Description
Background
In this homework, we focus on fine-grained sentence-level sentiment classification problem. After implementing the details about MLP and CNN in python with Numpy and TensorFlow, we believe that you have already learned basic skills in deep learning. In this homework, you should apply recurrent neural network (RNN) to sentence-level sentiment classification problem and implement one of the self-attention skills to improve the performance of your network.
We have learned from the lecture that RNN can encode the input sequence into hidden states with the size , where is the dimension of each hidden state. In self-attention mechanism [1], we transform the “vector” hidden states into “matrix” hidden states :
where is a weight matrix and is a matrix. indicates different attention weight vectors with the size and the sum of each weight vector is 1 (because each weight vector is normalized by softmax function). denotes the matrix hidden states. In addition, we add a penalization term to the loss function to avoid redundant weight vectors :
where stands for the Frobenius norm of a matrix and indicates the identity matrix. For more details, please refer to [1].
In this homework, you should implement #todo parts in codes, including basic rnncells in rnn_cell.py, rnn framework in model.py, and self-attention mechanism in model.py. We encourage you to rewrite part of the codes in main.py to use TensorBoard to visualize your experimental results.
Requirements
python 2.7 or python 3.6 TensorFlow >= 1.0.0
Dataset Description
Stanford Sentiment Treebank (SST) dataset contains 11,855 sentences, and has been split into the training / validation / test parts, respectively containing 8,544 / 1,101 / 2,210 sentences. We also provide pre-trained word embeddings in vector.txt and load them with function in main.py.
Python Files Description
main.py contains the main script to run the whole program. model.py contains the main script for model implementation.
rnn_cell.py contains various basic rnncells.
Command
Train : python main.py
Test : python main.py –is_train=False –inference_version=xxx
NOTE: Feel free to fine-tune all the hyperparameters to get better results. You can choose the checkpoint with the best performance on validation set as inference_version. After you run the test command, you’ll get and you should submit this file with your codes and report.
Report
Everyone needs to perform the following experiments in this homework:
1. Plot the loss value and accuracy value of one-layer RNN with 3 kinds of rnncells (i.e., BasicRNNCell, GRUCell, BasicLSTMCell) against to every epoch during training. Compare and analyze the performance of 3 kinds of rnncells.
2. Plot the loss value and accuracy value of your final network against to every epoch during training.
3. Describe and analyze the details and the performance of your final network.
Submission Guideline
You need to submit a report document and codes, as required as follows:
Report: well formatted and readable summary to describe the results, discussions and your analysis.
Source codes should not be included in the report. Only some essential lines of codes are permitted.
The format of a good report can be referred to a top-conference paper.
You should submit a .zip file named after your student number, organized as below:
Reference
[1] Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. In Proceedings of International Conference on Learning Representations.
Reviews
There are no reviews yet.