Description
Scenario:
Assignment Tasks:
1. Data Preparation:
a. Select a dataset suitable for training a language model. It could be a collection of books,
articles, or any other text source.
b. Preprocess the dataset by tokenizing the text into words or characters.
c. Split the dataset into training and validation sets into 3:7, 4:6, 2:8 and then take average.
2. Implementing a Vanilla RNN:
a. Design and implement a Vanilla RNN architecture using a deep learning library of your
choice (e.g., TensorFlow, PyTorch, Keras).
b. Train the Vanilla RNN on the training dataset using backpropagation through time (BPTT)
algorithm.
c. Experiment with different hyperparameters (e.g., learning rate, number of hidden units) to
optimize the model’s performance.
d. Monitor the training process by tracking the loss and other relevant metrics.
3. Text Generation:
a. After training the Vanilla RNN, use the model to generate text based on a given input
prompt.
b. Experiment with different input prompts to observe how the generated text changes.
c. Analyze the quality and coherence of the generated text and compare it with the training
dataset.
4. Limitations of Vanilla RNN:
a. Research and discuss the limitations of Vanilla RNNs in capturing long-term dependencies
in sequences.
b. Describe the vanishing and exploding gradient problems that can occur during training.
c. Explain how these limitations can affect the performance of the language model you
implemented.
d. Discuss potential solutions or alternative models that can overcome these limitations.
5. Report and Presentation:
a. Prepare a report summarizing your implementation, including details about the dataset, model architecture, training process, and text generation results.
b. Include an analysis of the limitations of Vanilla RNNs and their impact on the language
model’s performance.
c. Create a presentation to share your findings, explaining the steps taken, challenges faced,
and potential improvements.
Reviews
There are no reviews yet.