Description
Homework 14
Regularization-based Lifelong Learning
Machine Learning Exit Evaluation
Friendly reminder:
This questionnaire is only for educational analysis. Will never affect your grades, please fill in with peace of mind.
https://forms.gle/jAQPSoFQaoXcA22U6
Questionnaire of Education Course Series for AI Technologies and Applications
Friendly reminder:
https://forms.gle/boFFRPwvyMeaExxB6
Outline
● Introduction
● Dataset
● Sample Code
● Grading
● Submission
Introduction – LifeLong Learning
Goal: A model can beat all task!
Introduction – LifeLong Learning
Condition: Model Sequentially Learn Different Task! (In Training Time)
Introduction – LifeLong Learning
Dataset
Rotated MNIST (Generated by TAs)
Sample Code – Training Details
● 5 task / Each task has 10 epoches for training.
● Each method cost ~20 minutes for training model. (Tesla T4)
● Each method cost ~60 minutes for training model. (Tesla K80) ● Colab link
Sample Code – Guideline
● Utilities
● Prepare Data
● Prepare Model
● Train and Evaluate
● Methods
● Plot function
Sample Code – Prepare Data
● Prepare Data
○ Rotation and Transformation
○ Dataloaders and Arguments
○ Visualization
5 tasks
Sample Code – Prepare Model
● Prepare Model
○ Model Architecture
Sample Code – Train and Evaluate
● Train:
○ Sequentially train.
○ Add regularization term and update it.
● Evaluate:
○ Evaluate by using a special metric.
○ (Please read sample code and describe it in your report.)
Training Pipeline:
Sample Code – Methods
● Baseline (Do nothing in regularization term)
● EWC
● MAS
● SI
● RWalk
● SCP
EWC – Elastic Weight Consolidation
1. You need to know how to generate Guardiance weight from EWC!
2. Do this method need to use label ?
3. Hint: (Trace the class ewc and its calculate_importance function)
Paper Link: https://arxiv.org/abs/1612.00796
MAS – Memory Aware Synapse
1. You need to know how to generate Guardiance weight from MAS!
2. Do this method need to use label ?
3. We want you to implement Omega Matrix for MAS! Please read page 21 carefully and paste your code (only TODO block) in report.
Please do not modify any part of the sample code except the TODO block.
Paper Link: https://arxiv.org/abs/1711.09601
MAS – Memory Aware Synapse
● The method proposed in the paper is the local version by taking squared L2-norm outputs from each layer of the model.
● Here we only want you to implement the global version by taking outputs from the last layer of the model.
● Hint: (It is similar to the way you generate the Fisher matrix for EWC, the only difference is the calculation of the important weight.)
Paper Link: https://arxiv.org/abs/1711.09601
SI – Synaptic Intelligence
1. You need to know how to generate Guardiance weight from SI!
2. Do this method need to use label ?
3. Hint: (Accumulated loss change in each update step)
Paper Link: https://arxiv.org/abs/1703.04200, Talk Slide
RWalk – Remanian Walk
1. Trace Rwalk class and its update function!
2. Do this method need to use label ?
3. Hint: (The code is similar to two method which mentioned in sample code)
Paper Link: https://arxiv.org/abs/1801.10112
SCP – Sliced Cramer Preservation
1. Paper Link: https://openreview.net/pdf?id=BJge3TNKwH
2. Do this method need to use label ?
SCP – Main Idea
● Propose Distributed-based Distance to prevent fast intransigence and avoid overestimating the importance of parameters.
Other Methods and Scenarios
● Only in Multiple Choice Questions
● iCaRL (https://arxiv.org/abs/1611.07725)
● LwF (https://arxiv.org/abs/1606.09282)
● GEM (https://arxiv.org/abs/1706.08840)
● DGR (https://arxiv.org/abs/1705.08690)
● Three scenarios for continual learning (https://arxiv.org/abs/1904.07734)
Grading
● 20 multiple choice questions (8pts, 0.4pt each)
● Report (2pts)
● You have to choose ALL the correct answers for each question ● No leaderboards are needed!!
Grading – Multiple Choice Questions
● 20 multiple choice questions (8pts, 0.4pt each)
○ Basic Concept: 3 Questions
○ EWC: 2 Questions ○ MAS: 2 Questions
○ SI: 2 Questions
○ RWalk: 2 Questions
○ SCP: 3 Questions
○ Other Methods & scenarios: 6 Questions
■ ICaRL, LwF, GEM, DGR
■ Three Scenarios
Grading – Report
● Plot the learning curve of the metric with every method. (The Plotting function is provided in the sample code.) (0.5pt)
● Describe the metric. (0.5pt)
● Paste the code that you implement Omega Matrix for MAS. (1pt) Please do not modify any part of the sample code except the TODO block.
Please just paste the TODO block.
If you plot the right learning curve, you will still get the point of the first part no matter whether you implement Omega Matrix for MAS or not.
Submission
● The questions are on gradescope ● Submit your report to gradescope
● You can answer the questions unlimited times
● No late submission!
● Remember to save the answer when answering the questions!
Link ● Code: Colab
If any questions, you can ask us via…
● NTU COOL (Recommended)
● Email
○ The title should begin with “[hw14]”
Reviews
There are no reviews yet.