Martijho-PathNet-thesis
From Robin
(→Theoretical Background) |
|||
Line 50: | Line 50: | ||
- | = Theoretical Background = | + | = Theoretical Background = |
+ | == Machine Learning == | ||
+ | Intro about ML from the thesis | ||
+ | \subsection{MLP and NN modeling as function approx} | ||
+ | Inspired by the structure of the brain, the Neural Network (NN) consists of one or more layer where each layer is made up of perceptrons | ||
+ | * What is a perceptron? How is it connected to input, output? | ||
+ | * How is training done? Input against target | ||
+ | * Multiple layer perceptron (MLP) as an artificial Neural Network (ANN). | ||
+ | ** Ref binary MNIST classification in exp 1 | ||
+ | * Backpropagation and optimizers (SGD and Adam) | ||
+ | ** ref binary MNIST/Quinary MNIST/exp2 | ||
+ | * Regression/function approximation (ReLU activation) | ||
+ | * Classification (Softmax and probability approximation) | ||
+ | ** ref experiments | ||
+ | * Image classification | ||
+ | ** ref experiments | ||
+ | * Convolutional Neural Networks (CNN) | ||
+ | ** ref transition binary-quinary exp1 and exp2 | ||
+ | * Deep Learning and Deep neural networks (DNN) | ||
+ | |||
+ | == Deep Learning == | ||
+ | * Feature extraction | ||
+ | ** Bigger black box | ||
+ | * Network designs | ||
+ | * Transfer learning | ||
+ | ** What is it? | ||
+ | ** Why do it? | ||
+ | ** How do it? | ||
+ | ** TL in CNNs | ||
+ | *** Who have done it? | ||
+ | *** Results? | ||
+ | *** Gabor approximation | ||
+ | * Multi-task Learning | ||
+ | ** Curriculum Learning | ||
+ | *** ref to motivation behind task ordering in exp2 | ||
+ | * Catastrophic forgetting | ||
+ | *** EWC | ||
+ | *** PNN | ||
+ | *** PathNet | ||
+ | * Super Neural Networks | ||
+ | ** What are they? | ||
+ | |||
+ | == Evolutionary algorithms == | ||
+ | * What is it? Where does it come from? | ||
+ | * Exploration vs Exploitation | ||
+ | ** ref experiments (formulated in the context of this trade-off) | ||
+ | * Terms used in the evolutionary programming context | ||
+ | ** Population | ||
+ | ** Genotype and genome | ||
+ | ** Fitness-function | ||
+ | ** selection | ||
+ | ** recombination | ||
+ | ** generation | ||
+ | ** mutation | ||
+ | ** population diversity and convergence | ||
+ | * Some types | ||
+ | ** GA | ||
+ | ** Evolutionary searches | ||
+ | ** short. Straight into tournament search | ||
+ | * Tournament search | ||
+ | ** How it works, what are the steps? | ||
+ | ** Selection pressure (in larger context of EAs and then tournament search) | ||
+ | ** ref to search | ||
= Implementation = | = Implementation = |
Revision as of 13:47, 21 March 2018
Contents |
Opening
Abstract
- What is all this about?
- Why should I read this thesis?
- Is it any good?
- What's new?
Acknowledgements
- Who is your advisor?
- Did anyone help you?
- Who funded this work?
- What's the name of your favorite pet?
Introduction
From essay. More on multi task learning More on transfer learning
Raise problem: catastrophic forgetting.
Multiple solutions (PNN, PN, EWC)
- Large structures (PNN, PN)
- Limited in number of tasks it can retains(EWC)
Optimize reuse of knowledge while still providing valid solutions to tasks. More reuse and limited capacity use will increase amount of task a structure can learn.
- where do i start?
Question DeepMind left unanswered is how different GAs influence task learning and module reuse. Exploration vs exploitation\ref{theoretic background on topic}
- why this?
broad answers first, specify later. We know PN works. would it work better for different algorithms? logical next step from original paper "unit of evolution"
Problem/hypothesis
- What do modular PN training do with the knowledge?
- More/less accuracy?
- More/less transferability?
Test by learning in end-to-end first then PN search. Difference in performance or reuse?
- Can we make reuse easier by shifting focus of search algorithm?
- PN original: Naive search. Higher exploitation improve on module selection?
How to answer?
- Set up simple multitask scenarios and try.
- 2 tasks where first are end to end vs PN
- List algorithms with different selection pressure and try on multiple tasks.
Theoretical Background
Machine Learning
Intro about ML from the thesis \subsection{MLP and NN modeling as function approx} Inspired by the structure of the brain, the Neural Network (NN) consists of one or more layer where each layer is made up of perceptrons
- What is a perceptron? How is it connected to input, output?
- How is training done? Input against target
- Multiple layer perceptron (MLP) as an artificial Neural Network (ANN).
- Ref binary MNIST classification in exp 1
- Backpropagation and optimizers (SGD and Adam)
- ref binary MNIST/Quinary MNIST/exp2
- Regression/function approximation (ReLU activation)
- Classification (Softmax and probability approximation)
- ref experiments
- Image classification
- ref experiments
- Convolutional Neural Networks (CNN)
- ref transition binary-quinary exp1 and exp2
- Deep Learning and Deep neural networks (DNN)
Deep Learning
- Feature extraction
- Bigger black box
- Network designs
- Transfer learning
- What is it?
- Why do it?
- How do it?
- TL in CNNs
- Who have done it?
- Results?
- Gabor approximation
- Multi-task Learning
- Curriculum Learning
- ref to motivation behind task ordering in exp2
- Curriculum Learning
- Catastrophic forgetting
- EWC
- PNN
- PathNet
- Super Neural Networks
- What are they?
Evolutionary algorithms
- What is it? Where does it come from?
- Exploration vs Exploitation
- ref experiments (formulated in the context of this trade-off)
- Terms used in the evolutionary programming context
- Population
- Genotype and genome
- Fitness-function
- selection
- recombination
- generation
- mutation
- population diversity and convergence
- Some types
- GA
- Evolutionary searches
- short. Straight into tournament search
- Tournament search
- How it works, what are the steps?
- Selection pressure (in larger context of EAs and then tournament search)
- ref to search