Paper | Presenter | Week |
---|---|---|
Different activation function and optimizer: 1. Maxout Networks 2. ADADELTA: An Adaptive Learning Rate Method |
||
Initializers: 1. Understanding the difficulty of training deep feedforward neural networks 2. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification |
Ali Alsetri | 12/1 |
Generalization errors of optimizer: 1. The Marginal Value of Adaptive Gradient Methods in Machine Learning 2. When do adaptive optimizers fail to generalize? |
||
Different local minimum and generalization: 1. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima 2. Sharp Minima Can Generalize For Deep Nets |
Kotaro Kajita | 11/20 |
Adam optimizer: Adam: A Method for Stochastic Optimization |
Alex Emmons | 11/29 |
CNN Visualization: Visualizing and Understanding Convolutional Networks |
Joshua Peterson | 11/27 |