Deep Learning

January 6, 2025

Topics

Introduction
Optimization
Efficient Differentiation
Single Neuron
Multilayer Neural Networks
Neural Networks for Vision Tasks
Neural Networks for Sequential Tasks
Training Methods
Impact Analysis of Training Methods
Data Efficiency and Pre-training
Resource Efficiency and Model Compression

We will explore both the theoretical foundations and practical techniques for designing, building, and analyzing deep neural networks, with a focus on supervised learning. Topics include the behavior and convergence of gradient descent and its variants, efficient methods for automatic differentiation, and the theoretical and empirical properties of multilayer networks—such as approximation capabilities, initialization strategies, generalization, and symmetry. We will also cover convolutional networks and their extensions for visual tasks, advanced training methods and their analysis, and neural architectures for sequential data, including recurrent networks, attention mechanisms, and transformers. Additionally, the course addresses strategies to improve data efficiency (e.g., pre-training, self-supervised learning) and resource efficiency (e.g., model quantization, pruning).

This course is taught with Prof. Daniel Soudry.

Meetings

There will be 12 meetings: the first 7-8 meetings will be given by the lecturer (me). The last meetings will be given by the course participants (see note about working in pairs under the grade composition). Each student will have to present a paper and possible future directions. The article described in class and the proposed continuation directions will be the basis for the project. Discussions will take place during all the meetings, so attendance in the lectures is mandatory.

Books

Dive into Deep Learning, Aston Zhang, Zack C. Lipton, Mu Li, Alex Smola, 2020
Machine Learning with PyTorch and Scikit-Learn, Sebastian Raschka, Packt , 2022
Understand Deep Learning, Simon J.D. Prince, MIT Press, 2023
Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, MIT Press , 2016

Grade composition

Final exam: 40%
Assignment: 30%
Final project: 30%

References

Deep Learning

This course covers key theoretical and practical tools for deep learning, with an emphasis on supervised learning. Topics include gradient descent, fully-connected and convolutional networks, training methods, and architectures for sequential data like transformers. We also explore techniques to improve data and resource efficiency, such as pre-training, self-supervised learning, quantization, and pruning.

Back to all articles