The Mathematics of IA


10 lectures

This series consists of 10 sessions, each lasting 2 hours, focused on the mathematics of machine learning. It outlines the primary concepts without delving into the intricacies of the proofs. Clicking on the title of each session provides access to the transcript. Additionally, there are basic notes available to guide you through the structure and progression of the content.

Course #1 - Smooth Optimization

Content:

  • Introduction and motivation
  • Gradients, Jacobians, Hessians
  • Gradient descent and acceleration
  • Stochastic Gradient Descent (SGD)

Materials:

Bibliography:

Course #2 - From Smooth to Non-Smooth Optimization

Content:

  • Proofs of gradient descent and acceleration
  • Linear models and regularization
  • Ridge versus Lasso
  • ISTA Algorithm

Materials:

Bibliography:

Course #3 - Lasso and Compressed Sensing

Content:

  • Examples of non-smooth functionals (Lasso, TV regularization, constraints)
  • Subgradient and proximal operators
  • Forward-backward splitting, connection with FISTA
  • ADMM, Douglas-Rachford (DR), Primal-Dual
  • Compressive sensing theory

Materials:

Bibliography:

  • A Mathematical Introduction to Compressive Sensing by Foucart, Simon and Rauhut, Holger (advanced)
  • Convex Optimization, by Boyd and Vandenberghe
  • Proximal Algorithms, by N. Parikh and S. Boyd

Course #4 - Kernel, Perceptron, CNN, and Transformers

Content:

  • Transition from ridge regression to kernels
  • Multilayer Perceptron (MLP)
  • Convolutional Neural Networks (CNN)
  • ResNet architecture
  • Transformer models

Materials:

Bibliography:

Course #5 - Deep Learning: Theory and Numerics

Content:

  • Review of MLP and its variants (CNN, ResNet)
  • Theoretical framework of two-layer MLPs
  • Gradient and Jacobians in neural networks
  • Introduction to backpropagation

Materials:

Bibliography:

Course #6 - Differential Programming

Content:

  • Recap on Gradient and Jacobian
  • Forward and reverse mode automatic differentiation
  • Introduction to PyTorch
  • The adjoint method in computational mathematics

Materials:

Bibliography:

Course #7 - Sampling and Diffusion Models

Content:

  • Refresher on Stochastic Gradient Descent (SGD)
  • Introduction to Langevin dynamics
  • Overview of diffusion models

Materials:

Course #8 - LLM and Generative AI

Content:

  • Overview of different generative model concepts
  • Introduction to generative models (VAE, GANs, U-Net, diffusion)
  • Semi-supervised learning and next token prediction
  • Tokenizers
  • Transformer architectures, Flash attention
  • State space models

Materials:

Bibliography:

Course #9 - Generative Models

Content:

  • Understanding generative models as density fitting techniques.
  • Basics of Maximum Likelihood Estimation and f-divergences.
  • Gaussian mixtures and the Expectation-Maximization algorithm.
  • Variational Autoencoders (VAE).
  • Introduction to Normalizing Flows.
  • Generative Adversarial Networks (GANs), Wasserstein GANs (WGANs).
  • Diffusion Models.

Materials:

Bibliography:

Course #10 - Optimal Transport

Content:

  • Introduction to Monge and Kantorovich formulations.
  • The Sinkhorn algorithm.
  • Training of generative models.
  • Duality and Wasserstein GANs.

Materials:

Bibliography: