CSE 535: Theory of Optimization and Continuous Algorithms
The design of algorithms is traditionally a discrete endeavor. However, many advances have come from a continuous viewpoint. Typically, a continuous process, deterministic or randomized is designed (or shown) to have desirable properties, such as approaching an optimal solution or a desired distribution, and an algorithm is derived from this by appropriate discretization. In interesting and general settings, the current fastest methods are a consequence of this perspective. We will discuss several examples of algorithms for high-dimensional optimization and sampling, and use them to understand the following concepts in detail.
- Elimination
- Reduction
- Geometrization
- Acceleration
- Sparsification
- Decomposition
This course is offered in Georgia Tech at the same time by Santosh Vempala.
Administrative Information:
- Instructor: Yin Tat Lee
- Office Hours: By appointment, email me at yintat@ uw.edu
- TA Office hours: Fri 2:45-4:00, CSE 021
- Lectures: Tue, Thu 10:00-11:20 at ARC G070
- Lecture Note: https://github.com/YinTat/optimizationbook/blob/main/main.pdf
- Course evaluation: Homework (100%)
- Prerequisite: basic knowledge of algorithms, probability, linear algebra.
Assignments
Submitted via Gradescope. Check emails.
Tentative Schedule:
Introduction
- Jan 08: Gradient Descent (1.1, 1.2, 1.4)
- Jan 10: Langevin Dynamics (1.3, 1.5, 1.6)
Elimination
- Jan 15: Cutting Plane Methods (2.1, 2.2, 2.3, 2.4)
- Jan 17: Sphere and Parabola Method (2.6, 2.7)
Reduction
- Jan 22: Equivalences (3.1, 3.2, 3.3)
- Jan 24: Equivalences (3.4)
- Jan 29: Duality (3.5)
Geometrization (Optimization)
- Jan 31: Mirror Descent (4.1)
- Jan 31: Frank-Wolfe (4.2)
- Feb 07: Newton Method & L-BFGS (4.3)
- Feb 19: Interior Point Method (4.4)
- Feb 21: Survey on Solving Linear Systems (Not in the lecture note)
Geometrization (Sampling)
- Feb 14: Ball walk & Isoperimetry (5.1, 5.2)
- Feb 26: Ball walk (5.3, 5.4)
- Feb 28: HMC (5.5, 5.6, 5.8)
Sparsification
- Mar 5: Leverage Score Sampling (6.1, 6.2)
- Mar 7: Stochastic Gradient Descent & Variance Reduction (6.3, 6.4)
Acceleration
- Mar 12: Conjugate Gradient & Chebyshev Expansion (8.1, 8.2)
- Mar 14: Accelerated Gradient Descent (8.3)