Talk Keyword Index

TALK KEYWORD INDEX

This page contains an index consisting of author-provided keywords.

Shortcuts: A B C D E F G H I K L M N O P Q R S T U V W Z

A
accelerated gradient descent	Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
Accelerated Methods	Underdamped Langevin MCMC: A non-asymptotic analysis
Accelerated Stochastic Gradient Descent	Accelerating Stochastic Gradient Descent for Least Squares Regression
Acceleration	Accelerating Stochastic Gradient Descent for Least Squares Regression
active learning	Actively Avoiding Nonsense in Generative Models Efficient Active Learning of Sparse Halfspaces
Adaptive data analysis	Calibrating Noise to Variance in Adaptive Data Analysis
adaptive methods	The Many Faces of Exponential Weights in Online Learning
adaptive regret bounds	More Adaptive Algorithms for Adversarial Bandits
adaptivity	Adaptivity to Smoothness in X-armed bandits Minimax Bounds on Stochastic Batched Convex Optimization
adversarial and stochastic rewards	Best of both worlds: Stochastic & adversarial best-arm identification
Adversarial Multi-armed bandits	Online learning over a finite action set with limited switching
agnostic learning	Active Tolerant Testing
algorithm estimation	Active Tolerant Testing
Algorithm-dependent Generalization Error Bounds	Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints
Algorithmic stability	Calibrating Noise to Variance in Adaptive Data Analysis
analysis of heuristics	An Analysis of the t-SNE Algorithm for Data Visualization
approximate message-passing algorithm	Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
approximation	Approximation beats concentration? An approximation view on inference with smooth radial kernels A Faster Approximation Algorithm for the Gibbs Partition Function
approximation algorithms	An Optimal Learning Algorithm for Online Unconstrained Submodular Maximization
approximation rate	Optimal approximation of continuous functions by very deep ReLU networks
assignment problem	An explicit analysis of the entropic penalty in linear programming
Attribute Efficient	Efficient Active Learning of Sparse Halfspaces
Autoregressive processes	Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification
average-case complexity	Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure
B
Banach space	Black-Box Reductions for Parameter-free Online Learning in Banach Spaces
Bandit convex optimization	Nonstochastic Bandits with Composite Anonymous Feedback
bandit feedback	Online Variance Reduction for Stochastic Optimization
bandit linear optimization	The Many Faces of Exponential Weights in Online Learning
Bandits	Information Directed Sampling and Bandits with Heteroscedastic Noise
bandits with infinitely many arms	Adaptivity to Smoothness in X-armed bandits
Bayesian inference	Underdamped Langevin MCMC: A non-asymptotic analysis Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
Bayesian regret	The Externalities of Exploration and How Data Diversity Helps Exploitation
best-arm identification	Best of both worlds: Stochastic & adversarial best-arm identification
bias	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
Big data	Testing Symmetric Markov Chains From a Single Trajectory
Binary classification	Exponential convergence of testing error for stochastic gradient methods
bisection algorithm	Private Sequential Learning
boosting	Logistic Regression: The Importance of Being Improper
Burer Monteiro	Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form
C
chaining	Learning Patterns for Detection with Multiscale Scan Statistics
Cohort Analysis	A Data Prism: Semi-verified learning in the small-alpha regime
communication constraints	Detecting Correlations with Little Memory and Communication
Complexity Theory	Hardness of Learning Noisy Halfspaces using Polynomial Thresholds
Composite losses	Nonstochastic Bandits with Composite Anonymous Feedback
compressed sensing	Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk
Computational Complexity	Learning Mixtures of Linear Regressions with Nearly Optimal Complexity
Computational methods	Action-Constrained Markov Decision Processes With Kullback-Leibler Cost
concentration inequalities	Empirical bounds for functions with weak interactions
Conformal inference	Exact and Robust Conformal Inference Methods for Predictive Machine Learning With Dependent Data
contextual bandits	Efficient Contextual Bandits in Non-stationary Worlds The Externalities of Exploration and How Data Diversity Helps Exploitation
convergence	Log-concave sampling: Metropolis-Hastings algorithms are fast!
Convergence Rate	Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning
Convex optimization	Efficient Convex Optimization with Membership Oracles Black-Box Reductions for Parameter-free Online Learning in Banach Spaces Lower Bounds for Higher-Order Convex Optimization Minimax Bounds on Stochastic Batched Convex Optimization
convexity	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
crowd-labeling	Breaking the $1/\sqrt{n}$ Barrier: Faster Rates for Permutation-based Models in Polynomial Time
cumulative regret	Adaptivity to Smoothness in X-armed bandits
cutting plane methods	Cutting plane methods can be extended into nonconvex optimization
D
data visualization	An Analysis of the t-SNE Algorithm for Data Visualization
Deep Learning	Size-Independent Sample Complexity of Neural Networks Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk
deep priors	Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk
Delayed feedback	Nonstochastic Bandits with Composite Anonymous Feedback
Deletion channel	Subpolynomial trace reconstruction for random strings and arbitrary deletion probability
density estimation	Near-Optimal Sample Complexity Bounds for Maximum Likelihood Estimation of Multivariate Log-concave Densities Fast and Sample Near-Optimal Algorithms for Learning Multidimensional Histograms
dependent data	Exact and Robust Conformal Inference Methods for Predictive Machine Learning With Dependent Data
detecting correlations	Detecting Correlations with Little Memory and Communication
Detection	Learning Patterns for Detection with Multiscale Scan Statistics
Differential privacy	Privacy-preserving Prediction Calibrating Noise to Variance in Adaptive Data Analysis
dimension reduction	Approximate Nearest Neighbors in Limited Space Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods
Direct Sum	A Direct Sum for Information Learners
distance estimation	Approximate Nearest Neighbors in Limited Space
distance sketches	Approximate Nearest Neighbors in Limited Space
Distributed estimation	Geometric Lower Bounds for Distributed Parameter Estimation under Communication Constraints
Distribution Estimation	Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance
dynamic regret	Efficient Contextual Bandits in Non-stationary Worlds
E
Efficient	Efficient Active Learning of Sparse Halfspaces
Empirical process theory	Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification
empirical risk minimization	Online Variance Reduction for Stochastic Optimization
entropic penalization	An explicit analysis of the entropic penalty in linear programming
entropy	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
Erdos Renyi Model	Optimal Single Sample Tests for Structured versus Unstructured Network Data
exponential weights	The Many Faces of Exponential Weights in Online Learning
F
fairness	The Externalities of Exploration and How Data Diversity Helps Exploitation Unleashing Linear Optimizers for Group-Fair Learning and Optimization
Fano's inequality	Geometric Lower Bounds for Distributed Parameter Estimation under Communication Constraints
fast rates	Faster Rates for Convex-Concave Games
feedback graphs	Small-loss bounds for online learning with partial information
Finite Sample Analysis	Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning
finite sample bounds	Empirical bounds for functions with weak interactions A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
first-order methods	An Estimate Sequence for Geodesically Convex Optimization
Fokker--Planck	Langevin Monte Carlo and JKO splitting
Frank-Wolfe	Faster Rates for Convex-Concave Games
Free energy	The Vertex Sample Complexity of Free Energy is Polynomial
free probability	Fundamental Limits of Weak Recovery with Applications to Phase Retrieval
Functional Estimation	Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance
G
Gaussian	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
Generalization	Privacy-preserving Prediction Calibrating Noise to Variance in Adaptive Data Analysis
Generalization theory	Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
generalized linear model	Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
generative models	Actively Avoiding Nonsense in Generative Models Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk
geodesically convex optimization	An Estimate Sequence for Geodesically Convex Optimization
Gibbs distribution	A Faster Approximation Algorithm for the Gibbs Partition Function
gradient descent	An Analysis of the t-SNE Algorithm for Data Visualization
gradient flow	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
Gradient oracle	Efficient Convex Optimization with Membership Oracles
Gradient Temporal Difference	Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning
graph homomorphism numbers	Counting Motifs with Graph Sampling
graph sampling	Counting Motifs with Graph Sampling
greedy algorithm	The Externalities of Exploration and How Data Diversity Helps Exploitation
groups	Exact and Robust Conformal Inference Methods for Predictive Machine Learning With Dependent Data
H
Halfspaces	Hardness of Learning Noisy Halfspaces using Polynomial Thresholds
Hausdorff distance	Fitting a putative manifold to noisy data
heat equation	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
Heteroscedastic Noise	Information Directed Sampling and Bandits with Heteroscedastic Noise
High Dimensional Statistics	Optimal Single Sample Tests for Structured versus Unstructured Network Data
High-dimensional geometry	Geometric Lower Bounds for Distributed Parameter Estimation under Communication Constraints
high-dimensional inference	Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
High-dimensional Statistics	Restricted Eigenvalue from Stable Rank with Applications to Sparse Linear Regression
high-dimensionality	Detection limits in the high-dimensional spiked rectangular model
Higher-Order Optimization	Lower Bounds for Higher-Order Convex Optimization
histogram learning	Fast and Sample Near-Optimal Algorithms for Learning Multidimensional Histograms
Horvitz-Thompson estimator	Counting Motifs with Graph Sampling
hypothesis testing	Detection limits in the high-dimensional spiked rectangular model Optimal Single Sample Tests for Structured versus Unstructured Network Data
I
Implicit regularization	Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
importance sampling	Online Variance Reduction for Stochastic Optimization
Incentivizing Exploration	Incentivizing Exploration by Heterogeneous Users
increasing learning rate	More Adaptive Algorithms for Adversarial Bandits
independent component analysis	Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods
Information Directed Sampling	Information Directed Sampling and Bandits with Heteroscedastic Noise
Information Theory	A Direct Sum for Information Learners
integer programming	Hidden Integrality of SDP Relaxations for Sub-Gaussian Mixture Models
Ising Model	Optimal Single Sample Tests for Structured versus Unstructured Network Data
Ising models	The Mean-Field Approximation: Information Inequalities, Algorithms, and Complexity
K
k-PCA	Averaged Stochastic Gradient Descent on Riemannian Manifolds
kernel methods	Approximation beats concentration? An approximation view on inference with smooth radial kernels
Kullback-Leibler divergence	Action-Constrained Markov Decision Processes With Kullback-Leibler Cost
L
L1 regression	L1 Regression using Lewis Weights Preconditioning and Stochastic Gradient Descent
Langevin algorithm	Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability
Langevin Diffusion	Underdamped Langevin MCMC: A non-asymptotic analysis
Langevin dynamics	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
Langevin Monte Carlo	Langevin Monte Carlo and JKO splitting
Lasso	Restricted Eigenvalue from Stable Rank with Applications to Sparse Linear Regression
learning	Unleashing Linear Optimizers for Group-Fair Learning and Optimization
Learning theory	Testing Symmetric Markov Chains From a Single Trajectory Hardness of Learning Noisy Halfspaces using Polynomial Thresholds
least squares	Iterate Averaging as Regularization for Stochastic Gradient Descent
Least Squares Regression	Accelerating Stochastic Gradient Descent for Least Squares Regression
Lewis weights	L1 Regression using Lewis Weights Preconditioning and Stochastic Gradient Descent
likelihood ratio fluctuations	Detection limits in the high-dimensional spiked rectangular model
Linear dynamical systems	Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification
Lipschitz continuity	Learning from Unreliable Datasets
local minima	Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form Cutting plane methods can be extended into nonconvex optimization
log-concave	Near-Optimal Sample Complexity Bounds for Maximum Likelihood Estimation of Multivariate Log-concave Densities
low rank	Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form
Lower bounds	Detecting Correlations with Little Memory and Communication Time-Space Tradeoffs for Learning Finite Functions from Random Tests, with Applications to Polynomials Lower Bounds for Higher-Order Convex Optimization
M
Manifold Learning	Fitting a putative manifold to noisy data
Margin condition	Exponential convergence of testing error for stochastic gradient methods
Markov chain Monte Carlo	Underdamped Langevin MCMC: A non-asymptotic analysis
Markov decision processes	Action-Constrained Markov Decision Processes With Kullback-Leibler Cost
Markov random fields	The Mean-Field Approximation: Information Inequalities, Algorithms, and Complexity The Vertex Sample Complexity of Free Energy is Polynomial
martingales	Online Learning: Sufficient Statistics and the Burkholder Method
Matrix Completion	Non-Convex Matrix Completion Against a Semi-Random Adversary
Matrix factorization	Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
maximum likelihood estimation	Near-Optimal Sample Complexity Bounds for Maximum Likelihood Estimation of Multivariate Log-concave Densities
MCMC	Log-concave sampling: Metropolis-Hastings algorithms are fast!
Mean-field approximation	The Mean-Field Approximation: Information Inequalities, Algorithms, and Complexity
Membership oracle	Efficient Convex Optimization with Membership Oracles
memory constraints	Detecting Correlations with Little Memory and Communication
Memory-bounded learning	Time-Space Tradeoffs for Learning Finite Functions from Random Tests, with Applications to Polynomials
metastability	Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability
metric compression	Approximate Nearest Neighbors in Limited Space
Metropolis-adjusted Langevin algorithm	Log-concave sampling: Metropolis-Hastings algorithms are fast!
Minimax lower bound	Geometric Lower Bounds for Distributed Parameter Estimation under Communication Constraints
minimax lower bounds	Counting Motifs with Graph Sampling
minimax rates	Adaptivity to Smoothness in X-armed bandits
Minimax Risk	Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance
Minimaxity	Minimax Bounds on Stochastic Batched Convex Optimization
Mixtures of Gaussians	Learning Mixtures of Linear Regressions with Nearly Optimal Complexity
Mixtures of Linear Regressions	Learning Mixtures of Linear Regressions with Nearly Optimal Complexity
moments	Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods
multi-armed bandit	A General Approach to Multi-Armed Bandits Under Risk Criteria More Adaptive Algorithms for Adversarial Bandits
multi-armed bandits	Incentivizing Exploration by Heterogeneous Users Best of both worlds: Stochastic & adversarial best-arm identification
Multi-index models	Learning Single Index Models in Gaussian Space
multiscale scan statistics	Learning Patterns for Detection with Multiscale Scan Statistics
mutual information	Fundamental Limits of Weak Recovery with Applications to Phase Retrieval
N
nasty noise	Efficient Algorithms for Outlier-Robust Regression
nearest neighbor	Approximate Nearest Neighbors in Limited Space
Nearest-neighbors	Marginal Singularity, and the Benefits of Labels in Covariate-Shift
Nesterov's accelerated gradient method	An Estimate Sequence for Geodesically Convex Optimization
Network Data	Optimal Single Sample Tests for Structured versus Unstructured Network Data
network motifs	Counting Motifs with Graph Sampling
neural network	Optimal approximation of continuous functions by very deep ReLU networks
Neural networks	Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations Size-Independent Sample Complexity of Neural Networks
Newton's method	Lower Bounds for Higher-Order Convex Optimization
Noise addition	Calibrating Noise to Variance in Adaptive Data Analysis
non-convex	Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form
Non-convex Learning	Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints
non-convex optimization	Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability Non-Convex Matrix Completion Against a Semi-Random Adversary Learning Single Index Models in Gaussian Space An Analysis of the t-SNE Algorithm for Data Visualization
non-Gaussian component analysis	Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods
non-stationary environments	Efficient Contextual Bandits in Non-stationary Worlds
non-stochastic bandits	Small-loss bounds for online learning with partial information
nonconvex	Cutting plane methods can be extended into nonconvex optimization
Nonconvex optimization	Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent Convex Optimization with Unbounded Nonconvex Oracles Using Simulated Annealing
nonlinear optimization	An Estimate Sequence for Geodesically Convex Optimization
Nonparametric classification	Marginal Singularity, and the Benefits of Labels in Covariate-Shift
Nonstochastic bandits	Nonstochastic Bandits with Composite Anonymous Feedback
O
On-line learning	Time-Space Tradeoffs for Learning Finite Functions from Random Tests, with Applications to Polynomials
online algorithms	Unleashing Linear Optimizers for Group-Fair Learning and Optimization
Online Combinatorial Optimization	Online learning over a finite action set with limited switching
Online Convex Optimization	The Many Faces of Exponential Weights in Online Learning Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent
Online learning	An Optimal Learning Algorithm for Online Unconstrained Submodular Maximization Faster Rates for Convex-Concave Games Small-loss bounds for online learning with partial information Logistic Regression: The Importance of Being Improper Black-Box Reductions for Parameter-free Online Learning in Banach Spaces Online learning over a finite action set with limited switching Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent Online Learning: Sufficient Statistics and the Burkholder Method
Online Linear Optimization	Online learning over a finite action set with limited switching
operator splitting	Langevin Monte Carlo and JKO splitting
optimal transport	An explicit analysis of the entropic penalty in linear programming
optimistic online mirror descent	More Adaptive Algorithms for Adversarial Bandits
optimization	Cutting plane methods can be extended into nonconvex optimization Averaged Stochastic Gradient Descent on Riemannian Manifolds Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem Unleashing Linear Optimizers for Group-Fair Learning and Optimization
Ornstein-Uhlenbeck	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
outlier-robust learning	Efficient Algorithms for Outlier-Robust Regression
Over-parameterized models	Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
P
PAC Learning	A Direct Sum for Information Learners
PAC-Bayes Theory	Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints
pairwise comparisons	Breaking the $1/\sqrt{n}$ Barrier: Faster Rates for Permutation-based Models in Polynomial Time
Parameter-free	Black-Box Reductions for Parameter-free Online Learning in Banach Spaces
Parameterized family of MDPs	Action-Constrained Markov Decision Processes With Kullback-Leibler Cost
partition function	A Faster Approximation Algorithm for the Gibbs Partition Function
perceptron	Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
permutation and randomization	Exact and Robust Conformal Inference Methods for Predictive Machine Learning With Dependent Data
permutation-based models	Breaking the $1/\sqrt{n}$ Barrier: Faster Rates for Permutation-based Models in Polynomial Time
phase retrieval	Fundamental Limits of Weak Recovery with Applications to Phase Retrieval
phase transition	Fundamental Limits of Weak Recovery with Applications to Phase Retrieval
phase transitions	Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
planning	A General Approach to Multi-Armed Bandits Under Risk Criteria
planted clique	Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure
Polynomial Threshold Functions	Hardness of Learning Noisy Halfspaces using Polynomial Thresholds
Positive-definite kernels	Exponential convergence of testing error for stochastic gradient methods
Prediction from Experts	Online learning over a finite action set with limited switching
prediction with expert advice	The Many Faces of Exponential Weights in Online Learning
principal component analysis	Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods
privacy	Private Sequential Learning
probability	Online Learning: Sufficient Statistics and the Burkholder Method
projection	Unleashing Linear Optimizers for Group-Fair Learning and Optimization
Property testing	Testing Symmetric Markov Chains From a Single Trajectory Active Tolerant Testing The Vertex Sample Complexity of Free Energy is Polynomial
proximal operators	Langevin Monte Carlo and JKO splitting
Q
quantization	Approximate Nearest Neighbors in Limited Space
R
Rademacher Complexity	Size-Independent Sample Complexity of Neural Networks
random walk	Log-concave sampling: Metropolis-Hastings algorithms are fast!
Randomized algorithms	Testing Symmetric Markov Chains From a Single Trajectory
ranking	Breaking the $1/\sqrt{n}$ Barrier: Faster Rates for Permutation-based Models in Polynomial Time
Reach	Fitting a putative manifold to noisy data
Reductions	Black-Box Reductions for Parameter-free Online Learning in Banach Spaces
regression	Efficient Algorithms for Outlier-Robust Regression
Regret	Information Directed Sampling and Bandits with Heteroscedastic Noise
regularization	Iterate Averaging as Regularization for Stochastic Gradient Descent
reinforcement learning	A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation A General Approach to Multi-Armed Bandits Under Risk Criteria Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning
ReLU activation	Optimal approximation of continuous functions by very deep ReLU networks
replica symmetry	Detection limits in the high-dimensional spiked rectangular model
representation learning	Learning Patterns for Detection with Multiscale Scan Statistics
Restricted Eigenvalue	Restricted Eigenvalue from Stable Rank with Applications to Sparse Linear Regression
Riemannian Manifold	Averaged Stochastic Gradient Descent on Riemannian Manifolds
Riemannian optimization	An Estimate Sequence for Geodesically Convex Optimization
risk	A General Approach to Multi-Armed Bandits Under Risk Criteria
Robust Learning	A Data Prism: Semi-verified learning in the small-alpha regime
S
saddle points	Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
Sample complexity	Privacy-preserving Prediction Size-Independent Sample Complexity of Neural Networks Learning Mixtures of Linear Regressions with Nearly Optimal Complexity Subpolynomial trace reconstruction for random strings and arbitrary deletion probability The Vertex Sample Complexity of Free Energy is Polynomial
sampling	Log-concave sampling: Metropolis-Hastings algorithms are fast! Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
SDP	Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form
second moment method	Fundamental Limits of Weak Recovery with Applications to Phase Retrieval
Second-Order Optimization	Lower Bounds for Higher-Order Convex Optimization
semi-bandit	More Adaptive Algorithms for Adversarial Bandits
Semi-parametric models	Learning Single Index Models in Gaussian Space
Semi-Random Model	Non-Convex Matrix Completion Against a Semi-Random Adversary
Semi-Verified Learning	A Data Prism: Semi-verified learning in the small-alpha regime
semidefinite	Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form
semidefinite programming	Hidden Integrality of SDP Relaxations for Sub-Gaussian Mixture Models
sequential learning	Private Sequential Learning
SGD	Accelerating Stochastic Gradient Descent for Least Squares Regression Exponential convergence of testing error for stochastic gradient methods
shape-constrained estimation	Breaking the $1/\sqrt{n}$ Barrier: Faster Rates for Permutation-based Models in Polynomial Time
simple regret	Adaptivity to Smoothness in X-armed bandits
Simulated annealing	Convex Optimization with Unbounded Nonconvex Oracles Using Simulated Annealing
Single-index models	Learning Single Index Models in Gaussian Space
Sinkhorn algorithm	An explicit analysis of the entropic penalty in linear programming
small-loss regret bounds	Small-loss bounds for online learning with partial information
smoothed analysis	The Externalities of Exploration and How Data Diversity Helps Exploitation
smoothness	Adaptivity to Smoothness in X-armed bandits
Social Learning	Incentivizing Exploration by Heterogeneous Users
Sparse	Efficient Active Learning of Sparse Halfspaces
Sparse Linear Regression	Restricted Eigenvalue from Stable Rank with Applications to Sparse Linear Regression
spectral analysis	Approximation beats concentration? An approximation view on inference with smooth radial kernels
Spectral initialization	Fundamental Limits of Weak Recovery with Applications to Phase Retrieval
Spiked random matrix models	Detection limits in the high-dimensional spiked rectangular model
spin glasses	Detection limits in the high-dimensional spiked rectangular model
Stability of Learning Algorithms	Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints
Stable Rank	Restricted Eigenvalue from Stable Rank with Applications to Sparse Linear Regression
stationarity tests	Efficient Contextual Bandits in Non-stationary Worlds
stationary point	Cutting plane methods can be extended into nonconvex optimization
statistical learning	Actively Avoiding Nonsense in Generative Models Logistic Regression: The Importance of Being Improper
statistical learning theory	Empirical bounds for functions with weak interactions
Statistical queries	Calibrating Noise to Variance in Adaptive Data Analysis
statistical-computational gap	Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure Breaking the $1/\sqrt{n}$ Barrier: Faster Rates for Permutation-based Models in Polynomial Time
Statistics	Testing Symmetric Markov Chains From a Single Trajectory
Stochastic Algorithms	Minimax Bounds on Stochastic Batched Convex Optimization
Stochastic Approximation	Averaged Stochastic Gradient Descent on Riemannian Manifolds Accelerating Stochastic Gradient Descent for Least Squares Regression Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning
Stochastic Differential Equations	Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability
Stochastic Gradient Descent	Iterate Averaging as Regularization for Stochastic Gradient Descent A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation Accelerating Stochastic Gradient Descent for Least Squares Regression Exponential convergence of testing error for stochastic gradient methods L1 Regression using Lewis Weights Preconditioning and Stochastic Gradient Descent
Stochastic Gradient Langevin Dynamics	Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints Convex Optimization with Unbounded Nonconvex Oracles Using Simulated Annealing
Stochastic Gradient Method	Exponential convergence of testing error for stochastic gradient methods
Strong data processing inequality	Geometric Lower Bounds for Distributed Parameter Estimation under Communication Constraints
Sub-Gaussian Mixture Models	Hidden Integrality of SDP Relaxations for Sub-Gaussian Mixture Models
submodular optimization	An Optimal Learning Algorithm for Online Unconstrained Submodular Maximization
sum-of-squares	Efficient Algorithms for Outlier-Robust Regression
Switching Budgets	Online learning over a finite action set with limited switching
Switching costs	Online learning over a finite action set with limited switching Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent
System identification	Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification
T
t-SNE	An Analysis of the t-SNE Algorithm for Data Visualization
Temporal Difference	Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning
temporal difference learning	A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
the cavity method	Detection limits in the high-dimensional spiked rectangular model
Tikhonov regularization	Iterate Averaging as Regularization for Stochastic Gradient Descent
Time series	Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification
Trace reconstruction	Subpolynomial trace reconstruction for random strings and arbitrary deletion probability
Transfer learning	Marginal Singularity, and the Benefits of Labels in Covariate-Shift
Two Timescale	Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning
U
UCB	A General Approach to Multi-Armed Bandits Under Risk Criteria
unreliable data set	Learning from Unreliable Datasets
unsupervised learning	Polynomial Time and Sample Complexity for Non-Gaussian Component Analysis: Spectral Methods Fast and Sample Near-Optimal Algorithms for Learning Multidimensional Histograms
Upper Confidence Bound	A General Approach to Multi-Armed Bandits Under Risk Criteria
V
variance reduction	Online Variance Reduction for Stochastic Optimization
Variational methods	The Mean-Field Approximation: Information Inequalities, Algorithms, and Complexity
VC Dimension	A Direct Sum for Information Learners
verification	Learning from Unreliable Datasets
W
Wasserstein Distance	Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance
Wasserstein gradient flow	Langevin Monte Carlo and JKO splitting
Wasserstein metric	Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem
Z
zero-sum games	Faster Rates for Convex-Concave Games