TALK KEYWORD INDEX
This page contains an index consisting of author-provided keywords.
| A | |
| accelerated gradient descent | |
| Accelerated Methods | |
| Accelerated Stochastic Gradient Descent | |
| Acceleration | |
| active learning | |
| Adaptive data analysis | |
| adaptive methods | |
| adaptive regret bounds | |
| adaptivity | |
| adversarial and stochastic rewards | |
| Adversarial Multi-armed bandits | |
| agnostic learning | |
| algorithm estimation | |
| Algorithm-dependent Generalization Error Bounds | |
| Algorithmic stability | |
| analysis of heuristics | |
| approximate message-passing algorithm | |
| approximation | |
| approximation algorithms | |
| approximation rate | |
| assignment problem | |
| Attribute Efficient | |
| Autoregressive processes | |
| average-case complexity | |
| B | |
| Banach space | |
| Bandit convex optimization | |
| bandit feedback | |
| bandit linear optimization | |
| Bandits | |
| bandits with infinitely many arms | |
| Bayesian inference | |
| Bayesian regret | |
| best-arm identification | |
| bias | |
| Big data | |
| Binary classification | |
| bisection algorithm | |
| boosting | |
| Burer Monteiro | |
| C | |
| chaining | |
| Cohort Analysis | |
| communication constraints | |
| Complexity Theory | |
| Composite losses | |
| compressed sensing | |
| Computational Complexity | |
| Computational methods | |
| concentration inequalities | |
| Conformal inference | |
| contextual bandits | |
| convergence | |
| Convergence Rate | |
| Convex optimization | |
| convexity | |
| crowd-labeling | |
| cumulative regret | |
| cutting plane methods | |
| D | |
| data visualization | |
| Deep Learning | |
| deep priors | |
| Delayed feedback | |
| Deletion channel | |
| density estimation | |
| dependent data | |
| detecting correlations | |
| Detection | |
| Differential privacy | |
| dimension reduction | |
| Direct Sum | |
| distance estimation | |
| distance sketches | |
| Distributed estimation | |
| Distribution Estimation | |
| dynamic regret | |
| E | |
| Efficient | |
| Empirical process theory | |
| empirical risk minimization | |
| entropic penalization | |
| entropy | |
| Erdos Renyi Model | |
| exponential weights | |
| F | |
| fairness | |
| Fano's inequality | |
| fast rates | |
| feedback graphs | |
| Finite Sample Analysis | |
| finite sample bounds | |
| first-order methods | |
| Fokker--Planck | |
| Frank-Wolfe | |
| Free energy | |
| free probability | |
| Functional Estimation | |
| G | |
| Gaussian | |
| Generalization | |
| Generalization theory | |
| generalized linear model | |
| generative models | |
| geodesically convex optimization | |
| Gibbs distribution | |
| gradient descent | |
| gradient flow | |
| Gradient oracle | |
| Gradient Temporal Difference | |
| graph homomorphism numbers | |
| graph sampling | |
| greedy algorithm | |
| groups | |
| H | |
| Halfspaces | |
| Hausdorff distance | |
| heat equation | |
| Heteroscedastic Noise | |
| High Dimensional Statistics | |
| High-dimensional geometry | |
| high-dimensional inference | |
| High-dimensional Statistics | |
| high-dimensionality | |
| Higher-Order Optimization | |
| histogram learning | |
| Horvitz-Thompson estimator | |
| hypothesis testing | |
| I | |
| Implicit regularization | |
| importance sampling | |
| Incentivizing Exploration | |
| increasing learning rate | |
| independent component analysis | |
| Information Directed Sampling | |
| Information Theory | |
| integer programming | |
| Ising Model | |
| Ising models | |
| K | |
| k-PCA | |
| kernel methods | |
| Kullback-Leibler divergence | |
| L | |
| L1 regression | |
| Langevin algorithm | |
| Langevin Diffusion | |
| Langevin dynamics | |
| Langevin Monte Carlo | |
| Lasso | |
| learning | |
| Learning theory | |
| least squares | |
| Least Squares Regression | |
| Lewis weights | |
| likelihood ratio fluctuations | |
| Linear dynamical systems | |
| Lipschitz continuity | |
| local minima | |
| log-concave | |
| low rank | |
| Lower bounds | |
| M | |
| Manifold Learning | |
| Margin condition | |
| Markov chain Monte Carlo | |
| Markov decision processes | |
| Markov random fields | |
| martingales | |
| Matrix Completion | |
| Matrix factorization | |
| maximum likelihood estimation | |
| MCMC | |
| Mean-field approximation | |
| Membership oracle | |
| memory constraints | |
| Memory-bounded learning | |
| metastability | |
| metric compression | |
| Metropolis-adjusted Langevin algorithm | |
| Minimax lower bound | |
| minimax lower bounds | |
| minimax rates | |
| Minimax Risk | |
| Minimaxity | |
| Mixtures of Gaussians | |
| Mixtures of Linear Regressions | |
| moments | |
| multi-armed bandit | |
| multi-armed bandits | |
| Multi-index models | |
| multiscale scan statistics | |
| mutual information | |
| N | |
| nasty noise | |
| nearest neighbor | |
| Nearest-neighbors | |
| Nesterov's accelerated gradient method | |
| Network Data | |
| network motifs | |
| neural network | |
| Neural networks | |
| Newton's method | |
| Noise addition | |
| non-convex | |
| Non-convex Learning | |
| non-convex optimization | |
| non-Gaussian component analysis | |
| non-stationary environments | |
| non-stochastic bandits | |
| nonconvex | |
| Nonconvex optimization | |
| nonlinear optimization | |
| Nonparametric classification | |
| Nonstochastic bandits | |
| O | |
| On-line learning | |
| online algorithms | |
| Online Combinatorial Optimization | |
| Online Convex Optimization | |
| Online learning | |
| Online Linear Optimization | |
| operator splitting | |
| optimal transport | |
| optimistic online mirror descent | |
| optimization | |
| Ornstein-Uhlenbeck | |
| outlier-robust learning | |
| Over-parameterized models | |
| P | |
| PAC Learning | |
| PAC-Bayes Theory | |
| pairwise comparisons | |
| Parameter-free | |
| Parameterized family of MDPs | |
| partition function | |
| perceptron | |
| permutation and randomization | |
| permutation-based models | |
| phase retrieval | |
| phase transition | |
| phase transitions | |
| planning | |
| planted clique | |
| Polynomial Threshold Functions | |
| Positive-definite kernels | |
| Prediction from Experts | |
| prediction with expert advice | |
| principal component analysis | |
| privacy | |
| probability | |
| projection | |
| Property testing | |
| proximal operators | |
| Q | |
| quantization | |
| R | |
| Rademacher Complexity | |
| random walk | |
| Randomized algorithms | |
| ranking | |
| Reach | |
| Reductions | |
| regression | |
| Regret | |
| regularization | |
| reinforcement learning | |
| ReLU activation | |
| replica symmetry | |
| representation learning | |
| Restricted Eigenvalue | |
| Riemannian Manifold | |
| Riemannian optimization | |
| risk | |
| Robust Learning | |
| S | |
| saddle points | |
| Sample complexity | |
| sampling | |
| SDP | |
| second moment method | |
| Second-Order Optimization | |
| semi-bandit | |
| Semi-parametric models | |
| Semi-Random Model | |
| Semi-Verified Learning | |
| semidefinite | |
| semidefinite programming | |
| sequential learning | |
| SGD | |
| shape-constrained estimation | |
| simple regret | |
| Simulated annealing | |
| Single-index models | |
| Sinkhorn algorithm | |
| small-loss regret bounds | |
| smoothed analysis | |
| smoothness | |
| Social Learning | |
| Sparse | |
| Sparse Linear Regression | |
| spectral analysis | |
| Spectral initialization | |
| Spiked random matrix models | |
| spin glasses | |
| Stability of Learning Algorithms | |
| Stable Rank | |
| stationarity tests | |
| stationary point | |
| statistical learning | |
| statistical learning theory | |
| Statistical queries | |
| statistical-computational gap | |
| Statistics | |
| Stochastic Algorithms | |
| Stochastic Approximation | |
| Stochastic Differential Equations | |
| Stochastic Gradient Descent | |
| Stochastic Gradient Langevin Dynamics | |
| Stochastic Gradient Method | |
| Strong data processing inequality | |
| Sub-Gaussian Mixture Models | |
| submodular optimization | |
| sum-of-squares | |
| Switching Budgets | |
| Switching costs | |
| System identification | |
| T | |
| t-SNE | |
| Temporal Difference | |
| temporal difference learning | |
| the cavity method | |
| Tikhonov regularization | |
| Time series | |
| Trace reconstruction | |
| Transfer learning | |
| Two Timescale | |
| U | |
| UCB | |
| unreliable data set | |
| unsupervised learning | |
| Upper Confidence Bound | |
| V | |
| variance reduction | |
| Variational methods | |
| VC Dimension | |
| verification | |
| W | |
| Wasserstein Distance | |
| Wasserstein gradient flow | |
| Wasserstein metric | |
| Z | |
| zero-sum games | |