This page contains an index consisting of author-provided keywords.
A | |
accelerated gradient descent | |
Accelerated Methods | |
Accelerated Stochastic Gradient Descent | |
Acceleration | |
active learning | |
Adaptive data analysis | |
adaptive methods | |
adaptive regret bounds | |
adaptivity | |
adversarial and stochastic rewards | |
Adversarial Multi-armed bandits | |
agnostic learning | |
algorithm estimation | |
Algorithm-dependent Generalization Error Bounds | |
Algorithmic stability | |
analysis of heuristics | |
approximate message-passing algorithm | |
approximation | |
approximation algorithms | |
approximation rate | |
assignment problem | |
Attribute Efficient | |
Autoregressive processes | |
average-case complexity | |
B | |
Banach space | |
Bandit convex optimization | |
bandit feedback | |
bandit linear optimization | |
Bandits | |
bandits with infinitely many arms | |
Bayesian inference | |
Bayesian regret | |
best-arm identification | |
bias | |
Big data | |
Binary classification | |
bisection algorithm | |
boosting | |
Burer Monteiro | |
C | |
chaining | |
Cohort Analysis | |
communication constraints | |
Complexity Theory | |
Composite losses | |
compressed sensing | |
Computational Complexity | |
Computational methods | |
concentration inequalities | |
Conformal inference | |
contextual bandits | |
convergence | |
Convergence Rate | |
Convex optimization | |
convexity | |
crowd-labeling | |
cumulative regret | |
cutting plane methods | |
D | |
data visualization | |
Deep Learning | |
deep priors | |
Delayed feedback | |
Deletion channel | |
density estimation | |
dependent data | |
detecting correlations | |
Detection | |
Differential privacy | |
dimension reduction | |
Direct Sum | |
distance estimation | |
distance sketches | |
Distributed estimation | |
Distribution Estimation | |
dynamic regret | |
E | |
Efficient | |
Empirical process theory | |
empirical risk minimization | |
entropic penalization | |
entropy | |
Erdos Renyi Model | |
exponential weights | |
F | |
fairness | |
Fano's inequality | |
fast rates | |
feedback graphs | |
Finite Sample Analysis | |
finite sample bounds | |
first-order methods | |
Fokker--Planck | |
Frank-Wolfe | |
Free energy | |
free probability | |
Functional Estimation | |
G | |
Gaussian | |
Generalization | |
Generalization theory | |
generalized linear model | |
generative models | |
geodesically convex optimization | |
Gibbs distribution | |
gradient descent | |
gradient flow | |
Gradient oracle | |
Gradient Temporal Difference | |
graph homomorphism numbers | |
graph sampling | |
greedy algorithm | |
groups | |
H | |
Halfspaces | |
Hausdorff distance | |
heat equation | |
Heteroscedastic Noise | |
High Dimensional Statistics | |
High-dimensional geometry | |
high-dimensional inference | |
High-dimensional Statistics | |
high-dimensionality | |
Higher-Order Optimization | |
histogram learning | |
Horvitz-Thompson estimator | |
hypothesis testing | |
I | |
Implicit regularization | |
importance sampling | |
Incentivizing Exploration | |
increasing learning rate | |
independent component analysis | |
Information Directed Sampling | |
Information Theory | |
integer programming | |
Ising Model | |
Ising models | |
K | |
k-PCA | |
kernel methods | |
Kullback-Leibler divergence | |
L | |
L1 regression | |
Langevin algorithm | |
Langevin Diffusion | |
Langevin dynamics | |
Langevin Monte Carlo | |
Lasso | |
learning | |
Learning theory | |
least squares | |
Least Squares Regression | |
Lewis weights | |
likelihood ratio fluctuations | |
Linear dynamical systems | |
Lipschitz continuity | |
local minima | |
log-concave | |
low rank | |
Lower bounds | |
M | |
Manifold Learning | |
Margin condition | |
Markov chain Monte Carlo | |
Markov decision processes | |
Markov random fields | |
martingales | |
Matrix Completion | |
Matrix factorization | |
maximum likelihood estimation | |
MCMC | |
Mean-field approximation | |
Membership oracle | |
memory constraints | |
Memory-bounded learning | |
metastability | |
metric compression | |
Metropolis-adjusted Langevin algorithm | |
Minimax lower bound | |
minimax lower bounds | |
minimax rates | |
Minimax Risk | |
Minimaxity | |
Mixtures of Gaussians | |
Mixtures of Linear Regressions | |
moments | |
multi-armed bandit | |
multi-armed bandits | |
Multi-index models | |
multiscale scan statistics | |
mutual information | |
N | |
nasty noise | |
nearest neighbor | |
Nearest-neighbors | |
Nesterov's accelerated gradient method | |
Network Data | |
network motifs | |
neural network | |
Neural networks | |
Newton's method | |
Noise addition | |
non-convex | |
Non-convex Learning | |
non-convex optimization | |
non-Gaussian component analysis | |
non-stationary environments | |
non-stochastic bandits | |
nonconvex | |
Nonconvex optimization | |
nonlinear optimization | |
Nonparametric classification | |
Nonstochastic bandits | |
O | |
On-line learning | |
online algorithms | |
Online Combinatorial Optimization | |
Online Convex Optimization | |
Online learning | |
Online Linear Optimization | |
operator splitting | |
optimal transport | |
optimistic online mirror descent | |
optimization | |
Ornstein-Uhlenbeck | |
outlier-robust learning | |
Over-parameterized models | |
P | |
PAC Learning | |
PAC-Bayes Theory | |
pairwise comparisons | |
Parameter-free | |
Parameterized family of MDPs | |
partition function | |
perceptron | |
permutation and randomization | |
permutation-based models | |
phase retrieval | |
phase transition | |
phase transitions | |
planning | |
planted clique | |
Polynomial Threshold Functions | |
Positive-definite kernels | |
Prediction from Experts | |
prediction with expert advice | |
principal component analysis | |
privacy | |
probability | |
projection | |
Property testing | |
proximal operators | |
Q | |
quantization | |
R | |
Rademacher Complexity | |
random walk | |
Randomized algorithms | |
ranking | |
Reach | |
Reductions | |
regression | |
Regret | |
regularization | |
reinforcement learning | |
ReLU activation | |
replica symmetry | |
representation learning | |
Restricted Eigenvalue | |
Riemannian Manifold | |
Riemannian optimization | |
risk | |
Robust Learning | |
S | |
saddle points | |
Sample complexity | |
sampling | |
SDP | |
second moment method | |
Second-Order Optimization | |
semi-bandit | |
Semi-parametric models | |
Semi-Random Model | |
Semi-Verified Learning | |
semidefinite | |
semidefinite programming | |
sequential learning | |
SGD | |
shape-constrained estimation | |
simple regret | |
Simulated annealing | |
Single-index models | |
Sinkhorn algorithm | |
small-loss regret bounds | |
smoothed analysis | |
smoothness | |
Social Learning | |
Sparse | |
Sparse Linear Regression | |
spectral analysis | |
Spectral initialization | |
Spiked random matrix models | |
spin glasses | |
Stability of Learning Algorithms | |
Stable Rank | |
stationarity tests | |
stationary point | |
statistical learning | |
statistical learning theory | |
Statistical queries | |
statistical-computational gap | |
Statistics | |
Stochastic Algorithms | |
Stochastic Approximation | |
Stochastic Differential Equations | |
Stochastic Gradient Descent | |
Stochastic Gradient Langevin Dynamics | |
Stochastic Gradient Method | |
Strong data processing inequality | |
Sub-Gaussian Mixture Models | |
submodular optimization | |
sum-of-squares | |
Switching Budgets | |
Switching costs | |
System identification | |
T | |
t-SNE | |
Temporal Difference | |
temporal difference learning | |
the cavity method | |
Tikhonov regularization | |
Time series | |
Trace reconstruction | |
Transfer learning | |
Two Timescale | |
U | |
UCB | |
unreliable data set | |
unsupervised learning | |
Upper Confidence Bound | |
V | |
variance reduction | |
Variational methods | |
VC Dimension | |
verification | |
W | |
Wasserstein Distance | |
Wasserstein gradient flow | |
Wasserstein metric | |
Z | |
zero-sum games |