Our research focuses on statistical and computational aspects of learning, with current emphasis on over-parameterized models including deep learning and generative models. Some of the themes of current focus include:
Deep Learning: Optimization, Generalization, and Geometry
Sequential Decision Making, Bandits
Generative Models, Graphical Models
High-dimensional Statistics, Sparsity, Structure
Optimization
We also work on applications of machine learning on problems in several areas inclduing:
Climate science
Ecology
Recommendation systems
Finance
A brief summary and a few selected papers on each of these themes are below. Please see our Publications for more details.
Deep Learning: Optimization, Generalization, and Geometry
Deep learning models have emerged as the state-of-the-art in
many domains, but such models also have the capacity to perfectly fit
random data. The focus of of our work is to understand optimization
and generalization in deep learning, and use such understanding to
improve algorithms and design choices, e.g., effect of width, effect
of noisy training algorithms, etc.
Selected Papers
- Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee, Q. Li, and Y. Zhou
Preprint, 2024 (arxiv).
- Neural tangent kernel at initialization: Linear width suffices
A. Banerjee, P. Cisneros-Velarde, L. Zhu, and M. Belkin
Conference on Uncertainty in Artificial Intelligence (UAI), 2023.(pmlr)
- Restricted Strong Convexity of Deep Learning Models with Smooth Activations
A. Banerjee, P. Cisneros-Velarde, L. Zhu, and M. Belkin
International Conference on Learning Representations (ICLR), 2023.
Extended version (arxiv).
- Stability Based Generalization Bounds for Exponential Family Langevin Dynamics
A. Banerjee, T. Chen, X. Li, Y. Zhou
International Conference on Machine Learning (ICML), 2022.
Extended version (arXiv).
- Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification
Y. Zhou, S. Wu, and A. Banerjee
International Conference on Learning Representations (ICLR), 2021.
Extended version (arxiv).
- Noisy Truncated SGD: Optimization and Generalization
Y. Zhou, X. Li, and A. Banerjee
Preprint (arXiv), 2021.
- De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and Non-smooth Predictors
A. Banerjee, T. Chen, and Y. Zhou
Preprint, 2020 (arxiv).
- Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
X. Li, Q. Gu, Y. Zhou, T. Chen, and A. Banerjee
SIAM International Conference on Data Mining (SDM), 2020.
Extended version (arXiv).
- Stable Gradient Descent
Y. Xue, S. Chen, and A. Banerjee
Conference on Uncertainty in Artificial Intelligence (UAI), 2018 (pdf).
Sequential Decision Making, Bandits
Sequential decision making paradigms, such as context bandits,
have been successfully used in a variety of applications. Our work
focuses on new families of contextual bandit algorithms which can
leverage neural representations, can handle constraints, can have
simpler algorithms for smoothed analysis, i.e., less powerful
adversaries due to noise.
Selected Papers
- Contextual Bandits with Online Neural Regression
R. Deb, Y. Ban, S. Zuo, J. He, and A. Banerjee
International Conference on Learning Representations (ICLR), 2024.
Extended Version (arXiv).
- Improved Algorithms for Neural Active Learning
Y. Ban, Y. Zhang, H. Tong, A. Banerjee, and J. He
Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Smoothed Adversarial Linear Contextual Bandits with Knapsacks
V. Sivakumar, S. Zuo, and A. Banerjee
International Conference on Machine Learning (ICML), 2022.
- EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits
Y. Ban, Y. Yan, A. Banerjee, J. He
International Conference on Learning Representations (ICLR) [Spotlight], 2022.
- Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis
V. Sivakumar, S. Wu, and A. Banerjee
International Conference on Machine Learning (ICML), 2020.
Extended version (arXiv).
Generative Models, Graphical Models
Generative and graphical models enable probabilistic modeling of data, suitable uncertainty quantification, and inference.
Such models have also been effective in learning suitable representations of data based on deep generative models. We have worked
extensively on directed and undirected graphical models, including dependency structure learning, parameter estimation, and inference.
Our recent work has been focusing on more flexible deep generative models, with current emphasis on likelihood based
methods such as normalizing flows and variational autoencoders. Generative models will be an ongoing focus area with certain problems
being motivated by applications in climate science, ecology, and finance.
Selected Papers
- AmbientFlow: Invertible generative models from incomplete, noisy measurements
V. A. Kelkar, R. Deshpande, A. Banerjee, and M. A. Anastasio
NeurIPS'23 Workhop on Deep Learning and Inverse Problems
Transactions of Machine Learning Research (TMLR) , 2023. (arXiv)
- Gradient Boosted Normalizing Flows
R. Giaquinto and A. Banerjee
Advances in Neural Information Processing Systems (NeurIPS), 2020.
Extended version, 2020 (arXiv).
- DAPPER: Scaling Dynamic Author Persona Topic Model to Billion Word Corpora
R. Giaquinto and A. Banerjee
International Conference on Data Mining (ICDM), 2018 (arXiv).
- A Spectral Algorithm for Inference in Hidden semi-Markov Models
I. Melnyk and A. Banerjee
Journal of Machine Learning Research (JMLR), 2017 (journal version).
- Multi-task Sparse Structure Learning with Gaussian Copula Models
A. Goncalves, F. J. Von Zuben, and A. Banerjee
Journal of Machine Learning Research (JMLR), 2016 (pdf) .
- Bethe-ADMM for Tree Decomposition based Parallel MAP inference
Q. Fu, H. Wang, and A. Banerjee
Conference on Uncertainty in Artificial Intelligence (UAI), 2013 (pdf).
- Large Scale Distributed Sparse Precision Estimation
H. Wang, A. Banerjee, C. Hsieh, P. Ravikumar, and I. Dhillon
Advances in Neural Information Processing Systems (NIPS), 2013 (pdf).
High-dimensional Statistics, Sparsity, Structure
High-dimensional over-parameterized models, where the number of parameters is
much larger than the number of training samples, have been shown to
effective in a wide variety of scenarios. Such models include sparse
and structured estimation based on explicit regularization, e.g.,
Lasso, Dantzig selector, etc., as well as their non-linear,
non-convex, and semi-parametric extensions. Our work has focused on
statistical and computational guarantees for such models, often based
on geometric ideas and associated tools from empirical processes such
as generic chaining.
Selected Papers
- Random Quadratic Forms with Dependence: Applications to Restricted Isometry and Beyond
A. Banerjee, Q. Gu, V. Sivakumar, and S. Wu
Advances in Neural Information Processing Systems (NeurIPS), 2019 (pdf).
Extended version (arXiv).
- An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression
S. Chen and A. Banerjee
Advances in Neural Information Processing Systems (NeurIPS), 2018 (pdf).
- Robust Structured Estimation with Single-Index Models
S. Chen and A. Banerjee
International Conference on Machine Learning (ICML), 2017 (pdf).
- High Dimensional Structured Superposition Models
Q. Gu, A. Banerjee
Advances in Neural Information Processing Systems (NIPS), 2016.
Extended version (from NIPS'16), 2017 (arxiv).
- Generalized Direct Change Estimation in Ising Model Structure
F. Fazayeli and A. Banerjee
International Conference on Machine Learning (ICML), 2016 (pdf).
Extended version (from ICML'16), 2016 (arxiv).
- Estimating Structured Vector Autoregressive Model
I. Melnyk and A. Banerjee
International Conference on Machine Learning (ICML), 2016.
Extended version (from ICML'16), 2016 (arxiv).
- Structured Estimation with Atomic Norms: General Bounds and Applications
S. Chen and A. Banerjee
Advances in Neural Information Processing Systems (NIPS), 2015 (pdf).
- Beyond Sub-Gaussian Measurements: High-Dimensional Structured Estimation with Sub-Exponential Designs
V. Sivakumar, A. Banerjee, and P. Ravikumar
Advances in Neural Information Processing Systems (NIPS), 2015 (pdf).
- Estimation with Norm Regularization
A. Banerjee, S. Chen, F. Fazayeli, and V. Sivakumar
Advances in Neural Information Processing Systems (NIPS), 2014 (pdf).
Extended version (from NIPS'14), 2015 (arxiv).
- Generalized Dantzig Selector: Application to the k-support norm
S. Chatterjee, S. Chen, and A. Banerjee
Advances in Neural Information Processing Systems (NIPS), 2014 (pdf).
Extended version, 2014 (arxiv).
Optimization
Optimization plays a key role in several facets of machine
learning. In addition to using optimization for several projects, we
have done work on certain themes in optimization including augmented
Lagrangian and primal dual methods, sketched iterative algorithms, and
accelerated first order methods. Our current work focuses on noisy
gradient methods and optimization for overparametereized deep
models.
Selected Papers
- Restricted Strong Convexity of Deep Learning Models with Smooth Activations
A. Banerjee, P. Cisneros-Velarde, L. Zhu, and M. Belkin
Preprint, 2022 (arxiv).
- Scalable Algorithms for Locally Low-Rank Matrix Modeling
Q. Gu, J. Trzasko, and A. Banerjee
Knowledge and Information Systems (KAIS), 2019 (pdf).
Invited for journal publication as Best of ICDM'17.
- Sketched Iterative Algorithms for Structured Generalized Linear Models
Q. Gu and A. Banerjee
International Joint Conference on Artificial Intelligence (IJCAI) (oral), 2019 (pdf).
- Accelerated Alternating Direction Method of Multipliers
M. Kadkhodaie, K. Christakopoulou, M. Sanjabi, and A. Banerjee
International Conference on Knowledge Discovery and Data Mining (KDD), 2015 (pdf).
- Parallel Direction Method of Multipliers
H. Wang, A. Banerjee, and Z.-Q. Luo
Advances in Neural Information Processing Systems (NIPS), 2014 (pdf).
Extended version, 2014 (pdf).
- Bregman Alternating Direction Method of Multipliers
H. Wang and A. Banerjee
Advances in Neural Information Processing Systems (NIPS), 2014 (pdf).
Extended version, 2014 (arxiv).
- Online Alternating Direction Method
H. Wang and A. Banerjee.
International Conference on Machine Learning (ICML), 2012 (pdf).
Extended version, 2013 (arxiv).
Applications
We work on applications of machine learning on problems in several areas inclduing:
Climate science
Ecology
Recommendation systems
Finance
Climate Science
There are several challenging problems in climate science where progress can potentially be made
based on machine learning. We have worked on several problems in this context including
better design of climate model ensembles, understanding statistical dependencies for physical
processes, and interpretable predictive models for predicting climate variables. Our current
work focuses on sub-seasonal climate forecasting which focuses on prediction of key climate
variables such as temperature and precipitation in the 2 week to 2 month time horizon.
Selected Papers
- Learning and Dynamical Models for Sub-seasonal Climate Forecasting: Comparison and Collaboration
S. He, X. Li, L. Trenary, B. A. Cash, T. DelSole, and A. Banerjee
AAAI Conference on Artificial Intelligence (AAAI), 2022.
Extended version (arxiv).
- Subseasonal Climate Prediction in the Western US using Bayesian Spatial Models
V. Srinivasan, J. Khim, A. Banerjee, and P. Ravikumar.
Conference on Uncertainty in Artificial Intelligence (UAI), 2021.
- Sub-Seasonal Climate Forecasting via Machine Learning: Challenges, Analysis, and Advances
S. He, X. Li, T. DelSole, P. Ravikumar, and A. Banerjee
AAAI Conference on Artificial Intelligence (AAAI), 2021.
Extended versuion (arxiv).
- Interpretable Predictive Modeling for Climate Variables with Weighted Lasso
S. He, X. Li, V. Sivakumar, and A. Banerjee
AAAI Conference on Artificial Intelligence (AAAI), 2019 (pdf).
- Statistical Seasonal Prediction Based on Regularized Regression
T. DelSole and A. Banerjee
Journal of Climate, 2017 (journal version).
- High-Dimensional Dependency Structure Learning for Physical Processes
J. Golmohammadi, I. Ebert-Uphoff, S. He, Y. Deng, and A. Banerjee
International Conference on Data Mining (ICDM), 2017.
Extended version (arXiv).
- Understanding Dominant Factors for Precipitation over the Great Lakes Region
S. Chatterjee, S. Liess, A. Banerjee, and V. Kumar
AAAI Conference on Artificial Intelligence (AAAI), 2016 (pdf).
- Multi-task Sparse Structure Learning with Gaussian Copula Models
A. Goncalves, F. J. Von Zuben, and A. Banerjee
Journal of Machine Learning Research (JMLR), 2016 (pdf) .
- Climate Multi-model Regression Using Spatial Smoothing
K. Subbian and A. Banerjee
SIAM International Conference on Data Mining (SDM), 2013 (pdf).
- Sparse Group Lasso: Consistency and Climate Applications
S. Chatterjee, K. Steinhaeuser, A. Banerjee, S. Chatterjee, and A. Ganguly.
SIAM International Conference on Data Mining (SDM), 2012.
Ecology
Our work in ecology has focused on modeling of plant functional
traits as a predictive or gap-filling problem, similar to ratings
prediction in recommendation systems. Such work has been extended to
uncertainty quantified plant trait prediction, and use of such
predicted plant trait distributions in terrestrial land surface
model. Our current work is focused on spectral biology which uses
hyper-spectral data from drones and sattelites to characterize
biodiversity across scales.
Selected Papers
- BII-Implementation: The causes and consequences of plant biodiversity across scales in a rapidly changing world
J. Cavender-Bares, et al.
Research Ideas and Outcomes, 2021.
- Robustness of trait connections across environmental gradients and growth forms
H. Florese-Moreno, F. Fazayeli, A. Banerjee, A. Datta, J. Kattge, E. Butler, O. Atkin, K. Whythers, M. Chen, M. Anand, M. Bahn, Michael, C. Byun, J. Cornelissen, J. Craine, A. González-Melo, W. Hattingh, S. Jansen, N. Kraft, K. Kramer, D. Laughlin, V. Minden, Ü. Niinemets, V. Onipchenko, J. Penuelas, N. Soudzilovskaia, R. Dalrymple, P. Reich
Global Ecology and Biogeography (GEB), 2019 (paper).
- Intelligent systems for geosciences: an essential research agenda
Y. Gil, S. A. Pierce, H. A. Babaie, A. Banerjee, K. D. Borne, G. Bust, M. Cheatham, I. Ebert-Uphoff, C. Gomes, M. Hill, J. Horel, L. Hsu, J. Kinter, C. A. Knoblock, D. Krum, V. Kumar, P. Lermusiaux, Y. Liu, C. North, V. Pankratius, S. Peters, B. Plale, A. Pope, S. Ravela, J. Restrepo, A. J. Ridley, H. Samet, and S. Shekhar
Communications of the ACM (CACM), 62(1):76-84, 2019 (paper).
- Mapping local and global variability in plant trait distributions
E. E. Butler, A. Datta, et al.
Proceedings of the National Academy of Sciences (PNAS), 2017 (journal version).
- BHPMF -- a hierarchical Bayesian approach to gap-filling and trait prediction for macroecology and functional biogeography
F. Schrodt, et al.,
Global Ecology and Biogeography (GEB), 2015 (pdf).
- Gap Filling in the Plant Kingdom---Trait Prediction Using Hierarchical Probabilistic Matrix Factorization
H. Shan, J. Kattge, P. B. Reich, A. Banerjee, F. Schrodt, and M. Reichstein.
International Conference on Machine Learning (ICML), 2012 (pdf).
Recommendation Systems
We have worked on several aspects of recommendation systems over the years.
Selected Papers
- Adversarial Attacks on an Oblivious Recommender
K. Christakopoulou and A. Banerjee
ACM Recommender Systems Conference (RecSys), 2019 (pdf).
- Recommendation with Capacity Constraints
K. Christakopoulou, J. Kawale, and A. Banerjee
International Conference on Information and Knowledge Management (CIKM), 2017 (pdf)
Extended Version (arXiv).
- Collaborative Ranking with a Push at the Top
K. Christakopoulou and A. Banerjee
International World Wide Web Conference (WWW), 2015 (pdf).
- Kernelized Probabilistic Matrix Factorization: Exploiting Graphs and Side Information
T. Zhou, H. Shan, A. Banerjee, and G. Sapiro.
SIAM International Conference on Data Mining (SDM), 2012 (pdf).
- Generalized Probabilistic Matrix Factorizations for Collaborative Filtering
H. Shan and A. Banerjee.
IEEE International Conference on Data Mining (ICDM), 2010 (pdf,longer version).
- Residual Bayesian Co-clustering for Matrix Approximation
H. Shan and A. Banerjee.
SIAM International Conference on Data Mining (SDM), 2010 (pdf).
Finance
We have worked on certain aspects of portfolio selection from the perspective of online learning and full-information sequential decision making.
Selected Papers
- Structured Hedging for Resource Allocations with Leverage
N. Johnson and A. Banerjee
International Conference on Knowledge Discovery and Data Mining (KDD), 2015 (pdf).
- Online Resource Allocation with Structured Diversification
N. Johnson and A. Banerjee
SIAM International Conference on Data Mining (SDM), 2015 (pdf).
- Online Portfolio Selection with Group Sparsity
P. Das, N. Johnson, and A. Banerjee
AAAI Conference on Artificial Intelligence (AAAI), 2014 (pdf).
- Online Lazy Updates for Portfolio Selection with Transaction Costs
P. Das, N. Johnson, and A. Banerjee
AAAI Conference on Artificial Intelligence (AAAI), 2013 (pdf).
- Meta Optimization and its Application to Portfolio Selection
P. Das and A. Banerjee.
International Conference on Knowledge Discovery and Data Mining (KDD), 2011 (pdf).