Sarthak Mittal

Ph.D. Student at MILA, Université de Montréal

I am a final year PhD candidate at Mila, where I work with Guillaume Lajoie and Yoshua Bengio. My research spans meta-learning, amortized inference, diffusion models and variational methods, and large language model optimization — from pretraining to post-training.

During my PhD, I also spent time at various research labs including Google, Meta, Apple, NVIDIA, Morgan Stanley and Uber ATG, where I tackled problems ranging from long-context meta-learning and LLM pre-training to probabilistic forecasting and distillation.

I am primarily interested in improving the capabilities of current large-scale systems, from leveraging inductive biases in their architectures to latent variable modeling (analogously RL) as a framework for improving both reasoning and long-context abilities.

Contact: sarthmit@gmail.com

Sarthak Mittal Profile

Education

Mila, Université de Montréal Sept '22 - Jan '26 (Expected) Degree: Ph.D. in Computer Science | Fast-tracked from M.Sc.

GPA: 4.3/4.3

Advisors: Guillaume Lajoie, Yoshua Bengio

Indian Institute of Technology (IIT) Kanpur Jul '15 - May '19 Degree: B.S. in Mathematics and Scientific Computing

GPA: 9.7/10.0.

Awarded Suman Gupta Gold Medal

Dean's List for Academic Excellence every year

Experience

Google Logo

Sept '25 - Present

Google

Student Researcher

Hybrid systems for efficient inference in large language models

Meta Logo

Sept '24 - Aug '25

Meta

Student Researcher

Iterative Amortized Inference (IAI) for efficient meta-learning

Apple Logo

Jul '24 - Sept '24

Apple

Research Intern

Demand forecasting using time-series foundational models for Apple products

Morgan Stanley Logo

Jun '23 - Aug '23

Morgan Stanley

Research Intern

Combining continuous time generative models with autoregressive time series predictors

NVIDIA Logo

Jan '23 - Apr '23

York University

Research Intern

Amortizing Bayesian posterior inference using in-context learning

NVIDIA Logo

May '22 - Dec '22

NVIDIA

Research Intern

Leveraging synthetic targets for machine translation through knowledge distillation

Uber Eats Logo

Sept '19 - Mar '20

Uber ATG

Research Intern

Camera and LiDAR fusion for mapping using autonomous vehicle logs

Scholarships

FRQNT Doctoral Scholarship

UNIQUE PhD Excellence Scholarship

IVADO MSc Excellence Scholarship

UNIQUE MSc Excellence Scholarship

Thinking Machines (Tinker) Research Grant

KVPY Fellowship

Academic Service

  • Frontiers in Probabilistic Inference: Sampling Meets Learning: Helped organize the workshop at ICLR'25
  • Reviewer: ICML, ICLR, NeurIPS, IJCAI, ACL, ACML, NeurIPS Datasets and Benchmarks Track
  • Teaching Assistant: IFT-6135: Representation Learning during Fall ’22
  • Admissions Committee: Reviewed graduate school admissions at Mila for 2024 and 2025



Last updated: November 2025