I am a Founding Member at a stealth startup, working on post-training and distillation for language models and agents. I completed my MS in Machine Learning at Carnegie Mellon University (CMU), advised by Fernando De la Torre and closely collaborating with Raviteja Vemulapalli and Oncel Tuzel from Apple MLR. Before that, I was a Predoctoral Fellow at the Indian Institute of Science, advised by R. Venkatesh Babu.
My work examines how much of a frontier model's capability can be recovered through distillation and post-training, and how to rigorously characterize what changes between model versions — with the goal of making powerful models more accessible and their development more transparent.
State-of-the-art models offer remarkable capabilities, but their deployment is constrained by computational cost and API-only access. I study how much of a frontier model's capability can be recovered through distillation — examining the data regimes, query strategies, and training objectives that make knowledge transfer most effective.
Benchmark scores provide an incomplete account of how models change across training runs or releases. I am developing systematic tools for model comparison — frameworks that surface meaningful behavioral differences, track capability shifts, and assess compatibility across a model family over time.
Models that perform well in-distribution frequently degrade when deployed under domain shift. My prior work addresses this through transformer-based methods for source-free domain adaptation, vision-language supervision for cross-domain generalization, and training strategies for long-tail visual recognition.
* Equal contribution