I work on Post Training Gemini. Core contributor to Gemini 3.0, 2.5, and 2.0. My work on RL and evals for code helped push Gemini to #1 on the WebDev Arena.

Mostly I think about how to get models to solve problems they've never seen. Not by memorizing patterns, but by learning to actually think. Right now that means multi-step RL and reward modeling for code.

I also write about the things we might stop noticing.

Research

Industry

Thoughts