Multiturn Evals (and RL) for LLMs by Kartikeya Badola

Name: Multiturn Evals (and RL) for LLMs by Kartikeya Badola
Start: 2025-08-08T12:00:00+05:30
End: 2025-08-08T13:00:00+05:30
Location: SIT 001

August 8 @ 12:00 pm - 1:00 pm

Title: Multiturn Evals (and RL) for LLMs

Details: 8th August, 12 pm, SIT001

Abstract: LLMs often fail at multi-step tasks requiring memory and strategic planning, a gap not captured by traditional single-turn evals. To address this, we’ve developed a suite of human and automated evals that stress test Gemini on these capabilities. This talk will cover the motivation and design behind these evals, a discussion on latest results and will also touch upon some of the early promising experiments using multiturn RL methods to address some of these losses.

Bio: Kartikeya Badola is a Software Engineer at Google DeepMind in London, where he works with the Gemini evals and Gemini thinking teams. Prior to this, he was with Google Research in India, working on multilingual semantic parsing. Kartikeya is a graduate of IIT Delhi, where he worked with Prof. Mausam and Prof. Parag Singla on Distantly Supervised Relation Extraction.

Details

Date:: August 8
Time:: 12:00 pm - 1:00 pm

Venue

: SIT 001
: Amar Nath and Shashi Khosla School of Information Technology, IIT Delhi, Hauz Khas, New Delhi 110016, India
Delhi, Delhi 110016 India + Google Map