Google Launches Gemini 3.1 Flash Live for Real-Time Voice and Vision Agents
Google released Gemini 3.1 Flash Live on Thursday, its highest-quality model for real-time audio and voice applications. The launch targets developers building conversational AI agents that require low-latency, natural-sounding dialogue.
What's New
The model ships with three headline improvements over the previous Flash Live:
Faster responses โ designed to match the pace of natural conversation, with latency reduced enough that pauses between exchanges are no longer perceptible. Google specifically called out scenarios like hands-on troubleshooting where timing matters ("Can you help me change this tire in under 5 minutes?").
Doubled context window โ Gemini Live conversations can now hold twice as much session history, reducing the common failure mode where voice agents lose track of earlier conversation details.
Wider global availability โ Flash Live is now accessible in 200+ additional regions with multimodal, real-time support in users' preferred languages.
Where It's Available
The model rolls out across three surfaces simultaneously: Gemini Live inside the Gemini app and Search Live for consumers, the Gemini Live API and Google AI Studio for developers in preview, and Gemini Enterprise for Customer Experience.
Google also demoed voice-driven coding in AI Studio โ using Flash Live to build apps through spoken instructions, with code updating in real time as the developer talks.
Developer Angle
The API-accessible version is currently in preview. Teams building voice agents with persistent multi-turn context or operating in noisy environments are the clearest beneficiaries. Google positioned the model as the backbone for next-generation voice interfaces, not just a chat upgrade.