← All updates

April 9, 2025

Gemini
**Model updates:**

- Released `veo-2.0-generate-001`, a generally available (GA) text- and image-to-video model, capable of generating detailed and artistically nuanced videos. To learn more, see the [Veo docs](https://ai.google.dev/gemini-api/docs/video).
- Released `gemini-2.0-flash-live-001`, a public preview version of the
[Live API](https://ai.google.dev/gemini-api/docs/live) model with billing enabled.

- **Enhanced Session Management and Reliability**

- **Session Resumption:** Keep sessions alive across temporary network disruptions. The API now supports server-side session state storage (for up to 24 hours) and provides handles (session_resumption) to reconnect and resume where you left off.
- **Longer Sessions via Context Compression:** Enable extended interactions beyond previous time limits. Configure context window compression with a sliding window mechanism to automatically manage context length, preventing abrupt terminations due to context limits.
- **Graceful Disconnect Notification:** Receive a `GoAway` server message indicating when a connection is about to close, allowing for graceful handling before termination.
- **More Control over Interaction Dynamics**

- **Configurable Voice Activity Detection (VAD):** Choose sensitivity
levels or disable automatic VAD entirely and use new client events
(`activityStart`, `activityEnd`) for manual turn control.

- **Configurable Interruption Handling:** Decide whether user input
should interrupt the model's response.

- **Configurable Turn Coverage:** Choose whether the API processes all
audio and video input continuously or only captures it when the end-user
is detected speaking.

- **Configurable Media Resolution:** Optimize for quality or token usage
by selecting the resolution for input media.

- **Richer Output and Features**

- **Expanded Voice \& Language Options:** Choose from two new voices and
30 new languages for audio output. The output language is now
configurable within `speechConfig`.

- **Text Streaming:** Receive text responses incrementally as they are
generated, enabling faster display to the user.

- **Token Usage Reporting:** Gain insights into usage with detailed
token counts provided in the `usageMetadata` field of server messages,
broken down by modality and prompt or response phases.
Original source ↗ Crawled: 6 Feb 2026 22:16