[GSoC] Dynamic Text-to-Speech Playback

### Feature description

In the [current implementation](https://github.com/cohere-ai/cohere-toolkit/blob/main/src/backend/services/synthesizer.py), playback begins only after the entire text has been processed by the Google Text-to-Speech API, which cause inconvenient delays for long texts. [Journey Voices](https://cloud.google.com/text-to-speech/docs/voice-types#overview) provides [real-time streaming](https://cloud.google.com/text-to-speech/docs/create-audio-text-streaming) with low latency for some common languages (search for "Journey" on the [voices](https://cloud.google.com/text-to-speech/docs/voices) page). The goal of the project is to implement a flexible approach that supports both streaming and waiting for a complete response, depending on the language.

### Expected outcomes

Users can listen to synthesized text in two modes, depending on language support.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GSoC] Dynamic Text-to-Speech Playback #926

Feature description

Expected outcomes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[GSoC] Dynamic Text-to-Speech Playback #926

Description

Feature description

Expected outcomes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions