Google’s Gemini 3.1 Flash TTS adds audio tags, 70-plus languages, and SynthID watermarking for more controllable AI-generated ...
Abstract: In spoken scenarios, achieving personalized and controllable zero-shot spontaneous style speech synthesis is highly significant, particularly in generating natural and expressive speech for ...