Text To Speech and texts containing words of a different language from the base language

aiotech · November 9, 2022, 3:49pm

Several times happens that, for instance, a text written in Italian contains some word in English or in French. If TTS is set for Italian the pronounce of the foreign word is terrible, is there a way to instruct TTS to dynamically change the voice for the language of those words? I mean if there is, for instance, some tags to put on the text to speak or anything else.

SteveJG · November 9, 2022, 3:58pm

What a developer can do with Blocks is described in TextToSpeech.

That should answer your question. What can be done is limited.

aiotech · November 9, 2022, 5:29pm

Your answer do not help so much, I already read the docs. In the docs there is some missing infos, as the use of commas and punctuation marks and accents so, I suppose, there are other tricks or special markers to mix in the text to change dynamically the voice or the language and so on. At least known which text to speech engine is used by TTS could help a little bit more. I think that this problem is common in many other languages not only in Italian or Spanish and earing a foreign word/phrase spoken with the same accent of the base language is not only annoing but also hilarious. Before to write the post I tryed also changing the Android text to speech engine but there is no difference for this problem. Seems strange to me they did not provide a solution to this problem.

ewpatton · November 9, 2022, 11:17pm

The answer is probably not, but maybe in the future. Here is a StackOverflow thread discussing support for the Speech Synthesis Markup Language (SSML) in the Android TTS environment. It seems like on newer devices using Google TTS that it may be possible to provide some SSML tags, but the <lang> tag, which you would need to accomplish your aim, doesn't seem to be supported yet. Then of course on older devices you wouldn't have the support so things would still sound rather off.

I should note that there are cloud-based speech synthesis services that are more feature rich, so if you're willing to pay for them you could synthesize your speech with one of those services via a Web component and then play it back via a Player component.

aiotech · November 9, 2022, 11:49pm

Mmmmm... I don't know in older releases but in Android 11 is possible to change the speect to text engine, then if there is an engine that may manage SSML or other marks to manage the 2 language in a phrase may be installed. But TTS of AI2 may manage a different engine?

aiotech · November 9, 2022, 11:57pm

This is an example found in the docs of SSML showing how to choose the voice and the language. Seems that breaking a phrase could permit to change the language on the fly but the problem may be to have the same voice for both languages, I do not understand if it is possible or not.

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
       xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
    <voice name="en-US-JennyMultilingualNeural">
        <lang xml:lang="es-MX">
            ¡Esperamos trabajar con usted!
        </lang>
        <lang xml:lang="en-US">
           We look forward to working with you!
        </lang>
        <lang xml:lang="fr-FR">
            Nous avons hâte de travailler avec vous!
        </lang>
    </voice>
</speak>

aiotech · November 10, 2022, 12:18am

This works quite well with some text to speech engines available:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="it-IT"> 

	<voice name="it-IT-ElsaNeural"> 

		<lang xml:lang="it-IT"> Andiamo a visitare la portaerei </lang> 

	</voice> 

	<voice name="en-US-JennyMultilingualNeural"> 

		<lang xml:lang="en-US"> George Washington </lang> 

	</voice> 

</speak>

Taifun · September 19, 2023, 4:48pm

2 posts were split to a new topic: Using the Wavel.ai API

Taifun · September 19, 2023, 4:45pm

Taifun · September 19, 2023, 4:49pm