App Inventor Speech Recognizer1 mp3 or WAV input option

Hi, All

When using the Speech Recognizer block is it possible to have a wave or mp3 file loaded and transcribed into text or text file.
I dont want to use the microphone on the android device.

Basically, I have a remote device that can send audio wave files to the mobile phone device and need it translated to text and either sent back to the mobile device or saved in a file on the device.

Thank You


Welcome Thomas.

I don't think so Thomas. I tried using the microphone (which you do not want to do) and I found it virtually impossible to get the SpeechRecognizer to recognize the song's lyrics when playing a song using another device. Probably an issue with the background music making it impossible for the SR to capture the spoken work (lyrics).

I do not think it is possible to capture lyrics from a song unless the background music is very soft. Capturing lyrics from the mp3 directly is probably impossible. Capturing your speech is only possible if you speak in a clear voice with little background interference.

mp3 to text non App Inventor possible alternatives to transcribing text from mp3.


Hi, thanks for the reply.
Just to explain in more detail, Im basically trying to do a Speech to Text convertor for a project of mine using an ESP32 Microcontroller.

My ESP32 device has Dual MEM microphones and records speech when its spoken too.
It then sends the speech as a WAV or MP3 Audio file to my mobile device APP where I was hoping APP Inventor Speechrecognizer would convert the audio file to text and send it back to my ESP32 device.

I can use AWS Polly and other online API calls but the complexity and latencey going through certificates and other issues is just hair pulling lol.

Is APP Inventor looking at adding MP3 or WAV file module to its Speechrecognizer1 function.


P.S (I've been an engineer for 20 years and APP Inventor is the best interface Ive ever had the pleasure to use. Hats off to MIT students.)

My typo, Speech Recognition to Text Convertor.



Not using your microcontroller but might be something you find useful


I know this topic is about converting speech to text, but I would like to share this link, we upload a song and get a file with the music (Instrumental) and another with the sound of the lyrics (Vocal). Try.