How to detect language automatically with Speech Recognition in Python

Question

I am developing an app and I want to detect the language automatically and then print it.

My code:

with sr.Microphone() as source:
audio = r.listen(source)
try:
    # Auto detect the language
    print("You said: " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service")

Hope you understood.

Hi, welcome to Stackoverflow. There's not a lot of detail here about what specific problem you are having. Try looking at stackoverflow.com/help/how-to-ask — L.Grozinger
– L.Grozinger, Commented Jul 21, 2020 at 0:19

WinkDoubleguns · Accepted Answer · 2020-10-02 18:03:26Z

This may be too late for your answer. However, someone else may be looking for this in the future.

I have not found a way to auto detect the language using speech_recognition. The Google API does support multiple languages in the form of an alternate languages array in which it will attempt the different languages you've specified in order to provide a translation.

The way I overcame the issue of detecting a language is that I only have two languages. I could have more I only have two that I need. I use a wake word and command. So I can say (for example) "ok computer translate from Spanish" and the command parser determines that the intent is to translate from Spanish. I used snips for this but you could just do string tokenization. In any case, at that point, because I know the intent is "from Spanish" I explicitly set the language code to "es" like this:

 said = r.recognize_google(audio, language="es")

My SpeechRecord class

import speech_recognition as sr

class SpeechRec:
    #https://techwithtim.net/tutorials/voice-assistant/wake-keyword/
    def record(self, lang='en'):
        r = sr.Recognizer()
        with sr.Microphone() as source:
            audio = r.listen(source)
            said = ""

            try:
                #can I detect the language?
                if (lang == 'en') :
                    said = r.recognize_google(audio, language='en-US')
                elif (lang == 'es') :
                    said = r.recognize_google(audio, language="es") 

                print(said)
            except Exception as e:
                if (str(e) != ""):
                    print("Exception: " + str(e))

        return said.lower()

The listen loop - I call this from a Flask event but it'll work the same in a standalone application

WAKE = "computer"
while True:
    text = SpeechRec().record()
    language = "en"

    if text.count(WAKE) > 0:
        text = SpeechRec().record()

        #here I have a call to determine the intent via Snips - I've removed that and just
        #placed a text comparison for simplicity.  Also, to note, using Snips I can reduce
        #the wake word and command to remove two of these 
        #"text = SpeechRec().record()" lines
        if (text == 'translate from spanish'):
            text = SpeechRec().record('es')
        else:
            text = SpeechRec().record()

        #at this point I do the translation and print the value from the translation

This is probably not the most elegant solution. I will probably rewrite this many times in the future. However, it works well for what I need at the moment.

I hope this helps answer your question.

Collectives™ on Stack Overflow

How to detect language automatically with Speech Recognition in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related