B4A Class Android Speech Recognition API Wrapper

JohnC · Feb 3, 2016

Steve,

Great job - I just happen to find your wrapper for something I was looking into today - I wanted to be able to end up with a file of recorded audio and the text of what was spoken in that audio file.

I figured there are basically two possible ways to do this:

1) Record audio to a file, then somehow pass the audio stream from the file to the voice recognizer so it will generate the text of what was spoken in the audio.
2) Alternatively, I was hoping that that "onBufferReceived" event in your recognizer wrapper was going to provide the audio data (buffer()) that I could then save into an audio file so I could also end up with an audio file and the text spoken in that audio file.

But it looks like the "onBufferReceived" event never gets triggered

stevel05 · Feb 3, 2016

Yes, the API documentation says that there is no guarantee that the method will be called unfortunately.

BobsYourUncle · Mar 1, 2016

Steve, nice library! I noticed it works on some of my phones and not others. If the latest version of the Google App is installed, the first recognition attempt works, the second fails with a Client Error. If I then uninstall the update to the Google App, recognition works every time! Is the latest Google App forcing Recognizer to be fussier? Is there some object left open by the library that's causing the problem for the second recognition? Any thoughts?

stevel05 · Mar 1, 2016

The library is very light, just the code in the SpeechRecognition class. It appears to work OK with my Nexus 7 and Android 6.0.1. There are occasional clientside errors, but it generally works OK.

Which Google app did you update that causes the problem?

peacemaker · Mar 21, 2016

Thanks for this class!
Here is my modification for the background speech recognition (SR) in the service. But after debugging week i could not make my app well that

listens for silence
activates SR with suppressing start\stop beeps
recognizes speech for checking the command
gives message to the user by text-to-speech (TTS)
records voice message from user
play messages

All parts can work well separately - but together, one interferes to another part breaking the work

Here the class:

B4X:

'Class module
'v.2 by Pomelov Vlad aka Peacemaker [email protected]
Sub Class_Globals
    Private JO As JavaObject
    Private RecognizerIntent As Intent
    Private Initialized As Boolean
    Public Busy As Boolean
    Private SpeechRecognizer As JavaObject
    Private Target As Object
    Private Lang As String
    Private SpeechRecognition_Name As String
End Sub

'Initializes the object. You can add parameters to this method if needed.
'ObjectName = name of this SpeechRecognition object
Public Sub Initialize(TargetModule As Object, RecognizeLanguage As String, ObjectName As String)

 
    SpeechRecognizer.InitializeStatic("android.speech.SpeechRecognizer")
    JO = SpeechRecognizer.RunMethod("createSpeechRecognizer",Array(JO.InitializeContext))
 
    If Not(IsRecognitionAvailable) Then
        Log("Speech Recognition Not Available")
        ToastMessageShow("Speech Recognition is not Available", True)
        Return
    End If
    SpeechRecognition_Name = ObjectName
    Lang = RecognizeLanguage
    Target = TargetModule
    RecognizerIntent.Initialize("android.speech.action.VOICE_SEARCH_HANDS_FREE", "")
    RecognizerIntent.PutExtra("calling_package",Application.PackageName)
    RecognizerIntent.PutExtra("android.speech.extra.LANGUAGE_MODEL", "free_form")
    RecognizerIntent.PutExtra("android.speech.extra.MAX_RESULTS",3)
    RecognizerIntent.PutExtra("android.speech.extras.SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS", 1000)
    RecognizerIntent.PutExtra("android.speech.extras.EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS", 2000)
    RecognizerIntent.PutExtra("android.speech.extras.SPEECH_INPUT_MINIMUM_LENGTH_MILLIS", 1000 * 10)
 
    If Lang <> "" Then
        RecognizerIntent.PutExtra("android.speech.extra.LANGUAGE", Lang)
    End If
     
    Dim Event As Object = JO.CreateEvent("android.speech.RecognitionListener","Received","")
    JO.RunMethod("setRecognitionListener",Array(Event))
    Initialized = True
End Sub

Public Sub IsInitialized As Boolean
    Return Initialized
End Sub


Public Sub IsRecognitionAvailable As Boolean
    Dim JO1 As JavaObject
    JO1.InitializeContext
    Return JO.RunMethod("isRecognitionAvailable",Array(JO1))
End Sub
Public Sub StartListening
    Busy = True
    JO.RunMethod("startListening",Array(RecognizerIntent))
End Sub
Public Sub StopListening
    JO.RunMethod("stopListening",Null)
    Busy = False
End Sub

Public Sub Destroy
    JO.RunMethod("destroy",Null)
    Busy = False
End Sub

Public Sub cancel
    JO.RunMethod("cancel",Null)
    Busy = False
End Sub

Private Sub Received_Event (MethodName As String, Args() As Object) As Object
    Select MethodName
        Case "onBeginningOfSpeech"
        Case "onEndOfSpeech"
            Busy = False
            If SubExists(Target, SpeechRecognition_Name & "_onEndOfSpeech") Then
                CallSubDelayed(Target, SpeechRecognition_Name & "_onEndOfSpeech")
            End If
        Case "onError"
            Busy = False
            'Dim ErrorMsg As String = GetErrorText(Args(0))
            If SubExists(Target, SpeechRecognition_Name & "_onError") Then
                CallSubDelayed2(Target, SpeechRecognition_Name & "_onError", Args(0))
            End If
        Case "onResults"
            Busy = False
            Dim Results As JavaObject = Args(0)
            Dim Matches As List = Results.RunMethod("getStringArrayList",Array("results_recognition"))
            If SubExists(Target, SpeechRecognition_Name & "_onResults") Then
                CallSubDelayed2(Target, SpeechRecognition_Name & "_onResults", Matches)
            End If
        Case "onRmsChanged"
            Busy = True
            If SubExists(Target, SpeechRecognition_Name & "_onRmsChanged") Then
                CallSubDelayed2(Target, SpeechRecognition_Name & "_onRmsChanged", Args(0))
            End If
    End Select
 
End Sub

Sub GetErrorText(ErrorCode As Int) As String
    Select ErrorCode
        Case SpeechRecognizer.GetField("ERROR_AUDIO")
            Return "Audio Recording error"
         Case SpeechRecognizer.GetField("ERROR_CLIENT")
             Return "Client side error"
         Case SpeechRecognizer.GetField("ERROR_INSUFFICIENT_PERMISSIONS")
            Return "Insufficient permissions"
         Case SpeechRecognizer.GetField("ERROR_NETWORK")
             Return "Network error"
         Case SpeechRecognizer.GetField("ERROR_NETWORK_TIMEOUT")
             Return "Network timeout"
          Case SpeechRecognizer.GetField("ERROR_NO_MATCH")
             Return "No match"
          Case SpeechRecognizer.GetField("ERROR_RECOGNIZER_BUSY")
             Return "RecognitionService busy"
          Case SpeechRecognizer.GetField("ERROR_SERVER")
             Return "error from server"
         Case SpeechRecognizer.GetField("ERROR_SPEECH_TIMEOUT")
            Return "No speech input"
         Case Else
             Return "Didn't understand, please try again."
    End Select
End Sub

Working project is enclosed also.

canalrun · Apr 6, 2016

Hello,
I have put together an example app that tests stevel05's library.

Thanks stevel05

I have attached the source.

Barry.

JackKirk · Apr 12, 2016

stevel05,

Thanks for wrapping this API - looks like the most palatable way to get speech recognition.

Except for one problem - when I run your sr.zip project straight out of the box on my Samsung S5 I get "Speech Recognition Not Available" - do I have to fiddle something in the phone? - I have googled and generally rutted around without success.

Any advice would be appreciated...

peacemaker · Apr 12, 2016

JackKirk said:
"Speech Recognition Not Available"

Try my sample project. Above.

JackKirk · Apr 13, 2016

peacemaker,

In your sample project I get:

Starter.sr.IsRecognitionAvailable = False

in Main.butStart_Click

I must have some external setting wrong - but what?

Regards...

JackKirk · Apr 13, 2016

I've sorted it.

After some considerable googling it turns out that at some stage I must have uninstalled "Google voice typing" - if it ever was installed.

I went to the Playstore, searched for "google" and installed the first app (called rather mysteriously "Google").

Now all sample apps mentioned in this thread work!

Also, I now have a microphone icon on any keyboard that is raised - which is what I was really after - so I don't have to modify my app - bliss!

Sorry for inconveniencing everyone...

peacemaker · Apr 13, 2016

I meant - did butSetup not help to do this without googling ?

JackKirk · Apr 13, 2016

peacemaker,

When I installed your example I tapped butSetup but it had no obvious effect.

Once I sorted it as per my post #11 everything worked.

Thanks for your interest in my problem...

ArminKH · Apr 16, 2016

great job @stevel05 works fine on my Android 4.2.2 huawei g610
i tried it with 4 language and accuracy is awesome
thank u

Rusty · Aug 9, 2016

Any solutions to this?
I'm having the same problems...
First time works fast and well; second and forward end up with an error 5 client side error...a very long delay and finally a return of my speech/text.
Unusable with the delays/errors.
Thanks,
Rusty

rboeck · Sep 3, 2016

Today i got a Google installation on my device; the update is from yesterday - 2 sept 2016 - and the pause delay after recognition is completly away!! Now i works for me as ever wished.

canalrun · Oct 20, 2016

In post #7 of this thread:
https://www.b4x.com/android/forum/threads/android-speech-recognition-api-wrapper.62959/#post-414575

I uploaded a test app for B4A that tests stevel05's library.

I have an App that is essentially not much more than this test app. When I use my App on all five of my test phones and my tablet it works almost 100% of the time, but when friends use it on their phones (having approximately the same capabilities on their phones), or if I use it on their phones, sometimes it works, sometimes it doesn't recognize the end of speech, sometimes it abruptly stops before the end of speech, and sometimes it does not return results.

My test devices range from Android 4.x to 5.x. The friends phones are similar, or even the same model of phone. My tablet is 6.0.1.

It does not seem to be an android version problem.

My test devices work connected to Wi-Fi or cellular data.

I have tested on my friends phones and it doesn't work consistently on their phones when I speak.

Any ideas of what the problem could be?
Could people download and quickly try the test app in post #7?

Thanks,
Barry.

canalrun · Oct 20, 2016

Syd Wright said:
The problem is caused by the Google App. ... Make sure that your users have at least Google App version 6.3.38 installed! That should solve most problems.

Thanks!

JackKirk · Jan 21, 2017

I have taken stevel05's class, totally tidied it up, added full beep management, made it able to handle offline speech recognition and documented the bejesus out of it - see attached zip containing class and ultra simple example of usage.

Some notes on beep management:

This is done via the XSpeechRecognizer_Beep_Mute_Time parameter of the class's Initialize method.
If set to 0 you will understand the need for beep muting.
If set to 800 (as suggested by Rick Harris in this post:
https://www.b4x.com/android/forum/t...chrecognizer-library.27409/page-3#post-220692
then this will typically eliminate all but a miniscule beep every so often.
If set to a value of 4000-6000 or higher you can totally eliminate beeps - at this level restoration of music volume is occurring at a slower rate than occurrence of error number 6 meaning music volume is never restored until the final Flush in Activity_Pause.
In my view this is not responsible behaviour - if something else wants to use the music channel then the user won't hear it - not sure what that could be (any comment here?)
If you were OK with this behaviour then you could simplify the class by taking out all the music muting/restoration and just mute in the Start method and restore in the Flush method.

Some notes on offline speech recognition:

This was the whole reason for my embarking on this exercise.
It is simply achieved via the following statement in the classes Start method:
SpeechRecognizerIntent.PutExtra("android.speech.extra.PREFER_OFFLINE", True)
But to take advantage of it you have to make sure the phone is configured for offline speech recognition, the user notes in my app on this subject are as follows:

...you may be able to improve performance by configuring your phone for offline speech recognition - this stops Android offloading speech recognition to a remote server, eliminating reliance on (possibly flakey) mobile networks and significantly reducing latency.

Offline speech recognition is a fairly recent Android enhancement which may or may not be available on your phone - to implement it on a Samsung S5 running Android 6 (Marshmallow):

●Ensure you have Google app installed - yes, there is an app simply called 'Google' - go to [Play Store] and search for 'Google' - it should be first hit.
●Go to [Settings]/[Language and input]/[Google voice typing].
●Under [Languages] ensure you have language checked that you want speech recognition to be done in.
●Under [Offline speech recognition] ensure you have downloaded language pack that matches that checked in previous step (look under [ALL] tab for downloadable language packs).
●Test by turning on [Flight mode] then ...

Happy coding...

EDIT

I have found that there has recently been a change of behaviour of speech recognition - at least on my Samsung S5 - it no longer raises ERROR_SPEECH_TIMEOUT onError event when a speech timeout actually occurs - as a consequence (using my original effort) you get about a 5 sec window to say what you have to say and then it goes dead.

I have updated my zip (XSpeechRecognizer2.zip) which solves this rather weird development by simulating a ERROR_SPEECH_TIMEOUT onError event at a user defined interval.

If you set this interval to less than about 5000 millisecs everything works as before.

EDIT 2

As of 12 Jun 17 I have found that the problem referred to in the above edit no longer applies - looks like they have fixed it.

So you can either rip out anything that refers to:

Obj_force_speech_timeout
XSpeechRecognizer_Speech_Timeout

Or, when initializing:

Obj_speech.Initialize(Me, "Event_obj_speech", 800, 0)

if the last parameter is set to 0 it turns the (now redundant) fix off.

A NEW OBSERVATION

As of 12 Jun 17 (and probably before) on my Samsung S5, if you use the statement:

RecognizeSpeech_Intent.PutExtra("android.speech.extra.PREFER_OFFLINE", True)

in the Start procedure, this now seems to mean "you can only do offline" rather than the previous "you can do either offline or online".

Happy coding...

JohnC · Jan 21, 2017

Is there a way to use an audio file (of recorded speech) as the input audio/speech (to be translated) instead of using the microphone with this lib?

B4A Class Android Speech Recognition API Wrapper

Attachments

Expert

Expert

Member

Expert

Expert

Attachments

Well-Known Member

Attachments

Well-Known Member

Expert

Well-Known Member

Well-Known Member

Expert

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Attachments

Expert

Similar Threads