Android Tutorial Voice Recognition Example

A large button is displayed. When the user presses on the button, the user is asked to say something.
The voice recognition engine converts the audio to text.

Then the text is converted back to speech using the TTS (text to speech) library:
B4X:
'Activity module
Sub Process_Globals
    Dim VR As VoiceRecognition
    Dim TTS1 As TTS
End Sub

Sub Globals

End Sub

Sub Activity_Create(FirstTime As Boolean)
    If FirstTime Then
        VR.Initialize("VR")
        TTS1.Initialize("TTS1")
    End If
    Activity.LoadLayout("1")
    If VR.IsSupported Then
        ToastMessageShow("Voice recognition is supported.", False)
    Else
        ToastMessageShow("Voice recognition is not supported.", True)
    End If
    VR.Prompt = "Say your message"
End Sub

Sub Button1_Click
    VR.Listen 'calls the voice recognition external activity. Result event will be raised.
End Sub

Sub VR_Result (Success As Boolean, Texts As List)
    If Success = True Then
        ToastMessageShow(Texts.Get(0), True)
        TTS1.Speak(Texts.Get(0), True)
    End If
End Sub
VR is a VoiceRecognition object. We call its Listen method. This method launches the external voice recognition application. When the result is ready the Result event is raised.
We need to check the Success flag to make sure that there is at least one text available.
The Texts list holds all the possible results. We take the supposedly best one which is the first.
The program is attached.
 

Attachments

  • VoiceRecognition.zip
    5.6 KB · Views: 4,930

Syd Wright

Well-Known Member
Licensed User
Dear friends,

have you experiences with conditions of using this voice recognition? As wrote "Syd Wright" - are here some parameters of free using of this function or we must pay something? 0-60 min free and 61-1000000minutes for 0,006 dolar per month and per 15 sec? If we use voice recognition in our app via B4A, we must pay or not?

Best regards
p4ppc
Just to state that I do not yet have any experience with Google's new paid speech recognition API. I am pleased that finally there are some more people on this Forum who are interested in these new services. So far it has been very quiet.

Microsoft (Bing) also has a similar service which I have tried (using a library that you can find in this Forum). This library works well, but Bing does not yet offer a version in Dutch, so I did not proceed any further.
Bing is also cheaper than Google. If I remember correctly it is 0.004 cents per 15 seconds blocks of speech, whereas Google charges 0.006 cents ($).

What also is very unclear to me is WHO has to pay for these services. Logically speaking I would think that the end user should pay and not the developer / seller of the app that uses the service. Both Google and Bing are not at all clear about who has to pay and how this is handled.

What I do know is that, as a (registered!) developer, you can get a free trial with a limited amount of speech data. This is described on Bing and Google's API webpages.
 

petr4ppc

Well-Known Member
Licensed User
Syd Wright - thank you for your reply. I can´t find on the google web the date of beginning this paid service.
How long will be this service free. You have this onformations please? Because I thing that we can still use this B4A VR tutorial for free.

Its naturaly step form google and Bing to change from free to pay service, but I thing that 15 second is long interval. Some apps are using one or two words and the price for two words is very high.

Thank you an have a good day
p4ppc
 
Last edited:

Syd Wright

Well-Known Member
Licensed User
Hi there.
You will find documentation about Google's cloud speech API here:
https://cloud.google.com/speech/docs/ and the tariffs here: https://cloud.google.com/speech/pricing

Regrettably they charge per block of 15 seconds. Each shorter speech burst is rounded up to 15 seconds.
This makes their service totally unaffordable for continuous speech recognition. If for example a users uses the API continuously for 10 hours a day then the charge will be 10 (h) x 60 (m) x 4 (blocks/minute) = 2400 blocks * 0.006 US$ = 14.40 US$ a day! Cleaver software will be needed to suppress long durations of silence and/or background noise. Especially pre-detecting whether the users is speaking or that the microphone is picking up environmental sounds will be crucial. Only if speech by the users is detected should the microphone be allowed to send voice data to Google's service, using the API.

Also I repeat that I still don't know who has to pay Google: Is it the user or is it the developer/app seller?
If the latter is the case then Google is forcing the developer to set up a billing system (and Google will probaby retain 30% of revenue). Also Google will have to provide the developer with usage data, otherwise billing will not be possible.
So far I have not found the answer to this principle question.
 

petr4ppc

Well-Known Member
Licensed User
Hi Syd Wright,

Thank you very much for your links to web pages. I read them before and I know about it. I was not found the date of ending free use of voice recognition (for example this VR tutorial).
Its absolutely right what you wrote about continuous speech recognition and the prices.

I dont know too, who has to pay Google, if the user or developer/app seller. It looks, that comes the day and Google and Bing show us final results and show us the date of ending free services and starting the pay-services.

Thank you very much SYD
Best regards
p4ppc
 

Beja

Expert
Licensed User
Hi Erel,
Today I tried the example on the first page and it was running.. but the tts was slower than the voice reco
eg. if you say "hello how are you", the TTS said "are you" or the last words.. Not sure if this issue was raised in the previous pages because I didn't read them all.
I added a sleep function and the TTS played be the full recognized text.

B4X:
Sub VR_Result (Success As Boolean, Texts As List)
    If Success = True Then
        ToastMessageShow(Texts.Get(0), True)
        '____________
        Sleep(1000)
       '-------------
        TTS1.Speak(Texts.Get(0), True)
    End If
End Sub
 
Top