B4A Library SpeechToText - Continuous Offline Voice Recognition

Biswajit · Sep 27, 2023

Jmu5667 said:
So, and update. I have a class that does the extract of the model. The async methods dont function correctly, however the no async method do. I eleive I have solved the issue.

Ok. So that was the issue with the archiver library.

Biswajit · Sep 27, 2023

Ricks Film Restoration said:
If you have a "new example project" kindly make it available to the B4A community (especially me)

@Ricks Film Restoration please check the new update.

Ricks Film Restoration · Sep 27, 2023

Great, I will! Thanks! If it works you can expect another donation.

DonManfred · Sep 28, 2023

Jmu5667 said:
The async methods dont function correctly

Are you talking about Archiver library? The async methods DOES work fine.
You need to use

https://www.b4x.com/android/help/archiver.html#archiver_asyncunzip

instead of

B4A - Archiver

unzip is an synchronous method and returns the amount of files unzipped. There is no event raised when finish.

asyncunzip ´ll raise an Event...

Ricks Film Restoration · Sep 29, 2023

DonManfred said:
Are you talking about Archiver library? The async methods DOES work fine.
You need to use

https://www.b4x.com/android/help/archiver.html#archiver_asyncunzip
instead of

B4A - Archiver

unzip is an synchronous method and returns the amount of files unzipped. There is no event raised when finish.

asyncunzip ´ll raise an Event...

Why not just use the regular archiver library without the resumeable sub stuff? With archiver the program will simply not continue until the unzipping is done: i.e. exactly what is needed in this speech rec. project. Or am I wrong?

Ricks Film Restoration · Oct 26, 2023

Biswajit said:
@Ricks Film Restoration please check the new update.

Hi Biswajit. Sorry for the delayed reply. Your new version of the STT speech recognizer works fine! Thank you very much. I will now make my second donation, as promissed.

drgottjr · Oct 27, 2023

there is an updated library? i donated to you. don't the rest of us get to see it?

Ricks Film Restoration · Oct 27, 2023

Just use the new version that Biswajit uploaded to this thread! See #102. Nothing has been witheld from the B4A community.

drgottjr · Oct 27, 2023

i don't know what you see when you look at post #102 (or any around it), but there is nothing there but a message to you.

Ricks Film Restoration · Oct 27, 2023

drgottjr said:
i don't know what you see when you look at post #102 (or any around it), but there is nothing there but a message to you.

No it is a message stating that the latest version (1.5) is included in this thread. Just go to #1 and click on 1 to 4 under "Downloads".

Biswajit · Oct 27, 2023

drgottjr said:
i don't know what you see when you look at post #102 (or any around it), but there is nothing there but a message to you.

Check the post #102 date and the #1 last edited date. Posting an update to the first post is the correct way. Else new users have to search for the update here and there. You can subscribe to B4A library update thread so that you can get a notification when someone posts an update.

Ricks Film Restoration · Oct 27, 2023

I am doing more tests with this STT. Sadly it crashes with the Dutch vosk-model-small-nl-0.22 when pressing the Text-to-speech button and B4A refuses to assemble the app with the large (1.8 GB) vosk-model-en-us-0.22 model. I've tried it many times and deleted the old version each time before re-assembling the app.

Ricks Film Restoration · Oct 27, 2023

PS: It also doesn't work with vosk-model-en-us-daanzu-20200905 In all cases it does unzip but then I get an error log message: "java.io.IOException failed to create a model". (Note: The app now does install the large (1.8 GB) vosk-model-en-us-0.22 model when I don't use the B4A bridge (which keep connecting and disconnecting), but the error still occurs.

Ricks Film Restoration · Oct 28, 2023

Problem is solved: Immediately under Activity_Create I added:
model_zip_name = "vosk-model-en-us-0.22-lgraph.zip" (or any another model name)
model_file_name = model_zip_name.Replace(".zip","")
and changed ar.asyncunzip(file....) to ar.unzip(file....) because asyncunzip does not wait for the unzipping to complete!
The Dutch model version vosk-model-nl-spraakherkenning-0.6 is truely excellent!

Ricks Film Restoration · Oct 30, 2023

There are three more issues with this STT I'd like to report:
1. If I don't use the speech rognizer, but leave it on in a noisy room (i.e. with my TV on) for about half an hour, the STT does not immediately respond when I start to use it again. It takes dozens of spoken words before it catches up and is working to speed and properly again. The noise in the room seems to be filling a buffer all the time and slows the STT's responsiveness down. Any idea how this can be amended?
2. The 1.8 Gbytes big most elaborate US English model vosk-model-en-us-0.22 can not be installed nor unzipped. I guess it is too big (?)
3. All models occupy a triple amount of memory, probably because these models reside in File.dirAssets, are copied to File.dirInternal and then are unzipped.
The only thing I can do is delete the model from File.dirInternal after it has been successfully unzipped and installed.
I think the best way to minimize memory usage is to Unzip a model on forehand on a computer, upload the entire folder structure to a website and then download all the files and folders directly to File.dirInternal, thus avoiding the unzip routine inside the app.

Biswajit · Oct 31, 2023

Ricks Film Restoration said:
1. If I don't use the speech rognizer, but leave it on in a noisy room

I think you are not suppose to use this library for leaving it in a noisy room. This is for converting voice to text continuously. If you use it for listening to any wakeup word like hey google, then this might show unexpected behaviour.

Ricks Film Restoration said:
I think the best way to minimize memory usage is to Unzip a model on forehand on a computer, upload the entire folder structure to a website

Its upto the developer, how he will optimise the app to use minimal storage.

Ricks Film Restoration · Oct 31, 2023

Biswajit said:
I think you are not suppose to use this library for leaving it in a noisy room. This is for converting voice to text continuously. If you use it for listening to any wakeup word like hey google, then this might show unexpected behaviour.

Its upto the developer, how he will optimise the app to use minimal storage.

You don't seem to know what causes this problem?
I am indeed using it for an elaborate personal assistant which just listens all the time for questions and commands. Also I might add a hotword to make it sleep and wake up again. For that purpose I have inserted the STT in a service, such that my assistant responds regardless what app the user is using. It works great, apart from the problem I mentioned. Probably I will make a function that destroys the service every ca. 10 minutes (after the user has not spoken for about a minute, assuming that he/she won't need it for the next couple of seconds) and then automatically restarts it again in order to erase caches etc. What do you think?

Biswajit · Oct 31, 2023

Ricks Film Restoration said:
You don't seem to know what causes this problem?
I am indeed using it for an elaborate personal assistant which just listens all the time for questions and commands. Also I might add a hotword to make it sleep and wake up again. For that purpose I have inserted the STT in a service, such that my assistant responds regardless what app the user is using. It works great, apart from the problem I mentioned. Probably I will make a function that destroys the service every ca. 10 minutes (after the user has not spoken for about a minute, assuming that he/she won't need it for the next couple of seconds) and then automatically restarts it again in order to erase caches etc. What do you think?

Thats what I mentioned in previous comment. This library is not optimized for voice assistant. If you keep it running it will consume your battery and system resources. It also doesn't support any sleep or wakeup on hotword functionality. You can add this functionality to your app so that you can process something on detection of any hotword while the app is running.

Ricks Film Restoration · Nov 29, 2023

Under Process Globals you should define:
Public model_folder_name As String = "model"
Private model_zip_name As String = "model.zip"
and make sure that you have renamed the chosen language model to "model.zip" (in lowercase)

PS: It might take some time before you read my reply: for some strange reason all my replies are "under moderation, awaiting approval". No idea why that is.
Is that usual behaviour in this forum as regards how new accounts are treated?

Paolo Pini · Dec 5, 2023

Hi,
this is a great library.

I have developed an app that recognises speech and responds to certain messages.
I am trying to change the speech recognition module at runtime to change the langue to recognize, I succeed but the STT engine does not restart until I restart the app.
I renaming and reloading the speech modules at runtime I tried using the following commands to try to restart the engine after the speech pattern change:

B4X:

'test combination of:

        'STT.shutdown
        'STT.stop
        'STT.Initialize("STT", File.DirInternal & "/" & model_folder_name)
        'STT.startListening(-1)

'but  this sub is only called if I restart the application:

Sub STT_ReadyToListen
    Log("READY")
    STT.stop
    If STT.startListening(-1) Then
        Log("STT ready...")
    Else
        Log("Start failed...")
        MsgboxAsync("Start failed","")
    End If
End Sub

How can I restart the SST engine at runtime after reloading the voice module?

Thanks in advance

Paolo

B4A Library SpeechToText - Continuous Offline Voice Recognition

Active Member

Active Member

Member

Expert

Member

Member

Expert

Member

Expert

Attachments

Member

Active Member

Member

Member

Member

Member

Active Member

Member

Active Member

Member

Member

Similar Threads