This is a wrapper of Acephei VOSK , With this, you can add continuous offline speech recognition feature to your application,

NOTE:
  1. As it works offline the app should be complied with the voice model. It will increase the app size by 30-40Mb.
  2. The accuracy depends on the voice model. You can train your own voice model. For more details check the models download link below.
  3. Remember to add RECORD_AUDIO permission.
How to use:
  1. Download the required voice model from here.
  2. Change the file name to a simple one like "model.zip"
  3. Copy it to the Files folder of your project.
  4. Now to use that model check the attached example.

SpeechToText

Author:
@Biswajit
Version: 1.5
  • SpeechToText
    • Events:
      • Error (message As String)
      • FinalResult (text As String)
      • MicrophoneBuffer (buffer() As Byte)
      • PartialResult (text As String)
      • Paused (paused As Boolean)
      • ReadyToListen
      • ReadyToListenEx new
      • ReadyToRead
      • Restarted
      • Result (text As String)
    • Fields:
      • sampleRate As Int
        Default 16000
    • Functions:
      • cancel As Boolean
        Cancel microphone recognition. Do not post any new events, simply cancel processing.
        Does nothing if recognition is not active.
        Return type: @return:true if recognition was actually stopped
      • FeedExternalBuffer (ExBuffer As Byte()) new
        For recognizing the external audio buffer, feed the buffer here.
        ExBuffer: The external audio byte buffer.
      • Initialize (eventName As String, modelPath As String)
        Initialize the object.
        eventName: The event name prefix.
        modelPath: The model folder path.
      • pause (pause As Boolean)
        Pause microphone recognition.
        pause: Pass true to pause and false to continue.
      • prepareAudioFile (audioPath As String, predefinedWords As String)
        Prepare the audio file for recognition. On success Eventname_ReadyToRead event will be raised.
        Call startReading to start reading the file.
        audioPath: Audio file path.
        predefinedWords: Add some predefined words/phrase as JSON string. Can be blank.
      • prepareListenerEx (predefinedWords As String) new
        Prepare the listener for external audio buffer. On success Eventname_ReadyToListenEx event will be raised.
        Call startListeningEx to start listening.
        predefinedWords: Add some predefined words/phrase as JSON string. Can be blank.
      • prepareMicrophone (predefinedWords As String)
        Prepare the microphone for listening. On success Eventname_ReadyToListen event will be raised.
        Call startListening to start listening.
        predefinedWords: Add some predefined words/phrase as JSON string. Can be blank.
      • reset
        Resets microphone recognizer in a thread, starts microphone recognition over again
      • shutdown
        Shutdown the microphone recognizer and release the recorder.
        Call this on activity or service closing event.
      • startListening (timeout As Int) As Boolean
        Starts microphone recognition. After specified timeout listening stops and the
        endOfSpeech signals about that. Does nothing if recognition is active.
        timeout: timeout in milliseconds to listen. -1 = infinite;
        Return type: @return:true if recognition was actually started
      • startListeningEx As Boolean new
        Starts external audio buffer recognition.
        Return type: @return:true if recognition was actually started
      • startReading (timeout As Int) As Boolean
        Starts file recognition. After specified timeout listening stops and the
        endOfSpeech signals about that. Does nothing if recognition is active.
        timeout: timeout in milliseconds to listen. -1 = infinite;
        Return type: @return:true if recognition was actually started
      • stop As Boolean
        Stops microphone/file recognition. Listener should receive final result if there is
        any. Does nothing if recognition is not active.
        Call this on activity or service closing event.
        Return type: @return:true if recognition was actually stopped
Downloads:
  1. Library
  2. Example
  3. Voice Model
  4. Test app
Update:
  • Version 1.1:
    1. Added audio file to text functionality. (For now only WAV format is supported)
    2. Added predefined word/phrase detection functionality.
    3. Merged startListening and startListening2 together. Pass -1 for continuous recognition.
  • Version 1.2:
    1. Added MicrophoneBuffer event where you will receive the microphone audio buffer while using voice recognition.
  • Version 1.3:
    1. Added method to change the sampling rate.
  • Version 1.4:
    1. Fixed the app crashing issue while calling shutdown without stating the recognizer
  • Version 1.5:
    1. Added option to feed external audio buffer. Instead of using the internal audio recorder you can feed external audio buffer from another audio source.
      (Check the latest example project)
    2. Updated VOSK and JNA library. (Please delete old dependencies before coping the new ones.)

If you like my work, please donate. Your donations will encourage me to add more features in the future.

 
Last edited:

Biswajit

Active Member
Licensed User
Longtime User

Biswajit

Active Member
Licensed User
Longtime User
Why are you sharing these videos? If you want to add the functionality you have to think of a solution. If you are facing a problem post that to the forum directly. I am not gonna check the code from the video and do it for you as it's not related to this library.

This is a simple example of adding space/dot when there is no speech detected by the library,
B4X:
Sub STT_PartialResult(text As String)
    If text.Trim.length > 0 Then
        partialResultBox.Text = text
        timer.Enabled = False
    Else
        If StopBtn.Enabled Then timer.Enabled = True
    End If
End Sub

Sub timer_Tick
    ''add space/dot when the partial result length is 0
    resultBox.Text = resultBox.Text & " . "
End Sub
 

Attachments

  • Screenshot_20221103-220549.jpg
    Screenshot_20221103-220549.jpg
    100.1 KB · Views: 120

Biswajit

Active Member
Licensed User
Longtime User
The Test-app APK file works very well! This is really great. However, I keep getting this error when implementing this speech recognizer in my app: java.lang.UnsatisfiedLinkError: Native library (com/sun/jna/android-aarch64/libjnidispatch.so) Not found in resource path (.) I have found both .so files and have tried to include them in dozens of ways, including adding the entire "lib" folder from the APK file, modifying the path in jna.jar and using "#AdditionalRes: ..\lib", but nothing works. I have also read all the threads about inclusing .so files, but still I am stuck. Please help!
 

DonManfred

Expert
Licensed User
Longtime User
looks like there is no 64bit libjnidispatch.so inside the library.

they needs to be added to the library jar file into the right subfolder. Remember that the jar is a zip-file
Thanks. I tried to modify the jna.jar file (with zip) by adding the folder: android-aarch64 to: com/sun/jna/ and inserted both .so files there that I derived from the lib folder within the APK "Test App" file, but no success. PS: I also added this entire lib folder to my B4A project. The APK file works fine!
What I also find strange is that there is no reference anywhere to this jna.jar (nor to the vosk-adroid.aar) file. I guess it must be somewhere hidden in the SpeechToText.jar /xml library (?)
 
The needed .so is inside the AAR.
At least for
- arm64-v8a
- armeabi-v7a
- x86
- x86_64

What Device do you have?
Samsung Galaxy TAB S7 with Android-13. Shouldn't the AAR be added to my B4A project (or is it maybe used within the SpeechToText.jar /xml library)? Also why am I getting the afore mentioned error: java.lang.UnsatisfiedLinkError: Native library (com/sun/jna/android-aarch64/libjnidispatch.so) Not found in resource path (.) which seems to ignore the AAR?
 

drgottjr

Expert
Licensed User
Longtime User
i've checked my downloads:
SpeechToText_v1.3.zip and SpeechToText_v1.4.zip (the latest, as far as i am aware) both contain all the files you need to build an app. unzip the jar and put all the files in your additional libraries folder. when you build the project, just select SpeechToText from the libraries tab in the IDE.
i've had no trouble building. see https://www.b4x.com/android/forum/threads/talk-to-the-hand-the-final-frontier.143639/#content
for a comparison between android's speech recognition capabilities and vosk's. the vosk part was built with the library right out of the box.
 
  • Like
Reactions: byz
Thank you very much. Obviously I did follow all the instructions by Biswajit. The only explanation I have that it doesn't work for me is that I am still using an old B4A version (8.3). I have not done much with B4a for ages (lost interest). Sad to read that Vosk is not as good as Google. I mainly want to use Vosk for continuous speech recognition and/or to use it as a hotword detector. I was using Snowboy for that, but it doesn't work on recent Android versions.
 
Top