Android Question recognize a sound with the FFT library

Nkalampika · Aug 26, 2018

Hello I would like to recognize a sound (beep) that is in a wav file and compare it with a direct sound! how to do it ?

klaus · Aug 27, 2018

Not easy to do, I'm almost sure that it is not possible with the FFT library and an incoming sound in real time.

Nkalampika · Aug 27, 2018

MarkusR · Aug 27, 2018

here was a record from microphone mentioned.
https://www.b4x.com/android/forum/threads/b4x-fft-class.78797/

stevel05 · Aug 27, 2018

Identifying when a sound starts and ends is not too difficult, you would need to parse the incoming data and look for the sound start and end compared to the base silence level. Identifying what that sound is would be far more difficult. You would probably need to delve into the realms of Artificial Intelligence. I couldn't find a ready made java library that would do it.

nkala mpika · Aug 27, 2018

MarkusR · Aug 27, 2018

i think a beep is defined by some simple wave forms, fft should output the used frequencies and u define the freq. range.
for a compare i would put your wave or direct recorded into this fft and save the result, maybe in a sqlite database.
after that u can make a query that give u matching recordsets.

canalrun · Aug 27, 2018

Nkalampika said:
Hello I would like to recognize a sound (beep) that is in a wav file and compare it with a direct sound! how to do it ?

I have done that before – recognize a tone in real time using FFT's. I believe I may have used Klaus' FFT library or I may have used my own. Search this Forum for "Canalrun FFT", maybe I uploaded it, I don't remember.

I captured a short period, may be one half second, of raw, real-time microphone data, performed a 16 point FFT, computed the square root of sum magnitude squared, and checked the bin corresponding to the tone frequency that I was expecting for a power level above the computed background noise. Doing this requires a little signal processing knowledge, but it's not too bad.

I used the first FFT that detected the tone and counted the number of continuous FFT's that contained the tone to estimate the tone duration.

It can be done on a fairly recent device. I believe I was using hardware comparable to an LG G2 to do this.

Barry.

techknight · Aug 30, 2018

canalrun said:
I have done that before – recognize a tone in real time using FFT's. I believe I may have used Klaus' FFT library or I may have used my own. Search this Forum for "Canalrun FFT", maybe I uploaded it, I don't remember.

I captured a short period, may be one half second, of raw, real-time microphone data, performed a 16 point FFT, computed the square root of sum magnitude squared, and checked the bin corresponding to the tone frequency that I was expecting for a power level above the computed background noise. Doing this requires a little signal processing knowledge, but it's not too bad.

I used the first FFT that detected the tone and counted the number of continuous FFT's that contained the tone to estimate the tone duration.

It can be done on a fairly recent device. I believe I was using hardware comparable to an LG G2 to do this.

Barry.

I did a quick search, You didnt post anything of yours anywhere. Could you? I need something similar to detect a coach whistle which actually has 3 different frequencies an a beat frequency created by the 3 coupled together.

canalrun · Aug 30, 2018

techknight said:
I did a quick search, You didnt post anything of yours anywhere. Could you? I need something similar to detect a coach whistle which actually has 3 different frequencies an a beat frequency created by the 3 coupled together.

I couldn't find it online in these forums either. I did this about four years ago, the software is on another computer, and unfortunately long gone.

Thinking about what I did:

I capture data from the microphone guided by one of Erels examples. I believe you are able to specify the number of samples you want and you will receive an event with a buffer containing those samples. I chose some power of two number of points – probably 1024. Once I have the buffer of points I add the buffer to a global list.

I had a timer firing at somewhere between one and 5 ms. In the timer routine I would check if the global list contained a buffer. If it had a buffer I would do multiple sliding, probably 128 point, FFT's on the data array. I would compute the square root of the real and imaginary sum of magnitude squares and look for the tones within the resulting bins.

The only tricky part is that the number of audio samples specifies a time constraint. If you're using 44K samples per second and 1024 points, the time constraint is about 1024/44,000 = about 20 ms. You need to complete the timer FFT computations within this 20 ms.

I did get this working, but it did take some testing and tweaking.

Barry.

techknight · Aug 30, 2018

canalrun said:
I couldn't find it online in these forums either. I did this about four years ago, the software is on another computer, and unfortunately long gone.

Thinking about what I did:

I capture data from the microphone guided by one of Erels examples. I believe you are able to specify the number of samples you want and you will receive an event with a buffer containing those samples. I chose some power of two number of points – probably 1024. Once I have the buffer of points I add the buffer to a global list.

I had a timer firing at somewhere between one and 5 ms. In the timer routine I would check if the global list contained a buffer. If it had a buffer I would do multiple sliding, probably 128 point, FFT's on the data array. I would compute the square root of the real and imaginary sum of magnitude squares and look for the tones within the resulting bins.

The only tricky part is that the number of audio samples specifies a time constraint. If you're using 44K samples per second and 1024 points, the time constraint is about 1024/44,000 = about 20 ms. You need to complete the timer FFT computations within this 20 ms.

I did get this working, but it did take some testing and tweaking.

Barry.

Whew. My brain is fried. That one went over my head like a fart in a fan factory.

I'm not a math guy so its one of those projects that just continue to sit on the backburner. I appreciate the details though.

canalrun · Aug 30, 2018

techknight said:
Whew. My brain is fried. That one went over my head like a fart in a fan factory.

I'm not a math guy so its one of those projects that just continue to sit on the backburner. I appreciate the details though.

You might also have a look at the OpenCV B4A library. It will do FFT's, magnitude squared, and maybe microphone input. I've never used the B4A version of this library, but have used OpenCV in projects.

Barry.

klaus · Sep 5, 2018

I had a deeper look into your problem, the attached project is a demonstrator to your request.

To test beeps, the program has three beep mp3 files included.

w_500_4.mp3 is the reference beep it is composed by 4 frequencies (500, 1000, 1500, 2000 Hz)
w_520_4.mp3 a comparative beep it is composed by 4 frequencies (520, 1020, 1520, 2020 Hz)
w_530_4.mp3 a comparative beep it is composed by 4 frequencies (530, 1030, 1530, 2030 Hz)

Program flow:
1. Record of a sound signal (the beeps)
The program reads 8192 time samples (can be changed).
2. FFT calculation
3. Peak detection, there is a peak threshold which means that only peaks with a magnitude higher than the threshold are taken into account.
The threshold level, in the program, is 15% of the max peak level (can be changed).
4. After a click on Beep, the beep is compared to the reference beep by their number of frequency components and their frequencies.
If the number of frequency components is different, the beeps are considered being different.
If all frequencies of the different components are within a limit (25Hz in the program) the beeps are considered being the same.

Test:

1. Click on Sound
This records the reference beep.
The time signal is shown.

2. Click on FFT
Shows the FFT graph.
You see a horizontal red line, which is the peak detector threshold level.
On the right you see the detected peaks with their frequency.

3. Click on Beep
You see a red FFT graph for the generated beep.
A Toastmessage appears showing if the beep is considered being the same or not.

A click on REC records the mic input, like a spectrum analyser.

Some information about FFT.
You need to know the relationship between the sampling frequency, the number of time samples, the acquisition time and the frequency resolution.

The table below shows it:

In the first line we have 44100, which is the sampling frequency and I put it in the table only for comparison, it cannot be used for FFT calculations, the number of samples, for FFT, must be a power of 2.

I found that the number of 8192 time signal samples is a good compromise.

Acquisition time less than 200ms and a frequency resolution about 5 Hz.

Nkalampika · Sep 5, 2018

thanks

Erel · Sep 6, 2018

Impressive!

jemajuca · Sep 20, 2018

klaus said:
I had a deeper look into your problem, the attached project is a demonstrator to your request.

...

In the first line we have 44100, which is the sampling frequency and I put it in the table only for comparison, it cannot be used for FFT calculations, the number of samples, for FFT, must be a power of 2.

I found that the number of 8192 time signal samples is a good compromise.

Acquisition time less than 200ms and a frequency resolution about 5 Hz.

Hi klaus!
I need to detect a known frequency from a morse code using your example.
The morse code dot duration is 100ms and dash is 300ms and the frequency of the signal is 2KHz.
For this application it is a must to measure time the freq is being generated, or the start and ending of the pulse, or take enough measures to determine the pulse duration.
So I tested different combinations of sampling freq with signal samples, but only these three run:
11025/512
22050/1024
44100/2048
all three takes similar time, around 160ms, wich is excesive.
I think that I should select a lower signal sampling, i.e. 512 at 44100, as you commented on your post https://www.b4x.com/android/forum/threads/fft-fast-fourier-transform-library.6989/page-3#post-296146 but then the app does not run at all.
Only the three combinations listed works.
Any idea?

canalrun · Sep 20, 2018

Klaus is the person to ask, but I'll chime in since I did a similar project.

I reduced the FFT size to 64 or 16. These should be somewhat faster.

I sampled the audio at 22050, mono, I think.

I acquired an audio buffer whose size was a power of two, about one half second long.

I then computed consecutive FFTs "sliding" the start of the FFT along the buffer of data samples. The number of consecutive FFT's where there was a signal detected gave me the time length of the signal.

I also did something similar on an Arduino type processor using B4R. For that I found an integer FFT on the web that was significantly faster. That took a lot of searching, however.

I did all this four or five years ago. Unfortunately, the software I developed is long gone.

Barry.

klaus · Sep 20, 2018

I'm not sure that FFT is the best solution for this.
Couldn't you test the amplitudes of the time signal is above a given level checking if there is a signal or not.
Similar to what canalrun explined but without the FFT.

jemajuca · Sep 21, 2018

Hi.
Do you think there is any reason why I can not select i.e. 22050/512 or 44100/1024?
Klaus, yes I was thinking about that, measuring amplitudes only, but using FFT I can discrimine some noise sources.

klaus · Sep 21, 2018

Do you think there is any reason why I can not select i.e. 22050/512 or 44100/1024?

What exactly is the problem?
There could be a problem when the calculation time is longer than the aquisition time.

Android Question recognize a sound with the FFT library

Active Member

Expert

Active Member

Well-Known Member

Expert

New Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Expert

Attachments

Active Member

B4X founder

Member

Well-Known Member

Expert

Member

Expert

Similar Threads