I recently heard about MFCC ("Mel Frequency Cepstral Coefficients") which captures Audio more like the Human Hearing does which can be used for Speech Recognition or other Audio Feature Analysis. It also has a lower dimension, for example 13 coefficients instead of 2048 I was told (Compared to normal FFT).
I found an Implementation that was written in C which is called libmfcc, I tried to port it to C# to use with the Bass Library.
I have now some questions which I hope the Bass Devs can answer me:
- Their FFT Example Data seems to have a different Format, Values are very high and the total 8192 FFT Values Sum to 10739.24 , how can that be?
- In their example they call the Function like the following. Why they Use 128 as FFT Array Size if they just loaded 8192 Values?
mfcc_result = GetCoefficient(spectrum, 44100, 48, 128, coeff);
After some months I am still stuck on this topic, did anyone already try to implement MFCC with Bass or can give me any other Tips regarding this topic ?