Author Topic: Detect silence using DSPProc to find cuepoint  (Read 2009 times)

johan

  • Posts: 18
I've seen some code on the forums that detect how many milliseconds of silence there is before the sound goes above a dB threshold. The code is however loading the file from disk and uses ChannelGetData to process the samples.

How can I detect how many milliseconds/bytes of sound below a dB threshold has passed in a DSPProc using 32-bit floating point data? I need to do this since I'm streaming the file from the network.

Ian @ un4seen

  • Administrator
  • Posts: 20437
Re: Detect silence using DSPProc to find cuepoint
« Reply #1 on: 2 Aug '11 - 17:57 »
Could you not just count the data that the DSPPROC has received up until its level hits the threshold? :)

I guess this is the old thread that you found on the subject of silence detection...

   www.un4seen.com/forum/?topic=784

To transfer that to a DSPPROC, you would take the code following the BASS_ChannelGetData call (no need for that call as you already have data via the DSPPROC parameters). If you would like to translate the byte count to milliseconds, you can use BASS_ChannelBytes2Seconds (and mulitply by 1000).

johan

  • Posts: 18
Re: Detect silence using DSPProc to find cuepoint
« Reply #2 on: 3 Aug '11 - 15:10 »
Could you not just count the data that the DSPPROC has received up until its level hits the threshold? :)

I guess this is the old thread that you found on the subject of silence detection...

   www.un4seen.com/forum/?topic=784

To transfer that to a DSPPROC, you would take the code following the BASS_ChannelGetData call (no need for that call as you already have data via the DSPPROC parameters). If you would like to translate the byte count to milliseconds, you can use BASS_ChannelBytes2Seconds (and mulitply by 1000).

I tried using the code below, but it's triggering too fast, it's missing like 30-40 ms of complete silence. Does it have something to do with buffer lengths etc.? The DSP is on a decoding channel. I want it to trigger at a rather high volume level, e.g. a kick. I'm trying to find the first beat of a house track.

Should I use a short instead of a byte for the buffer? What are some appropriate numbers for a threshold? Is something invalid in the code below?

Thanks so much for your reply!

Code: [Select]
void CALLBACK countSilenceCallback(HDSP handle, DWORD channel, void *buffer, DWORD length, void *user)
{
    Channel *self = user;
   
    BYTE *buf = (BYTE*)buffer;
    int threshold = 254;
    int a,b = length; // decode some data
    b /= 2; // bytes -> samples
    for (a=0; a < b && abs(buf[a]) <= threshold; a++) ; // count silent samples
    self.cueByte += a * 2; // add number of silent bytes

    if (a < b) { // sound has begun!
        self.firstBeatPosition = BASS_ChannelBytes2Seconds(channel, self.cueByte);
        NSLog(@"Cue at byte: %i", self.cueByte);
        NSLog(@"Cue at position: %f", self.firstBeatPosition);
        BASS_ChannelRemoveDSP(channel, handle);
    }
}

Ian @ un4seen

  • Administrator
  • Posts: 20437
Re: Detect silence using DSPProc to find cuepoint
« Reply #3 on: 3 Aug '11 - 17:55 »
Should I use a short instead of a byte for the buffer? What are some appropriate numbers for a threshold? Is something invalid in the code below?

It will depend on the sample format. If it's 16-bit (and BASS_CONFIG_FLOATDSP is disabled), then "short" would indeed be the type to use. If it's floating-point (or BASS_CONFIG_FLOATDSP is enabled), then you would use "float" instead. You mentioned floating-point data in the original post, so in that case, the code could be modified like this...

Code: [Select]
...
    float *buf = (float*)buffer;
    float threshold = ...;
    int a,b = length; // decode some data
    b /= sizeof(float); // bytes -> samples
    for (a=0; a < b && fabs(buf[a]) <= threshold; a++) ; // count silent samples
    self.cueByte += a * sizeof(float); // add number of silent bytes
...

johan

  • Posts: 18
Re: Detect silence using DSPProc to find cuepoint
« Reply #4 on: 4 Aug '11 - 09:59 »
Should I use a short instead of a byte for the buffer? What are some appropriate numbers for a threshold? Is something invalid in the code below?

It will depend on the sample format. If it's 16-bit (and BASS_CONFIG_FLOATDSP is disabled), then "short" would indeed be the type to use. If it's floating-point (or BASS_CONFIG_FLOATDSP is enabled), then you would use "float" instead. You mentioned floating-point data in the original post, so in that case, the code could be modified like this...

Code: [Select]
...
    float *buf = (float*)buffer;
    float threshold = ...;
    int a,b = length; // decode some data
    b /= sizeof(float); // bytes -> samples
    for (a=0; a < b && fabs(buf[a]) <= threshold; a++) ; // count silent samples
    self.cueByte += a * sizeof(float); // add number of silent bytes
...

These changes made it work exactly as I wanted. Many thanks Ian!  ;D

johan

  • Posts: 18
Re: Detect silence using DSPProc to find cuepoint
« Reply #5 on: 8 Aug '11 - 14:36 »
If I would like to measure the levels of a certain range of frequencies in the DSPProc, I assume I need the FFT data. How can I calculate the FFT data and measure the levels of just the lower frequencies in a DSPProc?

Ian @ un4seen

  • Administrator
  • Posts: 20437
Re: Detect silence using DSPProc to find cuepoint
« Reply #6 on: 8 Aug '11 - 16:40 »
To get frequency information within the DSPPROC, you could use a push stream, ie. feed the sample data to it and request FFT data back from it. For example, it could look something like this...

Code: [Select]
BASS_CHANNELINFO ci;
BASS_ChannelGetInfo(stream, &ci); // get stream format info
// create push stream with same format (replace "ci.flags" with BASS_SAMPLE_FLOAT if FLOATDSP is enabled)
fftstream=BASS_StreamCreate(ci.freq, ci.chans, ci.flags|BASS_STREAM_DECODE, STREAMPROC_PUSH, 0);

...

void CALLBACK DspProc(HDSP handle, DWORD channel, void *buffer, DWORD length, void *user)
{
float fft[512];
// calculate how much data is needed for the FFT processing (just use "sizeof(float)" if FLOATDSP is enabled)
DWORD fftneed=1024*ci.chans*(ci.flags&BASS_SAMPLE_FLOAT?sizeof(float):sizeof(short));
BASS_StreamPutData(fftstream, buffer, min(fftneed, length)); // feed the data to the push stream
BASS_ChannelGetData(fftstream, fft, BASS_DATA_FFT1024); // perform a 1024 sample FFT
// do something with the FFT data in "fft"...
}

If you would like to process/inspect the entire buffer, you could use multiple calls...

Code: [Select]
void CALLBACK DspProc(HDSP handle, DWORD channel, void *buffer, DWORD length, void *user)
{
BASS_StreamPutData(fftstream, buffer, length); // feed the data to the push stream
float fft[512];
while (1) {
int r=BASS_ChannelGetData(fftstream, fft, BASS_DATA_FFT1024); // perform a 1024 sample FFT
if (r<=0) break; // processed all of the data (or an error)
// do something with the FFT data in "fft"...
}
« Last Edit: 9 Aug '11 - 17:43 by Ian @ un4seen »