I'm developing a rhythm game which includes singing and I am trying to monitor a microphone recording so that the user can hear themselves sing. I've had trouble getting this to work on Linux previously as there's some limitations on Linux compared to Windows (simultaneous recordings are hit or miss, can't retrieve recording device info for things like supported formats and channel count) but after some redesign I'm finally able to get microphones working on Linux.
But now I've ran into a problem with the monitoring stream. When the user is not singing anything or producing any meaningful sound input, a loud static crackling sound begins to be heard from the monitoring channel. This behavior can only be observed on Linux systems, Windows systems are not affected by this. I've narrowed the problem down to 2 things: A reverb effect applied on the monitoring channel, and fetching data (BASS_ChannelGetData) from a secondary decoding push stream.
The setup for this system goes as follows:
1. Create a push stream for the monitoring channel.
2. Apply BASS_FX Freeverb to the monitoring channel, and a DSP to apply gain.
3. Begin playing the monitoring channel.
4. Initialise a recording device and start recording from it.
5. Create a decoding push stream for batching data for pitch detection.
6. Apply 2 BASS_FX PeakEQ effects to the pitch detection channel, one to cut off low frequencies and the other to cut off high frequencies.
The RECORDPROC callback function for the recording stream works as follows:
1. Push the sample data to the monitoring channel so it is immediately feedbacked to the user.
2. Push the sample data to the pitch processing decoding channel.
3. Check if 40ms has passed by accumulating the recording period until it reaches 40ms.
4. If 40ms has passed, all the data buffered in the pitch processing channel is fetched with BASS_ChannelGetData and then passed into a pitch detection algorithm.
I "batch" the data every 40ms because the pitch detection algorithm is ran 25 times a second. That is why a decoding push stream is used here, to place data in ready to be decoded every 40ms (and to pass it through the EQ effects)
Here's a sample of the code. It's written in C# using the ManagedBass wrapper library but it should make sense from a C perspective.
The stream setup code:
int monitorHandle = 0;
int monitorGainHandle = 0;
int recordingHandle = 0;
int pitchProcessingHandle = 0;
int timeAccumulated = 0;
int processingBufferLength = 0;
void Setup()
{
monitorHandle = Bass.CreateStream(44100, 1, BassFlags.Default, StreamProcedureType.Push);
ReverbParameters reverbParameters = new()
{
fDryMix = 0.3f, fWetMix = 1f, fRoomSize = 0.4f, fDamp = 0.7f
};
// Problematic effect. If this effect is not added, crackling is not heard on the monitoring stream.
int reverbHandle = Bass.ChannelSetFX(monitorHandle, EffectType.Freeverb, 1);
Bass.FXSetParameters(reverbHandle, reverbParameters);
monitorGainHandle = Bass.ChannelSetDSP(monitorHandle, ApplyGain);
Bass.ChannelPlay(monitorHandle);
// Recording setup
Bass.RecordInit(1);
recordingHandle = Bass.RecordStart(44100, 1, BassFlags.Default, 10, ProcessRecordingData, IntPtr.Zero);
pitchProcessingHandle = Bass.CreateStream(44100, 1, BassFlags.Decode, StreamProcedureType.Push);
PeakEQParameters lowEqParameters = new()
{
fBandwidth = 2.5f, fCenter = 20f, fGain = -10f
};
PeakEQParameters highEqParameters = new()
{
fBandwidth = 2.5f, fCenter = 10_000f, fGain = -10f
};
int lowEqHandle = Bass.ChannelSetFX(pitchProcessingHandle, EffectType.PeakEQ, 0);
int highEqHandle = Bass.ChannelSetFX(pitchProcessingHandle, EffectType.PeakEQ, 0);
Bass.FXSetParameters(lowEqHandle, lowEqParameters);
Bass.FXSetParameters(highEqHandle, highEqParameters);
}
And the code for the RECORDPROC callback:
bool ProcessRecordingData(int handle, IntPtr buffer, int length, IntPtr user)
{
// Copies the data from the recording buffer to the monitor playback buffer.
Bass.StreamPutData(monitorHandle, buffer, length);
// Copy the data to the pitch processing handle to apply EQ FX
Bass.StreamPutData(pitchProcessingHandle, buffer, length);
timeAccumulated += 10;
processingBufferLength += length;
// 40ms has passed so get the data and run pitch detection
if (timeAccumulated >= 40)
{
unsafe
{
// Allocate a buffer on the stack
byte* procBuff = stackalloc byte[processingBufferLength];
// This call produces crackling on the monitoring stream (only on Linux)
Bass.ChannelGetData(pitchProcessingHandle, (IntPtr) procBuff, processingBufferLength);
int shortLength = processingBufferLength / sizeof(short);
var readOnlySpan = new ReadOnlySpan<short>(procBuff, shortLength);
// Calculate pitch
CalculatePitchAndAmplitude(readOnlySpan);
}
timeAccumulated = 0;
processingBufferLength = 0;
}
return true;
}
I've done some testing and narrowed down to where the crackling comes from. In the RECORDPROC callback, commenting out this line gets rid of the crackling sound:
// This call produces crackling on the monitoring stream (only on Linux)
Bass.ChannelGetData(pitchProcessingHandle, (IntPtr) procBuff, processingBufferLength);
As well as this, in the stream setup, commenting out the lines which add the Freeverb effect to the monitoring channel also removed the static sound.
// Problematic effect. If this effect is not added, crackling is not heard on the monitoring stream.
int reverbHandle = Bass.ChannelSetFX(monitorHandle, EffectType.Freeverb, 1);
Bass.FXSetParameters(reverbHandle, reverbParameters);
I've been trying to figure this out for around a day now and I'm stumped as to what could be causing the static. It's even weirder that this behavior does not happen on Windows, but only Linux. I'm not sure why the call to BASS_ChannelGetData on the pitch processing push stream can cause static crackling on the monitoring channel, nor why removing the Freeverb effect makes it go away.
I've attached a short mp4 clip of what the static sounds like in a zip file. Not heard in the clip is the sound going away when you begin to talk or sing into the microphone. When the user does that, they can hear themselves normally but quickly after they go quiet the static comes back.