You can have a mixer process 20ms blocks via the BASS_ChannelGetData "length" parameter. For example:
DWORD need = BASS_ChannelSeconds2Bytes(mixer, 0.02); // 20ms
BYTE *buf = (BYTE*)alloca(need); // allocate a buffer (it'd be better to pre-allocate and reuse a buffer)
DWORD got = BASS_ChannelGetData(mixer, buf, need); // process the mixer
The thread that a stream is created in doesn't really matter. It's more about where the stream is processed, eg. BASS_ChannelGetData called. It's fine to have multiple threads processing multiple streams simultaneously.