Both the above methods will take negligible amounts of CPU.
If you're dealing with a steady flow of sample date (eg. recording), another thing you could try is using a custom decoding stream. And set a BPM callback on that, so as you feed sample data to the stream, you will be notified of the BPM at regular intervals in the data. This method will probably take the least additional CPU of all.
HSTREAM dummystream;
DWORD CALLBACK dummystreamproc(HSTREAM handle, void *buffer, DWORD length, DWORD user)
{
// the data is already in "buffer", so just return
return length;
}
...
// create dummy stream
dummystream=BASS_StreamCreate(samplerate,BASS_STREAM_DECODE,&dummystreamproc,0); // add MONO/8BITS/FLOAT flags if appropriate
// set BPM callback on it
BASS_FX_BPM_CallbackSet(dummystream,...);
Then, to pass data through the dummy stream (and so the BPM calculating stuff too), simply do this...
BASS_ChannelGetData(dummystream,buffer,length);
This method can be used to apply DSP/FX to any sample data. I've not tried it with the BPM stuff myself, but it should work in theory