I'm not sure it's possible to have sample accurate sync between different devices with 100% certainty.
Of course, using more one audio card without word clock synchronization never gives perfect sync. A little more explanation. I mean using WDM/MME driver, not ASIO. My audio card has 18 mono outputs and I'd like to play multiple audio files to multiple outputs. The only solution I can see is a little bit complicated

I had a class for playing PCM wave files. I've tested that preparing buffers for all audio files (to many audio outputs) and then starting them to play simultaneously give me a perfect sample sync. The problem is playing non-PCM files ;-) A lot of work with all these format and BASS can do it

So I can decode files, mix them if I need to and then get_data to fill my class' buffers. Maybe I'll try it but... I wrote that post before deep testing your BASS_ASIO. And bingo

It can do it exactly what I need

More about it in my another post.
There's no limit, but I guess it could slightly affect performance if you have loads of them, as BASS looks through them when a syncable event occurs.
I asked about it because I had an idea to use sync events as a counter.... But it seems I have to look a better solution.
Thank you,
Jacek