Author Topic: SVP performance questions  (Read 323 times)

Rah'Dick

  • Posts: 980
SVP performance questions
« on: 28 Sep '20 - 12:22 »
Hey!

I've been tinkering with a Vis for a while and a few questions popped up lately, maybe someone with C++ knowledge can help me? (Ian?)

Performance
My vis draws to each pixel of the Video array once per frame, in order to fill it with a color. This works reasonably for lower resolutions, but as soon as I cross the FullHD border on my 4K screen, the framerate drops dramatically. I'm not really familiar with C/C++, is there maybe some kind of memory trick that could accelerate filling the entire array with the same value?
I'm using this code currently:
Code: [Select]
unsigned long rgbVal = RGB(bVal, gVal, rVal);
int len = width*height;
for(i=0;i<len;i++) {
Video[i] = Video[i]|rgbVal;
}

(In before "buy a new PC, hurr hurr")

Resolution restriction by XMPlay
To overcome the performance issue, I'm currently using the "Restrict vis size" setting, but I noticed it doesn't go above 999 in width or height. You've probably guessed it - I'd like to restrict it to 1920x1080 to have perfect pixel doubling for the vis, but I had to settle for 960x540...

Song change/skip detection
I also want to reset a couple of variables when a song change or skip is happening. Is there a way for the vis to query this? I think the QueryInterface is the right thing to use, but I haven't seen code examples for this yet (and I'm not primarily a coder). Can someone help me with this?

Thanks in advance!
Many greets,
Thomas

Ian @ un4seen

  • Administrator
  • Posts: 23037
Re: SVP performance questions
« Reply #1 on: 28 Sep '20 - 15:53 »
Performance
My vis draws to each pixel of the Video array once per frame, in order to fill it with a color. This works reasonably for lower resolutions, but as soon as I cross the FullHD border on my 4K screen, the framerate drops dramatically. I'm not really familiar with C/C++, is there maybe some kind of memory trick that could accelerate filling the entire array with the same value?
I'm using this code currently:
Code: [Select]
unsigned long rgbVal = RGB(bVal, gVal, rVal);
int len = width*height;
for(i=0;i<len;i++) {
Video[i] = Video[i]|rgbVal;
}

Are you sure you need the "or" there? Perhaps you can just set like this:

Code: [Select]
unsigned long rgbVal = RGB(bVal, gVal, rVal);
int len = width*height;
for(i=0;i<len;i++) {
Video[i] = rgbVal;
}

To speed it up further, you could try using SSE2, something like this:

Code: [Select]
unsigned long rgbVal = RGB(bVal, gVal, rVal);
__m128i rgbVec = _mm_set1_epi32(rgbVal);
int len = width*height;
for(i=0; i<len-3; i+=4)
_mm_storeu_si128(Video + i, rgbVec);
for(; i<len; i++)
Video[i] = rgbVal;

Song change/skip detection
I also want to reset a couple of variables when a song change or skip is happening. Is there a way for the vis to query this? I think the QueryInterface is the right thing to use, but I haven't seen code examples for this yet (and I'm not primarily a coder). Can someone help me with this?

It isn't possible to get notifications of song changes via VisQueryInterface but you can poll for changes, something like this:

Code: [Select]
char *filename = queryinterface->QueryString("currentsongfilename"); // get playing filename
if (!currentfilename || strcmp(filename, currentfilename)) { // it's new
queryinterface->FreeString(currentfilename); // free old filename
currentfilename = filename; // update filename
// do whatever else you need here...
} else
queryinterface->FreeString(filename); // free unchanged filename

You would also call queryinterface->FreeString(currentfilename) when your plugin is unloaded.

Another option is to make your plugin a general XMPlay plugin (instead of a Sonique vis plugin), like the Winamp vis wrapper does. This will allow you to receive notifications of track changes from XMPlay, and use OpenGL/DirectX acceleration if you like (you create your own vis window). For reference, the Winamp vis wrapper source is available here:

   https://github.com/schellingb/xmp-wavis

Rah'Dick

  • Posts: 980
Re: SVP performance questions
« Reply #2 on: 28 Sep '20 - 19:06 »
That's awesome advice, thank you!

The "or" in the draw loop is currently used to add the color to all pixels that haven't been drawn on in the current frame. I'm drawing a few graphs before filling, but that's a leftover from a (very) old project that this is based on. I intend to move the graph stuff to the right location later, but the current slowdowns are so substantially related to resolution that I wanted to find a fix for that first.

I'll try the SSE2 later, thanks!

[Edit]
I can't get the SSE2 part to work, something about being unable to convert unsigned long int* to __m128i* and I don't understand that stuff at all???

The "or" thing doesn't make a difference, by the way. I've added a frame time counter and don't notice a difference. The raw size of the vis makes the FPS drop by more than half if it gets larger than 1080p tho.
« Last Edit: 28 Sep '20 - 22:47 by Rah'Dick »

Ian @ un4seen

  • Administrator
  • Posts: 23037
Re: SVP performance questions
« Reply #3 on: 29 Sep '20 - 14:47 »
The "or" in the draw loop is currently used to add the color to all pixels that haven't been drawn on in the current frame. I'm drawing a few graphs before filling, but that's a leftover from a (very) old project that this is based on. I intend to move the graph stuff to the right location later, but the current slowdowns are so substantially related to resolution that I wanted to find a fix for that first.

Are you doing multiple passes per-frame? If so, perhaps you can combine them into a single pass to minimize memory accesses? Also make sure you have compiler optimizations enabled.

For reference, are other Sonique vis plugins performing well for you at higher resolutions?

I can't get the SSE2 part to work, something about being unable to convert unsigned long int* to __m128i* and I don't understand that stuff at all???

Oh yeah, I forgot a type cast. Change that line to this:

Code: [Select]
_mm_storeu_si128((__m128i*)(Video + i), rgbVec);

If you want to switch back to using "or" then you can change that line to this:

Code: [Select]
_mm_storeu_si128((__m128i*)(Video + i), _mm_or_si128(_mm_loadu_si128((__m128i*)(Video + i)), rgbVec));

Rah'Dick

  • Posts: 980
Re: SVP performance questions
« Reply #4 on: 29 Sep '20 - 18:13 »
Ok, the compiler doesn't complain now anymore, but instead the vis crashes instantly when I call "_mm_set1_epi32(rgbVal)". I'm using GCC 4.9.2 32-bit, that comes with Dev-C++. *sigh*

Ian @ un4seen

  • Administrator
  • Posts: 23037
Re: SVP performance questions
« Reply #5 on: 30 Sep '20 - 13:28 »
Strange. What CPU does your PC have? It looks like the _mm_set1_epi32 intrinsic may be converted to an AVX2 instruction if enabled in the compiler. Check that the compiler's CPU/instruction settings don't exceed your CPU. If there are no such options,  try adding "-march=core2". If it still crashes, please check the assembly code at the crash location (in the debugger) to confirm what it is.

Rah'Dick

  • Posts: 980
Re: SVP performance questions
« Reply #6 on: 1 Oct '20 - 11:24 »
It's an older Core i7, but I can try the different compiler options. Dev-C++ has a missing library for the debugger, I'd have to fix that first. Also, I have no experience with assembly whatsoever, so even if I get it running, I wouldn't know what to do with it. :-[

Reading my replies again, it sure sounds like I shouldn't be messing around with software development at all... :-X

Ian @ un4seen

  • Administrator
  • Posts: 23037
Re: SVP performance questions
« Reply #7 on: 1 Oct '20 - 15:45 »
I was just thinking that you would copy'n'paste the disassembly here to see if there is indeed an AVX2 instruction :) ... The older i7 CPUs don't support AVX2 instructions, so that's probably it. What CPU options does the Dev-C++ IDE give? If it doesn't have any, you can try adding "-march=core2" to the GCC compiler options (to limit it to instructions on the Core2 CPU).

Rah'Dick

  • Posts: 980
Re: SVP performance questions
« Reply #8 on: 3 Oct '20 - 14:12 »
I tried the march=core2 argument, but it didn't change anything. Guess I'll just switch to the free version of Visual Studio, Dev-C++ seems to be too limited and/or buggy.

Ian @ un4seen

  • Administrator
  • Posts: 23037
Re: SVP performance questions
« Reply #9 on: 5 Oct '20 - 14:04 »
Just to be sure, did you include the leading "-" in "-march=core2"? If so, perhaps it isn't an AVX2 instruction then, although I can't think of any other reason for _mm_set1_epi32 to fail. What type of crash is it, eg. illegal instruction or access violation or something else? If you can get the debugger working and copy'n'paste the disassembly, I'll check if it is an AVX2 instruction or something else.