Author Topic: Rounding Pitch Shift  (Read 545 times)

Hanuman

  • Posts: 89
Rounding Pitch Shift
« on: 21 Aug '22 - 06:57 »
In regards to the 432hz Player, someone wrote to me with the idea of rounding the frequency scaling to the nearest fraction.

So I tried it. It does seem to give a noticeable quality improvement! Clearer sound. At the cost of very slightly off pitch and speed.

Code: [Select]
// C#
private void AdjustTempo(double speed, double rate, double pitch)
{
    if (BassActive)
    {
        // In BASS, 2x speed is 100 (+100%), whereas our Speed property is 2. Need to convert.
        // speed 1=0, 2=100, 3=200, 4=300, .5=-100, .25=-300
        var freqSrc = ManagedBass.Bass.ChannelGetAttribute(_chan, ChannelAttribute.TempoFrequency);
        var freqDst = _chanInfo.Frequency * pitch * rate;
        if (true)
        {
            var freqRatio = freqDst / freqSrc;
            var freqFraction = GetFraction(freqRatio, 0.0001);
            freqDst = freqSrc * freqFraction.Key / freqFraction.Value;
        }

        ManagedBass.Bass.ChannelSetAttribute(_chan, ChannelAttribute.Tempo, (1.0 / pitch * speed - 1.0) * 100.0);
        ManagedBass.Bass.ChannelSetAttribute(_chan, ChannelAttribute.TempoFrequency, freqDst);
    }
}

private KeyValuePair<int, int> GetFraction(double value, double tolerance = 0.02)
{
    var f0 = 1 / value;
    var f1 = 1 / (f0 - Math.Truncate(f0));

    var aT = (int)Math.Truncate(f0);
    var aR = (int)Math.Round(f0);
    var bT = (int)Math.Truncate(f1);
    var bR = (int)Math.Round(f1);
    var c = (int)Math.Round(1 / (f1 - Math.Truncate(f1)));

    if (Math.Abs(1.0 / aR - value) <= tolerance)
        return new KeyValuePair<int, int>(1, aR);
    else if (Math.Abs(bR / (aT * bR + 1.0) - value) <= tolerance)
        return new KeyValuePair<int, int>(bR, aT * bR + 1);
    else
        return new KeyValuePair<int, int>(c * bT + 1, c * aT * bT + aT + c);
}

I altered TempoFrequency. Now my question is: what should I do with Tempo? Should I round it too in the same way? Or use it to compensate for the speed alteration?

Would love some input from experts here.

Edit: I'm thinking, there are actually 2 samplerate conversions. One for pitch-shifting, and one for the output.

Let's take a sample song at 441.9hz
freqSrc = 44100
freqDst = 43111.909705760583
freqRatio = 0.977594324393664
freqFraction = 87 / 89
new freqDst = 43108.988764044945

Then, we need to resample from 43108.988764044945 to 48000 for playback. And there's also Tempo to take into the equation.

For resampling to 48000, I see 2 options.
A) Ignore it and let the system handle it
B) Find fraction rounding that work for both conversions

For Tempo, I see 2 options.
A) Compensate for speed shift
B) Round it to a fraction

Are Tempo and TempoFrequency inter-connected in a way that needs to be taken into account?

Writing this, I'm thinking the best may be to
1. Find fraction rounding that work for both sample rate conversions
2. Adjust tempo to cancel the speed difference
3. Round tempo to a fraction (optional, if slight speed shift is acceptable)

Does this whole idea make sense, or is there something I'm missing?
« Last Edit: 21 Aug '22 - 20:10 by Hanuman »

Hanuman

  • Posts: 89
Re: Rounding Pitch Shift
« Reply #1 on: 22 Aug '22 - 04:51 »
I'll bring in the discussion with the person who contacted me with this idea. Does his idea make sense? I just realized that he's doing a single resampling instead of 2; combined with 2 rate-shifts.

-----------------

OK so lets try this again... first let's start off with a single "magic numbers" equation broken down into a whole slew of examples, so please tell me if you follow before I introduce more things.

So all of the following are for source and destination sample rates that are a multiple of 22050 and the source pitch is exactly 440.00hz with a destination pitch of exactly 432.00hz, and the "magic number" equation shared by all of the following examples is:

[source sample rate] --change-speed-> [the fraction "48/49" multiplied by source sample rate] --resample-> [the fraction "440/441" multiplied destination sample rate] --change-speed-> [destination sample rate]


NOTE: this might also work for sample rates that are a multiple of 11025; back when I did the examples with "magic numbers" there was one situation I found that resulted in a decimal when involving 11025 but not 22050 but I don't remember what it was...


----------------------------------------------------------------

Source sample rate: 22050Hz
Destination sample rate: 22050Hz

Process: 22050 --change-speed-> 21600 --resample-> 22000 --change-speed-> 22050

----------------

Source sample rate: 44100Hz
Destination sample rate: 44100Hz

Process: 44100 --change-speed-> 43200 --resample-> 44000 --change-speed-> 44100

----------------

Source sample rate: 88200Hz
Destination sample rate: 88200Hz

Process: 88200 --change-speed-> 86400 --resample-> 88000 --change-speed-> 88200

----------------

Source sample rate: 176400Hz
Destination sample rate: 176400Hz

Process: 176400 --change-speed-> 172800 --resample-> 176000 --change-speed-> 176400

----------------

Source sample rate: 352800Hz
Destination sample rate: 352800Hz

Process: 352800 --change-speed-> 345600 --resample-> 352000 --change-speed-> 352800

--------------------------------

Source sample rate: 22050Hz
Destination sample rate: 44100Hz

Process: 22050 --change-speed-> 21600 --resample-> 44000 --change-speed-> 44100

----------------

Source sample rate: 44100Hz
Destination sample rate: 88200Hz

Process: 44100 --change-speed-> 43200 --resample-> 88000 --change-speed-> 88200

----------------

Source sample rate: 88200Hz
Destination sample rate: 176400Hz

Process: 88200 --change-speed-> 86400 --resample-> 176000 --change-speed-> 176400

----------------

Source sample rate: 176400Hz
Destination sample rate: 352800Hz

Process: 176400 --change-speed-> 172800 --resample-> 352000 --change-speed-> 352800

--------------------------------

Source sample rate: 44100Hz
Destination sample rate: 22050Hz

Process: 44100 --change-speed-> 43200 --resample-> 22000 --change-speed-> 22050

----------------

Source sample rate: 88200Hz
Destination sample rate: 44100Hz

Process: 88200 --change-speed-> 86400 --resample-> 44000 --change-speed-> 44100

----------------

Source sample rate: 176400Hz
Destination sample rate: 88200Hz

Process: 176400 --change-speed-> 172800 --resample-> 88000 --change-speed-> 88200

----------------

Source sample rate: 352800Hz
Destination sample rate: 176400Hz

Process: 352800 --change-speed-> 345600 --resample-> 176000 --change-speed-> 176400

--------------------------------

Source sample rate: 22050Hz
Destination sample rate: 88200Hz

Process: 22050 --change-speed-> 21600 --resample-> 88000 --change-speed-> 88200

----------------

Source sample rate: 22050Hz
Destination sample rate: 176400Hz

Process: 22050 --change-speed-> 21600 --resample-> 176000 --change-speed-> 176400

----------------

Source sample rate: 22050Hz
Destination sample rate: 352800Hz

Process: 22050 --change-speed-> 21600 --resample-> 352000 --change-speed-> 352800

----------------

Source sample rate: 44100Hz
Destination sample rate: 176400Hz

Process: 44100 --change-speed-> 43200 --resample-> 176000 --change-speed-> 176400

----------------

Source sample rate: 44100Hz
Destination sample rate: 352800Hz

Process: 44100 --change-speed-> 43200 --resample-> 352000 --change-speed-> 352800

Hanuman

  • Posts: 89
Re: Rounding Pitch Shift
« Reply #2 on: 22 Aug '22 - 18:34 »
Thinking about it. There are 2 things he's doing.

1. Doing a single resampling that gives the exact output rate
2. Doing the tempo resampling on nearest fraction

I probably could do it in 2 operations
- Rate shift (lossless, precise value)
- Tempo shift (Tempo, TempoFrequency, round both to nearest fraction)

Just need to figure out the math. Let's identity variables.

I = Input sample rate (44100 / 48000)
O = Output sample rate (48000)
P = Pitch shift (432/440)
R = Rate shift, precise value
T = Tempo shift, rounded to nearest fraction
F = Frequency shift, rounded to nearest fraction

Given I, O and P, I must find the math to calculate R, T and F.

Considering T and F are rounded, I probably have to calculate those first, and then compensate for the speed shift in R. The trade-off will be a slight difference in speed OR pitch; since R is an exact value, we can probably keep exact speed but with a slight pitch alteration. Source pitch is generally not exact anyway so it's no big deal.

The more I think of it, perhaps the idea makes sense after all. Nobody ever thought about this?

The challenge is that R directly affects T and F, so there's a circular dependency between these variables! It's very possible that there's not just 1 solution but multiple solutions per problem, and that we have to evaluate which one is the best... (which solution has lowest pitch alteration)
« Last Edit: 22 Aug '22 - 19:30 by Hanuman »

Hanuman

  • Posts: 89
Re: Rounding Pitch Shift
« Reply #3 on: 22 Aug '22 - 19:42 »
The reason he was doing 2 separate rate shift is because he could only set sample rate to rounded values (eg. 42180 and not 42180.42686).

BASS does seem to allow precise frequency value so that's not an issue here.

The only 2 variables that alter speed are R and T, so T = -R. That simplifies things.

Hanuman

  • Posts: 89
Re: Rounding Pitch Shift
« Reply #4 on: 22 Aug '22 - 20:48 »
I'm getting closer.

Ian, I have this to do a resampling. How do I apply a lossless rate shift that alters both pitch and frequency?

Code: [Select]
var sampleRate = 48000;
var chanMix = BassMix.CreateMixerStream(sampleRate, chanInfo.Channels, BassFlags.MixerEnd | BassFlags.Decode).Valid();
BassMix.MixerAddChannel(chanMix, chan, BassFlags.MixerChanNoRampin | BassFlags.AutoFree);

I think I got it.

I can achieve the desired outcome in 3 steps:
1. Resample (rounded) from 44100 to 48888.888
2. Shift Rate (rounded) by 432/440 to get 48000 with desired pitch
3. Restore tempo (rounded)

Because restoring tempo must be rounded, and Rate / Tempo being the only 2 altering tempo, then Rate must be rounded too. Thus I will not be able to maintain exact playback speed and there will be a slight difference. Unless I apply no rounding here because we're starting off a fraction anyway. 432 / 441.6 becomes a pretty ugly fraction though...

Considering 432/440 is a fraction to begin with, does it make any difference at all? There is no rounding happening on that... but the resampling ratio from 44100 to 48000 with a 432/440 pitch is 0.902045 rounding to 46/51 = 0.901960. So yes rounding still occurs and it will make a difference.

Here's the actual math.
I-nput = 44100
O-utput = 48000
P-itch = 432/440
S-ampling = O / I / P = 1.108592 = 10/9 = 1.111111 (rounded Sr = 0.0025 error)
R-ate shift = P / (1 + Sr) = 0.981818 * 1.0025 = 0.979370
T-empo shift = -R = -0.981818 = -95 / 97 = -0.979381 (rounded Tr = -0.000031 error)

Here, the Sampling rounding will be absorbed by Rate shift, resulting in the speed and pitch being slightly off.

In this case, we get 2 roundings: at the resampling phase and at the tempo shift phase. Pitch is off by 0.25%, and Tempo is off by 0.0031%. For better quality, that's a good compromise. There should be a better sampling rounding ratio than 10/9 to reduce the error though; the fraction-seeking algorithm may be flawed.
« Last Edit: 22 Aug '22 - 21:52 by Hanuman »

Hanuman

  • Posts: 89
Re: Rounding Pitch Shift
« Reply #5 on: 22 Aug '22 - 23:16 »
I got a better fraction-seeking algorithm now.

I = 44100
O = 48000
P = 432 / 441.6
Sampling = 1.112622 = 49 / 44 = 1.113636, Error = 0.001013
Rate = 0.980824
Tempo = 49 / 50 = -0.979999, Error = 0.000824

If my math is right, this gives a pitch error of 0.101% and a speed error of 0.082%. Perfectly acceptable! Unless you want a drop-in replacement for existing audio.

Adding an option to skip the tempo correction is doable. I could have an option Tempo Correction: Exact | Optimized | None.

Hanuman

  • Posts: 89
Re: Rounding Pitch Shift
« Reply #6 on: 25 Aug '22 - 01:58 »
Got it working. The idea was not crazy after all, and the sound clarity improvement is impressive!

I create _chanSrc at 44100, _chanMix (BassMix) at 48000, and _chanOut (BassFx).

Apparently the result will be even better with higher output sampling rate, less rounding errors.

Code: [Select]
// Optimized pitch shifting: increased quality at the cost of pitch/speed rounding error.
// I = Input sample rate    (44100)
// O = Output sample rate   (48000)
// P = Pitch shift          (432/440)
// Pitch shifting steps:
// 1. Resample: O / I / P (round to closest fraction, eR = error)
// 2. Rate shift: P / (1 + eR), should give 48000 output
// 3. Tempo adjustment: -R (round to closest fraction, eT = error) -- skip if EffectsTempoCompensation = None
var freqSrc = _chanInfo.Frequency;
var freqOut = ManagedBass.Bass.GetInfo(out var info) ? info.SampleRate : 48000;
var s = Fraction.RoundToFraction((double)freqOut / freqSrc / Pitch, out var eS);
var r = Pitch / (1 + eS);
var t = Fraction.RoundToFraction(-r, out var eT);

// 1. Resampling to output in _chanMix constructor
// 2. Rate Shift
ManagedBass.Bass.ChannelSetAttribute(_chanOut, ChannelAttribute.Frequency, freqOut * r);
// 3. Tempo adjustment
ManagedBass.Bass.ChannelSetAttribute(_chanOut, ChannelAttribute.Tempo,
    EffectsTempoCompensation == TempoCompensation.Optimized ? (1.0 / -t - 1.0) * 100.0 : 0);

Hanuman

  • Posts: 89
Re: Rounding Pitch Shift
« Reply #7 on: 26 Aug '22 - 20:39 »
Yup I've done the tests.

The new algorithm without rounding gives a very slight improvement due to running Tempo directly at the output sample rate.

Adding fraction rounding makes a HUGE difference though! Remains to test how the rounding precision affects the quality.

EDIT

Well well... it seems resampling and tempo shift use different algorithms. Resampling benefits GREATLY from rounding to the closest fraction. Tempo shift? Not so much! at 1.4% tempo difference, I barely notice any difference; at 1.9% difference, I do detect a slight benefit... but really not much.

Perhaps best to limit fraction-rounding to the resampling phase only.

Tempo DOES benefit from running at 48000hz though! Instead of resampling from 44100 to 43xxx and then resampling to 48000.
« Last Edit: 27 Aug '22 - 01:51 by Hanuman »

Hanuman

  • Posts: 89
Re: Rounding Pitch Shift
« Reply #8 on: 27 Aug '22 - 21:05 »
My math was wrong; 's' wasn't even used at all! It kind-of worked but with wrong math.

Now that I understand the order of operations, it can be simplified. Pitch error is smaller, and the quality is better. No longer need to alter tempo as the resampling rounding is also the same ratio applied for tempo adjustment.

Code: [Select]
// Optimized pitch shifting for increased quality
// 1. Rate shift to Output * Pitch (rounded)
// 2. Resample to Output (48000hz)
// 3. Tempo adjustment: -Pitch
var freqOut = _deviceInfo.SampleRate;

var r = pitch * rate;
if (EffectsRoundPitch)
{
    r = Fraction.RoundToFraction(r, .005);
}
var t = r / speed;

// 1. Rate Shift (lossless)
ManagedBass.Bass.ChannelSetAttribute(_chanOut, ChannelAttribute.Frequency, freqOut * r);
// 2. Resampling to output in _chanMix constructor
// 3. Tempo adjustment
ManagedBass.Bass.ChannelSetAttribute(_chanOut, ChannelAttribute.Tempo,
    !EffectsSkipTempo ? (1.0 / t - 1.0) * 100.0 : 0);

Rounding fraction using this algorithm. (only one that worked)

Overall, I get less than 0.2hz pitch difference for increased quality.