Author Topic: Detecting Pitch  (Read 446 times)

Hanuman

  • Posts: 58
Detecting Pitch
« on: 24 Jun '22 - 04:38 »
I'm looking at the possibility of auto-detecting the exact pitch of a music so that I can tune it to 432hz more precisely -- looking to do it in .NET

Reading about FFT and various algorithms, it gets my head spinning as I have no experience in that regards.

It has been done here in Python, to detect pitch between 424hz and 448hz. Is it the best way to do it, and should I try to reproduce exactly this code?

Here there's a collection of algorithms in c++.

Could someone with experience in the field point me in the right direction as to which algorithm would be most suited for my needs? It's not for real-time, I just want to detect the pitch of a music file within a second.

And perhaps someone can explain in simple terms what it's doing so that I can convert it to .NET with BASS? The Python code is actually pretty simple, I just can't tell what it's really doing.

Code: [Select]
    # Computer FFT
    fft_real_result = np.fft.fft(samples)
    fft_freq = np.fft.fftfreq(samples_len, d=timestep)
    fft_real_normed = np.abs(fft_real_result) / len(fft_real_result)

    # Find the tuning freqency
    max_sum=0
    max_freq=0
    for frequency in np.arange(424.0, 448.1, 0.10):
        tone=tones[str(frequency)]
        sum=0
        for freq in tone:
            index = int(freq/freqstep)
            sum+=fft_real_normed[index]
        if sum > max_sum:
            max_sum=sum
            max_freq=frequency

Here's a library to achieve that task but it's for Swift library, not .NET. A similar .NET library would be the ideal solution to avoid implementation errors or deficiencies, but I haven't found any.

Any help or pointers would be greatly appreciated!

What I understand so far: BASS_ChannelGetData can provide me the FFT with flag BASS_DATA_FFT1024 (not sure what FFT size is best here). FFT1024 gives 512 values, for 44100hz sample rate, first value is the magnitude of frequencies between 0 and 43 (44100 / 2 / 512), and each value represents the magnitude of the next 43hz audio range band. Are other flags like BASS_DATA_FFT_REMOVEDC useful here?
« Last Edit: 24 Jun '22 - 05:13 by Hanuman »

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #1 on: 25 Jun '22 - 07:33 »
OK when browsing various codes, each tries to do something different. Some are for knowing what note you are singing or playing on the guitar, others to react interactively to music.

The Python script does what I need; if I can't understand the algorithm, then I translated it one line at a time. That's my best road.

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #2 on: 25 Jun '22 - 19:16 »
I'd be interested in the evolution of your project. I've been giving a thought for some time to a way to get the perceived tuning of a poorly tuned pipe organ. I think both problems are somehow related.

I think the code you posted, though it seems incomplete, aims at detecting the tuning of a single A4 note. Real music can be a lot more complex. For instance, the main key may be G#. In that case the content of A4-like notes may mostly be odd harmonics and percussion noise, which will of course be detuned from the fundamental note's pitch. And vibrato / portamento etc. effects will blurr adjacent FFT bands.

Audio-to-polyphonic-midi converters seem to address the problem in a clever way, specially those that allocate instruments to match the ADSR curve and harmonic signature of each fundamental tone in a song. I didn't dig in any code example (I don't know if there are any), so I'm just guessing. But I've heard a few converted songs and the outcome is encouraging.

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #3 on: 25 Jun '22 - 19:31 »
Oups I've accidentally erased this post while updating the code.

Code: [Select]
using ManagedBass;
// ReSharper disable LocalizableElement

namespace HanumanInstitute.BassAudio;

public class PitchDetector : IPitchDetector
{
    private readonly IFileSystemService _fileSystem;

    public PitchDetector(IFileSystemService fileSystem)
    {
        _fileSystem = fileSystem;
    }

    public async Task<float> GetPitchAsync(string filePath) =>
        await Task.Run(() => GetPitch(filePath), default).ConfigureAwait(false);
   
    public float GetPitch(string filePath)
    {
        if (!_fileSystem.File.Exists(filePath))
        {
            throw new FileNotFoundException("Source audio file was not found.", filePath);
        }

        BassDevice.Instance.Init();
       
        var chan = Bass.CreateStream(filePath, Flags: BassFlags.Float | BassFlags.Decode).Valid();
        try
        {
            var chanInfo = Bass.ChannelGetInfo(chan);
            var fft = new float[(int)32768 / 2];
            var fftBuffer = new float[fft.Length];
            var freqStep = (float)chanInfo.Frequency / fft.Length;

            var read = 0;
            var readTotal = 0;
            var maxRead = (int)Bass.ChannelSeconds2Bytes(chan, 100);
            do
            {
                read = Bass.ChannelGetData(chan, fftBuffer, (int)DataFlags.FFT32768);
                if (read > 0)
                {
                    readTotal += read;
                    for (var i = 0; i < fft.Length; i++)
                    {
                        fft[i] += fftBuffer[i];
                    }
                }
            }
            while (read > 0 && readTotal < maxRead);

            // var toneFreq = new[]{16.35f,17.32f,18.35f,19.45f,20.6f,21.83f,23.12f,24.5f,25.96f,27.5f,29.14f,30.87f,32.7f,34.65f,36.71f,38.89f,41.2f,43.65f,46.25f,49f,51.91f,55f,58.27f,61.74f,65.41f,69.3f,73.42f,77.78f,82.41f,87.31f,92.5f,98f,103.83f,110f,116.54f,123.47f,130.81f,138.59f,146.83f,155.56f,164.81f,174.61f,185f,196f,207.65f,220f,233.08f,246.94f,261.63f,277.18f,293.66f,311.13f,329.63f,349.23f,369.99f,392f,415.3f,440f,466.16f,493.88f,523.25f,554.37f,587.33f,622.25f,659.25f,698.46f,739.99f,783.99f,830.61f,880f,932.33f,987.77f,1046.5f,1108.73f,1174.66f,1244.51f,1318.51f,1396.91f,1479.98f,1567.98f,1661.22f,1760f,1864.66f,1975.53f,2093f,2217.46f,2349.32f,2489.02f,2637.02f,2793.83f,2959.96f,3135.96f,3322.44f,3520f,3729.31f,3951.07f,4186.01f,4434.92f,4698.63f,4978.03f,5274.04f,5587.65f,5919.91f,6271.93f,6644.88f,7040f,7458.62f,7902.13f};
            var toneFreq = new float[60];
            for (var i = 0; i < 60; i++)
            {
                toneFreq[i] = GetFrequency(i + 20);
            }
           
            // Find the tuning frequency
            var maxSum = 0.0f;
            var maxFreq = 0.0f;
            for (var i = 424.0f; i < 448.1f; i += 0.1f)
            {
                var sum = 0.0f;
                var tones = Array.ConvertAll(toneFreq, x => x * i / 440f);
                var lastFreq = 0f;
                foreach (var freq in tones)
                {
                    if (lastFreq > 0)
                    {
                        var index = (int)(freq / freqStep);
                        var factor = (freq - lastFreq) / freqStep;
                        sum += fft[index] * factor;
                    }
                    lastFreq = freq;
                }
                if (sum > maxSum)
                {
                    maxSum = sum;
                    maxFreq = i;
                }
            }
            return maxFreq;
        }
        finally
        {
            Bass.StreamFree(chan);
        }
    }

    private float GetFrequency(int keyIndex) => (float)Math.Pow(2, (keyIndex - 49) / 12.0) * 440;
}
« Last Edit: 26 Jun '22 - 17:26 by Hanuman »

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #4 on: 25 Jun '22 - 23:25 »
I have this idea, that accumulating all the bands with equal tuning (not just half a tone around A4 but all 8 octaves used in music), in a different group, one of the groups will have more content than all the others. This group will correspond to the most likely tuning.

Let's say you make 100 groups spaced 1 comma. Then your first group will be the sum of the level of all the bands whose frequency is any temperated note (there'll be 12*8 such notes in music) + 0 commas. Then your 2nd band will accumulate those + 2 commas, and so on.

Of course no FFT size will give you exactly one bin for each of the frequencies. You'll need to do some interpolation.

And you'll probably need to apply some weighing to each frequency because higher frequencies tend to be just harmonics that are not proper temperated notes (notably odd harmonics) and will make the groups dirthier.

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #5 on: 26 Jun '22 - 03:11 »
Not sure I understand your idea; neither do I understand what the code is doing to begin with. What is the long list of frequencies?

One thing you just said makes sense though: mid-range values are a LOT more important than highs and lows, so applying a curve to weight it out would definitely make sense.

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #6 on: 26 Jun '22 - 06:08 »
OK now I get what the code does. Looking at an online pitch generator, he's looking for all the notes on the keyboard. Indeed, it's taking ridiculously low and high tones and counting them just the same!

Now I take only tones from 82.41 to 1396.91

I'm also calculating them manually to have more precision.

Code: [Select]
private float GetFrequency(int keyIndex) => (float)Math.Pow(2, (keyIndex - 49) / 12.0) * 440;
Now I get
Python: 441.5, 435.6, 443.4, 426.7, 439.5, 428.4
256: 424.0, 433.5, 433.5, 424.0, 433.5, 433.5
512: 433.5, 433.5, 424.0, 433.5, 424.0, 433.5
1024: 433.5, 433.5, 433.5, 433.5, 433.5, 433.5
2048: 433.5, 433.5, 425.7, 424.0, 440.9, 333.5
4096: 424.0, 434.5, 424.8, 433.9, 430.7, 424.0
8192: 434.6, 434.7, 432.1, 437.3, 434.6, 434.1
16384: 435.2, 434.1, 432.1, 434.2, 434.2, 432.2
32768: 437.6, 435.9, 433.8, 424.0, 437.7, 434.6

Still inconsistent... let me try with different types of music. I'm expecting it to be 440-442hz for most music.

Björk: I get stable 440.5, 440.2, 441.5 with FFT16384, and inconsistent results with anything lower.

DJ Project: at FFT16384, all comes out as 434.1, 434.4, 434.7hz; but at FFT32768, all comes out as 437.1, 437.6, 438.2

Elisa: FFT16384 gives 435.0, 434.1, 434.6; FFT32768 gives 438.0, 436.5, 437.9

Nickelback: FFT16384 gives 434.4, 434.2, 432.6; FFT32768 gives 437.6, 436.5, 436.0

I notice a lot of ~434hz being detected instead of 440hz.

An improvement would be to apply a weighting curve to favor middle tones in a smoother way... my math is a bit rusted, anyone can help me with that weighting function?

I've updated my code above.

Here's another thing. Starting at a tone frequency of 79.41, FFT index is 39, then 84.1 is index 42, then 89.1 is index 44. That doesn't leave much room for precision to detect subtle tuning! Higher up, tone 400.2 has index 200 and 424.0 has index 212, now that's more reasonable. So I definitely need FFT32768, and let me try shifting the 50 notes to detect 6 tones up up to 1965hz

Thinking of it, FFT doesn't have enough precision even at 32768 for the kind of subtle analysis I'm trying to do here... and that alone can explain the flaky results.

With may audios getting detected as 424hz or 430hz ... since lower frequencies measure larger bands, it is natural that it will generally return lower pitch, where a FFT band is 1/3 of a tone instead of 1/4 of a tone for higher pitch.

The Python algorithm is definitely flawed...
« Last Edit: 26 Jun '22 - 08:16 by Hanuman »

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #7 on: 26 Jun '22 - 16:54 »
Since the Python development is active (published just 26 days ago), I've opened a ticket on his side with the issues.
https://github.com/CardLin/Exact432HzConverter/issues/1

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #8 on: 26 Jun '22 - 17:29 »
Updated the code above.
- Always use FFT32768
- Had an error where freqStep was INT instead of FLOAT.
- Added weighting to compensate for the variable band width.

Code: [Select]
var factor = (freq - lastFreq) / freqStep;
Ideally would add a weighting curve on top of that, but the math can be delicate to get right.

Now I get pretty good results!

INNA: 440.5, 439.6, 439.1, 441.1, 440.1, 440.0
DJ Project: 438.4, 438.2, 437.9, 440.3, 437.5
Symphony X: 441.1, 441.3, 441.0, 440.7, 441.7

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #9 on: 26 Jun '22 - 17:46 »
Sorry I wasn't clear. I'll try to put some pseudocode together. Don't expect anything soon, though, I don't have much spare time these days.

But I see you're getting the idea (not from me, I guess).

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #10 on: 27 Jun '22 - 04:10 »
I refreshed my parabolic math skills and created a nice curve for 80 tones.

Code: [Select]
var curve = (float)-Math.Pow(j - 1 - 40, 2) / 2000 + 1;
Unfortunately, I get better results with 60 flat values than with 80 parabolic values. It was worth a try.

I also tried doing proper rounding instead of rounding down to find the FFT band, but rounding down gave better results somehow.
Code: [Select]
// var index = (int)Math.Round(tones[j] / freqStep, 0);
var index = (int)(tones[j] / freqStep);

I'm pretty satisfied with the results. Source code here.

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #11 on: 27 Jun '22 - 21:34 »
I think I understand your idea. I took each of the 12 tones of an octave and sum them up over 5 octaves. Find the highest sum of specific tones.

Code: [Select]
// Find the tuning frequency
var maxSum = 0.0f;
var maxFreq = 440.0f;
for (var i = 424.0f; i < 448.1f; i += 0.1f)
{
    var tones = Array.ConvertAll(toneFreq, x => x * i / 440f);
    for (var tone = 0; tone < 12; tone++)
    {
        var sum = 0.0f;
        for (var octave = 0; octave < 5; octave++)
        {
            var j = tone + octave * 12 + 1;
            // We get more consistent results with rounding down (int) than with Math.Round
            // var index = (int)Math.Round(tones[j] / freqStep, 0);
            var index = (int)(tones[j] / freqStep);
            // FFT bands are larger at lower frequencies and smaller at higher frequencies, compensate for that.
            var factor = (tones[j] - tones[j - 1]) / freqStep;
            // Applying a parabolic curve to favor middle-tones is not improving the results.
            // var curve = (float)-Math.Pow(j - 1 - 40, 2) / 2000 + 1;
            sum += fft[index] * factor;
        }
        if (sum > maxSum)
        {
            maxSum = sum;
            maxFreq = i;
        }
    }
}
return maxFreq;

Here are the results.

All Tones: INNA
01 Heart Drop.mp3: 440.601
02 Bamboreea (ft. J-Son).mp3: 440.00098
03 Bad Boys.mp3: 439.10092
04 Too Sexy.mp3: 440.501
05 Bop Bop (ft. Eric Turner).mp3: 440.10098
06 Rendez Vous.mp3: 440.00098
07 Yalla.mp3: 440.10098
08 Walking On The Sun.mp3: 440.201
09 Fool Me.mp3: 440.501
10 Body And The Sun.mp3: 440.501
11 Salinas Skies.mp3: 440.90103
12 Devil's Paradise.mp3: 440.301
13 Diggy Down (ft. Marian Hill).mp3: 440.601
14 Low.mp3: 440.601
15 Tell Me.mp3: 440.601
16 Diggy Down (Piano Deluxe).mp3: 441.40106
17 Sun Goes Up.mp3: 440.201
18 Summer in December (ft. Morandi).mp3: 440.301

Separate tones: INNA
01 Heart Drop.mp3: 440.80103
02 Bamboreea (ft. J-Son).mp3: 440.10098
03 Bad Boys.mp3: 439.50095
04 Too Sexy.mp3: 441.10104
05 Bop Bop (ft. Eric Turner).mp3: 440.201
06 Rendez Vous.mp3: 437.60083
07 Yalla.mp3: 440.00098
08 Walking On The Sun.mp3: 440.601
09 Fool Me.mp3: 440.601
10 Body And The Sun.mp3: 440.00098
11 Salinas Skies.mp3: 440.90103
12 Devil's Paradise.mp3: 441.20105
13 Diggy Down (ft. Marian Hill).mp3: 440.70102
14 Low.mp3: 440.70102
15 Tell Me.mp3: 440.70102
16 Diggy Down (Piano Deluxe).mp3: 441.60107
17 Sun Goes Up.mp3: 440.601
18 Summer in December (ft. Morandi).mp3: 440.10098

All Tones: DJ Project
01 Te Chem (radio).mp3: 438.40088
02 Viseaza.mp3: 438.20087
03 Te chem (maxi).mp3: 437.90085
04 As vrea (Sa te pot uita) - (ra.mp3: 440.301
05 Experience.mp3: 437.50082

Separate Tones: DJ Project
01 Te Chem (radio).mp3: 439.80096
02 Viseaza.mp3: 440.10098
03 Te chem (maxi).mp3: 439.80096
04 As vrea (Sa te pot uita) - (ra.mp3: 441.00104
05 Experience.mp3: 437.90085

All Tones: Enigma
01 The Voice Of Enigma.mp3: 441.00104
02 Principles Of Lust SadenessFind LoveSadeness (Reprise).mp3: 439.10092
03 The Eyes Of Truth.mp3: 440.70102
04 Callas Went Away.mp3: 447.50143
05 Smell Of Desire.mp3: 440.301
06 Knocking On Forbidden Doors.mp3: 445.5013
07 Mea Culpa.mp3: 440.201
08 Morphing Thru Time.mp3: 441.00104
09 The Dream Of The Dolphin.mp3: 441.20105
10 Beyond The Invisible.mp3: 441.00104
11 Between Mind & Heart.mp3: 441.60107
12 Why!....mp3: 441.70108
13 Shadows In Silence.mp3: 440.201
14 The Child In Us.mp3: 441.70108
15 The Cross Of Changes.mp3: 441.30106
16 The Screen Behind The Mirror.mp3: 442.1011
17 T.N.T. For The Brain.mp3: 443.30118
18 Second Chapter.mp3: 441.50107

Separate Tones: Enigma
01 The Voice Of Enigma.mp3: 441.70108
02 Principles Of Lust SadenessFind LoveSadeness (Reprise).mp3: 439.40094
03 The Eyes Of Truth.mp3: 440.601
04 Callas Went Away.mp3: 424.30002
05 Smell Of Desire.mp3: 441.20105
06 Knocking On Forbidden Doors.mp3: 445.5013
07 Mea Culpa.mp3: 442.40112
08 Morphing Thru Time.mp3: 441.00104
09 The Dream Of The Dolphin.mp3: 441.9011
10 Beyond The Invisible.mp3: 440.00098
11 Between Mind & Heart.mp3: 440.501
12 Why!....mp3: 441.40106
13 Shadows In Silence.mp3: 441.10104
14 The Child In Us.mp3: 441.40106
15 The Cross Of Changes.mp3: 441.40106
16 The Screen Behind The Mirror.mp3: 446.7014
17 T.N.T. For The Brain.mp3: 443.7012
18 Second Chapter.mp3: 441.60107

This technique is something slightly better, sometimes slightly worse... overall not much improvement.

I also tried another idea, to reduce "unclean" tones by subtracting the FFT band above/below, but that also didn't improve the results.

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #12 on: 28 Jun '22 - 06:41 »
Yeah, that was basically my idea, though I didn't think of taking just the stronger tone at each key. That may discard useful information. I was rather thinking in taking into account all of the contributions to each possible tuning, like: given one possible tuning (let's say n commas appart from a standard tuning) summing the contributions to the frequencies of all the keys at that specific tuning by each FFT bin, taking into account the bin's frequency nearness to that frequency and its level. This means interpolation. Then repeat for all the possible different tunings, and chose the one with more contributions.

It seemed like a good idea, but of course if it doesn't give useful results you have to wonder why.

Now I think: how do we humans detect the tuning of a complex song?
First we isolate chords or solo notes from the background mess of noise and harmonics.
This takes some time, for me, at least. On some songs I don't have a clue until several seconds into the song. If a song begins with unpitched percussion, I have to wait until some pitched instrument breaks in. Even then, vibrato and glissandos can make the tuning not so evident. I suspect FFT32768's less than a sec window won't be enough. I know FFT relies on increasingly shorter spectrum analysis put together to make it a fast algorithm, but I don't know how does it combine them. So, taking 10 or so FFT32768s and averaging them may not be equivalent to taking only one FFT327680 that would then have a 10 sec window. Maybe someone with more knowledge that me can confirm that.

I'm quite used to detect the pitch of a single a capella singing voice so to begin accompaining the song at the proper key "on the fly". This usually takes me a few seconds. (A different matter is to be able to transpose the song "on the fly" to that key!). But often the tunning turns out to be off by up to 25 commas high or low, and the pipe organ doesn't have a main tuning knob like digital organs do. So I sound detuned at least until the singer realizes that he/her is off and corrects his/her own tuning. Of course the untrained congregation thinks it was me who did wrong!

Imagine how difficult could it be to detect the exact tuning of untrained voices!

I can't even imagine the complexity of the process going on into my brain to detect the tunning of a song. But I know I can do it, given time. I do it when I tune my digital organ at home "on the fly" to accompany some song that's playing on the radio.

For an algorythm to outperform a human in detecting tuning it has to be clever. Since our ears/brain work by making an spectrum analisis of the sound, using FFT doesn't give the algorythm an advantage over humans.

I don't mean to discourage you. You've made great progress in just a few hours. At this time I think you are a lot more able than me to solve the problem!

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #13 on: 29 Jun '22 - 22:37 »
Most ideas so far were counter-productive (wasting time!), but you have yet another (good? bad?) idea.

Instead of analyzing 100s in a big chunk, what if we were to analyze block by block (or summing a few blocks together)? Block 1 we might think it's 442.1hz, block 2 we're not quite sure because of glissando, block 3 is silent, block 4 is a clear note on 441hz ... that's closer to how you'd do it with your ears.

Then the question is... how to aggregate those results to come to a conclusion? We'd need a "decisiveness" factor for each score. One weird tone detected at 424hz could be very decisive and mess up the conclusion though. Not sure it will be worth the efforts.

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #14 on: 30 Jun '22 - 02:18 »
At this point I'm rather out of ideas. I think you're right about splitting the audio into blocks, so to work as much as possible with melodies (taking into account only the most intense tone in each block). But yeah, any out of context tone will lead to wrong conclussions.

I don't know how much effort you're willing to put in this project, but it seems that each small improvement would take lots of effort, so maybe you should settle with what you've got so far.

As for myself, I'm willing to put as little as possible effort. I'm more interested in the "perceived" tuning of a poorly tuned assembly. I don't need to extract the tuning from real music, because I can make one pipe speak at a time, and so get a collection of accurate tunings (one for each pipe). Then the problem is rather finding what those different tunings have in common and how they contribute to the "perceived" tuning.

My naïve thought is: "if it were that simple, then someone may have done it already, and you just need to copy his project".

I wonder if this isn't a problem for Artificial Inteligence? But that's completely out of my league. I maybe could contibute to the creation of a knowledge base from which the AI can learn, but nothing else.

By the way, have you checked that the results you've got closely match the real tuning of each of those songs?

Edit:
Sorry if I made you waste your time. I just wanted to help.
« Last Edit: 30 Jun '22 - 03:12 by jpf »

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #15 on: 30 Jun '22 - 05:02 »
I got no idea how I would manually measure that.

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #16 on: 30 Jun '22 - 15:52 »
To "manually measure" the tuning of a recording I'd play along with it on my virtual pipe organ, and move its tuning knob until I sound in tune with the recording. The display indicates the detuning from 440 Hz, in cents, and that would be the tuning of the recording, too, since I sound in tune with it. I don't remember caring about what the actual tuning was, I just want to sound in tune with the recording, so I didn't really test this method.

I guess this doesn't help you. You probably wanted a cientific method. I don't know of one.

Once again let me slip an unrequested suggestion: instead of trying to measure the tuning of the recordings to later compare them against the output of your code, you could use recordings of which you know the tuning beforehand for testing.

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #17 on: 1 Jul '22 - 23:36 »
One easy way to reduce false results is to limit the range of the scan. I was scanning 424hz to 448hz. I can have a "extendedRange" parameter, when False, I scan 436hz to 444hz.
For all the songs being produced out there, what's the range that can really be expected?

I'm getting 447.9hz for some Black Eye Peas song, possible? I reduce that tone score by 20% and it's still the highest match...

Can you tell what's the real pitch of this?
https://www.youtube.com/watch?v=Wqw1LoyN-BE
« Last Edit: 2 Jul '22 - 01:46 by Hanuman »

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #18 on: 2 Jul '22 - 06:31 »
This turned out to be a bit more work than I initially thought, but I got some results.

Unfortunately in the song there are no notes sustained long enough to compare the tuning of my virtual organ with that of the song with high precision (to get a 1 Hz resolution on a 440 Hz sine tone measurable using a frequency counter you'd need a 1 sec steady note!). Fortunately the human ear/brain works quite well on the harmonics, even those that are not really present but just perceived.

On this song I was barely able to chose between a 440 Hz and a 447.8 Hz tuning. 447.8 Hz sounds perfectly in tune to me, and 440 Hz sounds a little flat.

I'm attaching these play along files so you (and other users) can judge by yourself.

I couldn't find a pipe organ sounfont with precise tuning, so I used a pure sine wave soundfont:
https://github.com/datascopeanalytics/honk/blob/master/Sine%20Wave.sf2

For the conversion of cents to frequency ratio and back I used:
http://www.sengpielaudio.com/calculator-centsratio.htm

I also verified that the central A was at 440 Hz and 447.8 Hz respectively using a frequency counter (a Cool Edit Pro .xfm filter coded by me).
« Last Edit: 2 Jul '22 - 11:57 by jpf »

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #19 on: 2 Jul '22 - 22:54 »
so 447.8 sounds about right then?

to get longer sounds, you can slow down playback :)

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #20 on: 4 Jul '22 - 04:45 »
so 447.8 sounds about right then?

Yes, definetely. I wasn't able to try 447.9hz because the resolution of my tuning slider is just cents. Now I modified my code so I can get better resolutions now. I'll try 447.9hz soon, but I doubt my ear can notice any difference with 447.8 Hz.

That said, I guess your 447.9 Hz is indeed the right tuning. I'm amazed by your success! I had my doubts, but this proves you hit the nail!

to get longer sounds, you can slow down playback :)
That sounds like a good idea!

If you meant resampling, I tried that and the outcome is worse. Resampling to 1/5 of the original samplerate makes the pitch 2 octaves lower, and impossible to recognize any harmony. Only central notes or their harmonics (440 hz - 2200 Hz) help my ears tell the tuning. For tuning beyond those limits I resort to counting the beats between the original note and the reference note (diapason or electronic tuner).

But if you meant time stretching preserving the pitch, that looks promising.

I tried bass fx' BASS_FX_TempoCreate. The outcome has annoying artifacts making it unsuitable for the purpose.

Then I tried Rubberband. Still annoying artifacts, but some notes come out clear. I guess those notes fall exactly into one FFT bin.

I was trying to play along, but I'll have to get used to; the song sounds weird. At normal speed the melody and harmony are easy to follow, they're intuitive, friendly. Making the recordings I attached took me just a few minutes. But al 1/5 speed everything sounds like isolated chords with no relation to each other. I'll keep trying, though.

Anyway, this is just one song. It could be advisable to run your code on some tests songs. I mean song which tuning you know beforehand. I was thinking of rendering some midis using the "Sine wave" soundfont. Those will have a very precise tuning. You can then decorate them with percussion and other pitchless sounds to judge the robustness of your algorithm, if you want. Let me know what you think of this.

I didn't download your code because I won't be able to compile or run it. I'm only handy with VB and some C. I maybe able to translate it to VB but I won't do it if I can avoid it.

Edit:
Maybe other musicians in the forum can help with this. I feel so isolated, and I don't trust my ears so much. This is an interesting and now promising project!
« Last Edit: 4 Jul '22 - 05:19 by jpf »

Hanuman

  • Posts: 58
Re: Detecting Pitch
« Reply #21 on: 4 Jul '22 - 22:47 »
If 447.9 is right, then I think the algorithm is good enough as it is. Over-complicating it will likely not bring much benefit.

432hz Player with auto pitch detection is released!

jpf

  • Posts: 120
Re: Detecting Pitch
« Reply #22 on: 5 Jul '22 - 03:47 »
Well done!

If you're satisfied with it as is, I won't do further testing.