Fast, Accurate, and Robust Pitch Estimation NordicSMC Winter School - - PowerPoint PPT Presentation

fast accurate and robust pitch estimation
SMART_READER_LITE
LIVE PREVIEW

Fast, Accurate, and Robust Pitch Estimation NordicSMC Winter School - - PowerPoint PPT Presentation

Fast, Accurate, and Robust Pitch Estimation NordicSMC Winter School 2019 March 7, 2019 Jesper Kjr Nielsen jkn@create.aau.dk Audio Analysis Lab, CREATE Aalborg University, Denmark Website: http://audio.create.aau.dk YouTube:


slide-1
SLIDE 1

Fast, Accurate, and Robust Pitch Estimation

NordicSMC Winter School 2019 March 7, 2019 Jesper Kjær Nielsen jkn@create.aau.dk

Audio Analysis Lab, CREATE Aalborg University, Denmark Website: http://audio.create.aau.dk YouTube: http://tinyurl.com/yd8mo55z

slide-2
SLIDE 2

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Motivation

0.005 0.01 0.015 0.02 time [s]

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3

Periodic signals

A periodic signal repeats itself after some period τ or, equivalently, with some frequency ω0.

◮ We refer to ω0 as either the pitch (perceptual) or the fundamental

frequency (physical).

◮ How do we estimate this value from possibly noisy and

non-stationary data?

1 / 76

slide-3
SLIDE 3

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Motivation

Some examples of periodic signals and applications:

◮ Voiced speech and singing

  • Are people singing on-key?
  • Diagnosis of the Parkinson’s disease

◮ Many musical instruments (e.g., guitar, violin, flute, trumpet,

piano)

  • Tuning of instruments
  • Music transcription

◮ Electrocardiographic (ECG) signals

  • Measure your heart rate or heart rate variability
  • Heart defect diagnosis

◮ Rotating machines

  • Vibration analysis
  • Rotation speed

2 / 76

slide-4
SLIDE 4

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Motivation

Example: RPM estimation from tachometer signal

SNR: 40 dB

0.1 0.2 0.3 0.4 0.5 0.6

time [s]

  • 0.2

0.2 0.4 0.6 0.8 1

voltage [V]

3 / 76

slide-5
SLIDE 5

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Motivation

Example: RPM estimation from tachometer signal

Figure courtesy of A. Brandt, Noise and vibration analysis: signal analysis and experimental

  • procedures. John Wiley & Sons, 2011.

4 / 76

slide-6
SLIDE 6

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Motivation

Example: RPM estimation from tachometer signal

SNR: 0 dB

0.1 0.2 0.3 0.4 0.5 0.6

time [s]

  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

time [s]

5 / 76

slide-7
SLIDE 7

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods The Nonlinear Least Squares (NLS) Estimator The Harmonic Summation (HS) estimator* Comparison of Methods Robustness to noise Time-frequency resolution Summary Model Improvements Summary

6 / 76

slide-8
SLIDE 8

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods Comparison of Methods Model Improvements Summary

7 / 76

slide-9
SLIDE 9

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

0.005 0.01 0.015 0.02 time [s]

  • 0.3
  • 0.2
  • 0.1

0.1 0.2 0.3

For a periodic signal x(n) with a period τ = 2π/ω0, we have that x(n) = x(n − τ) = x(n − 2π/ω0) . (1)

◮ Unfortunately, τ is unknown so we have to try out different τ’s (or

ω0’s) to find one that satisfies the above equation.

◮ Real-world signals are not perfectly periodic so we might never

find one.

◮ Instead, the estimate of τ is the value which minimises some

  • bjective function.

8 / 76

slide-10
SLIDE 10

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

Consider the objective function J(a, τ) =

N−1

  • n=τMAX

|e(n)|2 (2) for a segment of data {x(n)}N−1

n=0 where

e(n) = x(n) − ax(n − τ) , a > 0 ∧ τ ∈ [τMIN, τMAX] (3) Often referred to as comb-filtering. x(n) 1 − ae−jωτ e(n)

9 / 76

slide-11
SLIDE 11

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

200 400 600 800 1000 1200 1400 1600 1800 2000

frequency [Hz]

0.5 1 1.5 2 2.5 3 3.5 4

amplitude

Periodogram

10 / 76

slide-12
SLIDE 12

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

200 400 600 800 1000 1200 1400 1600 1800 2000

frequency [Hz]

0.5 1 1.5 2 2.5 3 3.5 4

amplitude

Periodogram Comb filter

10 / 76

slide-13
SLIDE 13

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

Conditioned on τ, the optimal value for a is ˆ a(τ) = max N−1

n=τMAX x(n)x(n − τ)

N−1

n=τMAX x2(n − τ)

, 0

  • (4)

Inserting this into the objective J(a, τ) yields the estimator ˆ τ = argmax

τ∈[τMIN,τMAX]

max (φ(τ), 0) (5) where φ(τ) ∈ [−1, 1] is the normalised cross correlation function given by φ(τ) = N−1

n=τMAX x(n)x(n − τ)

N−1

n=τMAX x2(n) N−1 n=τMAX x2(n − τ)

(6)

11 / 76

slide-14
SLIDE 14

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

100 200 300 400 500 600

n

  • 0.2

0.2

= 45

MAX

N-1

x(n)

100 200 300 400 500

n-45

  • 0.2

0.2

x(n-45)

60 80 100 120 140 160 180 200 220 1 2

max( ( ),0)

max( (45),0) 12 / 76

slide-15
SLIDE 15

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

100 200 300 400 500 600

n

  • 0.2

0.2

= 95

MAX

N-1

x(n)

100 200 300 400 500

n-95

  • 0.2

0.2

x(n-95)

60 80 100 120 140 160 180 200 220 1 2

max( ( ),0)

max( (95),0) 13 / 76

slide-16
SLIDE 16

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

100 200 300 400 500 600

n

  • 0.2

0.2

= 169

MAX

N-1

x(n)

  • 100

100 200 300 400

n-169

  • 0.2

0.2

x(n-169)

60 80 100 120 140 160 180 200 220 1 2

max( ( ),0)

max( (169),0) 14 / 76

slide-17
SLIDE 17

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

100 200 300 400 500 600

n

  • 0.2

0.2

= 220

MAX

N-1

x(n)

  • 200
  • 100

100 200 300

n-220

  • 0.2

0.2

x(n-220)

60 80 100 120 140 160 180 200 220 1 2

max( ( ),0)

max( (220),0) 15 / 76

slide-18
SLIDE 18

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

60 80 100 120 140 160 180 200 220

[samples]

0.2 0.4 0.6 0.8 1

max( ( ),0)

200 300 400 500 600 700 800 900 1000

f [Hz]

0.2 0.4 0.6 0.8 1

max( (f),0)

16 / 76

slide-19
SLIDE 19

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

.... but is anyone actually using the comb filtering method? PRAAT: (Boersma, 1993), well over 1000 citations (Google Scholar) Maximises a windowed normalised cross-correlation function RAPT: (Talkin, 1995), nearly 1000 citations (Google Scholar) Maximises a normalised cross-correlation function YIN: (Cheveigné, 2002), nearly 2000 citations (Google Scholar) Minimises the comb filtering error for a = 1 Kaldi: (Ghahremani et al., 2014), nearly 150 citations (Google Scholar) Maximises a normalised cross-correlation function

17 / 76

slide-20
SLIDE 20

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

Was that really everything?

No! Four problems with the correlation-based methods:

  • 1. is prone to producing subharmonic errors,
  • 2. has a sub-optimal time-frequency resolution,
  • 3. is not robust to noise, and
  • 4. not statistically efficient.

18 / 76

slide-21
SLIDE 21

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

Subharmonic error

100 200 300 400 500 600 700 800 900 1000

frequency [Hz]

0.2 0.4 0.6 0.8 1

max( ( (f)), 0)

19 / 76

slide-22
SLIDE 22

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

Subharmonic error

200 400 600 800 1000 1200 1400 1600 1800 2000

frequency [Hz]

0.5 1 1.5 2 2.5 3 3.5 4

amplitude

Periodogram Comb Filter

20 / 76

slide-23
SLIDE 23

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

What can we do about these problems?

◮ Hundreds of published pitch estimators trying to solve these

problems using various heuristics.

◮ A fundamental flaw of the comb-filtering principle?

21 / 76

slide-24
SLIDE 24

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Correlation-based Methods

Five minutes active break

Please complete the SMCNordic pitch survey.

◮ Go to http://tinyurl.com/y3ny4n4n ◮ Fill out the form to the best of your ability

22 / 76

slide-25
SLIDE 25

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods The Nonlinear Least Squares (NLS) Estimator The Harmonic Summation (HS) estimator* Comparison of Methods Model Improvements Summary

23 / 76

slide-26
SLIDE 26

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Harmonic Model

930 932 934 936 938 940 942 944 946 948 950 −0.1 0.1 n [ms] x(n)

24 / 76

slide-27
SLIDE 27

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Harmonic Model

930 932 934 936 938 940 942 944 946 948 950 −0.1 0.1 n [ms] x(n) = h1(n) + h2(n) + h3(n) + e(n)

24 / 76

slide-28
SLIDE 28

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Harmonic Model

930 932 934 936 938 940 942 944 946 948 950 −0.1 0.1 n [ms] x(n) = h1(n) + h2(n) + h3(n) + e(n)

24 / 76

slide-29
SLIDE 29

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Harmonic Model

Mathematical Model

The signal model for any periodic signal is s(n) =

L

  • l=1

hl(n) =

L

  • l=1

Al cos(ω0ln + φl) (7) where Al real amplitude of the lth harmonic φl initial phase of the lth harmonic ω0 fundamental frequency in radians/sample L the number of harmonics/model order

25 / 76

slide-30
SLIDE 30

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Harmonic Model

Can we actually use models?

In 1987, G. E. P . Box (a British statistician) wrote Essentially, all models are wrong, but some are useful.

26 / 76

slide-31
SLIDE 31

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Harmonic Model

Can we actually use models?

In 1987, G. E. P . Box (a British statistician) wrote Essentially, all models are wrong, but some are useful.

◮ Do NOT think about models as exact physical representations of

a phenomenon in the real world.

◮ Instead, think of models as an explicit way of stating your

assumptions about the phenomenon.

◮ Models can be critisised (and improved on) since the

assumptions are explicit.

◮ Models allow us to assert under which conditions a problem is

  • ptimally solved .

26 / 76

slide-32
SLIDE 32

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Method of Least Squares

Instead of considering the comb-filtering error e(n) = x(n) − ax(n − τ) , (8) we consider the least-squares error e(n) = x(n) − s(n, θ) , n = 0, 1, ... , N − 1 (9) where s(n, θ) is a harmonic model given by s(n, θ) =

L

  • l=1

Al cos(lω0n + φl) (10) θ =

  • A1

· · · AL φ1 · · · φL ω0 T (11)

27 / 76

slide-33
SLIDE 33

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Method of Least Squares

The method of least-squares

Data generation Σ e(n) Signal Model θ x(n) error − s(n, θ)

◮ The vector θ contains the model parameters ◮ The signal s(n, θ) is produced by the signal model ◮ The signal x(n) is the observed data ◮ The error consists of noise and model inaccuracies

28 / 76

slide-34
SLIDE 34

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Nonlinear Least Squares Methods

Method of Least Squares

The nonlinear least squares (NLS) method is that of solving ˆ θ = argmin

θ

J(θ) (12) where J(θ) measures the squared error J(θ) =

N−1

  • n=0

|e(n)|2 =

N−1

  • n=0

|x(n) − s(n, θ)|2 (13)

◮ Solving this problem naïvely is very computationally demanding

since the fundamental frequency is a nonlinear parameter.

◮ Asymptotically, however, an efficient solution exists which for

historical reasons is called harmonic summation (Noll, 1969).

29 / 76

slide-35
SLIDE 35

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods The Nonlinear Least Squares (NLS) Estimator The Harmonic Summation (HS) estimator* Comparison of Methods Model Improvements Summary

30 / 76

slide-36
SLIDE 36

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

The NLS Estimator

The harmonic model x(n) =

L

  • l=1
  • al cos(lω0n) − bl sin(lω0n)
  • + e(n)

(14) for n = n0, n0 + 1, ... , n0 + N − 1 can be written as x = Z L(ω0)αL + e (15) where Z L(ω) =

  • c(ω)

c(2ω) · · · c(Lω) s(ω) s(2ω) · · · s(Lω)

  • c(ω) =

cos(ωn0) · · · cos(ω(n0 + N − 1))T s(ω) = sin(ωn0) · · · sin(ω(n0 + N − 1))T αl =

  • aT

L

−bT

L

T , aL = a1 · · · aL T , bL = b1 · · · bL T

31 / 76

slide-37
SLIDE 37

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

The NLS Estimator

The least squares error is

N−1

  • n=0

e2(n) = eTe = [x − Z L(ω0)αL]T [x − Z L(ω0)αL] (16) Conditioned on ω0, the estimate of αL is ˆ αL(ω0) =

  • Z T

L (ω0)Z L(ω0)

−1 Z T

L (ω0)x

(17) Inserting this back into the objective yields the NLS estimator ˆ ω0,L = argmax

ω0∈[ωMIN,ωMAX]

xTZ L(ω0)

  • Z T

L (ω0)Z L(ω0)

−1 Z T

L (ω0)x

(18) The NLS estimator has been known since (Quinn and Thomson, 1991), but is costly to compute.

32 / 76

slide-38
SLIDE 38

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

The NLS Estimator

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 200 400 ω0 N

2π [cycles/segment]

  • 1. Compute NLS cost function

ˆ ω0,L = argmax

ω0∈[ωMIN,ωMAX]

xTZ L(ω0)

  • Z T

L (ω0)Z L(ω0)

−1 Z T

L (ω0)x

(19)

  • n an F/L-point uniform grid for all model orders

L ∈ {1, ... , LMAX}.

  • 2. Optionally refine the LMAX grid estimates.
  • 3. Do model comparison.

33 / 76

slide-39
SLIDE 39

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

The NLS Estimator

Fast NLS Algorithm

A MATLAB implementation of the NLS estimator

% create an estimator object (the data independent step is computed) f0Estimator = fastF0Nls(nData, maxNoHarmonics, f0Bounds); % analyse a segment of data [f0Estimate, estimatedNoHarmonics, estimatedLinParam] = ... f0Estimator.estimate(data); ◮ The algorithm also includes model comparison. ◮ The algorithm can also be set-up to work for a model with a

non-zero DC-value.

◮ A C++-implementation is also available (although not as refined

as the MATLAB implementation).

◮ Can be downloaded from

https://github.com/jkjaer/fastF0Nls.

34 / 76

slide-40
SLIDE 40

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods The Nonlinear Least Squares (NLS) Estimator The Harmonic Summation (HS) estimator* Comparison of Methods Model Improvements Summary

35 / 76

slide-41
SLIDE 41

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

The Harmonic Summation (HS) estimator

Harmonic summation (HS) estimator

Asymptotically, lim

N→∞

2 N Z T

L (ω0)Z L(ω0) = IL .

(20) Using this limit as an approximation gives the harmonic summation estimator (Noll, 1969) ˆ ω0,L = argmax

ω0∈[ωMIN,ωMAX]

xTZ L(ω0)Z T

L (ω0)x =

argmax

ω0∈[ωMIN,ωMAX] L

  • l=1

|X(ω0l)|2 The HS estimator is also referred to as approximate NLS (aNLS).

36 / 76

slide-42
SLIDE 42

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Harmonic summation (HS) estimator

NLS vs. HS

Some remarks:

◮ The HS method works very well, unless the fundamental

frequency is low or the maximum harmonic component is close to the Nyquist frequency.

◮ The HS method can be implemented very efficiently using a

single FFT.

◮ The order of complexity for NLS has recently been decreased to

that of HS (Nielsen et al., 2017). 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 200 400 600 ω0 N

2π [cycles/segment]

HS NLS

37 / 76

slide-43
SLIDE 43

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods Comparison of Methods Robustness to noise Time-frequency resolution Summary Model Improvements Summary

38 / 76

slide-44
SLIDE 44

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

What could be evaluated?

  • 1. Estimation accuracy
  • 2. Robustness to noise
  • 3. Time-frequency resolution
  • 4. Computational complexity

39 / 76

slide-45
SLIDE 45

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods Comparison of Methods Robustness to noise Time-frequency resolution Summary Model Improvements Summary

40 / 76

slide-46
SLIDE 46

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

Simulation setup

◮ Segment size of 25 ms at a sampling frequency of 8000 Hz. ◮ Estimate the pitch from 1000 Monte Carlo runs for every SNR. ◮ In each run, the true pitch is randomly selected from [90, 380] Hz

and the true phases are also generated at random.

◮ The true amplitudes are exponentially decreasing. ◮ The true model order is 7. ◮ Each method searches for a pitch in the range [80, 400] Hz. ◮ The maximum model order in NLS is set to 15. ◮ The noise is white and Gaussian. ◮ No pitch tracking used in any method.

41 / 76

slide-47
SLIDE 47

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

−10 −5 5 10 15 20 25 30 10−1 100 101 102 SNR [dB] RMSE [Hz] CRLB

42 / 76

slide-48
SLIDE 48

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

−10 −5 5 10 15 20 25 30 10−1 100 101 102 SNR [dB] RMSE [Hz] CRLB Fast NLS

42 / 76

slide-49
SLIDE 49

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

−10 −5 5 10 15 20 25 30 10−1 100 101 102 SNR [dB] RMSE [Hz] CRLB Fast NLS Comb filtering

42 / 76

slide-50
SLIDE 50

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

−10 −5 5 10 15 20 25 30 10−1 100 101 102 SNR [dB] RMSE [Hz] CRLB Fast NLS Comb filtering YIN

42 / 76

slide-51
SLIDE 51

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

−10 −5 5 10 15 20 25 30 10−1 100 101 102 SNR [dB] RMSE [Hz] CRLB Fast NLS Comb filtering YIN

Average computation times in MATLAB

Fast NLS: 7.6 ms, Comb filter: 2.4 ms, YIN: 0.7 ms

42 / 76

slide-52
SLIDE 52

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

43 / 76

slide-53
SLIDE 53

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

No noise and window size of 25 ms.

44 / 76

slide-54
SLIDE 54

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

20 dB SNR and window size of 25 ms.

45 / 76

slide-55
SLIDE 55

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

15 dB SNR and window size of 25 ms.

46 / 76

slide-56
SLIDE 56

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

10 dB SNR and window size of 25 ms.

47 / 76

slide-57
SLIDE 57

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

5 dB SNR and window size of 25 ms.

48 / 76

slide-58
SLIDE 58

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

0 dB SNR and window size of 25 ms.

49 / 76

slide-59
SLIDE 59

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

  • 5 dB SNR and window size of 25 ms.

50 / 76

slide-60
SLIDE 60

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Robustness to noise

  • 10 dB SNR and window size of 25 ms.

51 / 76

slide-61
SLIDE 61

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods Comparison of Methods Robustness to noise Time-frequency resolution Summary Model Improvements Summary

52 / 76

slide-62
SLIDE 62

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Simulation setup

◮ SNR of 30 dB at a sampling frequency of 8000 Hz. ◮ Estimate the pitch from 1000 Monte Carlo runs for every

segment time.

◮ In each run, the true pitch is randomly selected from [90, 380] Hz

and the true phases are also generated at random.

◮ The true amplitudes are exponentially decreasing. ◮ The true model order is 7. ◮ Each method searches for a pitch in the range [80, 400] Hz. ◮ The maximum model order in NLS is set to 15. ◮ The noise is white and Gaussian. ◮ No pitch tracking used in any method.

53 / 76

slide-63
SLIDE 63

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

10 15 20 25 30 35 40 45 50 10−2 10−1 100 101 102 Segment length [ms] RMSE [Hz] CRLB

54 / 76

slide-64
SLIDE 64

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

10 15 20 25 30 35 40 45 50 10−2 10−1 100 101 102 Segment length [ms] RMSE [Hz] CRLB Fast NLS

54 / 76

slide-65
SLIDE 65

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

10 15 20 25 30 35 40 45 50 10−2 10−1 100 101 102 Segment length [ms] RMSE [Hz] CRLB Fast NLS Comb filtering

54 / 76

slide-66
SLIDE 66

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

10 15 20 25 30 35 40 45 50 10−2 10−1 100 101 102 Segment length [ms] RMSE [Hz] CRLB Fast NLS Comb filtering YIN

54 / 76

slide-67
SLIDE 67

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

55 / 76

slide-68
SLIDE 68

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 25 ms and no noise.

56 / 76

slide-69
SLIDE 69

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 20 ms and no noise.

57 / 76

slide-70
SLIDE 70

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 16 ms and no noise.

58 / 76

slide-71
SLIDE 71

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 15 ms and no noise.

59 / 76

slide-72
SLIDE 72

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 14 ms and no noise.

60 / 76

slide-73
SLIDE 73

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 12 ms and no noise.

61 / 76

slide-74
SLIDE 74

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 11 ms and no noise.

62 / 76

slide-75
SLIDE 75

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 10 ms and no noise.

63 / 76

slide-76
SLIDE 76

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Time-frequency resolution

Window size of 9 ms and no noise.

64 / 76

slide-77
SLIDE 77

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods Comparison of Methods Robustness to noise Time-frequency resolution Summary Model Improvements Summary

65 / 76

slide-78
SLIDE 78

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Summary

Correlation-based Methods

A periodic signal satisfies that x(n) = x(n − τ) (21) where τ = 2π/ω0 is the period. + Intuitive and simple + Low computational complexity + Mature and refined set of methods +/- No need to estimate the model order

  • Interpolation needed for fractional delay estimation
  • Poor time-frequency resolution
  • Are sensitive to noise

66 / 76

slide-79
SLIDE 79

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Comparison of Methods

Summary

Parametric Methods

Estimate the parameters in x(n) =

L

  • l=1

Al cos(lω0n + φl) + e(n) (22) + High estimation accuracy + Work very well in even noisy conditions + Good time-frequency resolution +/- The model order has to be estimated

  • Higher computational complexity
  • Early stage methods without fine tuning (yet)
  • Might produce over-optimistic results (e.g., due to

non-stationarity)

67 / 76

slide-80
SLIDE 80

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods Comparison of Methods Model Improvements Summary

68 / 76

slide-81
SLIDE 81

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Model Improvements

What is wrong with the harmonic model?

The harmonic model

So far, we have used the model x(n) = s(n) + e(n) =

L

  • l=1

Al cos(ω0ln + φl) + e(n) (23) What could be improved? Noise model Noise is typically not white, but coloured. Pitch tracking The pitch is typically smoothly evolving between successive frames. Inharmonic pitch For, e.g., stiff-stringed instruments, the frequencies

  • f the harmonics {ωl} deviate (slightly) from whole

multiples of the pitch (ωl = ω0l √ 1 + Bl2). Non-stationary pitch Within a segment, the pitch is typically not stationary, but time-varying.

69 / 76

slide-82
SLIDE 82

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Model Improvements

Non-stationary pitch estimation

Non-stationary pitch estimation

◮ Real-world signals are non-stationary since the fundamental

frequency is continuously changing.

◮ The harmonic model assumes that the the fundamental

frequency is constant in a segment of data

◮ We can extend the model of the phase of the lth harmonic

component to θl(n) ≈ φl + lω0n + lβ0n2/2 (24) where β0 is the fundamental chirp rate.

◮ We refer to this model as the harmonic chirp model

s(n) =

L

  • l=1

Al cos(lβ0n2/2 + lω0n + φl) (25)

70 / 76

slide-83
SLIDE 83

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Model Improvements

Non-stationary Pitch Estimation

Nonlinear least squares (NLS) objective JL(ω0, β0) = xTZ L(ω0, β0)

  • Z T

L (ω0, β0)Z L(ω0, β0)

−1 Z T

L (ω0, β0)x (26)

Harmonic chirp summation objective: JL(ω0, β0) = xTZ L(ω0, β0)Z T

L (ω0, β0)x

(27) 0.1 0.2 0.3 0.4 0.5 0.6 −5 5 ·10−3 ω0 β0

71 / 76

slide-84
SLIDE 84

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Model Improvements

Non-stationary Pitch Estimation

Window size of 30 ms, 75 % overlap, and no noise

72 / 76

slide-85
SLIDE 85

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Model Improvements

Non-stationary Pitch Estimation

Window size of 30 ms, 75 % overlap, and no noise

73 / 76

slide-86
SLIDE 86

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Outline

Correlation-based Methods Nonlinear Least Squares Methods Comparison of Methods Model Improvements Summary

74 / 76

slide-87
SLIDE 87

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Summary

◮ Published correlation-based methods are more mature than

published parametric methods in that they tend to include everything (pitch detection, estimation, and tracking) and are less computationally costly.

◮ However, parametric pitch estimation methods typically

  • utperform correlation-based methods in terms of estimation

accuracy, noise robustness, and time-frequency resolution.

◮ The modelling assumptions are explicit in parametric methods. ◮ Consequently, we can easily extend the model to take more

complex phenomena into account.

◮ Besides NLS, examples of other parametric methods are

subspace and filtering methods (Christensen and Jakobsson, 2009).

75 / 76

slide-88
SLIDE 88

Jesper Kjær Nielsen | Fast, Accurate, and Robust Pitch Estimation

Resources

◮ Audio Analysis Lab: https://audio.create.aau.dk/ ◮ Pitch Estimation for Dummies:

http://madsgc.blog.aau.dk/resources/

◮ MATLAB code: https://github.com/jkjaer/fastF0Nls ◮ YouTube videos: http://tinyurl.com/yd8mo55z

[1] J. K. Nielsen, T. L. Jensen, J. R. Jensen, M. G. Christensen, and S. H. Jensen, “Fast fundamental frequency estimation: Making a statistically efficient estimator computationally efficient,” Elsevier Signal Processing, vol. 135, pp. 188–197, 2017. [2] J. K. Nielsen, M. G. Christensen, and S. H. Jensen, “Default Bayesian estimation of the fundamental frequency,” IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 3, pp. 598–610, Mar. 2013. [3] M. G. Christensen and A. Jakobsson, Multi-Pitch Estimation, San Rafael, CA, USA: Morgan & Claypool, 2009.

76 / 76