Better Polynomial-Based Derivative Estimators

Numerical Methods for Derivative Estimation

We found that pre-filtering the input data to remove extraneous high frequency effects was effective at reducing noise sensitivity of derivative estimator filters. But filter designing can be difficult, and the results did not appear to be particularly economical in terms of computing effort.

In this section, we will consider modifying the polynomial methods to combine noise rejection and derivative estimation features, obtaining more promising results.


Rejecting high frequencies

When performing a analysis of a filter, it was helpful for the analysis to use a complex-valued sine-cosine wave combination


To calculate the response of a filter in discrete steps of length w T = π * m/M

   Σ ak · en(j π m/M)

Without going into too much theory about this, we can simplify somewhat using the notation

   en(j w m/M)  ≡ zn  

We can observe that when w = 0, z = 1, and when w m/M = π or -π, z = -1. These real-valued special cases are particularly easy to analyze.

Using this notation, we want to try to introduce new terms to our polynomial derivative estimator so that it vigorously rejects frequencies at the high frequency Nyquist limit. If we have a polynomial that does this, without affecting the low frequency terms, that could provide a combined filter form with all the right properties.

Let's consider the filter form

    L(z) ≡ z-K/2 (z+1)K / 2K

It is easy to verify that when frequency w T has the value 0, L(z) = 1, and when frequency w m/M = π, L(z) = 0. So this is promising.

The K parameter is selected to be a small even integer to keep the complexity small. Let us try the value K = 4. The pure z factors have the effect of "time shifting" to keep the filter centered. This filter expression can then be expanded into the form:

    L(z)  =  1/16 · (1 z-2 + 4 z-1 + 6 z0 + 4 z1 + 1 z2)

Here is the frequency response of this primitive filter.

rolloff from zeroes at inf

To apply this, we can multiply the high frequency noise rejection polynomial and the Central Difference polynomial, and collect terms. But we are not going to follow this further, because this result is not satisfactory. While the attenuation at high frequencies appears beneficial, there are side effects of excessive attenuation in the desired response range below "20% of the Nyquist limit."


Max flat filter

There is another class of filters that start with the simple filter above, but also include additional second-order terms that flatten the curve at the low end. The resulting filters have the following properties:

  1. The response curve starts very flat: multiple derivatives are exactly zero at frequency 0.
  2. The almost-flat response band provides a uniform almost-full response at low frequencies.

Filters with these properties are relatively complicated to design, so we will not consider the details here. But as an example, the coefficients below are for a typical maxflat filter of this class.

    L(z)  =  ( -0.010907 z-4 + -0.031250 z-3 + 0.043630 z-2 + 0.281250 z-1 + ...
                0.434555  z0 + 0.281250 z1 + 0.043630 z2 + -0.031250 z3 + -0.010907 z4 )

Here is its frequency response.

maxflat design

This is distinctly better. Near zero frequencies, the flat gain of 1.0 will not interfere with the derivative estimates, while very high frequencies are vigorously rejected. For purposes of comparison, here is an example of a combined max-flat filter with a Central Differences estimator. This one-pass estimator requires less than half of the computational effort of the experimental two-stage lowpass filtering.

    Combined max-flat / Central Differences derivative estimator 
   0.00018  -0.00112   0.00277   0.02529  -0.00596  -0.17224  
  -0.24723   0.00000   0.24723   0.17224   0.00596  -0.02529  
  -0.00277   0.00112  -0.00018

Here is its combined frequency response.

13-term combined estimator

This is a usable estimator filter. However, it still suffers from the typical limitations of all maxflat designs:

  1. Increasingly impaired accuracy as frequencies increase.
  2. Poor attenuation of noise in the mid-band "transition" region.


More efficient maxflat solution

Pavel Holoborodko devised a purely analytical method for directly designing very compact derivative estimator filters that have the same kind of max-flat property. The efficiency of these designs is remarkable, but you can see in the following plot, the designs have the usual accuracy and noise problems in the middle frequencies. Here is a plot of his "noise-robust" first derivative estimator of order 9.

robust low noise result

Be sure to check his site[1] to get complete information about the design method and results. If the mid-band accuracy and noise problems are of no concern to you, the efficiency and asymptotic low-frequency performance make these estimators appealing. If you want to experiment further, here for reference is a copy of the table of filter designs that he provides.

    Smooth noise-robust differentiators (Pavel Holoborodko)[1] 

   Order 5
     (1/8h)   ( -1.0000  -2.0000   0.0000  2.0000  1.0000 )

   Order 7
     (1/32h)  ( -1.0000  -4.0000  -5.0000   0.0000  5.0000  4.0000  1.0000 )

   Order 9
     (1/128h) (  -1.0000  -6.0000  -14.0000  -14.0000   0.0000  14.0000  ...
                 14.0000   6.0000   1.0000 )
   Order 11
     (1/512h) (  -1.0000  -8.0000  -27.0000  -48.0000 -42.0000   0.0000  ...
                 42.0000  48.0000  27.0000  8.0000  1.0000 )


Is that the best that can be done?

Though we're finally starting to get something useful, the maxflat polynomial designs seem to take us only part of the way. Maxflat polynomial design methods assume that it is sufficient to concentrate on flatness and attenuation at the limits of the representable frequency band, hoping that whatever occurs between is going to be acceptable.

The whole idea of trying to fit a polynomial curve to conform exactly to a number of specified points is a bit suspect — that is not how you would ordinarily try to fit a curve to noisy data. What you would (and should) most likely do is calculate some kind of "best fit" curve that represents the overall character of the desired response without necessarily matching individual data points exactly. It is reasonable that some kind of "best fit" strategy would be inherently less vulnerable to noise interference. We will examine this idea next, which will bring us to the conclusion of the series.



[1] See Pavel Holoborodko's Smooth Noise-Robust Differentiators page for details about these designs.


Contents of the "Numerical Estimation of Derivatives" section of the website, including this page, are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License unless otherwise specifically noted. For complete information about the terms of this license, see The license allows usage and derivative works for individual or commercial purposes under certain restrictions. For permissions beyond the scope of this license, see the contact information page for this site.