Filtering Noise (Part I)

If you own a car, you probably noticed that the speedometer needle’s position varies but relatively slowly, regardless of how the car actually accelerates or decelerates. Unless your speedometer is some variation on the eddy current meter, maybe the noise from the speed sensor isn’t filtered analogically but numerically by the dashboard’s computer.

Ford_Mondeo_MK3_ST220_-_Speedometer_(light)

Let us have a look at how this filtering could be done.

The original signal coming out of the sensor is likely to be noisy. The nature of the noise depends on the device, and it may not be always safe to assume that the average of the noise is zero, nor that it is Gaussian. You can estimate the device noise characteristics by submitting it to a constant input—more on this later. You take a number of readings from your sensor while in operation. You plot the results:

A Sample Time Series

A Sample Time Series

Here, we have 200 samples of the signal. I have shown the samples as sticks to emphasize the discrete nature of that time series. If you’re going to use this sensor’s output directly in an algorithm, it may be quite fine; however, using these values for display will be, despite being an accurate rendering of the sensor’s output, more confusing than informative. Especially when considering human/machine interfaces, displaying rapidly fluctuating numbers is never a good idea. What you want to display is a relatively slowly, always smoothly, varying value. A bit like your car’s speedometer. You’d probably be worried if the needle wouldn’t stop shaking. Instead, what you see is that it varies smoothly, maybe with a slight lag: it moves a few fractions of a second after you felt the acceleration.

A possible way of smoothing readings for display is to use the moving average algorithm. Rather than displaying the last read value accurately, it will display an average of the last w values. The size of the window w depends on many things, but for now, we can set it quite arbitrarily to, say, w=10 or w=20 or whatever value seems reasonable to you for your application. The moving average can be centered or tailed. In the centered flavor, the displayed value depends on the surrounding elements:

\displaystyle \bar{x}_t = \frac{1}{w}\sum_{i=t-\lfloor\frac{w}{2}\rfloor}^{t+\lfloor\frac{w}{2}\rfloor}x_i

for w odd. In the tailed version, the displayed value depends on the last w values and is given by:

\displaystyle \bar{x}_t = \frac{1}{w}\sum_{i=t-w+1}^{t}x_i

Applying the centered moving average on our time series, you would get:

A Sample Time Series with Moving Average (window of 20)

A Sample Time Series with Moving Average (window of 20)

Fortunately, we can compute either moving averages incrementally and we need not to store more than w sensor readings in memory. To compute incrementally the sums, we first must notice that:

\displaystyle \sum_{i=a}^b x_i = \left(\sum_{i=a-1}^{b-1}x_i\right) + x_b - x_{a-1} (✻)

That is, if we shift the window one item to the right from position a-1 to new position a, the new sum is given by the previous sum minus x_{a-1}, the item that goes out from the window, plus x_b, the new item that enters the window.

So clearly, a simple circular buffer suffice to hold the data in the window. Before overwriting the value, we subtract it from the sum, and we add the new value to the sum an into the buffer. The code would look something like:

    moving_sum+=new_value;
    moving_sum-=window[window_ptr];
    window[window_ptr]=new_value;
    window_ptr=(window_ptr+1)%window_size;

The only other thing you have to decide on are the initial conditions. When you have zero reading, the algorithm must return a value that’s sensible to your application. Very often, filling the window with zeroes and having an initial sum of zero is quite reasonable. Maybe in your application it makes more sense if they’re initialized at \pi or 100.

*
* *

Estimating the sensor noise is a tricky thing. It will determine the size of the window w depending on the maximum variation you want at any given time, and also how you ultimately compensate for the errors in the rest of the application.

One possible way to measure the spread of the sensor’s output due to noise is to compute the variance of a stationary value. This gives you an idea of how much the noise perturbs the readings on average. The sample variance is given by:

\displaystyle \mathrm{Var}(X) = \frac{1}{n}\sum_{i=1}^n (x_i-\bar{x})^2

where \bar{x} is the sample average. Or, using Huygens‘ identity:

\displaystyle Var(X) = \mathrm{E}[X^2]-\mathrm{E}^2[X]

Remember that one as it’ll be useful to you every time you compute the variance. The variance is the average of squares minus the square of average.

However, variance is very sensitive to minute differences because it is a L_2 measure, it deals with squares of values (or squares of differences) rather than the values directly. Plotting the variance around the reading, centered on the moving average, we get:

A Sample Time Series with Moving Average and Variance (window of 20)

A Sample Time Series with Moving Average and Variance (window of 20)

Which exhibits large variations when the slopes are large (the signal raises or falls rapidly) so it cannot be used alone as a measure of the noise. Probably a variance/slope ratio would be more indicative?

You may want to hide the noise from the user altogether. However, I’m not sure it’s always a good thing. In some application, showing that there’s a large variance around the read values may be of some use; maybe detect excessive vibration in a sensor, indicating some other, more important, problem? Whichever you decide, you should make sure that the information is presented in a clear, non-obtrusive way. May be not have okudagram-like displays, but you might want to show something like:

okudagram-noise-1
okudagram-noise-2

Where the brackets would change geometry depending on the variance or the standard deviation, which is the square root of the variance. Color could also play a major role. The interval between the brackets could vary color depending on the variance; maybe green under a certain threshold, maybe red to indicate an inordinately large variance?

*
* *

You may also have noticed that the incremental algorithm to compute the average (and thus the incremental algorithm to compute the variance) does not compute exactly the same thing as the naïve algorithm. Well, they are mathematically equivalent, but not equivalent when considering the implementation. In one case, we start with a sum of 0 and add a finite number of values to compute the average. In the second case, we start from the previous average (or previous sum) and, using the identity (✻), we compute the new average (or sum).

The problem with the second approach is, although it is much more efficient, it is susceptible to the accumulation of rounding errors. How the rounding errors accumulate depends on the nature of the noise and the magnitudes of the values you are adding. If the rounding errors are asymptotically symmetric around 0, they should more or less cancel each other out and yield an accurate average.

But the precision at which the computation takes place is also crucial. Let us consider this simple experiment. At each time, the new value x_t is computed as x_t=x_{t-1}+U(-1,1), where U(-1,1) is an uniform random number in the interval [-1,1] (from -1 to 1, inclusively). I generated a series of 10 000 000 samples.

If using double, I find that the error between the complete average and the incremental average is always between -0.0000000000227374 and 0.0000000000306386, so, roughly “zero”. Therefore, using double you shouldn’t expect major stability issues. Not so when using single precision float. Now, we find that the error is contained between -0.0333557 and 0.0012207, which is significantly larger than with double. Maybe when using float or some less than excellent implementation of IEEE 754, you could reset the incremental sum every one in a while. Maybe every million samples or so.

*
* *

You can test for window size, float type, and data source using this simple framework:

#include
#include
#include

#define nb_tries 10000000
#define window_size 20
#define float_t double

// some random function between -1 and 1
float_t random()
{
return (rand()-rand())/(float_t)RAND_MAX;
}

int main()
{
//srand(time(0));

float_t x0=0;
float_t s0=0;

float_t min_diff=0;
float_t max_diff=0;

float_t window[window_size]={0};
int window_ptr=0;

for (int i=0;imax_diff) max_diff=diff;
if (diffgcc -Wall -Wextra -std=c99 -O3 -ffloat-store stability-test.c -o stability-test. The -ffloat-store option makes sure that you’re not using a special feature of your processor, such as the x86’s 80 bits internal float registers. If all the maths are done on 80 bits internally, you’ll get an artificially low drift.

One Response to Filtering Noise (Part I)

  1. […] a previous installment, about filtering noise on filtering Noise, we discussed how to use a moving average to even out irregularities in a […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: