Computing the Moving Average of a sequence is a common task in games (and other applications). The motivation is simple: if something happened too far in the past, probably it does not matter anymore.
One of the main problems with the standard average is that it “slows” over time. This can be a serious problem. Suppose that you are designing a game in which your player have to quickly press a button to keep a value (e.g., player speed) above a certain average. If you use the global average, after a certain amount of time, your player can stop pressing the button and still keep the average above the threshold.
The demonstration of this is quite intuitive. If you have an average
a(t) at frame
t and the player will not press anything in the current frame, the average at frame
t+1 will be
This factor depends on the elapsed time and becomes “almost 1” very quickly. You don’t want that. You want to keep your player on the narrow land between “boredom” and “frustration”. You cannot give to your player the possibility to win without doing nothing for 30 seconds.
The solution to this problem is simple. Use a Moving Average. The player will have to push the button faster than the threshold, but the average is computed only using the data from the last 5 second (or any other time window you want).
Unfortunately, unlike the global average, computing the moving average require additional information. For the “global average” you just need to keep in memory the current average and the elapsed time. To compute a moving average, instead, you have to add a timestamp to every sample. You have to keep the timestamp because you must know which sample is old enough to be removed from the average computation when the time window shifts.
That’s BORING. Can we do something to avoid this unnecessary complexity? Yes, we can. Let’s write the theoretical optimal formula to update the moving average.
In the previous formula
T is the time window in which we are computing the average,
S is the amount of new samples we have measured in the interval between the current time and the last computation of the average and
X is the “amount of average” we have lost in the last time windows shifting (that is the sum of the samples we are going to remove). We know everything but the value of
X. As we have seen before, we cannot know
X without storing each sample with the corresponding timestamp.
However, if we don’t need the exact value of the moving average – and in games this is usually the case – we can compute the expected value of it way faster.
We are just moving the problem to compute the expected value of
X. However, we can go completely blind and assume a uniform distribution. If we compute the average frequently enough, this is “almost” true in any case. In this setting, the expected value is just the current average multiplied for the time increment:
Therefore, we can put this in the original formula:
Look at the formula. You don’t need to know how many samples are gone, so you can avoid implementing timestamp for them. The main “problem” with this formula is that it never goes to zero. Fortunately, you can workaround this assuming that the moving average is zero if it goes below a certain threshold. In general, this formula is a “smoother” (filtered) version of the real moving average. Fortunately, in games nobody will never have a problem with that and, in the meanwhile, you will be spared a lot of effort and additional complexity.
Update: Here is a small demo on how this thing works! :D
This small applet computes the moving average of clicks in the last 5 seconds. Note that, if you stop clicking, it slowly decays to 0 (while the true moving average should be exactly 0 after 5 seconds of inactivity). However, in the average case, the average is good (oh, silly words).