Initial Variance
Developing Models for Kalman Filters
Last installment, we reviewed background information about variance and about how this related to a dynamic state transition model. However, variance is also used in the Kalman world in another rather flaky manner — but it doesn't matter much that it is somewhat fictional. As long as the other long-term noise sources are well represented, the flakiness quickly corrects itself. And besides: there really isn't a choice. Artificial or not, you have to start somewhere.
Variance as randomness, variance as uncertainty
The problem is that when the system starts, and when its Kalman models starts, there has to be an initial state to calculate the first state projection. Though it is a bit of a stretch, suppose that the starting state is determined by a random lottery from the universe of possible starting states. This allows describing the admissible set of initial states as a probability distribution. So far so good, but the initial state is not really determined by a random lottery, and there is no possible way of accurately characterizing the probability distributions to make this idea a reality. Consequently, there is no rigorous way of determining a covariance matrix either. So fake it.
For multiple states, it is generally reasonable to say that each state value is selected by an independent lottery. So let all of the off-diagonal terms in the initial state covariance matrix be set to 0.0.
Uniformly spread starting state
Suppose for example that you have no idea at all what the system's
starting state will be. You expect the simulation to produce some rather
huge errors at first. In the uniform spread case, a state variable
could take any possible value in its range with equal probability,
according to the lottery concept. For a uniform probability
distribution of normalized variables spanning from -1 to +1, the
theoretical variance is 1/3. Set the initial value of the state
variable to its quiescent value of 0.0
, and place
the presumptive variance values 1/3 in the corresponding main diagonal
terms of the covariance matrix.
Clustered starting state
Suppose that the initial state is relatively easy to bound. For example, suppose the system is always started within 10 percent of it range centered on the central point of the cluster. If the central point is the quiescent state, set the initial state value to 0.0; otherwise, if you expect a different typical starting state, specify that initial state value.
Taking inspiration from the Gaussian ("Normal") distribution, estimate a balanced range around the initial state which is deemed likely to contain about 90% of the feasible starting states in the cluster. That 90% is spanned by roughly "two standard deviations," so estimate what that range is, then calculate half of that range limiting value and square it. Use this value for the corresponding covariance matrix main diagonal term.
For example, suppose it is believed that system operation will start with the first state somewhere in the range -0.2 to 0.2 about 90% of the time. The variance figure to represent this is obtained by taking 1/2 of the range limit, or 0.1, and squaring to get 0.01. This term goes into the initial state covariance estimate.
Accurately known starting state
When the state is accurately known, variance for these estimates
will be too small to matter. A typical example is when the system
always starts at the quiescent 0.0 equilibrium value.
There is very little offset in the starting state, so the
corresponding main diagonal term of the initial state covariance estimate
can be set arbitrarily small — 0.0 should be close enough.
If for some reason you are worried about having a singular
covariance matrix, no harm in setting the main diagonal term to
a very tiny positive value such as 1.0e-8
. After
one update, this won't make any difference.
Initial variance as input disturbance
In some prior unknown history, the state was something, but we don't know or care particularly what. Suddenly, a massive impulse hits the system, resulting in a state transition to the new state, which we consider the "unknown initial state" for further analysis.
After this scenario, we could just as well treat the initial state behavior as ordinary impulse response behavior... except for the fact that we don't know what the impulse was. We still know the general character of the system's response from that point forward. Given a stable system, and given sufficient time, the initial state disturbance is going to decay exponentially to zero.
Propagation of noise variance
Starting with the very first update, the state noise terms are transformed along with the rest of the state information. In addition, the random noise sources do their dirty work, contributing new variability at each step. This means that the variance in the state variable vector changes at each step. This process is repeated with each new update, as past contributions of noise recirculate through the state equations.
Next time, we will take a closer look at this noise propagation and how the state noise variance changes.