Initial Variance

Last installment, we reviewed background information about variance and about how this related to a dynamic state transition model. However, variance is also used in the Kalman world in another rather flaky manner — but it doesn't matter much that it is somewhat fictional. As long as the other long-term noise sources are well represented, the flakiness quickly corrects itself. And besides: there really isn't a choice. Artificial or not, you have to start somewhere.

Variance as randomness, variance as uncertainty

The problem is that when the system starts, and when its Kalman models starts, there has to be an initial state to calculate the first state projection. Though it is a bit of a stretch, suppose that the starting state is determined by a random lottery from the universe of possible starting states. This allows describing the admissible set of initial states as a probability distribution. So far so good, but the initial state is not really determined by a random lottery, and there is no possible way of accurately characterizing the probability distributions to make this idea a reality. Consequently, there is no rigorous way of determining a covariance matrix either. So fake it.

For multiple states, it is generally reasonable to say that each state value is selected by an independent lottery. So let all of the off-diagonal terms in the initial state covariance matrix be set to 0.0.

Uniformly spread starting state

Suppose for example that you have no idea at all what the system's starting state will be. You expect the simulation to produce some rather huge errors at first. In the uniform spread case, a state variable could take any possible value in its range with equal probability, according to the lottery concept. For a uniform probability distribution of normalized variables spanning from -1 to +1, the theoretical variance is 1/3. Set the initial value of the state variable to its quiescent value of 0.0, and place the presumptive variance values 1/3 in the corresponding main diagonal terms of the covariance matrix.

Clustered starting state

Suppose that the initial state is relatively easy to bound. For example, suppose the system is always started within 10 percent of it range centered on the central point of the cluster. If the central point is the quiescent state, set the initial state value to 0.0; otherwise, if you expect a different typical starting state, specify that initial state value.

Taking inspiration from the Gaussian ("Normal") distribution, estimate a balanced range around the initial state which is deemed likely to contain about 90% of the feasible starting states in the cluster. That 90% is spanned by roughly "two standard deviations," so estimate what that range is, then calculate half of that range limiting value and square it. Use this value for the corresponding covariance matrix main diagonal term.

For example, suppose it is believed that system operation will start with the first state somewhere in the range -0.2 to 0.2 about 90% of the time. The variance figure to represent this is obtained by taking 1/2 of the range limit, or 0.1, and squaring to get 0.01. This term goes into the initial state covariance estimate.

Accurately known starting state

When the state is accurately known, variance for these estimates will be too small to matter. A typical example is when the system always starts at the quiescent 0.0 equilibrium value. There is very little offset in the starting state, so the corresponding main diagonal term of the initial state covariance estimate can be set arbitrarily small — 0.0 should be close enough. If for some reason you are worried about having a singular covariance matrix, no harm in setting the main diagonal term to a very tiny positive value such as 1.0e-8. After one update, this won't make any difference.

Initial variance as input disturbance

In some prior unknown history, the state was something, but we don't know or care particularly what. Suddenly, a massive impulse hits the system, resulting in a state transition to the new state, which we consider the "unknown initial state" for further analysis.

After this scenario, we could just as well treat the initial state behavior as ordinary impulse response behavior... except for the fact that we don't know what the impulse was. We still know the general character of the system's response from that point forward. Given a stable system, and given sufficient time, the initial state disturbance is going to decay exponentially to zero.

Propagation of noise variance

Starting with the very first update, the state noise terms are transformed along with the rest of the state information. In addition, the random noise sources do their dirty work, contributing new variability at each step. This means that the variance in the state variable vector changes at each step. This process is repeated with each new update, as past contributions of noise recirculate through the state equations.

Next time, we will take a closer look at this noise propagation and how the state noise variance changes.