Model Transformations

In the previous installment, we covered some relatively simple matrix transformation operations that can produce a state transition model completely different from the original one but having exactly the same input/output behavior as the original. We will continue this time to cover a particular kind of transformation that can be used to impose a desired structure on the state transformation equations.

Quick review

Starting with a state transition equation in the following typical form.

\begin{matrix} x^{k + 1} = A \cdot x^{k} + B \cdot u^{k} \\ y^{k} = C \cdot x^{k} \end{matrix}

we found that using a nonsingular matrix Q, we can define the following new operators

Q \cdot A \cdot Q^{- 1} \Leftrightarrow \overset{‸}{A} Q \cdot B \Leftrightarrow \overset{‸}{B} C \cdot Q^{- 1} \Leftrightarrow \overset{‸}{C}

and apply them to construct the transformed state equation system.

\begin{matrix} q^{k + 1} = \overset{‸}{A} \cdot q^{k} + \overset{‸}{B} \cdot u^{k} \\ y^{k} = \overset{‸}{C} \cdot q^{k} \end{matrix}

The goal this time is to find a special transformation Q that can impose a desired restructuring of the state transition equations. For example, you could force the transition matrix to have a a tridiagonal form in which each internal state interacts directly with at most two other internal states.

The Householder Transformation

Householder devised a transformation back in 1958 ^[1] to eliminate some number of terms below diagonal of a matrix. You probably do not want to eliminate the main diagonal terms — that would be the equivalent of saying "the value of the state is completely independent of what it was in the previous time step" which is highly unusual. As typically presented, you also can't clear terms above the main diagonal, but this is really not a limitation because you know from the previous installment how to build a permutation transformation that can reorder the transformation rows to locate them any way you want.

Let the number of rows in the state transition matrix be N, and let the location of the last term not set to zero be index t. The specific goal of the transformation is:

To force the desired changes in a selected column w.
Avoid changes to rows and columns before row t.
Let the transformation produce whatever changes it needs in row t.
Make the transformation force all of the other remaining terms in the w column, terms t+1 through N, to zero.

To achieve this, Householder started with a proposed operator matrix of the form

    I - u·u^T

where the column vector u is derived in some way from the initial w column. When this operation is applied to that matrix column w the result will be

    (I - u·u^T) · w

To avoid making changes to the terms near the top of the matrix, set the terms of u_i to zero for i = 1 to t-1.
To make sure the later product terms of u^T w can cancel I matrix terms, select u_i = w_i for i = t+1 to N.
Calculate S² = Σ w_i² for terms i = t to N. Let S then be sqrt(S²).
Disregarding motivation for a moment, select u_t = w_t + sign(w_t) S.

With these particular choices, let's examine what happens in the u · u^T · w product terms. The u^T · w dot product value is

Σ u_i^T w_i   for i=t to N

   =  0 + 0 + ... + (w_t + sign(w_t) S) w_t + w_t+1² + w_t+2² + ... + w_N² 

   =  S sign(w_t) w_t + S²

Now we can see how selecting the sign(w_t) for the square root term avoids some kind of freak accident that might otherwise cause this dot product to be zero or very nearly so. Then there is no hazard in inverting it. Define that inverse to be
β = 1 / (S sign(w_t) w_t + S²).

If we now consider the scaled dot product u_t² · w_t · β we can see how this has been forced to equal 1. Note that this condition happens only for the special case of vector w.

Now define the Householder Transformation to be the following.

    H ≡ (I - β u·u^T)

When this transformation matrix is applied to the original transition matrix, and it encounters the w column vector, this produces the special terms that we just examined. The product β · u^T · w yields the scalar value 1.0, so for this particular case,

    (I - β u·u^T) · w 

      = w - u

Now reconsidering how the u vector was constructed,

For the first t-1 terms, u was set to zero, consequently, the first t-1 terms of the transformed result remain unchanged.
For the last t-1 to N terms, u was set to match w, consequently, all of these terms are forced to zero.
For the t term, the w_t term cancels out the w_t part of the u_t vector, leaving the result sign(w_t) S at this location.

Whew! If you didn't follow all of that, it isn't really critical. What is important is the that you can construct this transformation in a formulaic way, and it forces the desired sub-diagonal terms to zero in the selected column.

But there is something even more marvelous about the Householder matrix. It is orthonormal, which means that it is trivially easy to calculate its inverse: you just calculate its transpose. No extra matrix inverses need to be calculated.

So now let's try building one.

Numerical example

Here is an example of building and applying a Householder matrix (using explicit Octave coding^[2]) that clears the term in row 4 of column 1 of in a 4x4 matrix mmat.

% Configure the special u vector
wvec = mmat(:,1)
uvec = wvec;
uvec(1)=0.0;
uvec(2)=0.0;

% Perform the scaling calculations
sig = sign(wvec(3));
s2 = uvec'*uvec
ss = sqrt(abs(s2))
uvec(3) = ss*sig + wvec(3)
beta = 1.0/(uvec(3)*sig*ss)

% Construct the householder matrix
hmat = eye(4,4) - beta*uvec*uvec'

% Verify the results by constructing the new state transition matrix
simmat = hmat*mmat*hmat'

Let's apply this and see what happens. Here is an arbitrary state transition matrix. The goal of our transformation is to eliminate that last 0.335284 term in the first column of the state transition matrix.

mmat =
   0.9500415  -0.0254670   0.0332051  -0.0066137
   0.0173741   0.9965266  -0.0115364   0.0013972
   0.0318813   0.0281944   0.9561729  -0.0325960
   0.0335284   0.0024825  -0.0067044   0.9826702

From the first column, the intermediate u vector and scaling value β are calculated.

wvec =
   0.950041
   0.017374
   0.031881
   0.033528

s2 =  0.0021406
ss =  0.046266
uvec =
   0.000000
   0.000000
   0.078148
   0.033528

Here is the corresponding scaling factor beta and the Householder transformation matrix.

beta =  276.58
hmat =
   1.00000   0.00000   0.00000   0.00000
   0.00000   1.00000   0.00000   0.00000
   0.00000   0.00000  -0.68908  -0.72468
   0.00000   0.00000  -0.72468   0.68908

This matrix is then used as the Q matrix in the state transformation formula.

simat = hmat * mmat * hmat'
simat =
   9.5004e-01  -2.5467e-02  -1.8088e-02  -2.8621e-02
   1.7374e-02   9.9653e-01   6.9370e-03   9.3231e-03
  -4.6266e-02  -2.1227e-02   9.5046e-01  -1.2750e-03
  -1.4586e-18  -1.8721e-02  -2.7167e-02   9.8838e-01

Note the extremely small bit of roundoff junk left over in the location that was supposed to become 0.0. You can clear that location to zero exactly if you wish, since there is no numerical significance.

There is another popular (and simpler) transformation called a Given's Transformation that is also able to "wipe out" terms below the matrix diagonal one term at a time. However, this doesn't work so well for transforming state transition models. When used for this purpose, the transformation and its inverse tend to fill in values that previously were sent to zero, failing the primary objective.

Footnotes:

[1] Householder, A. S., Unitary Transformations of a Nonsymmetric Matrix, JACM 5, 1958, 339-342.

[2] Once you are comfortable with doing this, you will probably find that the householder command in Octave provides an easier way.

Restructuring State Models, part 2

Quick review

The Householder Transformation

Numerical example