Model Transformations

In earlier installments, we mentioned that state transition models were not unique. While sometimes this causes problems, it also provides some extra degrees of freedom that can be used to restructure the state transition model, without changing any external results.

General state transition transformations

Start with the original state transition equation system.

\begin{matrix} x^{k + 1} = A \cdot x^{k} + B \cdot u^{k} \\ y^{k} = C \cdot x^{k} \end{matrix}

Select an arbitrary nonsingular square matrix Q of the same size as the state transition matrix. By definition, because it is nonsingular, an inverse exists.

Q \cdot Q^{- 1} = Q^{- 1} \cdot Q = I

Therefore, we can use this Q matrix as a transformation operator that maps every possible state vector x into a new vector q, or we can invert this mapping to restore the original x vector.

q \Leftrightarrow Q \cdot x x \Leftrightarrow Q^{- 1} q

Using these new notations, we can replace references to the original x state vector in the state transition equations.

(Q^{- 1} \cdot q^{k + 1}) = A \cdot (Q^{- 1} \cdot q^{k}) + B \cdot u^{k}

y^{k} = C \cdot (Q^{- 1} \cdot q^{k})

Since the Q matrix is nonsingular, we lose no information multiplying both sides of the transition equations by Q.

q^{k + 1} = Q \cdot A \cdot Q^{- 1} \cdot q^{k} + Q \cdot B \cdot u^{k}

Now we can define some new names that simplify the notation.

Q \cdot A \cdot Q^{- 1} \Leftrightarrow \overset{‸}{A} Q \cdot B \Leftrightarrow \overset{‸}{B} C \cdot Q^{- 1} \Leftrightarrow \overset{‸}{C}

Substituting these simplified notations, we get the following equation system.

\begin{matrix} q^{k + 1} = \overset{‸}{A} \cdot q^{k} + \overset{‸}{B} \cdot u^{k} \\ y^{k} = \overset{‸}{C} \cdot q^{k} \end{matrix}

The result is another discrete time state transition model, but in terms of different state variables. The input sequences are exactly the same. The output sequences are exactly the same. Clearly the model is an exact equivalent, even though the internal states and the internal model parameters are entirely different.

Same results. Who cares?

There are many possible reasons you might care. There can be important benefits from using the transformed state variables.

The restructuring can be used to orthogonalize the state transition matrix for improved numerical accuracy.
The transformations can impose useful scalings on the magnitudes of the internal state values.
The restructuring can be use to impose a matrix structure with certain terms that exactly equal 0.0 .
The restructuring can factor some or all states into a modal form with reduced internal interactions between state variables.
A model can be transformed into a conventional canonical form for which certain theoretical results are more easily applied.

Some useful transformations

Here are some particular transformations that are likely to prove useful.

Scaling transformation

You already know about this one from prior installments of this series.^[1] Construct a square matrix with positive scaling values along the main diagonal. Calculating the inverse of the diagonal scaling matrix is trivial.

Row-permutation transformation

Start with an identity matrix, and swap any two columns i and j. When you left-multiply this matrix to any square matrix (or compatible column vector), it has the effect of swapping terms in the ith and the jth rows; in effect, this repositions the state variables in the state vector, and repositions the associated model parameters, without changing any values.

For example, let's consider transforming the following arbitrary state transition matrix by moving the second state variable to the position of the third, and the third state variable to the position of the second.

mmat =
   9.8051e-01  -2.5922e-02  -2.8938e-02   3.8595e-02
  -3.2968e-02   1.0265e+00  -1.7103e-02  -1.4628e-03
  -2.5834e-02  -8.4132e-04   9.7193e-01  -1.5873e-02
   3.4865e-02  -1.7536e-02  -1.3940e-02   9.6629e-01

pmat =
   1   0   0   0
   0   0   1   0
   0   1   0   0
   0   0   0   1

qmat = pmat' * mmat * pmat
qmat =
   9.8051e-01  -2.8938e-02  -2.5922e-02   3.8595e-02
  -2.5834e-02   9.7193e-01  -8.4132e-04  -1.5873e-02
  -3.2968e-02  -1.7103e-02   1.0265e+00  -1.4628e-03
   3.4865e-02  -1.3940e-02  -1.7536e-02   9.6629e-01

The net result of the system transformation is that parameters related to the second and third rows and columns are swapped. No values are changed: only their locations. None of the other terms change value or position.

Transposition matrices play nicely with each other. You can do a major reordering by applying a cascade of transposition operations. You can also consolidate the permutation operations and apply them all in one lump.

The most likely reasons to use this kind of transformation:

to isolate variables deemed secondary or inconsequential from other variables deemed important
to isolate "clusters" of closely related variables
to facilitate theoretical analysis of matrix operations

An output observation transformation

Suppose that in your state transition model the observation matrix C is a row vector having many nonzero terms. This says that the observed output is determined by a direct combination of several internal states. The goal is to define a new set of states such that under the new definitions the output is determined by a direct readout of one of the internal state variables.

How might this be done? Well, we need the Q^-1 matrix to have the property that

C \cdot Q^{- 1} = \overset{‸}{C} = [1 0 0]

Let's consider how to do this for the state transition equations used previously in this installment, using a process known as orthogonalization.^[2] Suppose that the original observation matrix is the following.

  cmat = [ 8.2520e+00   2.3353e+00   0.0000e+00  -1.0449e+01 ];

Initialize a completely random transformation matrix with values in the range -0.5 to +0.5.

  qinv = rand(4,4)-ones(4,4)*0.5;

Calculate a scaling factor from the initial C matrix.

  cscale = 1.0 / (cmat*cmat');

Replace the first column of the random matrix with a vector that is a scaled version of the transpose of the C matrix.

  qinv(1:4,1) = cscale * cmat';

For all of the other columns, modify the column by subtracting a certain amount of vector C^T such that the product of the modified column and the original C matrix is zero.

  for icol=2:4
    colscale = cmat * qinv(1:4,icol) * cscale
    qinv(1:4,icol) = qinv(1:4,icol) - colscale*cmat'
  end

Now verify that the inverse transformation works as intended by forming the C Q^-1 matrix product.

  newcmat = cmat * qinv

    newcmat =
      1.0000e+00  -7.5460e-17  -8.4893e-17  -1.5213e-16

Knowing the Q^-1 matrix, you can invert it and complete the rest of the state equation transformation.

Footnotes:

[1] See the installment on row and column scaling operations in this series.

[2] This is a partial orthogonalization. You can get better numerical results using a complete orthogonalization. Investigate Gram-Schmidt orthogonalization and the Octave qr function.

Restructuring State Models