Even More Householder

Mathematics
Linear Algebra
Householder Matrix
Author

Nick Appleton

Published

March 2, 2020

In several previous blogs (here and here), I've written about the orthonormal Householder matrix and how it can constructed such that it contains a particular unit-vector. A recent tweet from Eric Arnebäck posted a link to some code that was originally put together in 1999 by Tomas Akenine-Möller and John Hughes to find an orthogonal matrix which maps one unit-vector to another. This is a slightly more general problem which can be still solved in the same way using the Householder matrix. The vectors and matrices in this post are assumed to be real but the method can be modified to work with complex numbers.

In the previously linked posts, we described the method to find the Householder matrix \(\mathbf{P}\) given a unit-vector \(\mathbf{x}\) such that \(\mathbf{P} \mathbf{x}=\mathbf{e}_1\). Another way of wording this is: find the orthogonal matrix \(\mathbf{P}\) that contains \(\mathbf{x}\) as the first row. Now, we will generalise this to find an orthogonal matrix \(\mathbf{P}\) given a unit-vector \(\mathbf{x}\) such that \(\mathbf{P} \mathbf{x}=\mathbf{y}\) where \(\mathbf{y}\) is another unit-vector. We start by defining \(\mathbf{P}\) to be the Householder reflector \(\mathbf{P}=\mathbf{I}-2\mathbf{v}\mathbf{v}^H\) and attempt to solve for \(\mathbf{v}\):

\[ \begin{array}{@{}ll@{}} \mathbf{y} &= \left(\mathbf{I} - 2 \mathbf{v} \mathbf{v}^H\right) \mathbf{x} \\ \mathbf{x} - \mathbf{y} &= 2 \mathbf{v} \mathbf{v}^H \mathbf{x} \end{array} \]

As the quantity \(\mathbf{v}^H \mathbf{x}\) is a scalar, the vector \(\mathbf{v}\) must be a scaled version of \(\mathbf{x}-\mathbf{y}\). Define \(\mathbf{v}=\alpha\left(\mathbf{x}-\mathbf{y}\right)\) and make the substitution into the previous equation:

\[ \begin{array}{@{}ll@{}} \mathbf{x} - \mathbf{y} &= 2 \alpha^2 \left(\mathbf{x}-\mathbf{y}\right) \left(\mathbf{x}-\mathbf{y}\right)^H \mathbf{x} \\ &= 2 \alpha^2 \left(\mathbf{x}-\mathbf{y}\right) \left(1-\mathbf{y}^H \mathbf{x}\right) \\ \frac{1}{\sqrt{2 \left(1-\mathbf{y}^H \mathbf{x}\right)}} &= \alpha \end{array} \]

Observe that \(\alpha\) has a singularity as \(\mathbf{y}^H \mathbf{x}\) approaches one. This can be addressed (as has been done in the previous posts) by negating the definition of the reflector (i.e. set \(\mathbf{P}=2\mathbf{v}\mathbf{v}^H-\mathbf{I}\)) and solving for \(\mathbf{v}\) again:

\[ \begin{array}{@{}ll@{}} \mathbf{y} &= \left(2 \mathbf{v} \mathbf{v}^H - \mathbf{I}\right) \mathbf{x} \\ \mathbf{x} + \mathbf{y} &= 2 \mathbf{v} \mathbf{v}^H \mathbf{x} \end{array} \]

This time, \(\mathbf{v}\) will be defined as \(\mathbf{v} = \alpha\left(\mathbf{x}+\mathbf{y}\right)\) and the solution for \(\alpha\) becomes:

\[ \frac{1}{\sqrt{2 \left(1+\mathbf{y}^H \mathbf{x}\right)}} \]

This shifts the singularity to \(\mathbf{y}^H \mathbf{x}=-1\). If the quantity \(\mathbf{y}^H \mathbf{x}\) is positive, this latter solution will be more numerically robust; if it is negative, the former will be more numerically robust. We will build an algorithm that performs the flip based on the sign of \(\mathbf{y}^H \mathbf{x}\) which results in a discontinuity of the resulting matrix when the sign changes. This might be important to consider in some use-cases.

We can avoid computing the square root because \(\alpha\) always ends up being used squared. We define \(2 \mathbf{v}\mathbf{v}^H=\beta\mathbf{w}\mathbf{w}^H\) where \(\mathbf{w}\) is defined to be either \(\mathbf{x}-\mathbf{y}\) or \(\mathbf{x}+\mathbf{y}\) leading to \(\beta\) being defined as either \(\frac{1}{1-\mathbf{y}^H \mathbf{x}}\) or \(\frac{1}{1+\mathbf{y}^H \mathbf{x}}\) respectively. We choose which quantity to use based on the sign of \(\mathbf{y}^H \mathbf{x}\). The following C listing is not efficient but provides a reference in very little code for arbitrary dimension vectors.

/* Find the orthonormal symmetric reflection matrix which
 * maps the given unit-vector x to the given unit-vector y.
 * As the matrix is symmetric, it can be interpreted as
 * either row or column major.
 *
 * This isn\'t efficient. */
void make_reflector(const float *y, const float *x, unsigned N, float *mat) {
    unsigned i, j;
    float proj, sgn, beta;
    for (i = 0, proj = 0.0f; i < N; i++)
        proj += y[i] * x[i];
    sgn  = (proj >= 0.0f) ? 1.0 : -1.0;
    beta = 1.0f / (proj + sgn);
    for (i = 0; i < N; i++) {
        for (j = 0; j < N; j++) {
            float w_i = x[i] + y[i] * sgn;
            float w_j = x[j] + y[j] * sgn;
            mat[i*N+j] = w_i * beta * w_j;
        }
        mat[i*N+i] -= sgn;
    }
}

Because the resulting matrix is symmetric only \(n(n+1)/2\) elements need to be computed. The w vector could also be precomputed to avoid excessive repeated computation in the inner loop. The completely unrolled and branch-less 3-dimension case can be written as:

void make_reflector(float x[3], float y[3], float mat[3][3]) {
    float proj  = x[0]*y[0] + x[1]*y[1] + x[2]*y[2];
    float sgn   = copysignf(1.0f, proj);
    float beta  = 1.0f / (proj + sgn);
    float w0    = x[0] + y[0] * sgn;
    float w1    = x[1] + y[1] * sgn;
    float w2    = x[2] + y[2] * sgn;
    float w0a   = w0 * beta;
    float w1a   = w1 * beta;
    float w2a   = w2 * beta;
    mat[0][0]   = w0a * w0 - sgn;
    mat[0][1]   = w0a * w1;
    mat[0][2]   = w0a * w2;
    mat[1][0]   = mat[0][1];
    mat[1][1]   = w1a * w1 - sgn;
    mat[1][2]   = w1a * w2;
    mat[2][0]   = mat[0][2];
    mat[2][1]   = mat[1][2];
    mat[2][2]   = w2a * w2 - sgn;
}

This looks very similar to the listings used to reflect \(\mathbf{x}\) to \(\mathbf{e}_1\). None of this is new. The Householder matrix has been used for a very long time to solve this exact problem in the context of finding orthogonal matrices for performing the QR decomposition (google "QR decomposition using reflectors" to find less-verbose descriptions of what I've written on this page).