Category Archives: Uncategorized

Three step digital Butterworth design.

This guide omits a fair amount of detail in order to provide a fairly quick guide.

1) Design in the s-domain

Butterworth filters have their poles evenly spaced on a circlular arc in the left hand plane of the [s/Laplace/analogue]-domain.

For a low-pass filter of order N, the complex valued pole locations are:

 p_n = \omega_0 e^\frac{j \pi (2 n + N + 1)}{2 N}

Where  0 <= n < N and  \omega_0 is the cutoff frequency in radians. The transfer function in the s domain is:

 H(s) = \frac{w_0}{\prod\limits_{n=0}^{N-1}(s - p_n)}

Because the poles always occur in conjugate pairs for this kind of filter, it can always be expanded out into a polynomial with real-valued coefficients a_n:

 H(s) = \frac{w_0}{\sum\limits_{n=0}^{N-1} a_n s^n}

2) Convert the s-domain transfer function to the z-domain.

There are a few methods to convert an s-domain transfer function this into a [z/digital/discrete]-domain transfer function. The most common is the bilinear transform which makes the substitution:

 s = \alpha \frac{z - 1}{z + 1}

Where:

 \alpha = \frac{2}{T}

Or if you're performing "frequency warping":

 \alpha = \frac{\omega_0}{\tan(\frac{\omega_0 T}{2})}

I won't go into verbose details here, but you probably want frequency warping. The bilinear transform is a linear approximation to the function s=\frac{1}{T}\ln(z) which maps the s-domain to the digital z-domain. This linear mapping is analogous to the first term of a series expansion of s=\frac{1}{T}\ln(z) which is most accurate around z=1 (i.e. DC). The frequency warping operation moves the point where most accuracy is preserved to \omega_0. For higher frequency cutoffs, this becomes important. \omega_0 can be set to the cutoff frequency of the filter to improve the frequency response near the transition.

For a 3-rd order system, the conversion yields (see notes at the end of this page about risks):

 \frac{b_3 s^3 + b_2 s^2 + b_1 s^1 + b_0}{a_3 s^3 + a_2 s^2 + a_1 s^1 + a_0} \leftrightarrow \frac{b_3 \alpha^3 (z + 1)^3 + b_2 \alpha^2 (z + 1)^2 (z - 1) + b_1 \alpha (z + 1) (z - 1)^2 + b_0 (z - 1)^3}{a_3 \alpha^3 (z + 1)^3 + a_2 \alpha^2 (z + 1)^2 (z - 1) + a_1 \alpha (z + 1) (z - 1)^2 + a_0 (z - 1)^3}

Which can be expanded into a rational polynomial transfer function in z.

3) Convert from the z-domain to a difference equation.

First, you must make your z-domain expression causal. Positive powers of z must be removed otherwise your filter would need to know the future! This is simple, just modify the numerator and denominator of the transfer function by the inverse of the highest power of z. For example:

 \frac{Y(z)}{X(z)} = H(z) = \frac{z + 4}{2z^2 - 3z + 2}

Multiply the numerator and denominator by z^{-2}.

 \frac{Y(z)}{X(z)} = \frac{z^{-1} + 4z^{-2}}{2 - 3z^{-1} + 2z^{-2}}

Get  Y(z) as a function of  X(z) .

 Y(z)(2 - 3z^{-1} + 2z^{-2}) = X(z)(z^{-1} + 4z^{-2})

The difference equation can be found by substituting z-powers as offsets to the data they are using. i.e. for the above case:

 2y[n] = x[n-1] + 4x[n-2] + 3y[n-1] - 2y[n-2]

That is what you want to implement in your code.

Things to keep in mind

High-order systems can start behaving very poorly when designed in this way. Even on floating point, once a system reaches 4-th order or higher, if there are poles close to the real axis in the z-domain, the output can become severely degraded. There are ways to mitigate this: the simplest of which is to break the implementation into separate 2nd order filters (biquads).

Windows Phone, ownCloud and CardDAV and CalDAV

This is a combination of information from the following two locations:

What is this?

This page gives an outline of how to get your Windows Phone to sync calendars and contacts with an ownCloud instance running on a server using a self-signed certificate. This whole process is a hack and I'm incredibly disappointed that Windows Phone does not support this natively - especially since they have a CalDAV and CardDAV implementation already available (used by iCloud and Google accounts). If this process stops working at some point, that should be expected.

I successfully got this working on a Lumia 735.

Process

I'm assuming that you've got ownCloud installed on a server using SSL. When the certificate was set up, the FQDN must really be the FDQN. i.e. if your ownCloud instance is hosted at "a.b.com", the certificate must be for "a.b.com" - not "b.com". I had this wrong on my home server and was getting the "80072F0D" error on the phone. If you have not yet set up ownCloud or have not set up SSL for your server yet, there are sites already documenting this process How to Create and Install an Apache Self Signed Certificate.

You then need to get the certificate installed on your phone. If you do not do this, you will get errors when the phone tries to start syncing. You can do this by opening the certificate in Internet Explorer on the phone (you can do this easily by copying the certificate to your ownCloud files, logging into your cloud in IE on the phone and opening the file.

You should then be able to follow the instructions here: Setting up CardDAV and CalDAV on Windows Phone 8.1.

Which roughly reads:

  1. Create a fake iCloud account. Put garbage information in the fields and create it. The phone won't check anything.
  2. Modify the account, edit the advanced settings then change the servers for CalDAV and CardDAV (respectively) as:
    [domain]/remote.php/caldav/principals/[username]
    [domain]/remote.php/carddav/principals/[username]

Understanding the Modified Discrete Cosine Transform (MDCT)

After playing around with discrete cosine transforms, I thought I would implement an MDCT and document my understanding of how everything works. I use some similar techniques to those used on the Wikipedia page as they are helpful for understanding but will add some scanned drawings which I think help (I'm not even close to being clever enough to get a computer to draw these for me).

Prerequisites

The only real background knowledge which I think is relevant to understanding the MDCT is the data extensions which the DCT-4 transform assumes.

First DCT-4 Basis Function with Shifted 2N Sample Input

First DCT-4 Basis Function with Shifted 2N Sample Input

I'll refer to the above image in the Forward Transform overview, but for the mean time, only pay attention to the solid quarter wave. This is the first basis function (i.e.  k=0 ) for an  N length DCT-4. If the basis is continued past  N , it has a repeating symmetrical pattern (the dashed line in the image) which repeats every  4N . The symmetry is even around  -0.5 and odd around  N-0.5 and holds for every basis function of the DCT-4. i.e. The DCT-4 assumes that the input data continues on forever, repeating itself in the following manner:  x_n ,  -x_{N-n-1} ,  -x_n ,  x_{N-n-1} .

Forward Transform

The MDCT takes in  2N real data points and produces  N real outputs. These inputs are designed to overlap, so the first half of the input data should be the second half of the input data of the previous call. The definition is:

 X_k = \displaystyle\sum\limits_{n=0}^{2N-1} x_n \cos \frac{ \pi \left( n + 0.5 + N/2 \right) \left( k + 0.5 \right) }{N}

It should be trivial to see from the above that the MDCT can be computed using a DCT-4 with an extended number of input data points, all of which have been shifted by half a basis. Go back to the crappy drawing and notice the concatenated  N/2 length sequences  a ,  b ,  c and  d . The total length of this sequence is  2N and begins at  N/2 (or half the length of a basis function). We need to get  b ,  c and  d back into the  N point region if we want to compute the MDCT using a DCT-4, this can be achieved with the following concatenated sequence (I will subscript these sequences with  r to denote a reversal of the sequence):

 - c_r - d , a - b_r

If we take the DCT-4 of this concatenated sequence, we have found the MDCT of the input sequence.

Inverse Transform

The inverse MDCT or IMDCT takes in  N real data points and produces  2N real outputs. In this transform, the outputs should overlap such that the first half of the output should be added to the second half of the output data in the previous call. The definition is:

 x_n = \frac{1}{N} \displaystyle\sum\limits_{n=0}^{N-1} X_k \cos \frac{ \pi \left( n + 0.5 + N/2 \right) \left( k + 0.5 \right) }{N}

Because we know how the DCT-4 assumes the input and output data repeats in a symmetric pattern, we can get this data trivially in exactly the same fashion as we did in the forward transform. In the following Illustration, we take the output from the forward transform and extend it along the basis:

Extended Projection of the MDCT Output on the First DCT-4 Basis

Extended Projection of the MDCT Output on the First DCT-4 Basis

In output row zero, we can see how to extend the input sequence to obtain the 2N points required. We then see in rows two and three how summing the overlapping blocks causes the aliased sequences to cancel in subsequent calls to the IMDCT.

World's Dumbest C MDCT Implementation

I validated all this actually works with a small C program. Follows are the MDCT/IMDCT implementations I came up with... ignore the "twid" input, I cache the modulation factors for the FFT which gets called in the dct4 routine:

/* state should contain double the number of elements as the input buffer (N)
 * and should have all elements initialized to zero prior to calling. The
 * output buffer is actually the first N elements of state after calling. */
void mdct(double *state, const double *input, double *twid, unsigned lenbits)
{
    unsigned rl = 1u << lenbits;
    unsigned i;
    /* Alias the input data with the previous block. */
    for (i = 0; i < rl / 2; i++) {
        state[i]        = - input[rl/2+i]      - input[rl/2-i-1];
        state[rl/2+i]   =   state[rl+i]        - state[rl+rl-i-1];
    }
    /* Save the input block */
    for (i = 0; i < rl; i++)
        state[rl+i]     = input[i];
    /* DCT-4 */
    dct4(state, lenbits, twid);
}
 
/* state should contain double the number of elements as the input buffer (N)
 * and should have all elements initialized to zero prior to calling. The
 * output buffer is actually the first N elements of state after calling. */
void imdct(double *state, const double *input, double *twid, unsigned lenbits)
{
    unsigned rl = 1u << lenbits;
    unsigned i;
    /* Collect contributions from the previous frame to the output buffer */
    for (i = 0; i < rl / 2; i++) {
        state[i]        = - state[rl+rl/2-i-1];
        state[rl/2+i]   = - state[rl+i];
    }
    /* Load the input and run the DCT-4 */
    for (i = 0; i < rl; i++)
        state[rl+i]     = input[i];
    dct4(state + rl, lenbits, twid);
    /* Sum contributions from this frame to the output buffer and perform the
     * required scaling. */
    for (i = 0; i < rl / 2; i++) {
        state[i]        = (state[i]      + state[rl+rl/2+i]) / rl;
        state[rl/2+i]   = (state[rl/2+i] - state[rl+rl-i-1]) / rl;
    }
}

Windowed MDCT Implementation

Typical MDCT implementations will window the input and output data (this can also be thought of as windowing the basis functions - which I think is a more helpful way to understand what is happening). It is really important to note that the window function must be carefully chosen to ensure that the basis functions remain orthogonal! The window makes the basis functions always begin and end near zero. The process has the side effect of de-normalising the basis functions (unless the window is rectangular) and means there will be a window-dependent scaling factor which will need to be applied at the output to achieve perfect reconstruction. The following images show the second basis function of the MDCT both un-windowed and windowed with a half-sine window (given at the end of the post).

Second MDCT Basis Function

Second MDCT Basis Function

Sine Windowed Second MDCT Basis Function

Sine Windowed Second MDCT Basis Function

In a lossy codec this windowing process is somewhat necessary because if the start and end points are not close to zero, the output is likely to periodically glitch for even the slightest errors in the reconstructed MDCT data. This glitching will occur at the boundaries of the transform (i.e. every N points).

We can work out the necessary conditions for the window to obtain perfect reconstruction using the previous drawings (I'd steer away from equations for this one - it's easier to validate the results visually) by applying a window function split into 4 segments to each of the input blocks. I'll do the generic case for a symmetrical window which is applied to both the input and the output. We split the window (which has a length  2N ) into four segments which will be applied to our original input segments  a ,  b ,  c and  d . Because we are defining this window to be symmetric, we can call the pieces:

 u, v, v_r, u_r

Symmetrical Window Impact on MDCT

Symmetrical Window Impact on MDCT

The above illustration shows how our window segments are applied to the input data and the impact that has on the DCT-4 analysed data blob. Following that is the output segments from two sequential IMDCT calls with the windows applied to the output here as well.

We need to make the overlapping terms equal the required output segment i.e.

 c = v_r \left( d_r u + c v_r \right) + u \left( c u - d_r v_r \right)


 d = u_r \left( d u_r + c_r v \right) + v \left( d v - c_r u_r \right)

It is clear from the above that the necessary condition to achieve reconstruction is for  v_r^2 + u^2 = 1 (which implies in this case that  v^2 + u_r^2 = 1 must also be true).

A simple solution to this is:

 w_n = \sin \frac{ \pi \left( n + 0.5 \right) }{2N}

The output requires a scaling factor of 2 for this window.

Why you should not use C99 exact-width integer types

Is there really ever a time where you need an integer type containing exactly N-bits? There are C99 types which guarantee at least N-bits. There are even C90 types which guarantee at least 8, 16 and 32 bits (the standard C integer types). Why not use one of those?

I never use C99 exact-width types in code... ever. Chances are that you shouldn't either because:

Exact width integer types reduce portability

This is because:

1) Exact width integer types do not exist before C99

Sure you could create an abstraction that detects if the standard is less than C99 and introduce the types, but then you would be overriding the POSIX namespace by defining your own integer types suffixed with "_t". POSIX.1-2008 - The System Interfaces: 2.2.2 The Name Space

GCC also will not like you:

The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; your program may not redefine these names.
GNU libc manual: 1.3.3 Reserved Names

From my own experience using GCC on OS X, the fixed width types are defined even when using --std=c90, meaning you'll just get errors if you try to redefine them. Bummer.

2) Exact width integer types are not guaranteed to exist at all:

These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32 or 64 bits, it shall define the corresponding typedef names.
ISO/IEC 9899:1999 - 7.18.1.1 Exact-width integer types

Even in C99, the (u)intN_t type does not need to exist unless there is a native integer type of that width. You may argue and say that there are not many platforms which do not have these types - there are: DSPs. If you start using these types, you limit the platforms on which your software can run - and are also probably developing bad habits.

Using exact width integer types could have a negative performance impact

If you need at least N-bits and it does not matter if there are more, why restrict yourself to a type which which could require additional overhead? If you are writing C99 code, use one of the (u)int_fastN_t types. Maybe, you could even use a standard C integer type!

The endianness of exact width integer types is unspecified

I am not not implying that the endianness is specified for other C types. I am just trying to make a point: you cannot even use these types for portable serialisation/de-serialisation without feral-octet-swapping-macro-garbage as the underlying layout of the type is system dependent.

If you are interested in the conditions for when memcpy can be used to copy memory into a particular type, maybe you should check out the abstraction which is part of my digest program. It contains a heap of checks to ensure that memcpy is only used on systems when it is known that it will do the right thing. It tries to deal with potential padding, non 8-bit chars and endianness in a clean way that isn't broken.

This article deliberately did not discuss the signed variants of these types...