Fourier Transform for Time Series: Fast Convolution Explained with numpy | by Yoann Mocquin | Jul, 2023


Implementation from scratch vs numpy

The Fourier transform algorithm is considered one of the greatest discoveries in all of mathematics. French mathematician Jean-Baptiste Joseph Fourier laid the foundation for harmonic analysis in his book “Théorie analytique de la chaleur” in 1822. Today, the Fourier transform and all its variants form the basis of our modern world, powering technologies like compression, communication, image processing.

Engraved portrait of French mathematician Jean Baptiste Joseph Fourier (1768–1830), early 19th century. [souce: wikipedia, image from public domain]

This wonderful framework also provides great tools for analysing time-series… and that’s why we’re here!

This post is part of a series on the Fourier transform. Today we will talk about convolution and how the Fourier transform provides the fastest way you can do it.

All figures and equations are made by the author.

Let’s start with basic definitions. The discrete Fourier transform for a discrete time sequence x of N elements is :

Discrete Fourier Transform (DFT) definition. Other definitions exist, you just need to choose one and stick to it (made by author)

where k denotes the k-th frequency of the spectrum of x. Note that some author add a scaling factor of 1/N to that definition, but is not of great importance for this post — all in all it is just a matter of definition and sticking it to it.

The inverse Fourier transform is then (given the definition for the forward Fourier transform):

Inverse Discrete Fourier Transform, based on the forward definition mentioned above (made by author).

That being said, one of the most important theorem on Fourier transform is that convolution in one space is equivalent to multiplication in the other. In other words, the Fourier transform of a product is the convolution of the respective Fourier spectra, and the Fourier transform of a convolution is the product of the respective Fourier spectra.

Multiplication in the time-domain corresponds to circular convolution in Fourier domain (made by author).

and

Circular convolution in time domain corresponds to multiplication in Fourier domain (made by author).

where the dot represents the standard product (multiplication) and the circled star represents circular convolution.

Two important notes:

  • Periodic signal : Fourier analysis framework consider that the signals we handle are periodic. In other words, they repeat from minus infinity to infinity. However it is not always practical to handle such signals with a finite memory computer, so we only “play” with one period, as we will see hereafter.
  • Circular convolution : The convolution theorem states that multiplication is equivalent to circulation convolution, which is a bit different from linear convolution, which we are more accustomed to. As we will see, it is not that different, and not that complicated either.

If you’re familiar with linear convolution, often simply referred to as ‘convolution’, you won’t be confused by circular convolution. Basically, circular convolution is just the way to convolve periodic signals. As you can guess, linear convolution only makes sense for finite length signals, that do not span from minus infinity to infinity. In our case, in the context of Fourier analysis, our signals are periodic therefore do not satisfy this condition. We can’t talk about (linear) convolution.

Yet we still can intuit a linear-convolution-like operation on our periodic signals : just convolve the periodic signal on a one-period length. That’s what circular convolution does : it convolves 2 periodic signals of the same length along a one-period span.

To further convince yourself of the difference, compare both formulas for discrete linear convolution and discrete circular convolution :

Equation for linear convolution : most of the time in signal processing, with use this formula, by padding with zeros (made by author).
Circular convolution : this is the convolution used when dealing with periodic signals, as in Fourier analysis (made by author).

Notice the differences :

bounds : linear convolution uses samples from minus-infinity to plus infinity — as stated previously, in this context x and y have finite energy the sum makes sense. For the circular convolution, we only need to what happened in a period span, so the sum only spans one period

circular indexes : in the circular convolution, we “wrap” the index of y using a modulo operation with a length of N. That’s just a way to ensure that y is considered periodic with a period of N : when we want to know the value of y at position k, we just use the value of y at position k%N — since y is N periodic, we get the right value. Again, this is just a mathematical way to handle periodic infinite-length sample sequences.

Numpy provides great tools for finite length signals: and that’s a good news because as we just saw, our infinite-length periodic signal can be represented with just a period.

Let’s create a simple class to represent such signals. We add a method to quickly plot the array, as well as an additional period before and after the “base” array, to keep in mind that we are dealing with a periodic sequence.

Let’s look at two examples : first with a sampled sinus sequence, then with a linear sequence. Both are considered to be N-periodic (N=10 in this case).

2 examples of PeriodicArray : the “base” period is plotted from 0 to N in dark blue, while 2 other periods are added before and after to represent the fact that we are dealing with periodic sequences (made by author).

Let’s now implement the circular convolution equation seen above. Using indexing and the modulo operator, it is pretty straightforward:

The circular convolution between the above 2 periodic sequences (made by author).

That’s great, we can now see what the circular convolution between two signals looks like. Putting everything in a single figure :

Left : first periodic array. Middle : second periodic array. Right : circular convolution of the 2 periodic arrays, which also is a periodic array (made by author).

Now this solution works really well, but it has a major flaw : it is slow. As you can see, we have to go through 2 nested loops to compute the result : one for each position in the result array, and one to compute the result at that position : we say that the algorithm is O(N²), as N increases the number of operations will increase as the square of N.

For small arrays like those in the example, it is not a problem, but as your array will grow it’ll become a major problem.

Furthermore, loops on numerical data is most of the time considered a bad practice in python. There must be a better way…

And that’s where the Fourier transform and the convolution theorem come into play. Because of the way the discrete Fourier transform is implemented, in a very fast and optimized way using the Fast Fourier Transform (FFT), the operation is **very** fast (we say the FFT is O(N log N), which is way better than O(N²)).

Using the convolution theorem, we can use the fact the product of the DFT of 2 sequences, when transformed back into the time-domain using the inverse DFT, we get the convolution of the input time sequences. In other words, we have :

Circular convolution between x and y using direct and inverse Fourier transform (made by author).

where DFT represents the discrete Fourier transform, and IDFT the inverse operation.

We can then implement this algorithm to compute the convolution of x and y using numpy very easily :

To finish, let’s verify that both approaches yield the same results, and compare the time it take python to compute the circular convolution using both techniques:

Comparison of both approaches to compute the circular convolution between 2 periodic sequences : “slow way” is the simple algebraic using loops and additions in blue, which is superimposed with the “Fourier way” in orange. Both methods give exactly equal results (down to numerical precision) (made by author).

That’s a perfect match! Both are rigorously equivalent in term of numerical values.

Now for the time comparison:

And the results are:

  • for N=10 samples, the DFT is faster by a factor of 6
  • for N=1000 samples, the DFT is faster a factor of about 10000

That’s huge! Consider now what it can brings you when you analyze your time-series with thousands and thousands of samples!

We have seen in this post that the Fourier transform is powerful tool, especially thanks to the convolution theorem that allows us to compute convolution in a very efficient manner. We’ve seen that linear and circular are not exactly the same operation but are both based on a convolution.

Subscribe to get future posts about Fourier transform directly onto your feed!

Also, check out my other post and if you like any of them, please subscribe it helps me a lot to reach my goal of 100 subscribers:

300-Times Faster Resolution of Finite-Difference Method Using NumPy | by Yoann Mocquin | Towards Data Science (medium.com)

PCA/LDA/ICA : a components analysis algorithms comparison | by Yoann Mocquin | Towards Data Science (medium.com)

Wrapping numpy’s array. The container approach. | by Yoann Mocquin | Towards Data Science (medium.com)

Deep Dive into Seaborn: Color Palettes | by Yoann Mocquin | Analytics Vidhya | Medium



Source link

Leave a Comment