Logo   Information, Signal, Images et ViSion C.N.R.S.   GdR   M.E.S.R.

The discrete STFT

To reduce the redundancy of the continuous STFT, we can sample it in the time-frequency plane. Since the atoms used can be deduced from the window $h(t)$ by translation in time and in frequency, it is natural to sample the STFT on a rectangular grid:

\begin{displaymath}F_x[n,m;h]=F_x(nt_0,m\nu_0;h)=\int_{-\infty}^{+\infty} x(u)\
h^*(u-nt_0) \exp{[-j2\pi m \nu_0 u]}\ du \end{displaymath}

$m,n \in {\cal Z}$. The problem is then to choose the values of $t_0$ and $\nu_0$ so as to minimize the redundancy without loosing any information. For that, we must have

\begin{displaymath}t_0\times \nu_0\ \leq\ 1.\end{displaymath}

Then, the atoms ${h_{nt_0,m\nu_0}}$ constitute a discrete over-sampled family of non orthonormal elements, which is called a frame : when $t_0\times \nu_0>1$, the time-frequency plane is not sufficiently "covered" by the atoms $h_{nt_0,m\nu_0}$, i.e. there are "gaps" between adjacent atoms.

When $t_0\times \nu_0 = 1$, the family of atoms ${h_{nt_0,m\nu_0}}$ can constitute an orthonormal basis for an appropriate choice of the window. But it can be shown that it is impossible to obtain such a basis with a window $h$ which is well localized in time and in frequency (this property is known as the Balian-Low obstruction [Dau92]). Therefore, for a well localized window $h$ (for example a gaussian window), the reconstruction formula will not be numerically stable.

In the discrete case, the reconstruction (synthesis) formula of the signal from the STFT is then given by

\begin{displaymath}x(t) = \sum_n \sum_m F_x[n,m;h]\ g_{n,m}(t)\end{displaymath}

where $g_{n,m}(t)=g(t-nt_0)\ \exp{[j2\pi m \nu_0 t]}$.

This relation is valid provided that the sampling periods $t_0$ and $\nu_0$, the analysis window $h$ and the synthesis window $g$ are chosen such that

\begin{displaymath}\frac{1}{\nu_0} \sum_n g(t+\frac{k}{\nu_0}-nt_0)\ h^*(t-nt_0)\ =\
\delta_k\ \ \forall t\end{displaymath}

with $\delta_k$ defined as $\delta_0=1$ and $\delta_k=0$ for $k\neq
0$. This condition is far more restrictive than the condition $\int_{-\infty}^{+\infty} g(t)\ h^*(t)\ dt=1$ required in the continuous case.

For a sampled signal $x[n]$ whose sampling period is noted $T$, $t_0$ has to be chosen so that $t_0 = kT$, $k\in {\cal N}^*$. We then have the following analysis and synthesis formulae

$\displaystyle F_x[n,m;h]$ $\textstyle =$ $\displaystyle \sum_k x[k]\ h^*[k-n]\ \exp{[-j2\pi m k]} \mbox{ for }
-\frac{1}{2} \leq m \leq \frac{1}{2}$ (3.1)
$\displaystyle x[k]$ $\textstyle =$ $\displaystyle \sum_n \sum_m F_x[n,m;h]\ g[k-n]\ \exp{[j2\pi m k]}.$ (3.2)

These two relations can be implemented efficiently using a Fast Fourier Transform (FFT) algorithm.

Eric Chassande-Mottin 2005-10-26

© GdR ISIS - Contact