Caution
You're reading an old version of this documentation. If you want up-to-date information, please have a look at 0.11.0.
librosa.istft
- librosa.istft(stft_matrix, *, hop_length=None, win_length=None, n_fft=None, window='hann', center=True, dtype=None, length=None, out=None)[source]
Inverse short-time Fourier transform (ISTFT).
Converts a complex-valued spectrogram
stft_matrixto time-seriesyby minimizing the mean squared error betweenstft_matrixand STFT ofyas described in [1] up to Section 2 (reconstruction from MSTFT).In general, window function, hop length and other parameters should be same as in stft, which mostly leads to perfect reconstruction of a signal from unmodified
stft_matrix.- Parameters:
- stft_matrixnp.ndarray [shape=(…, 1 + n_fft//2, t)]
STFT matrix from
stft- hop_lengthint > 0 [scalar]
Number of frames between STFT columns. If unspecified, defaults to
win_length // 4.- win_lengthint <= n_fft = 2 * (stft_matrix.shape[0] - 1)
When reconstructing the time series, each frame is windowed and each sample is normalized by the sum of squared window according to the
windowfunction (see below).If unspecified, defaults to
n_fft.- n_fftint > 0 or None
The number of samples per frame in the input spectrogram. By default, this will be inferred from the shape of
stft_matrix. However, if an odd frame length was used, you can specify the correct length by settingn_fft.- windowstring, tuple, number, function, np.ndarray [shape=(n_fft,)]
a window specification (string, tuple, or number); see
scipy.signal.get_windowa window function, such as
scipy.signal.windows.hanna user-specified window vector of length
n_fft
- centerboolean
If
True,Dis assumed to have centered frames.If
False,Dis assumed to have left-aligned frames.
- dtypenumeric type
Real numeric type for
y. Default is to match the numerical precision of the input spectrogram.- lengthint > 0, optional
If provided, the output
yis zero-padded or clipped to exactlylengthsamples.- outnp.ndarray or None
A pre-allocated, complex-valued array to store the reconstructed signal
y. This must be of the correct shape for the given input parameters.If not provided, a new array is allocated and returned.
- Returns:
- ynp.ndarray [shape=(…, n)]
time domain signal reconstructed from
stft_matrix. Ifstft_matrixcontains more than two axes (e.g., from a stereo input signal), thenywill match shape on the leading dimensions.
See also
stftShort-time Fourier Transform
Notes
This function caches at level 30.
Examples
>>> y, sr = librosa.load(librosa.ex('trumpet')) >>> D = librosa.stft(y) >>> y_hat = librosa.istft(D) >>> y_hat array([-1.407e-03, -4.461e-04, ..., 5.131e-06, -1.417e-05], dtype=float32)
Exactly preserving length of the input signal requires explicit padding. Otherwise, a partial frame at the end of
ywill not be represented.>>> n = len(y) >>> n_fft = 2048 >>> y_pad = librosa.util.fix_length(y, size=n + n_fft // 2) >>> D = librosa.stft(y_pad, n_fft=n_fft) >>> y_out = librosa.istft(D, length=n) >>> np.max(np.abs(y - y_out)) 8.940697e-08