Caution

You're reading an old version of this documentation. If you want up-to-date information, please have a look at 0.9.1.

librosa.core.ifgram

librosa.core.ifgram(y, sr=22050, n_fft=2048, hop_length=None, win_length=None, window='hann', norm=False, center=True, ref_power=1e-06, clip=True, dtype=<class 'numpy.complex64'>, pad_mode='reflect')[source]

Compute the instantaneous frequency (as a proportion of the sampling rate) obtained as the time-derivative of the phase of the complex spectrum as described by [1].

Calculates regular STFT as a side effect.

1

Abe, Toshihiko, Takao Kobayashi, and Satoshi Imai. “Harmonics tracking and pitch extraction based on instantaneous frequency.” International Conference on Acoustics, Speech, and Signal Processing, ICASSP-95., Vol. 1. IEEE, 1995.

Warning

This function is deprecated in version 0.7.1, and will be removed in version 0.8.0. The function reassigned_spectrogram provides comparable functionality, and should be used instead of ifgram.

Parameters
ynp.ndarray [shape=(n,)]

audio time series

srnumber > 0 [scalar]

sampling rate of y

n_fftint > 0 [scalar]

FFT window size

hop_lengthint > 0 [scalar]

hop length, number samples between subsequent frames. If not supplied, defaults to win_length / 4.

win_lengthint > 0, <= n_fft

Window length. Defaults to n_fft. See stft for details.

windowstring, tuple, number, function, or np.ndarray [shape=(n_fft,)]
  • a window specification (string, tuple, number); see scipy.signal.get_window

  • a window function, such as scipy.signal.hanning

  • a user-specified window vector of length n_fft

See stft for details.

normbool

Normalize the STFT.

centerboolean
  • If True, the signal y is padded so that frame

    D[:, t] (and if_gram) is centered at y[t * hop_length].

  • If False, then D[:, t] at y[t * hop_length]

ref_powerfloat >= 0 or callable

Minimum power threshold for estimating instantaneous frequency. Any bin with np.abs(D[f, t])**2 < ref_power will receive the default frequency estimate.

If callable, the threshold is set to ref_power(np.abs(D)**2).

clipboolean
  • If True, clip estimated frequencies to the range [0, 0.5 * sr].

  • If False, estimated frequencies can be negative or exceed 0.5 * sr.

dtypenumeric type

Complex numeric type for D. Default is 64-bit complex.

pad_modestring

If center=True, the padding mode to use at the edges of the signal. By default, STFT uses reflection padding.

Returns
if_gramnp.ndarray [shape=(1 + n_fft/2, t), dtype=real]

Instantaneous frequency spectrogram: if_gram[f, t] is the frequency at bin f, time t

Dnp.ndarray [shape=(1 + n_fft/2, t), dtype=complex]

Short-time Fourier transform

See also

stft

Short-time Fourier Transform

reassigned_spectrogram

Time-frequency reassigned spectrogram

Examples

>>> y, sr = librosa.load(librosa.util.example_audio_file())
>>> frequencies, D = librosa.ifgram(y, sr=sr)
>>> frequencies
array([[  0.000e+00,   0.000e+00, ...,   0.000e+00,   0.000e+00],
       [  3.150e+01,   3.070e+01, ...,   1.077e+01,   1.077e+01],
       ...,
       [  1.101e+04,   1.101e+04, ...,   1.101e+04,   1.101e+04],
       [  1.102e+04,   1.102e+04, ...,   1.102e+04,   1.102e+04]])