You're reading an old version of this documentation. If you want up-to-date information, please have a look at 0.10.2.


librosa.yin(y, fmin, fmax, sr=22050, frame_length=2048, win_length=None, hop_length=None, trough_threshold=0.1, center=True, pad_mode='reflect')[source]

Fundamental frequency (F0) estimation using the YIN algorithm.

YIN is an autocorrelation based method for fundamental frequency estimation [1]. First, a normalized difference function is computed over short (overlapping) frames of audio. Next, the first minimum in the difference function below trough_threshold is selected as an estimate of the signal’s period. Finally, the estimated period is refined using parabolic interpolation before converting into the corresponding frequency.

ynp.ndarray [shape=(n,)]

audio time series.

fmin: number > 0 [scalar]

minimum frequency in Hertz. The recommended minimum is librosa.note_to_hz('C2') (~65 Hz) though lower values may be feasible.

fmax: number > 0 [scalar]

maximum frequency in Hertz. The recommended maximum is librosa.note_to_hz('C7') (~2093 Hz) though higher values may be feasible.

srnumber > 0 [scalar]

sampling rate of y in Hertz.

frame_lengthint > 0 [scalar]

length of the frames in samples. By default, frame_length=2048 corresponds to a time scale of about 93 ms at a sampling rate of 22050 Hz.

win_lengthNone or int > 0 [scalar]

length of the window for calculating autocorrelation in samples. If None, defaults to frame_length // 2

hop_lengthNone or int > 0 [scalar]

number of audio samples between adjacent YIN predictions. If None, defaults to frame_length // 4.

trough_threshold: number > 0 [scalar]

absolute threshold for peak estimation.


If True, the signal y is padded so that frame D[:, t] is centered at y[t * hop_length]. If False, then D[:, t] begins at y[t * hop_length]. Defaults to True, which simplifies the alignment of D onto a time grid by means of librosa.core.frames_to_samples.

pad_modestring or function

If center=True, this argument is passed to np.pad for padding the edges of the signal y. By default (pad_mode="reflect"), y is padded on both sides with its own reflection, mirrored around its first and last sample respectively. If center=False, this argument is ignored. .. see also:: np.pad

f0: np.ndarray [shape=(n_frames,)]

time series of fundamental frequencies in Hertz.

See also


Fundamental frequency (F0) estimation using probabilistic YIN (pYIN).


Computing a fundamental frequency (F0) curve from an audio input

>>> y = librosa.chirp(440, 880, duration=5.0)
>>> librosa.yin(y, 440, 880)
array([442.66354675, 441.95299983, 441.58010963, ...,
    871.161732  , 873.99001454, 877.04297681])