librosa.yin(y, *, fmin, fmax, sr=22050, frame_length=2048, win_length=None, hop_length=None, trough_threshold=0.1, center=True, pad_mode='constant')[source]

Fundamental frequency (F0) estimation using the YIN algorithm.

YIN is an autocorrelation based method for fundamental frequency estimation 1. First, a normalized difference function is computed over short (overlapping) frames of audio. Next, the first minimum in the difference function below trough_threshold is selected as an estimate of the signal’s period. Finally, the estimated period is refined using parabolic interpolation before converting into the corresponding frequency.


De Cheveigné, Alain, and Hideki Kawahara. “YIN, a fundamental frequency estimator for speech and music.” The Journal of the Acoustical Society of America 111.4 (2002): 1917-1930.

ynp.ndarray [shape=(…, n)]

audio time series. Multi-channel is supported..

fminnumber > 0 [scalar]

minimum frequency in Hertz. The recommended minimum is librosa.note_to_hz('C2') (~65 Hz) though lower values may be feasible.

fmaxnumber > 0 [scalar]

maximum frequency in Hertz. The recommended maximum is librosa.note_to_hz('C7') (~2093 Hz) though higher values may be feasible.

srnumber > 0 [scalar]

sampling rate of y in Hertz.

frame_lengthint > 0 [scalar]

length of the frames in samples. By default, frame_length=2048 corresponds to a time scale of about 93 ms at a sampling rate of 22050 Hz.

win_lengthNone or int > 0 [scalar]

length of the window for calculating autocorrelation in samples. If None, defaults to frame_length // 2

hop_lengthNone or int > 0 [scalar]

number of audio samples between adjacent YIN predictions. If None, defaults to frame_length // 4.

trough_thresholdnumber > 0 [scalar]

absolute threshold for peak estimation.


If True, the signal y is padded so that frame D[:, t] is centered at y[t * hop_length]. If False, then D[:, t] begins at y[t * hop_length]. Defaults to True, which simplifies the alignment of D onto a time grid by means of librosa.core.frames_to_samples.

pad_modestring or function

If center=True, this argument is passed to np.pad for padding the edges of the signal y. By default (pad_mode="constant"), y is padded on both sides with zeros. If center=False, this argument is ignored. .. see also:: np.pad

f0: np.ndarray [shape=(…, n_frames)]

time series of fundamental frequencies in Hertz.

If multi-channel input is provided, f0 curves are estimated separately for each channel.

See also


Fundamental frequency (F0) estimation using probabilistic YIN (pYIN).


Computing a fundamental frequency (F0) curve from an audio input

>>> y = librosa.chirp(fmin=440, fmax=880, duration=5.0)
>>> librosa.yin(y, fmin=440, fmax=880)
array([442.66354675, 441.95299983, 441.58010963, ...,
    871.161732  , 873.99001454, 877.04297681])