You're reading the documentation for a development version. For the latest released version, please have a look at 0.10.2.


librosa.yin(y, *, fmin, fmax, sr=22050, frame_length=2048, win_length=None, hop_length=None, trough_threshold=0.1, center=True, pad_mode='constant')[source]

Fundamental frequency (F0) estimation using the YIN algorithm.

YIN is an autocorrelation based method for fundamental frequency estimation [1]. First, a normalized difference function is computed over short (overlapping) frames of audio. Next, the first minimum in the difference function below trough_threshold is selected as an estimate of the signal’s period. Finally, the estimated period is refined using parabolic interpolation before converting into the corresponding frequency.

ynp.ndarray [shape=(…, n)]

audio time series. Multi-channel is supported..

fminnumber > 0 [scalar]

minimum frequency in Hertz. The recommended minimum is librosa.note_to_hz('C2') (~65 Hz) though lower values may be feasible.

fmaxnumber > fmin, <= sr/2 [scalar]

maximum frequency in Hertz. The recommended maximum is librosa.note_to_hz('C7') (~2093 Hz) though higher values may be feasible.

srnumber > 0 [scalar]

sampling rate of y in Hertz.

frame_lengthint > 0 [scalar]

length of the frames in samples. By default, frame_length=2048 corresponds to a time scale of about 93 ms at a sampling rate of 22050 Hz.

win_lengthNone or int > 0 [scalar]

length of the window for calculating autocorrelation in samples. If None, defaults to frame_length // 2

hop_lengthNone or int > 0 [scalar]

number of audio samples between adjacent YIN predictions. If None, defaults to frame_length // 4.

trough_thresholdnumber > 0 [scalar]

absolute threshold for peak estimation.


If True, the signal y is padded so that frame D[:, t] is centered at y[t * hop_length]. If False, then D[:, t] begins at y[t * hop_length]. Defaults to True, which simplifies the alignment of D onto a time grid by means of librosa.core.frames_to_samples.

pad_modestring or function

If center=True, this argument is passed to np.pad for padding the edges of the signal y. By default (pad_mode="constant"), y is padded on both sides with zeros. If center=False, this argument is ignored. .. see also:: np.pad

f0: np.ndarray [shape=(…, n_frames)]

time series of fundamental frequencies in Hertz.

If multi-channel input is provided, f0 curves are estimated separately for each channel.

See also


Fundamental frequency (F0) estimation using probabilistic YIN (pYIN).


Computing a fundamental frequency (F0) curve from an audio input

>>> y = librosa.chirp(fmin=440, fmax=880, duration=5.0, sr=22050)
>>> librosa.yin(y, fmin=440, fmax=880, sr=22050)
array([442.66354675, 441.95299983, 441.58010963, ...,
    871.161732  , 873.99001454, 877.04297681])