Caution

You're reading an old version of this documentation. If you want up-to-date information, please have a look at 0.9.1.

librosa.yin¶

librosa.yin(y, *, fmin, fmax, sr=22050, frame_length=2048, win_length=None, hop_length=None, trough_threshold=0.1, center=True, pad_mode='constant')[source]¶

Fundamental frequency (F0) estimation using the YIN algorithm.

YIN is an autocorrelation based method for fundamental frequency estimation 1. First, a normalized difference function is computed over short (overlapping) frames of audio. Next, the first minimum in the difference function below trough_threshold is selected as an estimate of the signal’s period. Finally, the estimated period is refined using parabolic interpolation before converting into the corresponding frequency.

1: De Cheveigné, Alain, and Hideki Kawahara. “YIN, a fundamental frequency estimator for speech and music.” The Journal of the Acoustical Society of America 111.4 (2002): 1917-1930.

Parameters

ynp.ndarray [shape=(…, n)]: audio time series. Multi-channel is supported..
fminnumber > 0 [scalar]: minimum frequency in Hertz. The recommended minimum is librosa.note_to_hz('C2') (~65 Hz) though lower values may be feasible.
fmaxnumber > 0 [scalar]: maximum frequency in Hertz. The recommended maximum is librosa.note_to_hz('C7') (~2093 Hz) though higher values may be feasible.
srnumber > 0 [scalar]: sampling rate of y in Hertz.
frame_lengthint > 0 [scalar]: length of the frames in samples. By default, frame_length=2048 corresponds to a time scale of about 93 ms at a sampling rate of 22050 Hz.
win_lengthNone or int > 0 [scalar]: length of the window for calculating autocorrelation in samples. If None, defaults to frame_length // 2
hop_lengthNone or int > 0 [scalar]: number of audio samples between adjacent YIN predictions. If None, defaults to frame_length // 4.
trough_thresholdnumber > 0 [scalar]: absolute threshold for peak estimation.
centerboolean: If True, the signal y is padded so that frame D[:, t] is centered at y[t * hop_length]. If False, then D[:, t] begins at y[t * hop_length]. Defaults to True, which simplifies the alignment of D onto a time grid by means of librosa.core.frames_to_samples.
pad_modestring or function: If center=True, this argument is passed to np.pad for padding the edges of the signal y. By default (pad_mode="constant"), y is padded on both sides with zeros. If center=False, this argument is ignored. .. see also:: np.pad

Returns

f0: np.ndarray [shape=(…, n_frames)]

time series of fundamental frequencies in Hertz.

If multi-channel input is provided, f0 curves are estimated separately for each channel.