librosa.piptrack
- librosa.piptrack(*, y=None, sr=22050, S=None, n_fft=2048, hop_length=None, fmin=150.0, fmax=4000.0, threshold=0.1, win_length=None, window='hann', center=True, pad_mode='constant', ref=None)[source]
Pitch tracking on thresholded parabolically-interpolated STFT.
This implementation uses the parabolic interpolation method described by [1].
- Parameters:
- ynp.ndarray [shape=(…, n)] or None
audio signal. Multi-channel is supported..
- srnumber > 0 [scalar]
audio sampling rate of
y- Snp.ndarray [shape=(…, d, t)] or None
magnitude or power spectrogram
- n_fftint > 0 [scalar] or None
number of FFT bins to use, if
yis provided.- hop_lengthint > 0 [scalar] or None
number of samples to hop
- thresholdfloat in (0, 1)
A bin in spectrum
Sis considered a pitch when it is greater thanthreshold * ref(S).By default,
ref(S)is taken to bemax(S, axis=0)(the maximum value in each column).- fminfloat > 0 [scalar]
lower frequency cutoff.
- fmaxfloat > 0 [scalar]
upper frequency cutoff.
- win_lengthint <= n_fft [scalar]
Each frame of audio is windowed by
window. The window will be of length win_length and then padded with zeros to matchn_fft.If unspecified, defaults to
win_length = n_fft.- windowstring, tuple, number, function, or np.ndarray [shape=(n_fft,)]
a window specification (string, tuple, or number); see
scipy.signal.get_windowa window function, such as
scipy.signal.windows.hanna vector or array of length
n_fft
- centerboolean
If
True, the signalyis padded so that frametis centered aty[t * hop_length].If
False, then frametbegins aty[t * hop_length]
- pad_modestring
If
center=True, the padding mode to use at the edges of the signal. By default, STFT uses zero-padding.See also: np.pad.
- refscalar or callable [default=np.max]
If scalar, the reference value against which
Sis compared for determining pitches.If callable, the reference value is computed as
ref(S, axis=0).
- Returns:
- pitches, magnitudesnp.ndarray [shape=(…, d, t)]
Where
dis the subset of FFT bins withinfminandfmax.pitches[..., f, t]contains instantaneous frequency at binf, timetmagnitudes[..., f, t]contains the corresponding magnitudes.Both
pitchesandmagnitudestake value 0 at bins of non-maximal magnitude.
Notes
This function caches at level 30.
One of
Sorymust be provided. IfSis not given, it is computed fromyusing the default parameters oflibrosa.stft.Examples
Computing pitches from a waveform input
>>> y, sr = librosa.load(librosa.ex('trumpet')) >>> pitches, magnitudes = librosa.piptrack(y=y, sr=sr)
Or from a spectrogram input
>>> S = np.abs(librosa.stft(y)) >>> pitches, magnitudes = librosa.piptrack(S=S, sr=sr)
Or with an alternate reference value for pitch detection, where values above the mean spectral energy in each frame are counted as pitches
>>> pitches, magnitudes = librosa.piptrack(S=S, sr=sr, threshold=1, ... ref=np.mean)