You're reading an old version of this documentation. If you want up-to-date information, please have a look at 0.10.2.

Core IO and DSP

Audio loading

load(path[, sr, mono, offset, duration, ...])

Load an audio file as a floating point time series.

stream(path, block_length, frame_length, ...)

Stream audio in fixed-length buffers.


Convert an audio signal to mono by averaging samples across channels.

resample(y, orig_sr, target_sr[, res_type, ...])

Resample a time series from orig_sr to target_sr

get_duration([y, sr, S, n_fft, hop_length, ...])

Compute the duration (in seconds) of an audio time series, feature matrix, or filename.


Get the sampling rate for a given file.

Time-domain processing

autocorrelate(y[, max_size, axis])

Bounded-lag auto-correlation

lpc(y, order)

Linear Prediction Coefficients via Burg's method

zero_crossings(y[, threshold, ...])

Find the zero-crossings of a signal y: indices i such that sign(y[i]) != sign(y[j]).

mu_compress(x[, mu, quantize])

mu-law compression

mu_expand(x[, mu, quantize])

mu-law expansion

Signal generation

clicks([times, frames, sr, hop_length, ...])

Construct a "click track".

tone(frequency[, sr, length, duration, phi])

Construct a pure tone (cosine) signal at a given frequency.

chirp(fmin, fmax[, sr, length, duration, ...])

Construct a "chirp" or "sine-sweep" signal.

Spectral representations

stft(y[, n_fft, hop_length, win_length, ...])

Short-time Fourier transform (STFT).

istft(stft_matrix[, hop_length, win_length, ...])

Inverse short-time Fourier transform (ISTFT).

reassigned_spectrogram(y[, sr, S, n_fft, ...])

Time-frequency reassigned spectrogram.

cqt(y[, sr, hop_length, fmin, n_bins, ...])

Compute the constant-Q transform of an audio signal.

icqt(C[, sr, hop_length, fmin, ...])

Compute the inverse constant-Q transform.

hybrid_cqt(y[, sr, hop_length, fmin, ...])

Compute the hybrid constant-Q transform of an audio signal.

pseudo_cqt(y[, sr, hop_length, fmin, ...])

Compute the pseudo constant-Q transform of an audio signal.

vqt(y[, sr, hop_length, fmin, n_bins, ...])

Compute the variable-Q transform of an audio signal.

iirt(y[, sr, win_length, hop_length, ...])

Time-frequency representation using IIR filters

fmt(y[, t_min, n_fmt, kind, beta, ...])

The fast Mellin transform (FMT)

magphase(D[, power])

Separate a complex-valued spectrogram D into its magnitude (S) and phase (P) components, so that D = S * P.

Phase recovery

griffinlim(S[, n_iter, hop_length, ...])

Approximate magnitude spectrogram inversion using the "fast" Griffin-Lim algorithm.

griffinlim_cqt(C[, n_iter, sr, hop_length, ...])

Approximate constant-Q magnitude spectrogram inversion using the "fast" Griffin-Lim algorithm.


interp_harmonics(x, freqs, h_range[, kind, ...])

Compute the energy at harmonics of time-frequency representation.

salience(S, freqs, h_range[, weights, ...])

Harmonic salience function.

phase_vocoder(D, rate[, hop_length])

Phase vocoder.

Magnitude scaling

amplitude_to_db(S[, ref, amin, top_db])

Convert an amplitude spectrogram to dB-scaled spectrogram.

db_to_amplitude(S_db[, ref])

Convert a dB-scaled spectrogram to an amplitude spectrogram.

power_to_db(S[, ref, amin, top_db])

Convert a power spectrogram (amplitude squared) to decibel (dB) units

db_to_power(S_db[, ref])

Convert a dB-scale spectrogram to a power spectrogram.

perceptual_weighting(S, frequencies[, kind])

Perceptual weighting of a power spectrogram.

frequency_weighting(frequencies[, kind])

Compute the weighting of a set of frequencies.

multi_frequency_weighting(frequencies[, kinds])

Compute multiple weightings of a set of frequencies.

A_weighting(frequencies[, min_db])

Compute the A-weighting of a set of frequencies.

B_weighting(frequencies[, min_db])

Compute the B-weighting of a set of frequencies.

C_weighting(frequencies[, min_db])

Compute the C-weighting of a set of frequencies.

D_weighting(frequencies[, min_db])

Compute the D-weighting of a set of frequencies.

pcen(S[, sr, hop_length, gain, bias, power, ...])

Per-channel energy normalization (PCEN)

Time unit conversion

frames_to_samples(frames[, hop_length, n_fft])

Converts frame indices to audio sample indices.

frames_to_time(frames[, sr, hop_length, n_fft])

Converts frame counts to time (seconds).

samples_to_frames(samples[, hop_length, n_fft])

Converts sample indices into STFT frames.

samples_to_time(samples[, sr])

Convert sample indices to time (in seconds).

time_to_frames(times[, sr, hop_length, n_fft])

Converts time stamps into STFT frames.

time_to_samples(times[, sr])

Convert timestamps (in seconds) to sample indices.

blocks_to_frames(blocks, block_length)

Convert block indices to frame indices

blocks_to_samples(blocks, block_length, ...)

Convert block indices to sample indices

blocks_to_time(blocks, block_length, ...)

Convert block indices to time (in seconds)

Frequency unit conversion

hz_to_note(frequencies, **kwargs)

Convert one or more frequencies (in Hz) to the nearest note names.


Get MIDI note number(s) for given frequencies

hz_to_svara_h(frequencies, Sa[, abbr, ...])

Convert frequencies (in Hz) to Hindustani svara

hz_to_svara_c(frequencies, Sa, mela[, abbr, ...])

Convert frequencies (in Hz) to Carnatic svara


Get the frequency (Hz) of MIDI note(s)

midi_to_note(midi[, octave, cents, key, unicode])

Convert one or more MIDI numbers to note strings.

midi_to_svara_h(midi, Sa[, abbr, octave, ...])

Convert MIDI numbers to Hindustani svara

midi_to_svara_c(midi, Sa, mela[, abbr, ...])

Convert MIDI numbers to Carnatic svara within a given melakarta raga

note_to_hz(note, **kwargs)

Convert one or more note names to frequency (Hz)

note_to_midi(note[, round_midi])

Convert one or more spelled notes to MIDI number(s).

note_to_svara_h(notes, Sa[, abbr, octave, ...])

Convert western notes to Hindustani svara

note_to_svara_c(notes, Sa, mela[, abbr, ...])

Convert western notes to Carnatic svara

hz_to_mel(frequencies[, htk])

Convert Hz to Mels

hz_to_octs(frequencies[, tuning, ...])

Convert frequencies (Hz) to (fractional) octave numbers.

mel_to_hz(mels[, htk])

Convert mel bin numbers to frequencies

octs_to_hz(octs[, tuning, bins_per_octave])

Convert octaves numbers to frequencies.

A4_to_tuning(A4[, bins_per_octave])

Convert a reference pitch frequency (e.g., A4=435) to a tuning estimation, in fractions of a bin per octave.

tuning_to_A4(tuning[, bins_per_octave])

Convert a tuning deviation (from 0) in fractions of a bin per octave (e.g., tuning=-0.1) to a reference pitch frequency relative to A440.

Music notation

key_to_notes(key[, unicode])

Lists all 12 note names in the chromatic scale, as spelled according to a given key (major or minor).


Construct the diatonic scale degrees for a given key.

mela_to_svara(mela[, abbr, unicode])

Spell the Carnatic svara names for a given melakarta raga


Construct the svara indices (degrees) for a given melakarta raga


Construct the svara indices (degrees) for a given thaat


List melakarta ragas by name and index.


List supported thaats by name.

Frequency range generation

fft_frequencies([sr, n_fft])

Alternative implementation of np.fft.fftfreq

cqt_frequencies(n_bins, fmin[, ...])

Compute the center frequencies of Constant-Q bins.

mel_frequencies([n_mels, fmin, fmax, htk])

Compute an array of acoustic frequencies tuned to the mel scale.

tempo_frequencies(n_bins[, hop_length, sr])

Compute the frequencies (in beats per minute) corresponding to an onset auto-correlation or tempogram matrix.

fourier_tempo_frequencies([sr, win_length, ...])

Compute the frequencies (in beats per minute) corresponding to a Fourier tempogram matrix.

Pitch and tuning

pyin(y, fmin, fmax[, sr, frame_length, ...])

Fundamental frequency (F0) estimation using probabilistic YIN (pYIN).

yin(y, fmin, fmax[, sr, frame_length, ...])

Fundamental frequency (F0) estimation using the YIN algorithm.

estimate_tuning([y, sr, S, n_fft, ...])

Estimate the tuning of an audio time series or spectrogram input.

pitch_tuning(frequencies[, resolution, ...])

Given a collection of pitches, estimate its tuning offset (in fractions of a bin) relative to A440=440.0Hz.

piptrack([y, sr, S, n_fft, hop_length, ...])

Pitch tracking on thresholded parabolically-interpolated STFT.


samples_like(X[, hop_length, n_fft, axis])

Return an array of sample indices to match the time axis from a feature matrix.

times_like(X[, sr, hop_length, n_fft, axis])

Return an array of time values to match the time axis from a feature matrix.


Get the FFT library currently used by librosa


Set the FFT library used by librosa.