Caution

You're reading an old version of this documentation. If you want up-to-date information, please have a look at 0.10.2.

librosa.vqt

librosa.vqt(y, sr=22050, hop_length=512, fmin=None, n_bins=84, gamma=None, bins_per_octave=12, tuning=0.0, filter_scale=1, norm=1, sparsity=0.01, window='hann', scale=True, pad_mode='reflect', res_type=None, dtype=None)[source]

Compute the variable-Q transform of an audio signal.

This implementation is based on the recursive sub-sampling method described by [1].

Parameters:

ynp.ndarray [shape=(n,)]

audio time series

srnumber > 0 [scalar]

sampling rate of y

hop_lengthint > 0 [scalar]

number of samples between successive VQT columns.

fminfloat > 0 [scalar]

Minimum frequency. Defaults to C1 ~= 32.70 Hz

n_binsint > 0 [scalar]

Number of frequency bins, starting at fmin

gammanumber > 0 [scalar]

Bandwidth offset for determining filter lengths.

If gamma=0, produces the constant-Q transform.

If ‘gamma=None’, gamma will be calculated such that filter bandwidths are equal to a constant fraction of the equivalent rectangular bandwidths (ERB). This is accomplished by solving for the gamma which gives:

B_k = alpha * f_k + gamma = C * ERB(f_k),

where B_k is the bandwidth of filter k with center frequency f_k, alpha is the inverse of what would be the constant Q-factor, and C = alpha / 0.108 is the constant fraction across all filters.

Here we use ERB(f_k) = 24.7 + 0.108 * f_k, the best-fit curve derived from experimental data in [2].

bins_per_octaveint > 0 [scalar]

Number of bins per octave

tuningNone or float

Tuning offset in fractions of a bin.

If None, tuning will be automatically estimated from the signal.

The minimum frequency of the resulting VQT will be modified to fmin * 2**(tuning / bins_per_octave).

filter_scalefloat > 0

Filter scale factor. Small values (<1) use shorter windows for improved time resolution.

norm{inf, -inf, 0, float > 0}

Type of norm to use for basis function normalization. See librosa.util.normalize.

sparsityfloat in [0, 1)

Sparsify the VQT basis by discarding up to sparsity fraction of the energy in each basis.

Set sparsity=0 to disable sparsification.

windowstr, tuple, number, or function

Window specification for the basis filters. See filters.get_window for details.

scalebool

If True, scale the VQT response by square-root the length of each channel’s filter. This is analogous to norm='ortho' in FFT.

If False, do not scale the VQT. This is analogous to norm=None in FFT.

pad_modestring

Padding mode for centered frame analysis.

See also: librosa.stft and numpy.pad.

res_typestring [optional]

The resampling mode for recursive downsampling.

By default, vqt will adaptively select a resampling mode which trades off accuracy at high frequencies for efficiency at low frequencies.

You can override this by specifying a resampling mode as supported by librosa.resample. For example, res_type='fft' will use a high-quality, but potentially slow FFT-based down-sampling, while res_type='polyphase' will use a fast, but potentially inaccurate down-sampling.

dtypenp.dtype

The dtype of the output array. By default, this is inferred to match the numerical precision of the input signal.

Returns:

VQTnp.ndarray [shape=(n_bins, t), dtype=np.complex or np.float]: Variable-Q value each frequency at each time.

Raises:

ParameterError

If hop_length is not an integer multiple of 2**(n_bins / bins_per_octave)

Or if y is too short to support the frequency range of the VQT.