Compute the variable-Q transform of an audio signal.
This implementation is based on the recursive sub-sampling method
described by [1].
Parameters:
ynp.ndarray [shape=(n,)]
audio time series
srnumber > 0 [scalar]
sampling rate of y
hop_lengthint > 0 [scalar]
number of samples between successive VQT columns.
fminfloat > 0 [scalar]
Minimum frequency. Defaults to C1 ~= 32.70 Hz
n_binsint > 0 [scalar]
Number of frequency bins, starting at fmin
gammanumber > 0 [scalar]
Bandwidth offset for determining filter lengths.
If gamma=0, produces the constant-Q transform.
If ‘gamma=None’, gamma will be calculated such that filter bandwidths are equal to a
constant fraction of the equivalent rectangular bandwidths (ERB). This is accomplished
by solving for the gamma which gives:
B_k=alpha*f_k+gamma=C*ERB(f_k),
where B_k is the bandwidth of filter k with center frequency f_k, alpha
is the inverse of what would be the constant Q-factor, and C=alpha/0.108 is the
constant fraction across all filters.
Here we use ERB(f_k)=24.7+0.108*f_k, the best-fit curve derived
from experimental data in [2].
bins_per_octaveint > 0 [scalar]
Number of bins per octave
tuningNone or float
Tuning offset in fractions of a bin.
If None, tuning will be automatically estimated from the signal.
The minimum frequency of the resulting VQT will be modified to
fmin*2**(tuning/bins_per_octave).
filter_scalefloat > 0
Filter scale factor. Small values (<1) use shorter windows
for improved time resolution.
By default, vqt will adaptively select a resampling mode
which trades off accuracy at high frequencies for efficiency at low frequencies.
You can override this by specifying a resampling mode as supported by
librosa.resample. For example, res_type='fft' will use a high-quality,
but potentially slow FFT-based down-sampling, while res_type='polyphase' will
use a fast, but potentially inaccurate down-sampling.
dtypenp.dtype
The dtype of the output array. By default, this is inferred to match the
numerical precision of the input signal.
Returns:
VQTnp.ndarray [shape=(n_bins, t), dtype=np.complex or np.float]
Variable-Q value each frequency at each time.
Raises:
ParameterError
If hop_length is not an integer multiple of
2**(n_bins/bins_per_octave)
Or if y is too short to support the frequency range of the VQT.