librosa.pseudo_cqt(y, *, sr=22050, hop_length=512, fmin=None, n_bins=84, bins_per_octave=12, tuning=0.0, filter_scale=1, norm=1, sparsity=0.01, window='hann', scale=True, pad_mode='constant', dtype=None)[source]

Compute the pseudo constant-Q transform of an audio signal.

This uses a single fft size that is the smallest power of 2 that is greater than or equal to the max of:

  1. The longest CQT filter

  2. 2x the hop_length

ynp.ndarray [shape=(…, n)]

audio time series. Multi-channel is supported.

srnumber > 0 [scalar]

sampling rate of y

hop_lengthint > 0 [scalar]

number of samples between successive CQT columns.

fminfloat > 0 [scalar]

Minimum frequency. Defaults to C1 ~= 32.70 Hz

n_binsint > 0 [scalar]

Number of frequency bins, starting at fmin

bins_per_octaveint > 0 [scalar]

Number of bins per octave

tuningNone or float

Tuning offset in fractions of a bin.

If None, tuning will be automatically estimated from the signal.

The minimum frequency of the resulting CQT will be modified to fmin * 2**(tuning / bins_per_octave).

filter_scalefloat > 0

Filter filter_scale factor. Larger values use longer windows.

norm{inf, -inf, 0, float > 0}

Type of norm to use for basis function normalization. See librosa.util.normalize.

sparsityfloat in [0, 1)

Sparsify the CQT basis by discarding up to sparsity fraction of the energy in each basis.

Set sparsity=0 to disable sparsification.

windowstr, tuple, number, or function

Window specification for the basis filters. See filters.get_window for details.


If True, scale the CQT response by square-root the length of each channel’s filter. This is analogous to norm='ortho' in FFT.

If False, do not scale the CQT. This is analogous to norm=None in FFT.


Padding mode for centered frame analysis.

See also: librosa.stft and numpy.pad.

dtypenp.dtype, optional

The complex data type for CQT calculations. By default, this is inferred to match the precision of the input signal.

CQTnp.ndarray [shape=(…, n_bins, t), dtype=np.float]

Pseudo Constant-Q energy for each frequency at each time.


This function caches at level 20.