librosa.feature.inverse.mfcc_to_audio
- librosa.feature.inverse.mfcc_to_audio(mfcc, *, n_mels=128, dct_type=2, norm='ortho', ref=1.0, lifter=0, **kwargs)[source]
- Convert Mel-frequency cepstral coefficients to a time-domain audio signal - This function is primarily a convenience wrapper for the following steps: - Convert mfcc to Mel power spectrum ( - mfcc_to_mel)
- Convert Mel power spectrum to time-domain audio ( - mel_to_audio)
 - Parameters:
- mfccnp.ndarray [shape=(…, n_mfcc, n)]
- The Mel-frequency cepstral coefficients 
- n_melsint > 0
- The number of Mel frequencies 
- dct_type{1, 2, 3}
- Discrete cosine transform (DCT) type By default, DCT type-2 is used. 
- normNone or ‘ortho’
- If - dct_typeis 2 or 3, setting- norm='ortho'uses an orthonormal DCT basis. Normalization is not supported for- dct_type=1.
- reffloat
- Reference power for (inverse) decibel calculation 
- lifternumber >= 0
- If lifter>0, apply inverse liftering (inverse cepstral filtering)::
- M[n, :] <- M[n, :] / (1 + sin(pi * (n + 1) / lifter)) * lifter / 2 
 
- If 
- **kwargsadditional keyword arguments to pass through to mel_to_audio
- Mnp.ndarray [shape=(…, n_mels, n), non-negative]
- The spectrogram as produced by feature.melspectrogram 
- srnumber > 0 [scalar]
- sampling rate of the underlying signal 
- n_fftint > 0 [scalar]
- number of FFT components in the resulting STFT 
- hop_lengthNone or int > 0
- The hop length of the STFT. If not provided, it will default to - n_fft // 4
- win_lengthNone or int > 0
- The window length of the STFT. By default, it will equal - n_fft
- windowstring, tuple, number, function, or np.ndarray [shape=(n_fft,)]
- A window specification as supported by stft or istft 
- centerboolean
- If True, the STFT is assumed to use centered frames. If False, the STFT is assumed to use left-aligned frames. 
- pad_modestring
- If - center=True, the padding mode to use at the edges of the signal. By default, STFT uses zero padding.
- powerfloat > 0 [scalar]
- Exponent for the magnitude melspectrogram 
- n_iterint > 0
- The number of iterations for Griffin-Lim 
- lengthNone or int > 0
- If provided, the output - yis zero-padded or clipped to exactly- lengthsamples.
- dtypenp.dtype
- Real numeric type for the time-domain signal. Default is 32-bit float. 
- **kwargsadditional keyword arguments for Mel filter bank parameters
- fminfloat >= 0 [scalar]
- lowest frequency (in Hz) 
- fmaxfloat >= 0 [scalar]
- highest frequency (in Hz). If None, use - fmax = sr / 2.0
- htkbool [scalar]
- use HTK formula instead of Slaney 
 
- Returns:
- ynp.ndarray [shape=(…, n)]
- A time-domain signal reconstructed from mfcc