librosa.effects.trim
- librosa.effects.trim(y, *, top_db=60, ref=<function amax>, frame_length=2048, hop_length=512, aggregate=<function amax>)[source]
Trim leading and trailing silence from an audio signal.
Silence is defined as segments of the audio signal that are top_db decibels (or more) quieter than a reference level, ref. By default, ref is set to the signal’s maximum RMS value. It’s important to note that if the entire signal maintains a uniform RMS value, there will be no segments considered quieter than the maximum, leading to no trimming. This implies that a completely silent signal will remain untrimmed with the default ref setting. In these situations, an explicit value for ref (in decibels) should be used instead.
- Parameters:
- ynp.ndarray, shape=(…, n)
Audio signal. Multi-channel is supported.
- top_dbnumber > 0
The threshold (in decibels) below reference to consider as silence
- refnumber or callable
The reference amplitude. By default, it uses
np.max
and compares to the peak amplitude in the signal.- frame_lengthint > 0
The number of samples per analysis frame
- hop_lengthint > 0
The number of samples between analysis frames
- aggregatecallable [default: np.max]
Function to aggregate across channels (if y.ndim > 1)
- Returns:
- y_trimmednp.ndarray, shape=(…, m)
The trimmed signal
- indexnp.ndarray, shape=(2,)
the interval of
y
corresponding to the non-silent region:y_trimmed = y[index[0]:index[1]]
(for mono) ory_trimmed = y[:, index[0]:index[1]]
(for stereo).
Examples
>>> # Load some audio >>> y, sr = librosa.load(librosa.ex('choice')) >>> # Trim the beginning and ending silence >>> yt, index = librosa.effects.trim(y) >>> # Print the durations >>> print(librosa.get_duration(y, sr=sr), librosa.get_duration(yt, sr=sr)) 25.025986394557822 25.007891156462584