librosa.effects.trim(y, *, top_db=60, ref=<function amax>, frame_length=2048, hop_length=512, aggregate=<function amax>)[source]

Trim leading and trailing silence from an audio signal.

ynp.ndarray, shape=(…, n)

Audio signal. Multi-channel is supported.

top_dbnumber > 0

The threshold (in decibels) below reference to consider as silence

refnumber or callable

The reference amplitude. By default, it uses np.max and compares to the peak amplitude in the signal.

frame_lengthint > 0

The number of samples per analysis frame

hop_lengthint > 0

The number of samples between analysis frames

aggregatecallable [default: np.max]

Function to aggregate across channels (if y.ndim > 1)

y_trimmednp.ndarray, shape=(…, m)

The trimmed signal

indexnp.ndarray, shape=(2,)

the interval of y corresponding to the non-silent region: y_trimmed = y[index[0]:index[1]] (for mono) or y_trimmed = y[:, index[0]:index[1]] (for stereo).


>>> # Load some audio
>>> y, sr = librosa.load(librosa.ex('choice'))
>>> # Trim the beginning and ending silence
>>> yt, index = librosa.effects.trim(y)
>>> # Print the durations
>>> print(librosa.get_duration(y), librosa.get_duration(yt))
25.025986394557822 25.007891156462584