librosa.util.frame

librosa.util.frame(x, *, frame_length, hop_length, axis=-1, writeable=False, subok=False)[source]

Slice a data array into (overlapping) frames.

This implementation uses low-level stride manipulation to avoid making a copy of the data. The resulting frame representation is a new view of the same input data.

For example, a one-dimensional input x = [0, 1, 2, 3, 4, 5, 6] can be framed with frame length 3 and hop length 2 in two ways. The first (axis=-1), results in the array x_frames:

[[0, 2, 4],
 [1, 3, 5],
 [2, 4, 6]]

where each column x_frames[:, i] contains a contiguous slice of the input x[i * hop_length : i * hop_length + frame_length].

The second way (axis=0) results in the array x_frames:

[[0, 1, 2],
 [2, 3, 4],
 [4, 5, 6]]

where each row x_frames[i] contains a contiguous slice of the input.

This generalizes to higher dimensional inputs, as shown in the examples below. In general, the framing operation increments by 1 the number of dimensions, adding a new “frame axis” either before the framing axis (if axis < 0) or after the framing axis (if axis >= 0).

Parameters
xnp.ndarray

Array to frame

frame_lengthint > 0 [scalar]

Length of the frame

hop_lengthint > 0 [scalar]

Number of steps to advance between frames

axisint

The axis along which to frame.

writeablebool

If True, then the framed view of x is read-only. If False, then the framed view is read-write. Note that writing to the framed view will also write to the input array x in this case.

subokbool

If True, sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default).

Returns
x_framesnp.ndarray [shape=(…, frame_length, N_FRAMES, …)]

A framed view of x, for example with axis=-1 (framing on the last dimension):

x_frames[..., j] == x[..., j * hop_length : j * hop_length + frame_length]

If axis=0 (framing on the first dimension), then:

x_frames[j] = x[j * hop_length : j * hop_length + frame_length]
Raises
ParameterError

If x.shape[axis] < frame_length, there is not enough data to fill one frame.

If hop_length < 1, frames cannot advance.

Examples

Extract 2048-sample frames from monophonic signal with a hop of 64 samples per frame

>>> y, sr = librosa.load(librosa.ex('trumpet'))
>>> frames = librosa.util.frame(y, frame_length=2048, hop_length=64)
>>> frames
array([[-1.407e-03, -2.604e-02, ..., -1.795e-05, -8.108e-06],
       [-4.461e-04, -3.721e-02, ..., -1.573e-05, -1.652e-05],
       ...,
       [ 7.960e-02, -2.335e-01, ..., -6.815e-06,  1.266e-05],
       [ 9.568e-02, -1.252e-01, ...,  7.397e-06, -1.921e-05]],
      dtype=float32)
>>> y.shape
(117601,)
>>> frames.shape
(2048, 1806)

Or frame along the first axis instead of the last:

>>> frames = librosa.util.frame(y, frame_length=2048, hop_length=64, axis=0)
>>> frames.shape
(1806, 2048)

Frame a stereo signal:

>>> y, sr = librosa.load(librosa.ex('trumpet', hq=True), mono=False)
>>> y.shape
(2, 117601)
>>> frames = librosa.util.frame(y, frame_length=2048, hop_length=64)
(2, 2048, 1806)

Carve an STFT into fixed-length patches of 32 frames with 50% overlap

>>> y, sr = librosa.load(librosa.ex('trumpet'))
>>> S = np.abs(librosa.stft(y))
>>> S.shape
(1025, 230)
>>> S_patch = librosa.util.frame(S, frame_length=32, hop_length=16)
>>> S_patch.shape
(1025, 32, 13)
>>> # The first patch contains the first 32 frames of S
>>> np.allclose(S_patch[:, :, 0], S[:, :32])
True
>>> # The second patch contains frames 16 to 16+32=48, and so on
>>> np.allclose(S_patch[:, :, 1], S[:, 16:48])
True