librosa.stream

librosa.stream(path, *, block_length, frame_length, hop_length, mono=True, offset=0.0, duration=None, fill_value=None, dtype=<class 'numpy.float32'>)[source]

Stream audio in fixed-length buffers.

This is primarily useful for processing large files that won’t fit entirely in memory at once.

Instead of loading the entire audio signal into memory (as in load, this function produces blocks of audio spanning a fixed number of frames at a specified frame length and hop length.

While this function strives for similar behavior to load, there are a few caveats that users should be aware of:

This function does not return audio buffers directly. It returns a generator, which you can iterate over to produce blocks of audio. A block, in this context, refers to a buffer of audio which spans a given number of (potentially overlapping) frames.

Automatic sample-rate conversion is not supported. Audio will be streamed in its native sample rate, so no default values are provided for frame_length and hop_length. It is recommended that you first get the sampling rate for the file in question, using get_samplerate, and set these parameters accordingly.

Many analyses require access to the entire signal to behave correctly, such as resample, cqt, or beat_track, so these methods will not be appropriate for streamed data.

The block_length parameter specifies how many frames of audio will be produced per block. Larger values will consume more memory, but will be more efficient to process down-stream. The best value will ultimately depend on your application and other system constraints.

By default, most librosa analyses (e.g., short-time Fourier transform) assume centered frames, which requires padding the signal at the beginning and end. This will not work correctly when the signal is carved into blocks, because it would introduce padding in the middle of the signal. To disable this feature, use center=False in all frame-based analyses.

See the examples below for proper usage of this function.

Parameters:

pathstring, int, sf.SoundFile, or file-like object

path to the input file to stream.

Any codec supported by soundfile is permitted here.

An existing soundfile.SoundFile object may also be provided.

block_lengthint > 0

The number of frames to include in each block.

Note that at the end of the file, there may not be enough data to fill an entire block, resulting in a shorter block by default. To pad the signal out so that blocks are always full length, set fill_value (see below).

frame_lengthint > 0

The number of samples per frame.

hop_lengthint > 0

The number of samples to advance between frames.

Note that by when hop_length < frame_length, neighboring frames will overlap. Similarly, the last frame of one block will overlap with the first frame of the next block.

monobool

Convert the signal to mono during streaming

offsetfloat

Start reading after this time (in seconds)

durationfloat

Only load up to this much audio (in seconds)

fill_valuefloat [optional]

If padding the signal to produce constant-length blocks, this value will be used at the end of the signal.

In most cases, fill_value=0 (silence) is expected, but you may specify any value here.

dtypenumeric type

data type of audio buffers to be produced

Yields:

ynp.ndarray: An audio buffer of (at most) (block_length-1) * hop_length + frame_length samples.

See also

load
get_samplerate
soundfile.blocks

Examples

Apply a short-term Fourier transform to blocks of 256 frames at a time. Note that streaming operation requires left-aligned frames, so we must set center=False to avoid padding artifacts.

>>> filename = librosa.ex('brahms')
>>> sr = librosa.get_samplerate(filename)
>>> stream = librosa.stream(filename,
...                       block_length=256,
...                       frame_length=4096,
...                       hop_length=1024)
>>> for y_block in stream:
...     D_block = librosa.stft(y_block, center=False)

Or compute a mel spectrogram over a stream, using a shorter frame and non-overlapping windows

>>> filename = librosa.ex('brahms')
>>> sr = librosa.get_samplerate(filename)
>>> stream = librosa.stream(filename,
...                         block_length=256,
...                         frame_length=2048,
...                         hop_length=2048)
>>> for y_block in stream:
...     m_block = librosa.feature.melspectrogram(y=y_block, sr=sr,
...                                              n_fft=2048,
...                                              hop_length=2048,
...                                              center=False)