librosa.stream¶
- librosa.stream(path, *, block_length, frame_length, hop_length, mono=True, offset=0.0, duration=None, fill_value=None, dtype=<class 'numpy.float32'>)[source]¶
Stream audio in fixed-length buffers.
This is primarily useful for processing large files that won’t fit entirely in memory at once.
Instead of loading the entire audio signal into memory (as in
load
, this function produces blocks of audio spanning a fixed number of frames at a specified frame length and hop length.While this function strives for similar behavior to
load
, there are a few caveats that users should be aware of:This function does not return audio buffers directly. It returns a generator, which you can iterate over to produce blocks of audio. A block, in this context, refers to a buffer of audio which spans a given number of (potentially overlapping) frames.
Automatic sample-rate conversion is not supported. Audio will be streamed in its native sample rate, so no default values are provided for
frame_length
andhop_length
. It is recommended that you first get the sampling rate for the file in question, usingget_samplerate
, and set these parameters accordingly.Many analyses require access to the entire signal to behave correctly, such as
resample
,cqt
, or beat_track, so these methods will not be appropriate for streamed data.The
block_length
parameter specifies how many frames of audio will be produced per block. Larger values will consume more memory, but will be more efficient to process down-stream. The best value will ultimately depend on your application and other system constraints.By default, most librosa analyses (e.g., short-time Fourier transform) assume centered frames, which requires padding the signal at the beginning and end. This will not work correctly when the signal is carved into blocks, because it would introduce padding in the middle of the signal. To disable this feature, use
center=False
in all frame-based analyses.
See the examples below for proper usage of this function.
- Parameters
- pathstring, int, sf.SoundFile, or file-like object
path to the input file to stream.
Any codec supported by
soundfile
is permitted here.An existing
soundfile.SoundFile
object may also be provided.- block_lengthint > 0
The number of frames to include in each block.
Note that at the end of the file, there may not be enough data to fill an entire block, resulting in a shorter block by default. To pad the signal out so that blocks are always full length, set
fill_value
(see below).- frame_lengthint > 0
The number of samples per frame.
- hop_lengthint > 0
The number of samples to advance between frames.
Note that by when
hop_length < frame_length
, neighboring frames will overlap. Similarly, the last frame of one block will overlap with the first frame of the next block.- monobool
Convert the signal to mono during streaming
- offsetfloat
Start reading after this time (in seconds)
- durationfloat
Only load up to this much audio (in seconds)
- fill_valuefloat [optional]
If padding the signal to produce constant-length blocks, this value will be used at the end of the signal.
In most cases,
fill_value=0
(silence) is expected, but you may specify any value here.- dtypenumeric type
data type of audio buffers to be produced
- Yields
- ynp.ndarray
An audio buffer of (at most)
(block_length-1) * hop_length + frame_length
samples.
See also
Examples
Apply a short-term Fourier transform to blocks of 256 frames at a time. Note that streaming operation requires left-aligned frames, so we must set
center=False
to avoid padding artifacts.>>> filename = librosa.ex('brahms') >>> sr = librosa.get_samplerate(filename) >>> stream = librosa.stream(filename, ... block_length=256, ... frame_length=4096, ... hop_length=1024) >>> for y_block in stream: ... D_block = librosa.stft(y_block, center=False)
Or compute a mel spectrogram over a stream, using a shorter frame and non-overlapping windows
>>> filename = librosa.ex('brahms') >>> sr = librosa.get_samplerate(filename) >>> stream = librosa.stream(filename, ... block_length=256, ... frame_length=2048, ... hop_length=2048) >>> for y_block in stream: ... m_block = librosa.feature.melspectrogram(y=y_block, sr=sr, ... n_fft=2048, ... hop_length=2048, ... center=False)