Note
Go to the end to download the full example code.
Audio playback
This notebook demonstrates how to use IPython’s audio playback to play audio signals through your web browser.
# Code source: Brian McFee
# License: ISC
We’ll need numpy and matplotlib for this example
import numpy as np
import matplotlib.pyplot as plt
# sphinx_gallery_thumbnail_path = '_static/playback-thumbnail.png'
import librosa
# We'll need IPython.display's Audio widget
from IPython.display import Audio
# We'll also use `mir_eval` to synthesize a signal for us
import mir_eval.sonify
Playing a synthetic sound
The IPython Audio widget accepts raw numpy data as audio signals. This means we can synthesize signals directly and play them back in the browser.
For example, we can make a sine sweep from C3 to C5:
sr = 22050
y_sweep = librosa.chirp(fmin=librosa.note_to_hz('C3'),
fmax=librosa.note_to_hz('C5'),
sr=sr,
duration=1)
Audio(data=y_sweep, rate=sr)
Playing a real sound
Of course, we can also play back real recorded sounds in the same way.
Sonifying pitch estimates
As a slightly more advanced example, we can use sonification to directly observe the output of a fundamental frequency estimator.
We’ll do this using librosa.pyin
for analysis,
and mir_eval.sonify.pitch_contour
for synthesis.
# Using fill_na=None retains the best-guess f0 at unvoiced frames
f0, voiced_flag, voiced_probs = librosa.pyin(y,
sr=sr,
fmin=librosa.note_to_hz('C2'),
fmax=librosa.note_to_hz('C7'),
fill_na=None)
# To synthesize the f0, we'll need sample times
times = librosa.times_like(f0)
mir_eval’s synthesizer uses negative f0 values to indicate unvoiced regions.
We’ll make an array vneg which is 1 for voiced frames, and -1 for unvoiced frames. This way, f0 * vneg will leave voiced estimates unchanged, and negate the frequency for unvoiced frames.
Sonifying mixtures
Finally, we can also use the Audio widget to listen to combinations of signals.
This example runs the onset detector over the original test clip, and then synthesizes a click at each detection.
We can then overlay the click track on the original signal and hear them both.
For this to work, we need to ensure that both the synthesized click track and the original signal are of the same length.
# Compute the onset strength envelope, using a max filter of 5 frequency bins
# to cut down on false positives
onset_env = librosa.onset.onset_strength(y=y, sr=sr, max_size=5)
# Detect onset times from the strength envelope
onset_times = librosa.onset.onset_detect(onset_envelope=onset_env, sr=sr, units='time')
# Sonify onset times as clicks
y_clicks = librosa.clicks(times=onset_times, length=len(y), sr=sr)
Audio(data=y+y_clicks, rate=sr)
Caveats
Finally, some important things to note when using interactive playback:
IPython.display.Audio
works by serializing the entire audio signal and sending it to the browser in a UUEncoded stream. This may be inefficient for long signals.
IPython.display.Audio
can also work directly with filenames and URLs. If you’re working with long signals, or do not want to load the signal into python directly, it may be better to use one of these modes.Audio playback, by default, will normalize the amplitude of the signal being played. Most of the time this is what you will want, but sometimes it may not be, so be aware that normalization can be disabled.
If you’re working in a Jupyter notebook and want to show multiple Audio widgets in the same cell, you can use
IPython.display.display(IPython.display.Audio(...))
to explicitly render each widget. This is helpful when playing back multiple related signals.
Total running time of the script: (0 minutes 1.812 seconds)