Go to the end to download the full example code
This notebook demonstrates how to use IPython’s audio playback to play audio signals through your web browser.
# Code source: Brian McFee # License: ISC
We’ll need numpy and matplotlib for this example
import numpy as np import matplotlib.pyplot as plt # sphinx_gallery_thumbnail_path = '_static/playback-thumbnail.png' import librosa # We'll need IPython.display's Audio widget from IPython.display import Audio # We'll also use `mir_eval` to synthesize a signal for us import mir_eval.sonify
Playing a synthetic sound
The IPython Audio widget accepts raw numpy data as audio signals. This means we can synthesize signals directly and play them back in the browser.
For example, we can make a sine sweep from C3 to C5:
sr = 22050 y_sweep = librosa.chirp(fmin=librosa.note_to_hz('C3'), fmax=librosa.note_to_hz('C5'), sr=sr, duration=1) Audio(data=y_sweep, rate=sr)
Playing a real sound
Of course, we can also play back real recorded sounds in the same way.
Sonifying pitch estimates
As a slightly more advanced example, we can use sonification to directly observe the output of a fundamental frequency estimator.
We’ll do this using
librosa.pyin for analysis,
mir_eval.sonify.pitch_contour for synthesis.
# Using fill_na=None retains the best-guess f0 at unvoiced frames f0, voiced_flag, voiced_probs = librosa.pyin(y, sr=sr, fmin=librosa.note_to_hz('C2'), fmax=librosa.note_to_hz('C7'), fill_na=None) # To synthesize the f0, we'll need sample times times = librosa.times_like(f0)
mir_eval’s synthesizer uses negative f0 values to indicate unvoiced regions.
We’ll make an array vneg which is 1 for voiced frames, and -1 for unvoiced frames. This way, f0 * vneg will leave voiced estimates unchanged, and negate the frequency for unvoiced frames.
Finally, we can also use the Audio widget to listen to combinations of signals.
This example runs the onset detector over the original test clip, and then synthesizes a click at each detection.
We can then overlay the click track on the original signal and hear them both.
For this to work, we need to ensure that both the synthesized click track and the original signal are of the same length.
# Compute the onset strength envelope, using a max filter of 5 frequency bins # to cut down on false positives onset_env = librosa.onset.onset_strength(y=y, sr=sr, max_size=5) # Detect onset times from the strength envelope onset_times = librosa.onset.onset_detect(onset_envelope=onset_env, sr=sr, units='time') # Sonify onset times as clicks y_clicks = librosa.clicks(times=onset_times, length=len(y), sr=sr) Audio(data=y+y_clicks, rate=sr)
Finally, some important things to note when using interactive playback:
IPython.display.Audioworks by serializing the entire audio signal and sending it to the browser in a UUEncoded stream. This may be inefficient for long signals.
IPython.display.Audiocan also work directly with filenames and URLs. If you’re working with long signals, or do not want to load the signal into python directly, it may be better to use one of these modes.
Audio playback, by default, will normalize the amplitude of the signal being played. Most of the time this is what you will want, but sometimes it may not be, so be aware that normalization can be disabled.
If you’re working in a Jupyter notebook and want to show multiple Audio widgets in the same cell, you can use
IPython.display.display(IPython.display.Audio(...))to explicitly render each widget. This is helpful when playing back multiple related signals.
Total running time of the script: ( 0 minutes 4.345 seconds)