.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_hprss.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_hprss.py: ===================================== Harmonic-percussive source separation ===================================== This notebook illustrates how to separate an audio signal into its harmonic and percussive components. We'll compare the original median-filtering based approach of `Fitzgerald, 2010 `_ and its margin-based extension due to `Dreidger, Mueller and Disch, 2014 `_. .. GENERATED FROM PYTHON SOURCE LINES 17-25 .. code-block:: Python import numpy as np import matplotlib.pyplot as plt from IPython.display import Audio import librosa .. GENERATED FROM PYTHON SOURCE LINES 26-27 Load an example clip with harmonics and percussives .. GENERATED FROM PYTHON SOURCE LINES 27-31 .. code-block:: Python y, sr = librosa.load(librosa.ex('fishin'), duration=5, offset=10) Audio(data=y, rate=sr) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 32-33 Compute the short-time Fourier transform of y .. GENERATED FROM PYTHON SOURCE LINES 33-35 .. code-block:: Python D = librosa.stft(y) .. GENERATED FROM PYTHON SOURCE LINES 36-39 Decompose D into harmonic and percussive components :math:`D = D_\text{harmonic} + D_\text{percussive}` .. GENERATED FROM PYTHON SOURCE LINES 39-42 .. code-block:: Python D_harmonic, D_percussive = librosa.decompose.hpss(D) .. GENERATED FROM PYTHON SOURCE LINES 43-44 We can plot the two components along with the original spectrogram .. GENERATED FROM PYTHON SOURCE LINES 44-65 .. code-block:: Python # Pre-compute a global reference power from the input spectrum rp = np.max(np.abs(D)) fig, ax = plt.subplots(nrows=3, sharex=True, sharey=True) img = librosa.display.specshow(librosa.amplitude_to_db(np.abs(D), ref=rp), y_axis='log', x_axis='time', ax=ax[0]) ax[0].set(title='Full spectrogram') ax[0].label_outer() librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic), ref=rp), y_axis='log', x_axis='time', ax=ax[1]) ax[1].set(title='Harmonic spectrogram') ax[1].label_outer() librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive), ref=rp), y_axis='log', x_axis='time', ax=ax[2]) ax[2].set(title='Percussive spectrogram') fig.colorbar(img, ax=ax) .. image-sg:: /auto_examples/images/sphx_glr_plot_hprss_001.png :alt: Full spectrogram, Harmonic spectrogram, Percussive spectrogram :srcset: /auto_examples/images/sphx_glr_plot_hprss_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 66-68 We can also invert the separated spectrograms to play back the audio. First the harmonic signal: .. GENERATED FROM PYTHON SOURCE LINES 68-72 .. code-block:: Python y_harmonic = librosa.istft(D_harmonic, length=len(y)) Audio(data=y_harmonic, rate=sr) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 73-74 And next the percussive signal: .. GENERATED FROM PYTHON SOURCE LINES 74-79 .. code-block:: Python y_percussive = librosa.istft(D_percussive, length=len(y)) Audio(data=y_percussive, rate=sr) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 80-94 The default HPSS above assigns energy to each time-frequency bin according to whether a horizontal (harmonic) or vertical (percussive) filter responds higher at that position. This assumes that all energy belongs to either a harmonic or percussive source, but does not handle "noise" well. Noise energy ends up getting spread between D_harmonic and D_percussive. Unfortunately, this often also includes vocals and other sounds that are not purely harmonic or percussive. If we instead require that the horizontal filter responds more than the vertical filter *by at least some margin*, and vice versa, then noise can be removed from both components. Note: the default (above) corresponds to margin=1 .. GENERATED FROM PYTHON SOURCE LINES 94-102 .. code-block:: Python # Let's compute separations for a few different margins and compare the results below D_harmonic2, D_percussive2 = librosa.decompose.hpss(D, margin=2) D_harmonic4, D_percussive4 = librosa.decompose.hpss(D, margin=4) D_harmonic8, D_percussive8 = librosa.decompose.hpss(D, margin=8) D_harmonic16, D_percussive16 = librosa.decompose.hpss(D, margin=16) .. GENERATED FROM PYTHON SOURCE LINES 103-105 In the plots below, note that vibrato has been suppressed from the harmonic components, and vocals have been suppressed in the percussive components. .. GENERATED FROM PYTHON SOURCE LINES 105-144 .. code-block:: Python fig, ax = plt.subplots(nrows=5, ncols=2, sharex=True, sharey=True, figsize=(10, 10)) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic), ref=rp), y_axis='log', x_axis='time', ax=ax[0, 0]) ax[0, 0].set(title='Harmonic') librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive), ref=rp), y_axis='log', x_axis='time', ax=ax[0, 1]) ax[0, 1].set(title='Percussive') librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic2), ref=rp), y_axis='log', x_axis='time', ax=ax[1, 0]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive2), ref=rp), y_axis='log', x_axis='time', ax=ax[1, 1]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic4), ref=rp), y_axis='log', x_axis='time', ax=ax[2, 0]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive4), ref=rp), y_axis='log', x_axis='time', ax=ax[2, 1]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic8), ref=rp), y_axis='log', x_axis='time', ax=ax[3, 0]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive8), ref=rp), y_axis='log', x_axis='time', ax=ax[3, 1]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic16), ref=rp), y_axis='log', x_axis='time', ax=ax[4, 0]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive16), ref=rp), y_axis='log', x_axis='time', ax=ax[4, 1]) for i in range(5): ax[i, 0].set(ylabel='margin={:d}'.format(2**i)) ax[i, 0].label_outer() ax[i, 1].label_outer() .. image-sg:: /auto_examples/images/sphx_glr_plot_hprss_002.png :alt: Harmonic, Percussive :srcset: /auto_examples/images/sphx_glr_plot_hprss_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 145-150 In the plots above, it looks like margins of 4 or greater are sufficient to produce strictly harmonic and percussive components. We can invert and play those components back just as before. Again, starting with the harmonic component: .. GENERATED FROM PYTHON SOURCE LINES 150-154 .. code-block:: Python y_harmonic4 = librosa.istft(D_harmonic4, length=len(y)) Audio(data=y_harmonic4, rate=sr) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 155-156 And the percussive component: .. GENERATED FROM PYTHON SOURCE LINES 156-159 .. code-block:: Python y_percussive4 = librosa.istft(D_percussive4, length=len(y)) Audio(data=y_percussive4, rate=sr) .. raw:: html


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.963 seconds) .. _sphx_glr_download_auto_examples_plot_hprss.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_hprss.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_hprss.py ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_