.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_hprss.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_hprss.py: ===================================== Harmonic-percussive source separation ===================================== This notebook illustrates how to separate an audio signal into its harmonic and percussive components. We'll compare the original median-filtering based approach of `Fitzgerald, 2010 `_ and its margin-based extension due to `Dreidger, Mueller and Disch, 2014 `_. .. GENERATED FROM PYTHON SOURCE LINES 17-26 .. code-block:: Python import numpy as np import matplotlib.pyplot as plt from IPython.display import Audio import librosa import librosa.display .. GENERATED FROM PYTHON SOURCE LINES 27-28 Load an example clip with harmonics and percussives .. GENERATED FROM PYTHON SOURCE LINES 28-32 .. code-block:: Python y, sr = librosa.load(librosa.ex('fishin'), duration=5, offset=10) Audio(data=y, rate=sr) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 33-34 Compute the short-time Fourier transform of y .. GENERATED FROM PYTHON SOURCE LINES 34-36 .. code-block:: Python D = librosa.stft(y) .. GENERATED FROM PYTHON SOURCE LINES 37-40 Decompose D into harmonic and percussive components :math:`D = D_\text{harmonic} + D_\text{percussive}` .. GENERATED FROM PYTHON SOURCE LINES 40-43 .. code-block:: Python D_harmonic, D_percussive = librosa.decompose.hpss(D) .. GENERATED FROM PYTHON SOURCE LINES 44-45 We can plot the two components along with the original spectrogram .. GENERATED FROM PYTHON SOURCE LINES 45-66 .. code-block:: Python # Pre-compute a global reference power from the input spectrum rp = np.max(np.abs(D)) fig, ax = plt.subplots(nrows=3, sharex=True, sharey=True) img = librosa.display.specshow(librosa.amplitude_to_db(np.abs(D), ref=rp), y_axis='log', x_axis='time', ax=ax[0]) ax[0].set(title='Full spectrogram') ax[0].label_outer() librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic), ref=rp), y_axis='log', x_axis='time', ax=ax[1]) ax[1].set(title='Harmonic spectrogram') ax[1].label_outer() librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive), ref=rp), y_axis='log', x_axis='time', ax=ax[2]) ax[2].set(title='Percussive spectrogram') fig.colorbar(img, ax=ax) .. image-sg:: /auto_examples/images/sphx_glr_plot_hprss_001.png :alt: Full spectrogram, Harmonic spectrogram, Percussive spectrogram :srcset: /auto_examples/images/sphx_glr_plot_hprss_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 67-69 We can also invert the separated spectrograms to play back the audio. First the harmonic signal: .. GENERATED FROM PYTHON SOURCE LINES 69-73 .. code-block:: Python y_harmonic = librosa.istft(D_harmonic, length=len(y)) Audio(data=y_harmonic, rate=sr) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 74-75 And next the percussive signal: .. GENERATED FROM PYTHON SOURCE LINES 75-80 .. code-block:: Python y_percussive = librosa.istft(D_percussive, length=len(y)) Audio(data=y_percussive, rate=sr) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 81-95 The default HPSS above assigns energy to each time-frequency bin according to whether a horizontal (harmonic) or vertical (percussive) filter responds higher at that position. This assumes that all energy belongs to either a harmonic or percussive source, but does not handle "noise" well. Noise energy ends up getting spread between D_harmonic and D_percussive. Unfortunately, this often also includes vocals and other sounds that are not purely harmonic or percussive. If we instead require that the horizontal filter responds more than the vertical filter *by at least some margin*, and vice versa, then noise can be removed from both components. Note: the default (above) corresponds to margin=1 .. GENERATED FROM PYTHON SOURCE LINES 95-103 .. code-block:: Python # Let's compute separations for a few different margins and compare the results below D_harmonic2, D_percussive2 = librosa.decompose.hpss(D, margin=2) D_harmonic4, D_percussive4 = librosa.decompose.hpss(D, margin=4) D_harmonic8, D_percussive8 = librosa.decompose.hpss(D, margin=8) D_harmonic16, D_percussive16 = librosa.decompose.hpss(D, margin=16) .. GENERATED FROM PYTHON SOURCE LINES 104-106 In the plots below, note that vibrato has been suppressed from the harmonic components, and vocals have been suppressed in the percussive components. .. GENERATED FROM PYTHON SOURCE LINES 106-145 .. code-block:: Python fig, ax = plt.subplots(nrows=5, ncols=2, sharex=True, sharey=True, figsize=(10, 10)) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic), ref=rp), y_axis='log', x_axis='time', ax=ax[0, 0]) ax[0, 0].set(title='Harmonic') librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive), ref=rp), y_axis='log', x_axis='time', ax=ax[0, 1]) ax[0, 1].set(title='Percussive') librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic2), ref=rp), y_axis='log', x_axis='time', ax=ax[1, 0]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive2), ref=rp), y_axis='log', x_axis='time', ax=ax[1, 1]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic4), ref=rp), y_axis='log', x_axis='time', ax=ax[2, 0]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive4), ref=rp), y_axis='log', x_axis='time', ax=ax[2, 1]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic8), ref=rp), y_axis='log', x_axis='time', ax=ax[3, 0]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive8), ref=rp), y_axis='log', x_axis='time', ax=ax[3, 1]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_harmonic16), ref=rp), y_axis='log', x_axis='time', ax=ax[4, 0]) librosa.display.specshow(librosa.amplitude_to_db(np.abs(D_percussive16), ref=rp), y_axis='log', x_axis='time', ax=ax[4, 1]) for i in range(5): ax[i, 0].set(ylabel='margin={:d}'.format(2**i)) ax[i, 0].label_outer() ax[i, 1].label_outer() .. image-sg:: /auto_examples/images/sphx_glr_plot_hprss_002.png :alt: Harmonic, Percussive :srcset: /auto_examples/images/sphx_glr_plot_hprss_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 146-151 In the plots above, it looks like margins of 4 or greater are sufficient to produce strictly harmonic and percussive components. We can invert and play those components back just as before. Again, starting with the harmonic component: .. GENERATED FROM PYTHON SOURCE LINES 151-155 .. code-block:: Python y_harmonic4 = librosa.istft(D_harmonic4, length=len(y)) Audio(data=y_harmonic4, rate=sr) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 156-157 And the percussive component: .. GENERATED FROM PYTHON SOURCE LINES 157-160 .. code-block:: Python y_percussive4 = librosa.istft(D_percussive4, length=len(y)) Audio(data=y_percussive4, rate=sr) .. raw:: html


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.876 seconds) .. _sphx_glr_download_auto_examples_plot_hprss.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_hprss.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_hprss.py ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_