.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_segmentation.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_segmentation.py: ====================== Laplacian segmentation ====================== This notebook implements the laplacian segmentation method of `McFee and Ellis, 2014 `_, with a couple of minor stability improvements. Throughout the example, we will refer to equations in the paper by number, so it will be helpful to read along. .. GENERATED FROM PYTHON SOURCE LINES 14-19 .. code-block:: default # Code source: Brian McFee # License: ISC .. GENERATED FROM PYTHON SOURCE LINES 20-26 Imports - numpy for basic functionality - scipy for graph Laplacian - matplotlib for visualization - sklearn.cluster for K-Means .. GENERATED FROM PYTHON SOURCE LINES 26-37 .. code-block:: default from __future__ import print_function import numpy as np import scipy import matplotlib.pyplot as plt import sklearn.cluster import librosa import librosa.display .. GENERATED FROM PYTHON SOURCE LINES 38-39 First, we'll load in a song .. GENERATED FROM PYTHON SOURCE LINES 39-42 .. code-block:: default y, sr = librosa.load('audio/Karissa_Hobbs_-_09_-_Lets_Go_Fishin.mp3') .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /tmp/tmpfl4ra6qp/b0064fe7dbe8048b1d4148e61a568b6fe3fca91b/librosa/core/audio.py:161: UserWarning: PySoundFile failed. Trying audioread instead. warnings.warn('PySoundFile failed. Trying audioread instead.') .. GENERATED FROM PYTHON SOURCE LINES 43-44 Next, we'll compute and plot a log-power CQT .. GENERATED FROM PYTHON SOURCE LINES 44-58 .. code-block:: default BINS_PER_OCTAVE = 12 * 3 N_OCTAVES = 7 C = librosa.amplitude_to_db(np.abs(librosa.cqt(y=y, sr=sr, bins_per_octave=BINS_PER_OCTAVE, n_bins=N_OCTAVES * BINS_PER_OCTAVE)), ref=np.max) plt.figure(figsize=(12, 4)) librosa.display.specshow(C, y_axis='cqt_hz', sr=sr, bins_per_octave=BINS_PER_OCTAVE, x_axis='time') plt.tight_layout() .. image-sg:: /auto_examples/images/sphx_glr_plot_segmentation_001.png :alt: plot segmentation :srcset: /auto_examples/images/sphx_glr_plot_segmentation_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /tmp/tmpfl4ra6qp/b0064fe7dbe8048b1d4148e61a568b6fe3fca91b/librosa/display.py:862: MatplotlibDeprecationWarning: The 'basey' parameter of __init__() has been renamed 'base' since Matplotlib 3.3; support for the old name will be dropped two minor releases later. scaler(mode, **kwargs) .. GENERATED FROM PYTHON SOURCE LINES 59-60 To reduce dimensionality, we'll beat-synchronous the CQT .. GENERATED FROM PYTHON SOURCE LINES 60-77 .. code-block:: default tempo, beats = librosa.beat.beat_track(y=y, sr=sr, trim=False) Csync = librosa.util.sync(C, beats, aggregate=np.median) # For plotting purposes, we'll need the timing of the beats # we fix_frames to include non-beat frames 0 and C.shape[1] (final frame) beat_times = librosa.frames_to_time(librosa.util.fix_frames(beats, x_min=0, x_max=C.shape[1]), sr=sr) plt.figure(figsize=(12, 4)) librosa.display.specshow(Csync, bins_per_octave=12*3, y_axis='cqt_hz', x_axis='time', x_coords=beat_times) plt.tight_layout() .. image-sg:: /auto_examples/images/sphx_glr_plot_segmentation_002.png :alt: plot segmentation :srcset: /auto_examples/images/sphx_glr_plot_segmentation_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /tmp/tmpfl4ra6qp/b0064fe7dbe8048b1d4148e61a568b6fe3fca91b/librosa/display.py:862: MatplotlibDeprecationWarning: The 'basey' parameter of __init__() has been renamed 'base' since Matplotlib 3.3; support for the old name will be dropped two minor releases later. scaler(mode, **kwargs) .. GENERATED FROM PYTHON SOURCE LINES 78-82 Let's build a weighted recurrence matrix using beat-synchronous CQT (Equation 1) width=3 prevents links within the same bar mode='affinity' here implements S_rep (after Eq. 8) .. GENERATED FROM PYTHON SOURCE LINES 82-90 .. code-block:: default R = librosa.segment.recurrence_matrix(Csync, width=3, mode='affinity', sym=True) # Enhance diagonals with a median filter (Equation 2) df = librosa.segment.timelag_filter(scipy.ndimage.median_filter) Rf = df(R, size=(1, 7)) .. GENERATED FROM PYTHON SOURCE LINES 91-97 Now let's build the sequence matrix (S_loc) using mfcc-similarity :math:`R_\text{path}[i, i\pm 1] = \exp(-\|C_i - C_{i\pm 1}\|^2 / \sigma^2)` Here, we take :math:`\sigma` to be the median distance between successive beats. .. GENERATED FROM PYTHON SOURCE LINES 97-107 .. code-block:: default mfcc = librosa.feature.mfcc(y=y, sr=sr) Msync = librosa.util.sync(mfcc, beats) path_distance = np.sum(np.diff(Msync, axis=1)**2, axis=0) sigma = np.median(path_distance) path_sim = np.exp(-path_distance / sigma) R_path = np.diag(path_sim, k=1) + np.diag(path_sim, k=-1) .. GENERATED FROM PYTHON SOURCE LINES 108-109 And compute the balanced combination (Equations 6, 7, 9) .. GENERATED FROM PYTHON SOURCE LINES 109-118 .. code-block:: default deg_path = np.sum(R_path, axis=1) deg_rec = np.sum(Rf, axis=1) mu = deg_path.dot(deg_path + deg_rec) / np.sum((deg_path + deg_rec)**2) A = mu * Rf + (1 - mu) * R_path .. GENERATED FROM PYTHON SOURCE LINES 119-120 Plot the resulting graphs (Figure 1, left and center) .. GENERATED FROM PYTHON SOURCE LINES 120-134 .. code-block:: default plt.figure(figsize=(8, 4)) plt.subplot(1, 3, 1) librosa.display.specshow(Rf, cmap='inferno_r', y_axis='time', y_coords=beat_times) plt.title('Recurrence similarity') plt.subplot(1, 3, 2) librosa.display.specshow(R_path, cmap='inferno_r') plt.title('Path similarity') plt.subplot(1, 3, 3) librosa.display.specshow(A, cmap='inferno_r') plt.title('Combined graph') plt.tight_layout() .. image-sg:: /auto_examples/images/sphx_glr_plot_segmentation_003.png :alt: Recurrence similarity, Path similarity, Combined graph :srcset: /auto_examples/images/sphx_glr_plot_segmentation_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 135-136 Now let's compute the normalized Laplacian (Eq. 10) .. GENERATED FROM PYTHON SOURCE LINES 136-174 .. code-block:: default L = scipy.sparse.csgraph.laplacian(A, normed=True) # and its spectral decomposition evals, evecs = scipy.linalg.eigh(L) # We can clean this up further with a median filter. # This can help smooth over small discontinuities evecs = scipy.ndimage.median_filter(evecs, size=(9, 1)) # cumulative normalization is needed for symmetric normalize laplacian eigenvectors Cnorm = np.cumsum(evecs**2, axis=1)**0.5 # If we want k clusters, use the first k normalized eigenvectors. # Fun exercise: see how the segmentation changes as you vary k k = 5 X = evecs[:, :k] / Cnorm[:, k-1:k] # Plot the resulting representation (Figure 1, center and right) plt.figure(figsize=(8, 4)) plt.subplot(1, 2, 2) librosa.display.specshow(Rf, cmap='inferno_r') plt.title('Recurrence matrix') plt.subplot(1, 2, 1) librosa.display.specshow(X, y_axis='time', y_coords=beat_times) plt.title('Structure components') plt.tight_layout() .. image-sg:: /auto_examples/images/sphx_glr_plot_segmentation_004.png :alt: Recurrence matrix, Structure components :srcset: /auto_examples/images/sphx_glr_plot_segmentation_004.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 175-177 Let's use these k components to cluster beats into segments (Algorithm 1) .. GENERATED FROM PYTHON SOURCE LINES 177-201 .. code-block:: default KM = sklearn.cluster.KMeans(n_clusters=k) seg_ids = KM.fit_predict(X) # and plot the results plt.figure(figsize=(12, 4)) colors = plt.get_cmap('Paired', k) plt.subplot(1, 3, 2) librosa.display.specshow(Rf, cmap='inferno_r') plt.title('Recurrence matrix') plt.subplot(1, 3, 1) librosa.display.specshow(X, y_axis='time', y_coords=beat_times) plt.title('Structure components') plt.subplot(1, 3, 3) librosa.display.specshow(np.atleast_2d(seg_ids).T, cmap=colors) plt.title('Estimated segments') plt.colorbar(ticks=range(k)) plt.tight_layout() .. image-sg:: /auto_examples/images/sphx_glr_plot_segmentation_005.png :alt: Recurrence matrix, Structure components, Estimated segments :srcset: /auto_examples/images/sphx_glr_plot_segmentation_005.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 202-203 Locate segment boundaries from the label sequence .. GENERATED FROM PYTHON SOURCE LINES 203-219 .. code-block:: default bound_beats = 1 + np.flatnonzero(seg_ids[:-1] != seg_ids[1:]) # Count beat 0 as a boundary bound_beats = librosa.util.fix_frames(bound_beats, x_min=0) # Compute the segment label for each boundary bound_segs = list(seg_ids[bound_beats]) # Convert beat indices to frames bound_frames = beats[bound_beats] # Make sure we cover to the end of the track bound_frames = librosa.util.fix_frames(bound_frames, x_min=None, x_max=C.shape[1]-1) .. GENERATED FROM PYTHON SOURCE LINES 220-221 And plot the final segmentation over original CQT .. GENERATED FROM PYTHON SOURCE LINES 221-247 .. code-block:: default # sphinx_gallery_thumbnail_number = 5 import matplotlib.patches as patches plt.figure(figsize=(12, 4)) bound_times = librosa.frames_to_time(bound_frames) freqs = librosa.cqt_frequencies(n_bins=C.shape[0], fmin=librosa.note_to_hz('C1'), bins_per_octave=BINS_PER_OCTAVE) librosa.display.specshow(C, y_axis='cqt_hz', sr=sr, bins_per_octave=BINS_PER_OCTAVE, x_axis='time') ax = plt.gca() for interval, label in zip(zip(bound_times, bound_times[1:]), bound_segs): ax.add_patch(patches.Rectangle((interval[0], freqs[0]), interval[1] - interval[0], freqs[-1], facecolor=colors(label), alpha=0.50)) plt.tight_layout() plt.show() .. image-sg:: /auto_examples/images/sphx_glr_plot_segmentation_006.png :alt: plot segmentation :srcset: /auto_examples/images/sphx_glr_plot_segmentation_006.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out Out: .. code-block:: none /tmp/tmpfl4ra6qp/b0064fe7dbe8048b1d4148e61a568b6fe3fca91b/librosa/display.py:862: MatplotlibDeprecationWarning: The 'basey' parameter of __init__() has been renamed 'base' since Matplotlib 3.3; support for the old name will be dropped two minor releases later. scaler(mode, **kwargs) .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 12.621 seconds) .. _sphx_glr_download_auto_examples_plot_segmentation.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_segmentation.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_segmentation.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_