You're reading an old version of this documentation. If you want up-to-date information, please have a look at 0.10.2.


librosa.sequence.viterbi_binary(prob, transition, *, p_state=None, p_init=None, return_logp=False)[source]

Viterbi decoding from binary (multi-label), discriminative state predictions.

Given a sequence of conditional state predictions prob[s, t], indicating the conditional likelihood of state s being active conditional on observation at time t, and a 2*2 transition matrix transition which encodes the conditional probability of moving from state s to state ~s (not-s), the Viterbi algorithm computes the most likely sequence of states from the observations.

This function differs from viterbi_discriminative in that it does not assume the states to be mutually exclusive. viterbi_binary is implemented by transforming the multi-label decoding problem to a collection of binary Viterbi problems (one for each state or label).

The output is a binary matrix states[s, t] indicating whether each state s is active at time t.

Like viterbi_discriminative, the probabilities of the optimal state sequences are not normalized here. If using the return_logp=True option (see below), be aware that the “probabilities” may not sum to (and may exceed) 1.

probnp.ndarray [shape=(…, n_steps,) or (…, n_states, n_steps)], non-negative

prob[s, t] is the probability of state s being active conditional on the observation at time t. Must be non-negative and less than 1.

If prob is 1-dimensional, it is expanded to shape (1, n_steps).

If prob contains multiple input channels, then each channel is decoded independently.

transitionnp.ndarray [shape=(2, 2) or (n_states, 2, 2)], non-negative

If 2-dimensional, the same transition matrix is applied to each sub-problem. transition[0, i] is the probability of the state going from inactive to i, transition[1, i] is the probability of the state going from active to i. Each row must sum to 1.

If 3-dimensional, transition[s] is interpreted as the 2x2 transition matrix for state label s.

p_statenp.ndarray [shape=(n_states,)]

Optional: marginal probability for each state (between [0,1]). If not provided, a uniform distribution (0.5 for each state) is assumed.

p_initnp.ndarray [shape=(n_states,)]

Optional: initial state distribution. If not provided, it is assumed to be uniform.


If True, return the (unnormalized) log-likelihood of the state sequences.

Either states or (states, logp):
statesnp.ndarray [shape=(…, n_states, n_steps)]

The most likely state sequence.

logpnp.ndarray [shape=(…, n_states,)]

If return_logp=True, the (unnormalized) log probability of each state activation sequence states

See also


Viterbi decoding from observation likelihoods


Viterbi decoding for discriminative (mutually exclusive) state predictions


In this example, we have a sequence of binary state likelihoods that we want to de-noise under the assumption that state changes are relatively uncommon. Positive predictions should only be retained if they persist for multiple steps, and any transient predictions should be considered as errors. This use case arises frequently in problems such as instrument recognition, where state activations tend to be stable over time, but subject to abrupt changes (e.g., when an instrument joins the mix).

We assume that the 0 state has a self-transition probability of 90%, and the 1 state has a self-transition probability of 70%. We assume the marginal and initial probability of either state is 50%.

>>> trans = np.array([[0.9, 0.1], [0.3, 0.7]])
>>> prob = np.array([0.1, 0.7, 0.4, 0.3, 0.8, 0.9, 0.8, 0.2, 0.6, 0.3])
>>> librosa.sequence.viterbi_binary(prob, trans, p_state=0.5, p_init=0.5)
array([[0, 0, 0, 0, 1, 1, 1, 0, 0, 0]])