ssspy.bss.iva#

In this module, we separate multichannel signals using independent vector analysis (IVA). We denote the number of sources and microphones as \(N\) and \(M\), respectively. We also denote short-time Fourier transforms of source, observed, and separated signals as \(\boldsymbol{s}_{ij}\), \(\boldsymbol{x}_{ij}\), and \(\boldsymbol{y}_{ij}\), respectively.

\[\begin{split}\boldsymbol{s}_{ij} &= (s_{ij1},\ldots,s_{ijn},\ldots,s_{ijN})^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \boldsymbol{x}_{ij} &= (x_{ij1},\ldots,x_{ijm},\ldots,x_{ijM})^{\mathsf{T}}\in\mathbb{C}^{M}, \\ \boldsymbol{y}_{ij} &= (y_{ij1},\ldots,y_{ijn},\ldots,y_{ijN})^{\mathsf{T}}\in\mathbb{C}^{N},\end{split}\]

where \(i=1,\ldots,I\) and \(j=1,\ldots,J\) are indices of frequency bins and time frames, respectively. We also define the following vector:

\[\vec{\boldsymbol{y}}_{jn} = (y_{1jn},\ldots,y_{ijn},\ldots,y_{Ijn})^{\mathsf{T}}\in\mathbb{C}^{I}.\]

When a mixing system is time-invariant, \(\boldsymbol{x}_{ij}\) is represented as follows:

\[\boldsymbol{x}_{ij} = \boldsymbol{A}_{i}\boldsymbol{s}_{ij},\]

where \(\boldsymbol{A}_{i}=(\boldsymbol{a}_{i1},\ldots,\boldsymbol{a}_{in},\ldots,\boldsymbol{a}_{iN})\in\mathbb{C}^{M\times N}\) is a mixing matrix. If \(M=N\) and \(\boldsymbol{A}_{i}\) is non-singular, a demixing system is represented as

\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij},\]

where \(\boldsymbol{W}_{i}=(\boldsymbol{w}_{i1},\ldots,\boldsymbol{w}_{in},\ldots,\boldsymbol{w}_{iN})^{\mathsf{H}}\in\mathbb{C}^{N\times M}\) is a demixing matrix. The negative log-likelihood of observed signals (divided by \(J\)) is computed as follows:

\[\begin{split}\mathcal{L} &= -\frac{1}{J}\log p(\mathcal{X}) \\ &= -\frac{1}{J}\left(\log p(\mathcal{Y}) \ + \sum_{i}\log|\det\boldsymbol{W}_{i}|^{2J} \right) \\ &= -\frac{1}{J}\sum_{j,n}\log p(\vec{\boldsymbol{y}}_{jn}) - 2\sum_{i}\log|\det\boldsymbol{W}_{i}| \\ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}) - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|, \\ G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}),\end{split}\]

where \(G(\vec{\boldsymbol{y}}_{jn})\) is a contrast function. The derivative of \(G(\vec{\boldsymbol{y}}_{jn})\) is called a score function.

\[\phi_{i}(\vec{\boldsymbol{y}}_{jn}) = \frac{\partial G(\vec{\boldsymbol{y}}_{jn})}{\partial y_{ijn}^{*}}.\]

Algorithms#

class ssspy.bss.iva.IVABase(flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Base class of independent vector analysis (IVA).

Parameters:
  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

__call__(input, n_iter=100, initial_call=True, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – Mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – Number of iterations of demixing filter updates. Default: 100.

  • initial_call (bool) – If True, perform callbacks (and computation of loss if necessary) before iterations.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

apply_projection_back()#

Apply projection back technique to estimated spectrograms.

Return type:

None

compute_logdet(demix_filter)#

Compute log-determinant of demixing filter.

Parameters:

demix_filter (numpy.ndarray) – Demixing filters with shape of (n_bins, n_sources, n_channels).

Return type:

ndarray

Returns:

numpy.ndarray of computed log-determinant values.

compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\begin{split}\mathcal{L} \ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}) \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|, \\ G(\vec{\boldsymbol{y}}_{jn}) \ &= - \log p(\vec{\boldsymbol{y}}_{jn})\end{split}\]
Return type:

float

Returns:

Computed loss.

restore_scale()#

Restore scale ambiguity.

If self.scale_restoration=projection_back, we use projection back technique. If self.scale_restoration=minimal_distortion_principle, we use minimal distortion principle.

Return type:

None

separate(input, demix_filter)#

Separate input using demixing_filter.

\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij}\]
Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • demix_filter (numpy.ndarray) – The demixing filters to separate input. The shape is (n_bins, n_sources, n_channels).

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_sources, n_bins, n_frames).

update_once()#

Update demixing filters once.

Return type:

None

class ssspy.bss.iva.GradIVABase(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=False, scale_restoration=True, record_loss=True, reference_id=0)#

Base class of independent vector analysis (IVA) using gradient descent.

Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • contrast_fn (callable) – A contrast function which corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • score_fn (callable) – A score function which corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the gradient descent if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

class ssspy.bss.iva.FastIVABase(flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Base class of fast independent vector analysis (FastIVA).

Parameters:
  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

apply_projection_back()#

Apply projection back technique to estimated spectrograms.

Return type:

None

compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\begin{split}\mathcal{L} \ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}), \\ G(\vec{\boldsymbol{y}}_{jn}) \ &= - \log p(\vec{\boldsymbol{y}}_{jn})\end{split}\]
Return type:

float

Returns:

Computed loss.

separate(input, demix_filter, use_whitening=True)#

Separate input using demixing_filter.

\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij}\]
Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • demix_filter (numpy.ndarray) – The demixing filters to separate input. The shape is (n_bins, n_sources, n_channels).

  • use_whitening (bool) – If use_whitening=True, use_whitening (sphering) is applied to input. Default: True.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_sources, n_bins, n_frames).

class ssspy.bss.iva.AuxIVABase(contrast_fn=None, d_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Base class of auxiliary-function-based independent vector analysis (IVA).

Parameters:
  • contrast_fn (callable) – A contrast function corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • d_contrast_fn (callable) – A derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used.

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

__call__(input, n_iter=100, initial_call=True, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – The number of iterations of demixing filter updates. Default: 100.

  • initial_call (bool) – If True, perform callbacks (and computation of loss if necessary) before iterations.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

apply_projection_back()#

Apply projection back technique to estimated spectrograms.

Return type:

None

compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\begin{split}\mathcal{L} \ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}) \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|, \\ G(\vec{\boldsymbol{y}}_{jn}) \ &= - \log p(\vec{\boldsymbol{y}}_{jn})\end{split}\]
Return type:

float

Returns:

Computed loss.

separate(input, demix_filter)#

Separate input using demixing_filter.

\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij}\]
Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • demix_filter (numpy.ndarray) – The demixing filters to separate input. The shape is (n_bins, n_sources, n_channels).

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_sources, n_bins, n_frames).

class ssspy.bss.iva.GradIVA(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) [1] using gradient descent.

Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • contrast_fn (callable) – A contrast function corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • score_fn (callable) – A score function corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the gradient descent if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def score_fn(y):
...     norm = np.linalg.norm(y, axis=1, keepdims=True)
...     return y / np.maximum(norm, 1e-10)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradIVA(
...     contrast_fn=contrast_fn,
...     score_fn=score_fn,
...     is_holonomic=True,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def score_fn(y):
...     norm = np.linalg.norm(y, axis=1, keepdims=True)
...     return y / np.maximum(norm, 1e-10)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradIVA(
...     contrast_fn=contrast_fn,
...     score_fn=score_fn,
...     is_holonomic=False,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
update_once()#

Update demixing filters once using the gradient descent.

If is_holonomic=True, demixing filters are updated as follows: :rtype: None

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \ \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}} \ -\boldsymbol{I}\right)\boldsymbol{W}_{i}^{-\mathsf{H}},\]

where

\[\begin{split}\boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j}) &= \left(\phi_{i}(\vec{\boldsymbol{y}}_{j1}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jN}))\ \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}) &= \frac{\partial G(\vec{\boldsymbol{y}}_{jn})}{\partial y_{ijn}^{*}}, \\ G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}).\end{split}\]

Otherwise (is_holonomic=False),

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}^{-\mathsf{H}}.\]
class ssspy.bss.iva.NaturalGradIVA(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using natural gradient descent.

Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • contrast_fn (callable) – A contrast function corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • score_fn (callable) – A score function corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def score_fn(y):
...     norm = np.linalg.norm(y, axis=1, keepdims=True)
...     return y / np.maximum(norm, 1e-10)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradIVA(
...     contrast_fn=contrast_fn,
...     score_fn=score_fn,
...     is_holonomic=True,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def score_fn(y):
...     norm = np.linalg.norm(y, axis=1, keepdims=True)
...     return y / np.maximum(norm, 1e-10)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradIVA(
...     contrast_fn=contrast_fn,
...     score_fn=score_fn,
...     is_holonomic=False,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
update_once()#

Update demixing filters once using the natural gradient descent.

If is_holonomic=True, demixing filters are updated as follows: :rtype: None

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \ \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}} \ -\boldsymbol{I}\right)\boldsymbol{W}_{i},\]

where

\[\begin{split}\boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j}) &= \left(\phi_{i}(\vec{\boldsymbol{y}}_{j1}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jN}))\ \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}) &= \frac{\partial G(\vec{\boldsymbol{y}}_{jn})}{\partial y_{ijn}^{*}}, \\ G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}).\end{split}\]

Otherwise (is_holonomic=False),

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}.\]
class ssspy.bss.iva.FastIVA(contrast_fn=None, d_contrast_fn=None, dd_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Fast independent vector analysis (Fast IVA) [2].

Parameters:
  • contrast_fn (callable) – A contrast function which corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • d_contrast_fn (callable) – A derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • dd_contrast_fn (callable) – Second order derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

>>> from ssspy.transform import whiten
>>> from ssspy.algorithm import projection_back

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> def dd_contrast_fn(y):
...     return 2 * np.zeros_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = FastIVA(
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
...     dd_contrast_fn=dd_contrast_fn,
...     scale_restoration=False,
... )
>>> spectrogram_mix_whitened = whiten(spectrogram_mix)
>>> spectrogram_est = iva(spectrogram_mix_whitened, n_iter=100)
>>> spectrogram_est = projection_back(spectrogram_est, reference=spectrogram_mix)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
__call__(input, n_iter=100, initial_call=True, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – The number of iterations of demixing filter updates. Default: 100.

  • initial_call (bool) – If True, perform callbacks (and computation of loss if necessary) before iterations.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

update_once(flooring_fn='self')#

Update demixing filters once.

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

Demixing filters are updated as follows:

\[\begin{split}\boldsymbol{w}_{in} \leftarrow&\frac{1}{J}\sum_{j} \frac{G'_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2})} {2\|\vec{\boldsymbol{y}}_{jn}\|_{2}} \left(\boldsymbol{w}_{in}-y_{ijn}^{*}\boldsymbol{x}_{ij}\right) \notag \\ &-\frac{1}{J}\sum_{j}\frac{|y_{ijn}|^{2}}{2\|\vec{\boldsymbol{y}}_{jn}\|_{2}}\left( \frac{G'_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2})} {\|\vec{\boldsymbol{y}}_{jn}\|_{2}} - G''_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2}) \right)\boldsymbol{w}_{in} \\ \boldsymbol{W}_{i} \leftarrow&\left(\boldsymbol{W}_{i}\boldsymbol{W}_{i}^{\mathsf{H}}\right)^{-\frac{1}{2}} \boldsymbol{W}_{i}.\end{split}\]
class ssspy.bss.iva.FasterIVA(contrast_fn=None, d_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Faster independent vector analysis (Faster IVA) [3].

Parameters:
  • contrast_fn (callable) – A contrast function which corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • d_contrast_fn (callable) – A derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

>>> from ssspy.transform import whiten
>>> from ssspy.algorithm import projection_back

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = FasterIVA(
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
...     scale_restoration=False,
... )
>>> spectrogram_mix_whitened = whiten(spectrogram_mix)
>>> spectrogram_est = iva(spectrogram_mix_whitened, n_iter=100)
>>> spectrogram_est = projection_back(spectrogram_est, reference=spectrogram_mix)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
__call__(input, n_iter=100, initial_call=True, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – The number of iterations of demixing filter updates. Default: 100.

  • initial_call (bool) – If True, perform callbacks (and computation of loss if necessary) before iterations.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

update_once(flooring_fn='self')#

Update demixing filters once.

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

In FasterIVA, we compute the eigenvector of \(\boldsymbol{U}_{in}\) which corresponds to the largest eigenvalue by solving

\[\boldsymbol{U}_{in}\boldsymbol{w}_{in} = \lambda_{in}\boldsymbol{w}_{in}.\]

Then,

\[\boldsymbol{W}_{i} \leftarrow\left(\boldsymbol{W}_{i}\boldsymbol{W}_{i}^{\mathsf{H}}\right)^{-\frac{1}{2}} \boldsymbol{W}_{i}.\]
class ssspy.bss.iva.AuxIVA(spatial_algorithm='IP', contrast_fn=None, d_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, scale_restoration=True, record_loss=True, reference_id=0, **kwargs)#

Auxiliary-function-based independent vector analysis (IVA) [4].

Parameters:
  • spatial_algorithm (str) – Algorithm for demixing filter updates. Choose IP, IP1, IP2, ISS, ISS1, ISS2, or IPA. Default: IP.

  • contrast_fn (callable) – A contrast function which corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • d_contrast_fn (callable) – A derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used.

  • pair_selector (callable, optional) – Selector to choose updaing pair in IP2 and ISS2. If None is given, sequential_pair_selector is used. Default: None.

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the demixing filter update if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

  • lqpqm_normalization (bool) – This keyword argument can be specified when spatial_algorithm='IPA'. If True, normalization by trace is applied to positive semi-definite matrix in LQPQM. Default: True.

  • newton_iter (int) – This keyword argument can be specified when spatial_algorithm='IPA'. Number of iterations in Newton method. Default: 1.

Examples

Update demixing filters by IP:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="IP",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by IP2:

>>> from ssspy.utils.select_pair import sequential_pair_selector

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="IP2",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
...     pair_selector=sequential_pair_selector,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="ISS",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS2:

>>> import functools
>>> from ssspy.utils.select_pair import sequential_pair_selector

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="ISS2",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
...     pair_selector=functools.partial(sequential_pair_selector, step=2),
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by IPA:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="IPA",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
__call__(input, n_iter=100, initial_call=True, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – The number of iterations of demixing filter updates. Default: 100.

  • initial_call (bool) – If True, perform callbacks (and computation of loss if necessary) before iterations.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

update_once(flooring_fn='self')#

Update demixing filters once.

  • If self.spatial_algorithm is IP or IP1, update_once_ip1 is called.

  • If self.spatial_algorithm is IP2, update_once_ip2 is called.

  • If self.spatial_algorithm is ISS or ISS1, update_once_iss1 is called.

  • If self.spatial_algorithm is ISS2, update_once_iss2 is called.

  • If self.spatial_algorithm is IPA, update_once_ipa is called.

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

update_once_ip1(flooring_fn='self')#

Update demixing filters once using iterative projection.

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

Compute auxiliary variables:

\[\bar{r}_{jn} \leftarrow\|\vec{\boldsymbol{y}}_{jn}\|_{2}\]

Then, demixing filters are updated sequentially for \(n=1,\ldots,N\) as follows:

\[\begin{split}\boldsymbol{w}_{in} &\leftarrow\left(\boldsymbol{W}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\right)^{-1} \ \boldsymbol{e}_{n}, \\ \boldsymbol{w}_{in} &\leftarrow\frac{\boldsymbol{w}_{in}} {\sqrt{\boldsymbol{w}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\boldsymbol{w}_{in}}}, \\\end{split}\]

where

\[\begin{split}\boldsymbol{U}_{in} &= \frac{1}{J}\sum_{j} \varphi(\bar{r}_{jn})\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}, \\ \varphi(\bar{r}_{jn}) &= \frac{G'_{\mathbb{R}}(\bar{r}_{jn})}{2\bar{r}_{jn}}, \\ G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}), \\ G_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2}) &= G(\vec{\boldsymbol{y}}_{jn}).\end{split}\]
update_once_ip2(flooring_fn='self')#

Update demixing filters once using pairwise iterative projection.

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

For \(n_{1}\) and \(n_{2}\) (\(n_{1}\neq n_{2}\)), compute auxiliary variables:

\[\begin{split}\bar{r}_{jn_{1}} &\leftarrow\|\vec{\boldsymbol{y}}_{jn_{1}}\|_{2} \\ \bar{r}_{jn_{2}} &\leftarrow\|\vec{\boldsymbol{y}}_{jn_{2}}\|_{2}\end{split}\]

Then, for \(n=n_{1},n_{2}\), compute weighted covariance matrix as follows:

\[\begin{split}\boldsymbol{U}_{in_{1}} &= \frac{1}{J}\sum_{j} \varphi(\bar{r}_{jn_{1}})\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}, \\ \boldsymbol{U}_{in_{2}} &= \frac{1}{J}\sum_{j} \varphi(\bar{r}_{jn_{2}})\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}},\end{split}\]

where

\[\varphi(\bar{r}_{jn}) = \frac{G'_{\mathbb{R}}(\bar{r}_{jn})}{2\bar{r}_{jn}}.\]

Using \(\boldsymbol{U}_{in_{1}}\) and \(\boldsymbol{U}_{in_{2}}\), we compute generalized eigenvectors.

\[\left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i} = \lambda_{i} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i},\]

where

\[\begin{split}\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{1}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ), \\ \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{2}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ).\end{split}\]

After that, we standardize two eigenvectors \(\boldsymbol{h}_{in_{1}}\) and \(\boldsymbol{h}_{in_{2}}\).

\[\begin{split}\boldsymbol{h}_{in_{1}} &\leftarrow\frac{\boldsymbol{h}_{in_{1}}} {\sqrt{\boldsymbol{h}_{in_{1}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{1}}}}, \\ \boldsymbol{h}_{in_{2}} &\leftarrow\frac{\boldsymbol{h}_{in_{2}}} {\sqrt{\boldsymbol{h}_{in_{2}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{2}}}}.\end{split}\]

Then, update \(\boldsymbol{w}_{in_{1}}\) and \(\boldsymbol{w}_{in_{2}}\) simultaneously.

\[\begin{split}\boldsymbol{w}_{in_{1}} &\leftarrow \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{1}} \\ \boldsymbol{w}_{in_{2}} &\leftarrow \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{2}}.\end{split}\]

At each iteration, we update pairs of \(n_{1}\) and \(n_{1}\) for \(n_{1}\neq n_{2}\).

update_once_ipa(flooring_fn='self')#

Update estimated spectrograms once using iterative projection with adjustment [5].

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

First, we compute auxiliary variables:

\[\bar{r}_{jn} \leftarrow\|\vec{\boldsymbol{y}}_{jn}\|_{2},\]

where

\[\begin{split}G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}), \\ G_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2}) &= G(\vec{\boldsymbol{y}}_{jn}).\end{split}\]

Then, by defining, \(\tilde{\boldsymbol{U}}_{in'}\), \(\boldsymbol{A}_{in}\in\mathbb{R}^{(N-1)\times(N-1)}\), \(\boldsymbol{b}_{in}\in\mathbb{C}^{N-1}\), \(\boldsymbol{C}_{in}\in\mathbb{C}^{(N-1)\times(N-1)}\), \(\boldsymbol{d}_{in}\in\mathbb{C}^{N-1}\), and \(z_{in}\in\mathbb{R}_{\geq 0}\) as follows:

\[\begin{split}\tilde{\boldsymbol{U}}_{in'} &= \frac{1}{J}\sum_{j}\frac{G'_{\mathbb{R}}(\bar{r}_{jn'})}{2\bar{r}_{jn'}} \boldsymbol{y}_{ij}\boldsymbol{y}_{ij}^{\mathsf{H}}, \\ \boldsymbol{A}_{in} &= \mathrm{diag}(\ldots, \boldsymbol{e}_{n}^{\mathsf{T}}\tilde{\boldsymbol{U}}_{in'}\boldsymbol{e}_{n} ,\ldots)~~(n'\neq n), \\ \boldsymbol{b}_{in} &= (\ldots, \boldsymbol{e}_{n}^{\mathsf{T}}\tilde{\boldsymbol{U}}_{in'}\boldsymbol{e}_{n'} ,\ldots)^{\mathsf{T}}~~(n'\neq n), \\ \boldsymbol{C}_{in} &= \bar{\boldsymbol{E}}_{n}^{\mathsf{T}}(\tilde{\boldsymbol{U}}_{in}^{-1})^{*} \bar{\boldsymbol{E}}_{n}, \\ \boldsymbol{d}_{in} &= \bar{\boldsymbol{E}}_{n}^{\mathsf{T}}(\tilde{\boldsymbol{U}}_{in}^{-1})^{*} \boldsymbol{e}_{n}, \\ z_{in} &= \boldsymbol{e}_{n}^{\mathsf{T}}\tilde{\boldsymbol{U}}_{in}^{-1}\boldsymbol{e}_{n} - \boldsymbol{d}_{in}^{\mathsf{H}}\boldsymbol{C}_{in}^{-1}\boldsymbol{d}_{in},\end{split}\]

\(\boldsymbol{y}_{ij}\) is updated via log-quadratically penelized quadratic minimization (LQPQM).

\[\begin{split}\check{\boldsymbol{q}}_{in} &\leftarrow \mathrm{LQPQM2}(\boldsymbol{H}_{in},\boldsymbol{v}_{in},z_{in}), \\ \boldsymbol{q}_{in} &\leftarrow \boldsymbol{G}_{in}^{-1}\check{\boldsymbol{q}}_{in} - \boldsymbol{A}_{in}^{-1}\boldsymbol{b}_{in}, \\ \tilde{\boldsymbol{q}}_{in} &\leftarrow \boldsymbol{e}_{n} - \bar{\boldsymbol{E}}_{n}\boldsymbol{q}_{in}, \\ \boldsymbol{p}_{in} &\leftarrow \frac{\tilde{\boldsymbol{U}}_{in}^{-1}\tilde{\boldsymbol{q}}_{in}^{*}} {\sqrt{(\tilde{\boldsymbol{q}}_{in}^{*})^{\mathsf{H}}\tilde{\boldsymbol{U}}_{in}^{-1} \tilde{\boldsymbol{q}}_{in}^{*}}}, \\ \boldsymbol{\Upsilon}_{i}^{(n)} &\leftarrow \boldsymbol{I} + \boldsymbol{e}_{n}(\boldsymbol{p}_{in} - \boldsymbol{e}_{n})^{\mathsf{H}} + \bar{\boldsymbol{E}}_{n}\boldsymbol{q}_{in}^{*}\boldsymbol{e}_{n}^{\mathsf{T}}, \\ \boldsymbol{y}_{ij} &\leftarrow \boldsymbol{\Upsilon}_{i}^{(n)}\boldsymbol{y}_{ij},\end{split}\]
update_once_iss1(flooring_fn='self')#

Update estimated spectrograms once using iterative source steering [6].

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

First, update auxiliary variables

\[\bar{r}_{jn} \leftarrow\|\vec{\boldsymbol{y}}_{jn}\|_{2}.\]

Then, update \(y_{ijn}\) as follows:

\[\begin{split}\boldsymbol{y}_{ij} & \leftarrow\boldsymbol{y}_{ij} - \boldsymbol{d}_{in}y_{ijn}, \\ d_{inn'} &= \begin{cases} \dfrac{\sum_{j}\dfrac{G'_{\mathbb{R}}(\bar{r}_{jn'})}{2\bar{r}_{jn'}} y_{ijn'}y_{ijn}^{*}}{\sum_{j}\dfrac{G'_{\mathbb{R}}(\bar{r}_{jn'})} {2\bar{r}_{jn'}}|y_{ijn}|^{2}} & (n'\neq n) \\ 1 - \dfrac{1}{\sqrt{\dfrac{1}{J}\sum_{j}\dfrac{G'_{\mathbb{R}}(\bar{r}_{jn'})} {2\bar{r}_{jn'}} |y_{ijn}|^{2}}} & (n'=n) \end{cases}.\end{split}\]
update_once_iss2(flooring_fn='self')#

Update estimated spectrograms once using pairwise iterative source steering [7].

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

First, we compute auxiliary variables:

\[\bar{r}_{jn} \leftarrow\|\vec{\boldsymbol{y}}_{jn}\|_{2},\]

where

\[\begin{split}G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}), \\ G_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2}) &= G(\vec{\boldsymbol{y}}_{jn}).\end{split}\]

Then, we compute \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\) for \(n_{1}\neq n_{2}\):

\[\begin{split}\begin{array}{rclc} \boldsymbol{G}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}}\varphi(\bar{r}_{jn}) \boldsymbol{y}_{ij}^{(n_{1},n_{2})}{\boldsymbol{y}_{ij}^{(n_{1},n_{2})}}^{\mathsf{H}} &(n=1,\ldots,N), \\ \boldsymbol{f}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}} \varphi(\bar{r}_{jn})y_{ijn}^{*}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} &(n\neq n_{1},n_{2}), \\ \varphi(\bar{r}_{jn}) &=&\dfrac{G'_{\mathbb{R}}(\bar{r}_{jn})}{2\bar{r}_{jn}}. \end{array}\end{split}\]

Using \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\), we compute

\[\begin{split}\begin{array}{rclc} \boldsymbol{p}_{in} &=& \dfrac{\boldsymbol{h}_{in}} {\sqrt{\boldsymbol{h}_{in}^{\mathsf{H}}\boldsymbol{G}_{in}^{(n_{1},n_{2})} \boldsymbol{h}_{in}}} & (n=n_{1},n_{2}), \\ \boldsymbol{q}_{in} &=& -{\boldsymbol{G}_{in}^{(n_{1},n_{2})}}^{-1}\boldsymbol{f}_{in}^{(n_{1},n_{2})} & (n\neq n_{1},n_{2}), \end{array}\end{split}\]

where \(\boldsymbol{h}_{in}\) (\(n=n_{1},n_{2}\)) is a generalized eigenvector obtained from

\[\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{i} = \lambda_{i}\boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{i}.\]

Separated signal \(y_{ijn}\) is updated as follows:

\[\begin{split}y_{ijn} &\leftarrow\begin{cases} &\boldsymbol{p}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} & (n=n_{1},n_{2}) \\ &\boldsymbol{q}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} + y_{ijn} & (n\neq n_{1},n_{2}) \end{cases}.\end{split}\]
class ssspy.bss.iva.GradLaplaceIVA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using the gradient descent on a Laplace distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a Laplace distribution.

\[p(\vec{\boldsymbol{y}}_{jn})\propto\exp(\|\vec{\boldsymbol{y}}_{jn}\|_{2})\]
Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the gradient descent if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradLaplaceIVA(is_holonomic=True)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradLaplaceIVA(is_holonomic=False)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\mathcal{L} \ = \frac{2}{J}\sum_{j,n}\|\vec{\boldsymbol{y}}_{jn}\|_{2} \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|.\]
Return type:

float

Returns:

Computed loss.

update_once()#

Update demixing filters once using the gradient descent.

If is_holonomic=True, demixing filters are updated as follows: :rtype: None

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \ \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}} \ -\boldsymbol{I}\right)\boldsymbol{W}_{i}^{-\mathsf{H}},\]

where

\[\begin{split}\boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j}) &= \left(\phi_{i}(\vec{\boldsymbol{y}}_{j1}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jN}))\ \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}) &= \frac{y_{ijn}}{\|\vec{\boldsymbol{y}}_{jn}\|_{2}}.\end{split}\]

Otherwise (is_holonomic=False),

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}^{-\mathsf{H}}.\]
class ssspy.bss.iva.GradGaussIVA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using the gradient descent on a time-varying Gaussian distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a time-varying Gaussian distribution.

\[p(\vec{\boldsymbol{y}}_{jn}) \propto\frac{1}{\alpha_{jn}^{I}} \exp\left(\frac{\|\vec{\boldsymbol{y}}_{jn}\|_{2}^{2}}{\alpha_{jn}}\right).\]
Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the gradient descent if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradGaussIVA(is_holonomic=True)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradGaussIVA(is_holonomic=False)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
update_once()#

Update variance and demixing filters and once.

Return type:

None

update_source_model()#

Update variance of Gaussian distribution.

Return type:

None

class ssspy.bss.iva.NaturalGradLaplaceIVA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using the natural gradient descent on a Laplace distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a Laplace distribution.

\[p(\vec{\boldsymbol{y}}_{jn}) \propto\frac{1}{\alpha_{jn}^{I}} \exp\left(\frac{\|\vec{\boldsymbol{y}}_{jn}\|_{2}}{\alpha_{jn}}\right)\]
Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the gradient descent if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradLaplaceIVA(is_holonomic=True)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradLaplaceIVA(is_holonomic=False)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\mathcal{L} \ = \frac{2}{J}\sum_{j,n}\|\vec{\boldsymbol{y}}_{jn}\|_{2} \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|.\]
Return type:

float

Returns:

Computed loss.

update_once()#

Update demixing filters once using the natural gradient descent.

If is_holonomic=True, demixing filters are updated as follows: :rtype: None

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \ \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}} \ -\boldsymbol{I}\right)\boldsymbol{W}_{i},\]

where

\[\begin{split}\boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j}) &= \left(\phi_{i}(\vec{\boldsymbol{y}}_{j1}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jN}))\ \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}) &= \frac{y_{ijn}}{\|\vec{\boldsymbol{y}}_{jn}\|_{2}}.\end{split}\]

Otherwise (is_holonomic=False),

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}.\]
class ssspy.bss.iva.NaturalGradGaussIVA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using the natural gradient descent on a time-varying Gaussian distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a time-varying Gaussian distribution.

\[p(\vec{\boldsymbol{y}}_{jn}) \propto\frac{1}{\alpha_{jn}^{I}} \exp\left(\frac{\|\vec{\boldsymbol{y}}_{jn}\|_{2}^{2}}{\alpha_{jn}}\right).\]
Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the gradient descent if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradGaussIVA(is_holonomic=True)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradGaussIVA(is_holonomic=False)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\begin{split}\mathcal{L} \ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}) \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|, \\ G(\vec{\boldsymbol{y}}_{jn}) \ &= - \log p(\vec{\boldsymbol{y}}_{jn})\end{split}\]
Return type:

float

Returns:

Computed loss.

update_once()#

Update variance and demixing filters and once.

Return type:

None

class ssspy.bss.iva.AuxLaplaceIVA(spatial_algorithm='IP', flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, scale_restoration=True, record_loss=True, reference_id=0, **kwargs)#

Auxiliary-function-based independent vector analysis (IVA) on a Laplace distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a Laplace distribution.

\[p(\vec{\boldsymbol{y}}_{jn})\propto\exp(\|\vec{\boldsymbol{y}}_{jn}\|_{2})\]
Parameters:
  • spatial_algorithm (str) – Algorithm for demixing filter updates. Choose IP, IP1, IP2, ISS, ISS1, or ISS2. Default: IP.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used.

  • pair_selector (callable, optional) – Selector to choose updaing pair in IP2 and ISS2. If None is given, sequential_pair_selector is used. Default: None.

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the demixing filter update if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

Update demixing filters by IP:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxLaplaceIVA(spatial_algorithm="IP")
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by IP2:

>>> from ssspy.utils.select_pair import sequential_pair_selector

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxLaplaceIVA(
...     spatial_algorithm="IP2",
...     pair_selector=sequential_pair_selector,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxLaplaceIVA(spatial_algorithm="ISS")
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS2:

>>> import functools
>>> from ssspy.utils.select_pair import sequential_pair_selector

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxLaplaceIVA(
...     spatial_algorithm="ISS2",
...     pair_selector=functools.partial(sequential_pair_selector, step=2),
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
class ssspy.bss.iva.AuxGaussIVA(spatial_algorithm='IP', flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, scale_restoration=True, record_loss=True, reference_id=0, **kwargs)#

Auxiliary-function-based independent vector analysis (IVA) on a time-varying Gaussian distribution [8].

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a time-varying Gaussian distribution.

\[p(\vec{\boldsymbol{y}}_{jn}) \propto\frac{1}{\alpha_{jn}^{I}} \exp\left(\frac{\|\vec{\boldsymbol{y}}_{jn}\|_{2}^{2}}{\alpha_{jn}}\right).\]
Parameters:
  • spatial_algorithm (str) – Algorithm for demixing filter updates. Choose IP, IP1, IP2, ISS, ISS1, or ISS2. Default: IP.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used.

  • pair_selector (callable, optional) – Selector to choose updaing pair in IP2 and ISS2. If None is given, sequential_pair_selector is used. Default: None.

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back or minimal_distortion_principle. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the demixing filter update if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back and minimal distortion principle. Default: 0.

Examples

Update demixing filters by IP:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxGaussIVA(spatial_algorithm="IP")
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by IP2:

>>> from ssspy.utils.select_pair import sequential_pair_selector

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxGaussIVA(
...     spatial_algorithm="IP2",
...     pair_selector=sequential_pair_selector,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxGaussIVA(spatial_algorithm="ISS")
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS2:

>>> import functools
>>> from ssspy.utils.select_pair import sequential_pair_selector

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxGaussIVA(
...     spatial_algorithm="ISS2",
...     pair_selector=functools.partial(sequential_pair_selector, step=2),
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
update_once(flooring_fn='self')#

Update variance and demixing filters and once.

Parameters:

flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. If self is given as str, self.flooring_fn is used. Default: self.

Return type:

None

update_source_model()#

Update variance of Gaussian distribution.

Return type:

None