ssspy.bss.iva#

In this module, we separate multichannel signals using independent vector analysis (IVA). We denote the number of sources and microphones as \(N\) and \(M\), respectively. We also denote short-time Fourier transforms of source, observed, and separated signals as \(\boldsymbol{s}_{ij}\), \(\boldsymbol{x}_{ij}\), and \(\boldsymbol{y}_{ij}\), respectively.

\[\begin{split}\boldsymbol{s}_{ij} &= (s_{ij1},\ldots,s_{ijn},\ldots,s_{ijN})^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \boldsymbol{x}_{ij} &= (x_{ij1},\ldots,x_{ijm},\ldots,x_{ijM})^{\mathsf{T}}\in\mathbb{C}^{M}, \\ \boldsymbol{y}_{ij} &= (y_{ij1},\ldots,y_{ijn},\ldots,y_{ijN})^{\mathsf{T}}\in\mathbb{C}^{N},\end{split}\]

where \(i=1,\ldots,I\) and \(j=1,\ldots,J\) are indices of frequency bins and time frames, respectively. We also define the following vector:

\[\vec{\boldsymbol{y}}_{jn} = (y_{1jn},\ldots,y_{ijn},\ldots,y_{Ijn})^{\mathsf{T}}\in\mathbb{C}^{I}.\]

When a mixing system is time-invariant, \(\boldsymbol{x}_{ij}\) is represented as follows:

\[\boldsymbol{x}_{ij} = \boldsymbol{A}_{i}\boldsymbol{s}_{ij},\]

where \(\boldsymbol{A}_{i}=(\boldsymbol{a}_{i1},\ldots,\boldsymbol{a}_{in},\ldots,\boldsymbol{a}_{iN})\in\mathbb{C}^{M\times N}\) is a mixing matrix. If \(M=N\) and \(\boldsymbol{A}_{i}\) is non-singular, a demixing system is represented as

\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij},\]

where \(\boldsymbol{W}_{i}=(\boldsymbol{w}_{i1},\ldots,\boldsymbol{w}_{in},\ldots,\boldsymbol{w}_{iN})^{\mathsf{H}}\in\mathbb{C}^{N\times M}\) is a demixing matrix. The negative log-likelihood of observed signals (divided by \(J\)) is computed as follows:

\[\begin{split}\mathcal{L} &= -\frac{1}{J}\log p(\mathcal{X}) \\ &= -\frac{1}{J}\left(\log p(\mathcal{Y}) \ + \sum_{i}\log|\det\boldsymbol{W}_{i}|^{2J} \right) \\ &= -\frac{1}{J}\sum_{j,n}\log p(\vec{\boldsymbol{y}}_{jn}) - 2\sum_{i}\log|\det\boldsymbol{W}_{i}| \\ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}) - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|, \\ G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}),\end{split}\]

where \(G(\vec{\boldsymbol{y}}_{jn})\) is a contrast function. The derivative of \(G(\vec{\boldsymbol{y}}_{jn})\) is called a score function.

\[\phi_{i}(\vec{\boldsymbol{y}}_{jn}) = \frac{\partial G(\vec{\boldsymbol{y}}_{jn})}{\partial y_{ijn}^{*}}.\]

Algorithms#

class ssspy.bss.iva.IVAbase(flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Base class of independent vector analysis (IVA).

Parameters:
  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

__call__(input, n_iter=100, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – Mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – Number of iterations of demixing filter updates. Default: 100.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

apply_projection_back()#

Apply projection back technique to estimated spectrograms.

Return type:

None

compute_logdet(demix_filter)#

Compute log-determinant of demixing filter.

Parameters:

demix_filter (numpy.ndarray) – Demixing filters with shape of (n_bins, n_sources, n_channels).

Return type:

ndarray

Returns:

numpy.ndarray of computed log-determinant values.

compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\begin{split}\mathcal{L} \ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}) \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|, \\ G(\vec{\boldsymbol{y}}_{jn}) \ &= - \log p(\vec{\boldsymbol{y}}_{jn})\end{split}\]
Return type:

float

Returns:

Computed loss.

restore_scale()#

Restore scale ambiguity.

If self.scale_restoration=projection_back, we use projection back technique.

Return type:

None

separate(input, demix_filter)#

Separate input using demixing_filter.

\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij}\]
Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • demix_filter (numpy.ndarray) – The demixing filters to separate input. The shape is (n_bins, n_sources, n_channels).

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_sources, n_bins, n_frames).

update_once()#

Update demixing filters once.

Return type:

None

class ssspy.bss.iva.GradIVAbase(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=False, scale_restoration=True, record_loss=True, reference_id=0)#

Base class of independent vector analysis (IVA) using gradient descent.

Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • contrast_fn (callable) – A contrast function which corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • score_fn (callable) – A score function which corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

class ssspy.bss.iva.FastIVAbase(flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Base class of fast independent vector analysis (FastIVA).

Parameters:
  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

apply_projection_back()#

Apply projection back technique to estimated spectrograms.

Return type:

None

compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\begin{split}\mathcal{L} \ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}), \\ G(\vec{\boldsymbol{y}}_{jn}) \ &= - \log p(\vec{\boldsymbol{y}}_{jn})\end{split}\]
Return type:

float

Returns:

Computed loss.

separate(input, demix_filter, use_whitening=True)#

Separate input using demixing_filter.

\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij}\]
Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • demix_filter (numpy.ndarray) – The demixing filters to separate input. The shape is (n_bins, n_sources, n_channels).

  • use_whitening (bool) – If use_whitening=True, use_whitening (sphering) is applied to input. Default: True.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_sources, n_bins, n_frames).

class ssspy.bss.iva.AuxIVAbase(contrast_fn=None, d_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Base class of auxiliary-function-based independent vector analysis (IVA).

Parameters:
  • contrast_fn (callable) – A contrast function corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • d_contrast_fn (callable) – A derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used.

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

__call__(input, n_iter=100, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – The number of iterations of demixing filter updates. Default: 100.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

apply_projection_back()#

Apply projection back technique to estimated spectrograms.

Return type:

None

compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\begin{split}\mathcal{L} \ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}) \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|, \\ G(\vec{\boldsymbol{y}}_{jn}) \ &= - \log p(\vec{\boldsymbol{y}}_{jn})\end{split}\]
Return type:

float

Returns:

Computed loss.

separate(input, demix_filter)#

Separate input using demixing_filter.

\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij}\]
Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • demix_filter (numpy.ndarray) – The demixing filters to separate input. The shape is (n_bins, n_sources, n_channels).

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_sources, n_bins, n_frames).

class ssspy.bss.iva.GradIVA(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) [1] using gradient descent.

Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • contrast_fn (callable) – A contrast function corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • score_fn (callable) – A score function corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def score_fn(y):
...     norm = np.linalg.norm(y, axis=1, keepdims=True)
...     return y / np.maximum(norm, 1e-10)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradIVA(
...     contrast_fn=contrast_fn,
...     score_fn=score_fn,
...     is_holonomic=True,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def score_fn(y):
...     norm = np.linalg.norm(y, axis=1, keepdims=True)
...     return y / np.maximum(norm, 1e-10)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradIVA(
...     contrast_fn=contrast_fn,
...     score_fn=score_fn,
...     is_holonomic=False,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
update_once()#

Update demixing filters once using the gradient descent.

If is_holonomic=True, demixing filters are updated as follows:

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \ \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}} \ -\boldsymbol{I}\right)\boldsymbol{W}_{i}^{-\mathsf{H}},\]

where

\[\begin{split}\boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j}) &= \left(\phi_{i}(\vec{\boldsymbol{y}}_{j1}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jN}))\ \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}) &= \frac{\partial G(\vec{\boldsymbol{y}}_{jn})}{\partial y_{ijn}^{*}}, \\ G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}).\end{split}\]

Otherwise (is_holonomic=False),

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}^{-\mathsf{H}}.\]
Return type:

None

class ssspy.bss.iva.NaturalGradIVA(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using natural gradient descent.

Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • contrast_fn (callable) – A contrast function corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • score_fn (callable) – A score function corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def score_fn(y):
...     norm = np.linalg.norm(y, axis=1, keepdims=True)
...     return y / np.maximum(norm, 1e-10)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradIVA(
...     contrast_fn=contrast_fn,
...     score_fn=score_fn,
...     is_holonomic=True,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def score_fn(y):
...     norm = np.linalg.norm(y, axis=1, keepdims=True)
...     return y / np.maximum(norm, 1e-10)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradIVA(
...     contrast_fn=contrast_fn,
...     score_fn=score_fn,
...     is_holonomic=False,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
update_once()#

Update demixing filters once using the natural gradient descent.

If is_holonomic=True, demixing filters are updated as follows:

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \ \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}} \ -\boldsymbol{I}\right)\boldsymbol{W}_{i},\]

where

\[\begin{split}\boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j}) &= \left(\phi_{i}(\vec{\boldsymbol{y}}_{j1}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jN}))\ \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}) &= \frac{\partial G(\vec{\boldsymbol{y}}_{jn})}{\partial y_{ijn}^{*}}, \\ G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}).\end{split}\]

Otherwise (is_holonomic=False),

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}.\]
Return type:

None

class ssspy.bss.iva.FastIVA(contrast_fn=None, d_contrast_fn=None, dd_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Fast independent vector analysis (Fast IVA) [2].

Parameters:
  • contrast_fn (callable) – A contrast function which corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • d_contrast_fn (callable) – A derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • dd_contrast_fn (callable) – Second order derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

>>> from ssspy.transform import whiten
>>> from ssspy.algorithm import projection_back

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> def dd_contrast_fn(y):
...     return 2 * np.zeros_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = FastIVA(
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
...     dd_contrast_fn=dd_contrast_fn,
...     scale_restoration=False,
... )
>>> spectrogram_mix_whitened = whiten(spectrogram_mix)
>>> spectrogram_est = iva(spectrogram_mix_whitened, n_iter=100)
>>> spectrogram_est = projection_back(spectrogram_est, reference=spectrogram_mix)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
__call__(input, n_iter=100, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – The number of iterations of demixing filter updates. Default: 100.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

update_once()#

Update demixing filters once.

Demixing filters are updated as follows:

\[\begin{split}\boldsymbol{w}_{in} \leftarrow&\frac{1}{J}\sum_{j} \frac{G'_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2})} {2\|\vec{\boldsymbol{y}}_{jn}\|_{2}} \left(\boldsymbol{w}_{in}-y_{ijn}^{*}\boldsymbol{x}_{ij}\right) \notag \\ &-\frac{1}{J}\sum_{j}\frac{|y_{ijn}|^{2}}{2\|\vec{\boldsymbol{y}}_{jn}\|_{2}}\left( \frac{G'_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2})} {\|\vec{\boldsymbol{y}}_{jn}\|_{2}} - G''_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2}) \right)\boldsymbol{w}_{in} \\ \boldsymbol{W}_{i} \leftarrow&\left(\boldsymbol{W}_{i}\boldsymbol{W}_{i}^{\mathsf{H}}\right)^{-\frac{1}{2}} \boldsymbol{W}_{i}.\end{split}\]
Return type:

None

class ssspy.bss.iva.FasterIVA(contrast_fn=None, d_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Faster independent vector analysis (Faster IVA) [3].

Parameters:
  • contrast_fn (callable) – A contrast function which corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • d_contrast_fn (callable) – A derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

>>> from ssspy.transform import whiten
>>> from ssspy.algorithm import projection_back

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = FasterIVA(
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
...     scale_restoration=False,
... )
>>> spectrogram_mix_whitened = whiten(spectrogram_mix)
>>> spectrogram_est = iva(spectrogram_mix_whitened, n_iter=100)
>>> spectrogram_est = projection_back(spectrogram_est, reference=spectrogram_mix)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
__call__(input, n_iter=100, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – The number of iterations of demixing filter updates. Default: 100.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

update_once()#

Update demixing filters once.

In FasterIVA, we compute the eigenvector of \(\boldsymbol{U}_{in}\) which corresponds to the largest eigenvalue by solving

\[\boldsymbol{U}_{in}\boldsymbol{w}_{in} = \lambda_{in}\boldsymbol{w}_{in}.\]

Then,

\[\boldsymbol{W}_{i} \leftarrow\left(\boldsymbol{W}_{i}\boldsymbol{W}_{i}^{\mathsf{H}}\right)^{-\frac{1}{2}} \boldsymbol{W}_{i}.\]
Return type:

None

class ssspy.bss.iva.AuxIVA(spatial_algorithm='IP', contrast_fn=None, d_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Auxiliary-function-based independent vector analysis (IVA) [4].

Parameters:
  • spatial_algorithm (str) – Algorithm for demixing filter updates. Choose IP, IP1, IP2, ISS, ISS1, or ISS2. Default: IP.

  • contrast_fn (callable) – A contrast function which corresponds to \(-\log p(\vec{\boldsymbol{y}}_{jn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_frames).

  • d_contrast_fn (callable) – A derivative of the contrast function. This function is expected to receive (n_channels, n_frames) and return (n_channels, n_frames).

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used.

  • pair_selector (callable, optional) – Selector to choose updaing pair in IP2 and ISS2. If None is given, sequential_pair_selector is used. Default: None.

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters by IP:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="IP",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by IP2:

>>> from ssspy.bss._select_pair import sequential_pair_selector

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="IP2",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
...     pair_selector=sequential_pair_selector,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS:

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="ISS",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS2:

>>> import functools
>>> from ssspy.bss._select_pair import sequential_pair_selector

>>> def contrast_fn(y):
...     return 2 * np.linalg.norm(y, axis=1)

>>> def d_contrast_fn(y):
...     return 2 * np.ones_like(y)

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxIVA(
...     spatial_algorithm="ISS2",
...     contrast_fn=contrast_fn,
...     d_contrast_fn=d_contrast_fn,
...     pair_selector=functools.partial(sequential_pair_selector, step=2),
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
__call__(input, n_iter=100, **kwargs)#

Separate a frequency-domain multichannel signal.

Parameters:
  • input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

  • n_iter (int) – The number of iterations of demixing filter updates. Default: 100.

Return type:

ndarray

Returns:

numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).

update_once()#

Update demixing filters once.

  • If self.spatial_algorithm is IP or IP1, update_once_ip1 is called.

  • If self.spatial_algorithm is IP2, update_once_ip2 is called.

  • If self.spatial_algorithm is ISS or ISS1, update_once_iss1 is called.

  • If self.spatial_algorithm is ISS2, update_once_iss2 is called.

Return type:

None

update_once_ip1()#

Update demixing filters once using iterative projection.

Compute auxiliary variables:

\[\bar{r}_{jn} \leftarrow\|\vec{\boldsymbol{y}}_{jn}\|_{2}\]

Then, demixing filters are updated sequentially for \(n=1,\ldots,N\) as follows:

\[\begin{split}\boldsymbol{w}_{in} &\leftarrow\left(\boldsymbol{W}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\right)^{-1} \ \boldsymbol{e}_{n}, \\ \boldsymbol{w}_{in} &\leftarrow\frac{\boldsymbol{w}_{in}} {\sqrt{\boldsymbol{w}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\boldsymbol{w}_{in}}}, \\\end{split}\]

where

\[\begin{split}\boldsymbol{U}_{in} &= \frac{1}{J}\sum_{j} \varphi(\bar{r}_{jn})\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}, \\ \varphi(\bar{r}_{jn}) &= \frac{G'_{\mathbb{R}}(\bar{r}_{jn})}{2\bar{r}_{jn}}, \\ G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}), \\ G_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2}) &= G(\vec{\boldsymbol{y}}_{jn}).\end{split}\]
Return type:

None

update_once_ip2()#

Update demixing filters once using pairwise iterative projection.

Warning

The current implementation of IP2 is based on “Auxiliary-function-based independent component analysis for super-Gaussian sources,” but this is not what is actually known as IP2. See https://github.com/tky823/ssspy/issues/178 for more details.

For \(n_{1}\) and \(n_{2}\) (\(n_{1}\neq n_{2}\)), compute auxiliary variables:

\[\begin{split}\bar{r}_{jn_{1}} &\leftarrow\|\vec{\boldsymbol{y}}_{jn_{1}}\|_{2} \\ \bar{r}_{jn_{2}} &\leftarrow\|\vec{\boldsymbol{y}}_{jn_{2}}\|_{2}\end{split}\]

Then, compute weighted covariance matrix as follows:

\[\begin{split}\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})} &= \frac{1}{J}\sum_{j}\varphi(\bar{r}_{jn_{1}}) \boldsymbol{y}_{ij}^{(n_{1},n_{2})}{\boldsymbol{y}_{ij}^{(n_{1},n_{2})}}^{\mathsf{H}} \\ \boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})} &= \frac{1}{J}\sum_{j}\varphi(\bar{r}_{jn_{2}}) \boldsymbol{y}_{ij}^{(n_{1},n_{2})}{\boldsymbol{y}_{ij}^{(n_{1},n_{2})}}^{\mathsf{H}},\end{split}\]

where

\[\begin{split}\varphi(\bar{r}_{jn}) &= \frac{G'_{\mathbb{R}}(\bar{r}_{jn})}{2\bar{r}_{jn}} \\ \boldsymbol{y}_{ij}^{(n_{1},n_{2})} &= \left( \begin{array}{c} \boldsymbol{w}_{in_{1}}^{\mathsf{H}}\boldsymbol{x}_{ij} \\ \boldsymbol{w}_{in_{2}}^{\mathsf{H}}\boldsymbol{x}_{ij} \end{array} \right).\end{split}\]

Using \(\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})}\) and \(\boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})}\), we compute generalized eigenvectors.

\[\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{i} = \lambda_{i}^{(n_{1},n_{2})}\boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{i}.\]

After that, we update two eigenvectors \(\boldsymbol{h}_{in_{1}}\) and \(\boldsymbol{h}_{in_{2}}\).

\[\begin{split}\boldsymbol{h}_{in_{1}} &\leftarrow\frac{\boldsymbol{h}_{in_{1}}} {\sqrt{\boldsymbol{h}_{in_{1}}^{\mathsf{H}}\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})} \boldsymbol{h}_{in_{1}}}}, \\ \boldsymbol{h}_{in_{2}} &\leftarrow\frac{\boldsymbol{h}_{in_{2}}} {\sqrt{\boldsymbol{h}_{in_{2}}^{\mathsf{H}}\boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})} \boldsymbol{h}_{in_{2}}}}.\end{split}\]

Then, update \(\boldsymbol{w}_{in_{1}}\) and \(\boldsymbol{w}_{in_{2}}\) simultaneously.

\[( \begin{array}{cc} \boldsymbol{w}_{in_{1}} & \boldsymbol{w}_{in_{2}} \end{array} )\leftarrow( \begin{array}{cc} \boldsymbol{w}_{in_{1}} & \boldsymbol{w}_{in_{2}} \end{array} )( \begin{array}{cc} \boldsymbol{h}_{in_{1}} & \boldsymbol{h}_{in_{2}} \end{array} )\]

At each iteration, we update pairs of \(n_{1}\) and \(n_{1}\) for \(n_{1}\neq n_{2}\).

Return type:

None

update_once_iss1()#

Update estimated spectrograms once using iterative source steering [5].

First, update auxiliary variables

\[\bar{r}_{jn} \leftarrow\|\vec{\boldsymbol{y}}_{jn}\|_{2}.\]

Then, update \(y_{ijn}\) as follows:

\[\begin{split}\boldsymbol{y}_{ij} & \leftarrow\boldsymbol{y}_{ij} - \boldsymbol{d}_{in}y_{ijn}, \\ d_{inn'} &= \begin{cases} \dfrac{\sum_{j}\dfrac{G'_{\mathbb{R}}(\bar{r}_{jn'})}{2\bar{r}_{jn'}} y_{ijn'}y_{ijn}^{*}}{\sum_{j}\dfrac{G'_{\mathbb{R}}(\bar{r}_{jn'})} {2\bar{r}_{jn'}}|y_{ijn}|^{2}} & (n'\neq n) \\ 1 - \dfrac{1}{\sqrt{\dfrac{1}{J}\sum_{j}\dfrac{G'_{\mathbb{R}}(\bar{r}_{jn'})} {2\bar{r}_{jn'}} |y_{ijn}|^{2}}} & (n'=n) \end{cases}.\end{split}\]
Return type:

None

update_once_iss2()#

Update estimated spectrograms once using pairwise iterative source steering [6].

First, we compute auxiliary variables:

\[\bar{r}_{jn} \leftarrow\|\vec{\boldsymbol{y}}_{jn}\|_{2},\]

where

\[\begin{split}G(\vec{\boldsymbol{y}}_{jn}) &= -\log p(\vec{\boldsymbol{y}}_{jn}), \\ G_{\mathbb{R}}(\|\vec{\boldsymbol{y}}_{jn}\|_{2}) &= G(\vec{\boldsymbol{y}}_{jn}).\end{split}\]

Then, we compute \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\) for \(n_{1}\neq n_{2}\):

\[\begin{split}\begin{array}{rclc} \boldsymbol{G}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}}\varphi(\bar{r}_{jn}) \boldsymbol{y}_{ij}^{(n_{1},n_{2})}{\boldsymbol{y}_{ij}^{(n_{1},n_{2})}}^{\mathsf{H}} &(n=1,\ldots,N), \\ \boldsymbol{f}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}} \varphi(\bar{r}_{jn})y_{ijn}^{*}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} &(n\neq n_{1},n_{2}), \\ \varphi(\bar{r}_{jn}) &=&\dfrac{G'_{\mathbb{R}}(\bar{r}_{jn})}{2\bar{r}_{jn}}. \end{array}\end{split}\]

Using \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\), we compute

\[\begin{split}\begin{array}{rclc} \boldsymbol{p}_{in} &=& \dfrac{\boldsymbol{h}_{in}} {\sqrt{\boldsymbol{h}_{in}^{\mathsf{H}}\boldsymbol{G}_{in}^{(n_{1},n_{2})} \boldsymbol{h}_{in}}} & (n=n_{1},n_{2}), \\ \boldsymbol{q}_{in} &=& -{\boldsymbol{G}_{in}^{(n_{1},n_{2})}}^{-1}\boldsymbol{f}_{in}^{(n_{1},n_{2})} & (n\neq n_{1},n_{2}), \end{array}\end{split}\]

where \(\boldsymbol{h}_{in}\) (\(n=n_{1},n_{2}\)) is a generalized eigenvector obtained from

\[\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{i} = \lambda_{i}\boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{i}.\]

Separated signal \(y_{ijn}\) is updated as follows:

\[\begin{split}y_{ijn} &\leftarrow\begin{cases} &\boldsymbol{p}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} & (n=n_{1},n_{2}) \\ &\boldsymbol{q}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} + y_{ijn} & (n\neq n_{1},n_{2}) \end{cases}.\end{split}\]
Return type:

None

class ssspy.bss.iva.GradLaplaceIVA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using the gradient descent on a Laplace distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a Laplace distribution.

\[p(\vec{\boldsymbol{y}}_{jn})\propto\exp(\|\vec{\boldsymbol{y}}_{jn}\|_{2})\]
Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradLaplaceIVA(is_holonomic=True)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradLaplaceIVA(is_holonomic=False)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\mathcal{L} \ = \frac{2}{J}\sum_{j,n}\|\vec{\boldsymbol{y}}_{jn}\|_{2} \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|.\]
Return type:

float

Returns:

Computed loss.

update_once()#

Update demixing filters once using the gradient descent.

If is_holonomic=True, demixing filters are updated as follows:

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \ \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}} \ -\boldsymbol{I}\right)\boldsymbol{W}_{i}^{-\mathsf{H}},\]

where

\[\begin{split}\boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j}) &= \left(\phi_{i}(\vec{\boldsymbol{y}}_{j1}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jN}))\ \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}) &= \frac{y_{ijn}}{\|\vec{\boldsymbol{y}}_{jn}\|_{2}}.\end{split}\]

Otherwise (is_holonomic=False),

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}^{-\mathsf{H}}.\]
Return type:

None

class ssspy.bss.iva.GradGaussIVA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using the gradient descent on a time-varying Gaussian distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a time-varying Gaussian distribution.

\[p(\vec{\boldsymbol{y}}_{jn}) \propto\frac{1}{\alpha_{jn}^{I}} \exp\left(\frac{\|\vec{\boldsymbol{y}}_{jn}\|_{2}^{2}}{\alpha_{jn}}\right).\]
Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradGaussIVA(is_holonomic=True)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = GradGaussIVA(is_holonomic=False)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=5000)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
update_once()#

Update variance and demixing filters and once.

Return type:

None

update_source_model()#

Update variance of Gaussian distribution.

Return type:

None

class ssspy.bss.iva.NaturalGradLaplaceIVA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using the natural gradient descent on a Laplace distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a Laplace distribution.

\[p(\vec{\boldsymbol{y}}_{jn}) \propto\frac{1}{\alpha_{jn}^{I}} \exp\left(\frac{\|\vec{\boldsymbol{y}}_{jn}\|_{2}}{\alpha_{jn}}\right)\]
Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradLaplaceIVA(is_holonomic=True)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradLaplaceIVA(is_holonomic=False)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\mathcal{L} \ = \frac{2}{J}\sum_{j,n}\|\vec{\boldsymbol{y}}_{jn}\|_{2} \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|.\]
Return type:

float

Returns:

Computed loss.

update_once()#

Update demixing filters once using the natural gradient descent.

If is_holonomic=True, demixing filters are updated as follows:

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \ \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}} \ -\boldsymbol{I}\right)\boldsymbol{W}_{i},\]

where

\[\begin{split}\boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j}) &= \left(\phi_{i}(\vec{\boldsymbol{y}}_{j1}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}),\ldots,\ \phi_{i}(\vec{\boldsymbol{y}}_{jN}))\ \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi_{i}(\vec{\boldsymbol{y}}_{jn}) &= \frac{y_{ijn}}{\|\vec{\boldsymbol{y}}_{jn}\|_{2}}.\end{split}\]

Otherwise (is_holonomic=False),

\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}_{i}(\vec{\boldsymbol{Y}}_{j})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}.\]
Return type:

None

class ssspy.bss.iva.NaturalGradGaussIVA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=True, scale_restoration=True, record_loss=True, reference_id=0)#

Independent vector analysis (IVA) using the natural gradient descent on a time-varying Gaussian distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a time-varying Gaussian distribution.

\[p(\vec{\boldsymbol{y}}_{jn}) \propto\frac{1}{\alpha_{jn}^{I}} \exp\left(\frac{\|\vec{\boldsymbol{y}}_{jn}\|_{2}^{2}}{\alpha_{jn}}\right).\]
Parameters:
  • step_size (float) – A step size of the gradient descent. Default: 1e-1.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used. Default: functools.partial(max_flooring, eps=1e-10).

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • is_holonomic (bool) – If is_holonomic=True, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default: False.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters using Holonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradGaussIVA(is_holonomic=True)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters using Nonholonomic-type update:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = NaturalGradGaussIVA(is_holonomic=False)
>>> spectrogram_est = iva(spectrogram_mix, n_iter=500)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
compute_loss()#

Compute loss \(\mathcal{L}\).

\(\mathcal{L}\) is given as follows:

\[\begin{split}\mathcal{L} \ &= \frac{1}{J}\sum_{j,n}G(\vec{\boldsymbol{y}}_{jn}) \ - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|, \\ G(\vec{\boldsymbol{y}}_{jn}) \ &= - \log p(\vec{\boldsymbol{y}}_{jn})\end{split}\]
Return type:

float

Returns:

Computed loss.

update_once()#

Update variance and demixing filters and once.

Return type:

None

class ssspy.bss.iva.AuxLaplaceIVA(spatial_algorithm='IP', flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Auxiliary-function-based independent vector analysis (IVA) on a Laplace distribution.

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a Laplace distribution.

\[p(\vec{\boldsymbol{y}}_{jn})\propto\exp(\|\vec{\boldsymbol{y}}_{jn}\|_{2})\]
Parameters:
  • spatial_algorithm (str) – Algorithm for demixing filter updates. Choose IP, IP1, IP2, ISS, ISS1, or ISS2. Default: IP.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used.

  • pair_selector (callable, optional) – Selector to choose updaing pair in IP2 and ISS2. If None is given, sequential_pair_selector is used. Default: None.

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters by IP:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxLaplaceIVA(spatial_algorithm="IP")
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by IP2:

>>> from ssspy.bss._select_pair import sequential_pair_selector

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxLaplaceIVA(
...     spatial_algorithm="IP2",
...     pair_selector=sequential_pair_selector,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxLaplaceIVA(spatial_algorithm="ISS")
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS2:

>>> import functools
>>> from ssspy.bss._select_pair import sequential_pair_selector

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxLaplaceIVA(
...     spatial_algorithm="ISS2",
...     pair_selector=functools.partial(sequential_pair_selector, step=2),
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
class ssspy.bss.iva.AuxGaussIVA(spatial_algorithm='IP', flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#

Auxiliary-function-based independent vector analysis (IVA) on a time-varying Gaussian distribution [7].

We assume \(\vec{\boldsymbol{y}}_{jn}\) follows a time-varying Gaussian distribution.

\[p(\vec{\boldsymbol{y}}_{jn}) \propto\frac{1}{\alpha_{jn}^{I}} \exp\left(\frac{\|\vec{\boldsymbol{y}}_{jn}\|_{2}^{2}}{\alpha_{jn}}\right).\]
Parameters:
  • spatial_algorithm (str) – Algorithm for demixing filter updates. Choose IP, IP1, IP2, ISS, ISS1, or ISS2. Default: IP.

  • flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set flooring_fn=None, the identity function (lambda x: x) is used.

  • pair_selector (callable, optional) – Selector to choose updaing pair in IP2 and ISS2. If None is given, sequential_pair_selector is used. Default: None.

  • callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default: None.

  • scale_restoration (bool or str) – Technique to restore scale ambiguity. If scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specify projection_back explicitly. Default: True.

  • record_loss (bool) – Record the loss at each iteration of the update algorithm if record_loss=True. Default: True.

  • reference_id (int) – Reference channel for projection back. Default: 0.

Examples

Update demixing filters by IP:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxGaussIVA(spatial_algorithm="IP")
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by IP2:

>>> from ssspy.bss._select_pair import sequential_pair_selector

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxGaussIVA(
...     spatial_algorithm="IP2",
...     pair_selector=sequential_pair_selector,
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS:

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxGaussIVA(spatial_algorithm="ISS")
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)

Update demixing filters by ISS2:

>>> import functools
>>> from ssspy.bss._select_pair import sequential_pair_selector

>>> n_channels, n_bins, n_frames = 2, 2049, 128
>>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \
...     + 1j * np.random.randn(n_channels, n_bins, n_frames)

>>> iva = AuxGaussIVA(
...     spatial_algorithm="ISS2",
...     pair_selector=functools.partial(sequential_pair_selector, step=2),
... )
>>> spectrogram_est = iva(spectrogram_mix, n_iter=100)
>>> print(spectrogram_mix.shape, spectrogram_est.shape)
(2, 2049, 128), (2, 2049, 128)
update_once()#

Update variance and demixing filters and once.

Return type:

None

update_source_model()#

Update variance of Gaussian distribution.

Return type:

None