ssspy.bss.fdica#
In this module, we separate multichannel signals using frequency-domain independent component analysis (FDICA). We denote the number of sources and microphones as \(N\) and \(M\), respectively. We also denote short-time Fourier transforms of source, observed, and separated signals as \(\boldsymbol{s}_{ij}\), \(\boldsymbol{x}_{ij}\), and \(\boldsymbol{y}_{ij}\), respectively.
where \(i=1,\ldots,I\) and \(j=1,\ldots,J\) are indices of frequency bins and time frames, respectively. When a mixing system is time-invariant, \(\boldsymbol{x}_{ij}\) is represented as follows:
where \(\boldsymbol{A}_{i}=(\boldsymbol{a}_{i1},\ldots,\boldsymbol{a}_{in},\ldots,\boldsymbol{a}_{iN})\in\mathbb{C}^{M\times N}\) is a mixing matrix. If \(M=N\) and \(\boldsymbol{A}_{i}\) is non-singular, a demixing system is represented as
where \(\boldsymbol{W}_{i}=(\boldsymbol{w}_{i1},\ldots,\boldsymbol{w}_{in},\ldots,\boldsymbol{w}_{iN})^{\mathsf{H}}\in\mathbb{C}^{N\times M}\) is a demixing matrix. The negative log-likelihood of observed signals (divided by \(J\)) is computed as follows:
where \(G(y_{ijn})\) is a contrast function. The derivative of \(G(y_{ijn})\) is called a score function.
Algorithms#
- class ssspy.bss.fdica.FDICABase(contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, permutation_alignment=True, scale_restoration=True, record_loss=True, reference_id=0)#
Base class of frequency-domain independent component analysis (FDICA).
- Parameters:
contrast_fn (callable) – A contrast function which corresponds to \(-\log p(y_{ijn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames). If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Default:partial(max_flooring, eps=1e-10)
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.permutation_alignment (bool) – If
permutation_alignment=True
, a permutation solver is used to align estimated spectrograms. Default:True
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
orminimal_distortion_principle
. Default:True
.record_loss (bool) – Record the loss at each iteration of the update algorithm if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0
.
- __call__(input, n_iter=100, initial_call=True, **kwargs)#
Separate a frequency-domain multichannel signal.
- Parameters:
input (numpy.ndarray) – Mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
n_iter (int) – Number of iterations of demixing filter updates. Default:
100
.initial_call (bool) – If
True
, perform callbacks (and computation of loss if necessary) before iterations.
- Return type:
ndarray
- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
- apply_projection_back()#
Apply projection back technique to estimated spectrograms.
- Return type:
None
- compute_logdet(demix_filter)#
Compute log-determinant of demixing filter.
- Parameters:
demix_filter (numpy.ndarray) – Demixing filters with shape of (n_bins, n_sources, n_channels).
- Return type:
ndarray
- Returns:
numpy.ndarray of computed log-determinant values.
- compute_loss()#
Compute loss \(\mathcal{L}\).
\(\mathcal{L}\) is given as follows:
\[\begin{split}\mathcal{L} &= \sum_{i}\mathcal{L}^{[i]}, \\ \mathcal{L}^{[i]} &= \frac{1}{J}\sum_{j,n}G(y_{ijn}) - 2\log|\det\boldsymbol{W}_{i}|, \\ G(y_{ijn}) \ &= - \log p(y_{ijn})\end{split}\]- Return type:
float
- Returns:
Computed loss.
- restore_scale()#
Restore scale ambiguity.
If
self.scale_restoration=projection_back
, we use projection back technique. Ifself.scale_restoration=minimal_distortion_principle
, we use minimal distortion principle.- Return type:
None
- separate(input, demix_filter)#
Separate
input
usingdemixing_filter
.\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij}\]- Parameters:
input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
demix_filter (numpy.ndarray) – The demixing filters to separate
input
. The shape is (n_bins, n_sources, n_channels).
- Return type:
ndarray
- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_sources, n_bins, n_frames).
- solve_permutation()#
Align demixing filters and separated spectrograms
- Return type:
None
- class ssspy.bss.fdica.GradFDICABase(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, permutation_alignment=True, scale_restoration=True, record_loss=True, reference_id=0)#
Base class of frequency-domain independent component analysis (FDICA) using the gradient descent.
- Parameters:
step_size (float) – A step size of the gradient descent. Default:
1e-1
.contrast_fn (callable) – A contrast function which corresponds to \(-\log p(y_{ijn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
score_fn (callable) – A score function which corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames). If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Default:partial(max_flooring, eps=1e-10)
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.permutation_alignment (bool) – If
permutation_alignment=True
, a permutation solver is used to align estimated spectrograms. Default:True
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
orminimal_distortion_principle
. Default:True
.record_loss (bool) – Record the loss at each iteration of the gradient descent if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0
.
- __call__(input, n_iter=100, initial_call=True, **kwargs)#
Separate a frequency-domain multichannel signal.
- Parameters:
input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
n_iter (int) – The number of iterations of demixing filter updates. Default:
100
.initial_call (bool) – If
True
, perform callbacks (and computation of loss if necessary) before iterations.
- Return type:
ndarray
- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
- class ssspy.bss.fdica.GradFDICA(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=False, permutation_alignment=True, scale_restoration=True, record_loss=True, reference_id=0)#
Frequency-domain independent component analysis (FDICA) using the gradient descent.
- Parameters:
step_size (float) – A step size of the gradient descent. Default:
1e-1
.contrast_fn (callable) – A contrast function corresponds to \(-\log p(y_{ijn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
score_fn (callable) – A score function corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames). If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Default:partial(max_flooring, eps=1e-10)
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.is_holonomic (bool) – If
is_holonomic=True
, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default:False
.permutation_alignment (bool) – If
permutation_alignment=True
, a permutation solver is used to align estimated spectrograms. Default:True
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
orminimal_distortion_principle
. Default:True
.record_loss (bool) – Record the loss at each iteration of the gradient descent if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0
.
Examples
Update demixing filters using Holonomic-type update:
>>> def contrast_fn(y): ... return 2 * np.abs(y) >>> def score_fn(y): ... denom = np.maximum(np.abs(y), 1e-10) ... return y / denom >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = GradFDICA( ... contrast_fn=contrast_fn, ... score_fn=score_fn, ... is_holonomic=True, ... ) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters using Nonholonomic-type update:
>>> def contrast_fn(y): ... return 2 * np.abs(y) >>> def score_fn(y): ... denom = np.maximum(np.abs(y), 1e-10) ... return y / denom >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = GradFDICA( ... contrast_fn=contrast_fn, ... score_fn=score_fn, ... is_holonomic=False, ... ) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
- update_once()#
Update demixing filters once using the gradient descent.
If
is_holonomic=True
, demixing filters are updated as follows: :rtype:None
\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}(\boldsymbol{y}_{ij})\boldsymbol{y}_{ij}^{\mathsf{H}} -\boldsymbol{I}\right)\boldsymbol{W}_{i}^{-\mathsf{H}},\]where
\[\begin{split}\boldsymbol{\phi}(\boldsymbol{y}_{ij}) &= \left(\phi(y_{ij1}),\ldots,\phi(y_{ijn}),\ldots,\phi(y_{ijN}) \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi(y_{ijn}) &= \frac{\partial G(y_{ijn})}{\partial y_{ijn}^{*}}, \\ G(y_{ijn}) &= -\log p(y_{ijn}).\end{split}\]Otherwise (
is_holonomic=False
),\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}(\boldsymbol{y}_{ij})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}^{-\mathsf{H}}.\]
- class ssspy.bss.fdica.NaturalGradFDICA(step_size=0.1, contrast_fn=None, score_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=False, permutation_alignment=True, scale_restoration=True, record_loss=True, reference_id=0)#
Frequency-domain independent component analysis (FDICA) using the natural gradient descent.
- Parameters:
step_size (float) – A step size of the gradient descent. Default:
1e-1
.contrast_fn (callable) – A contrast function corresponds to \(-\log p(y_{ijn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
score_fn (callable) – A score function corresponds to the partial derivative of the contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames). If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Default:partial(max_flooring, eps=1e-10)
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.is_holonomic (bool) – If
is_holonomic=True
, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default:False
.permutation_alignment (bool) – If
permutation_alignment=True
, a permutation solver is used to align estimated spectrograms. Default:True
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
orminimal_distortion_principle
. Default:True
.record_loss (bool) – Record the loss at each iteration of the gradient descent if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0
.
Examples
Update demixing filters using Holonomic-type update:
>>> def contrast_fn(y): ... return 2 * np.abs(y) >>> def score_fn(y): ... denom = np.maximum(np.abs(y), 1e-10) ... return y / denom >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = NaturalGradFDICA( ... contrast_fn=contrast_fn, ... score_fn=score_fn, ... is_holonomic=True, ... ) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters using Nonholonomic-type update:
>>> def contrast_fn(y): ... return 2 * np.abs(y) >>> def score_fn(y): ... denom = np.maximum(np.abs(y), 1e-10) ... return y / denom >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = NaturalGradFDICA( ... contrast_fn=contrast_fn, ... score_fn=score_fn, ... is_holonomic=False, ... ) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
- update_once()#
Update demixing filters once using the gradient descent.
If
is_holonomic=True
, demixing filters are updated as follows: :rtype:None
\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}(\boldsymbol{y}_{ij})\boldsymbol{y}_{ij}^{\mathsf{H}} -\boldsymbol{I}\right)\boldsymbol{W}_{i},\]where
\[\begin{split}\boldsymbol{\phi}(\boldsymbol{y}_{ij}) &= \left(\phi(y_{ij1}),\ldots,\phi(y_{ijn}),\ldots,\phi(y_{ijN}) \right)^{\mathsf{T}}\in\mathbb{C}^{N}, \\ \phi(y_{ijn}) &= \frac{\partial G(y_{ijn})}{\partial y_{ijn}^{*}}, \\ G(y_{ijn}) &= -\log p(y_{ijn}).\end{split}\]Otherwise (
is_holonomic=False
),\[\boldsymbol{W}_{i} \leftarrow\boldsymbol{W}_{i} - \eta\cdot\mathrm{offdiag}\left(\frac{1}{J}\sum_{j} \boldsymbol{\phi}(\boldsymbol{y}_{ij})\boldsymbol{y}_{ij}^{\mathsf{H}}\right) \boldsymbol{W}_{i}.\]
- class ssspy.bss.fdica.AuxFDICA(spatial_algorithm='IP', contrast_fn=None, d_contrast_fn=None, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, permutation_alignment=True, scale_restoration=True, record_loss=True, reference_id=0)#
Auxiliary-function-based frequency-domain independent component analysis (AuxFDICA) [1].
- Parameters:
spatial_algorithm (str) – Algorithm to update demixing filters. Choose
IP
,IP1
, orIP2
. Default:IP
.contrast_fn (callable) – A contrast function corresponds to \(-\log p(y_{ijn})\). This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
d_contrast_fn (callable) – A partial derivative of the real contrast function. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames).
flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames). If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Default:partial(max_flooring, eps=1e-10)
.pair_selector (callable, optional) – Selector to choose updaing pair in
IP2
andISS2
. IfNone
is given,partial(sequential_pair_selector, sort=True)
is used. Default:None
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.permutation_alignment (bool) – If
permutation_alignment=True
, a permutation solver is used to align estimated spectrograms. Default:True
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
orminimal_distortion_principle
. Default:True
.record_loss (bool) – Record the loss at each iteration of the demixing filter update if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0
.
Examples
Update demixing filters by IP:
>>> def contrast_fn(y): ... return 2 * np.abs(y) >>> def d_contrast_fn(y): ... return 2 * np.ones_like(y) >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = AuxFDICA( ... spatial_algorithm="IP", ... contrast_fn=contrast_fn, ... d_contrast_fn=d_contrast_fn, ... ) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by IP2:
>>> from ssspy.utils.select_pair import sequential_pair_selector >>> def contrast_fn(y): ... return 2 * np.abs(y) >>> def d_contrast_fn(y): ... return 2 * np.ones_like(y) >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = AuxFDICA( ... spatial_algorithm="IP2", ... contrast_fn=contrast_fn, ... d_contrast_fn=d_contrast_fn, ... pair_selector=sequential_pair_selector, ... ) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
- __call__(input, n_iter=100, initial_call=True, **kwargs)#
Separate a frequency-domain multichannel signal.
- Parameters:
input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
n_iter (int) – The number of iterations of demixing filter updates. Default:
100
.initial_call (bool) – If
True
, perform callbacks (and computation of loss if necessary) before iterations.
- Return type:
ndarray
- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
- update_once(flooring_fn='self')#
Update demixing filters once.
If
self.spatial_algorithm
isIP
orIP1
,update_once_ip1
is called.If
self.spatial_algorithm
isIP2
,update_once_ip2
is called.
- Parameters:
flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Ifself
is given as str,self.flooring_fn
is used. Default:self
.- Return type:
None
- update_once_ip1(flooring_fn='self')#
Update demixing filters once using iterative projection.
- Parameters:
flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Ifself
is given as str,self.flooring_fn
is used. Default:self
.- Return type:
None
Demixing filters are updated sequentially for \(n=1,\ldots,N\) as follows:
\[\begin{split}\boldsymbol{w}_{in} &\leftarrow\left(\boldsymbol{W}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\right)^{-1} \boldsymbol{e}_{n}, \\ \boldsymbol{w}_{in} &\leftarrow\frac{\boldsymbol{w}_{in}} {\sqrt{\boldsymbol{w}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\boldsymbol{w}_{in}}}, \\\end{split}\]where
\[\begin{split}\boldsymbol{U}_{in} &= \frac{1}{J}\sum_{j} \frac{G'_{\mathbb{R}}(|y_{ijn}|)}{2|y_{ijn}|} \boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}, \\ G(y_{ijn}) &= -\log p(y_{ijn}), \\ G_{\mathbb{R}}(|y_{ijn}|) &= G(y_{ijn}).\end{split}\]
- update_once_ip2(flooring_fn='self')#
Update demixing filters once using pairwise iterative projection.
- Parameters:
flooring_fn (callable or str, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Ifself
is given as str,self.flooring_fn
is used. Default:self
.- Return type:
None
For \(n_{1}\) and \(n_{2}\) (\(n_{1}\neq n_{2}\)), compute auxiliary variables:
\[\begin{split}\bar{r}_{ijn_{1}} &\leftarrow|y_{ijn_{1}}| \\ \bar{r}_{ijn_{2}} &\leftarrow|y_{ijn_{2}}|\end{split}\]Then, for \(n=n_{1},n_{2}\), compute weighted covariance matrix as follows:
\[\begin{split}\boldsymbol{U}_{in_{1}} &= \frac{1}{J}\sum_{j} \frac{G'_{\mathbb{R}}(\bar{r}_{ijn_{1}})}{2\bar{r}_{ijn_{1}}} \boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}, \\ \boldsymbol{U}_{in_{2}} &= \frac{1}{J}\sum_{j} \frac{G'_{\mathbb{R}}(\bar{r}_{ijn_{2}})}{2\bar{r}_{ijn_{2}}} \boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}},\end{split}\]where
\[\begin{split}G(y_{ijn}) &= -\log p(y_{ijn}), \\ G_{\mathbb{R}}(|y_{ijn}|) &= G(y_{ijn}).\end{split}\]Using \(\boldsymbol{U}_{in_{1}}\) and \(\boldsymbol{U}_{in_{2}}\), we compute generalized eigenvectors.
\[\left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i} = \lambda_{i} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i},\]where
\[\begin{split}\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{1}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ), \\ \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{2}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ).\end{split}\]After that, we standardize two eigenvectors \(\boldsymbol{h}_{in_{1}}\) and \(\boldsymbol{h}_{in_{2}}\).
\[\begin{split}\boldsymbol{h}_{in_{1}} &\leftarrow\frac{\boldsymbol{h}_{in_{1}}} {\sqrt{\boldsymbol{h}_{in_{1}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{1}}}}, \\ \boldsymbol{h}_{in_{2}} &\leftarrow\frac{\boldsymbol{h}_{in_{2}}} {\sqrt{\boldsymbol{h}_{in_{2}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{2}}}}.\end{split}\]Then, update \(\boldsymbol{w}_{in_{1}}\) and \(\boldsymbol{w}_{in_{2}}\) simultaneously.
\[\begin{split}\boldsymbol{w}_{in_{1}} &\leftarrow \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{1}} \\ \boldsymbol{w}_{in_{2}} &\leftarrow \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{2}}\end{split}\]At each iteration, we update pairs of \(n_{1}\) and \(n_{1}\) for \(n_{1}\neq n_{2}\).
- class ssspy.bss.fdica.GradLaplaceFDICA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=False, permutation_alignment=True, scale_restoration=True, record_loss=True, reference_id=0)#
Frequency-domain independent component analysis (FDICA) using the gradient descent on a Laplace distribution.
We assume \(y_{ijn}\) follows a Laplace distribution.
\[p(y_{ijn})\propto\exp(|y_{ijn}|)\]- Parameters:
step_size (float) – A step size of the gradient descent. Default:
1e-1
.flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames). If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Default:partial(max_flooring, eps=1e-10)
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.is_holonomic (bool) – If
is_holonomic=True
, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default:False
.permutation_alignment (bool) – If
permutation_alignment=True
, a permutation solver is used to align estimated spectrograms. Default:True
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
orminimal_distortion_principle
. Default:True
.record_loss (bool) – Record the loss at each iteration of the gradient descent if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0
.
Examples
Update demixing filters using Holonomic-type update:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = GradLaplaceFDICA(is_holonomic=True) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters using Nonholonomic-type update:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = GradLaplaceFDICA(is_holonomic=False) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
- class ssspy.bss.fdica.NaturalGradLaplaceFDICA(step_size=0.1, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, is_holonomic=False, permutation_alignment=True, scale_restoration=True, record_loss=True, reference_id=0)#
Frequency-domain independent component analysis (FDICA) using the natural gradient descent on a Laplace distribution.
We assume \(y_{ijn}\) follows a Laplace distribution.
\[p(y_{ijn})\propto\exp(|y_{ijn}|)\]- Parameters:
step_size (float) – A step size of the gradient descent. Default:
1e-1
.flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames). If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Default:partial(max_flooring, eps=1e-10)
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.is_holonomic (bool) – If
is_holonomic=True
, Holonomic-type update is used. Otherwise, Nonholonomic-type update is used. Default:False
.permutation_alignment (bool) – If
permutation_alignment=True
, a permutation solver is used to align estimated spectrograms. Default:True
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
orminimal_distortion_principle
. Default:True
.record_loss (bool) – Record the loss at each iteration of the gradient descent if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0
.
Examples
Update demixing filters using Holonomic-type update:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = NaturalGradLaplaceFDICA(is_holonomic=True) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters using Nonholonomic-type update:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = NaturalGradLaplaceFDICA(is_holonomic=False) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
- class ssspy.bss.fdica.AuxLaplaceFDICA(spatial_algorithm='IP', flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, permutation_alignment=True, scale_restoration=True, record_loss=True, reference_id=0)#
Auxiliary-function-based frequency-domain independent component analysis on a Laplace distribution.
We assume \(y_{ijn}\) follows a Laplace distribution.
\[p(y_{ijn})\propto\exp(|y_{ijn}|)\]- Parameters:
spatial_algorithm (str) – Algorithm to update demixing filters. Choose
IP
,IP1
, orIP2
. Default:IP
.flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to receive (n_channels, n_bins, n_frames) and return (n_channels, n_bins, n_frames). If you explicitly set
flooring_fn=None
, the identity function (lambda x: x
) is used. Default:partial(max_flooring, eps=1e-10)
.pair_selector (callable, optional) – Selector to choose updaing pair in
IP2
andISS2
. IfNone
is given,partial(sequential_pair_selector, sort=True)
is used. Default:None
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.permutation_alignment (bool) – If
permutation_alignment=True
, a permutation solver is used to align estimated spectrograms. Default:True
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
orminimal_distortion_principle
. Default:True
.record_loss (bool) – Record the loss at each iteration of the demixing filter update if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0
.
Examples
Update demixing filters by IP:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = AuxLaplaceFDICA(spatial_algorithm="IP") >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by IP2:
>>> from ssspy.utils.select_pair import sequential_pair_selector >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = \ ... np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> fdica = AuxLaplaceFDICA( ... spatial_algorithm="IP2", ... pair_selector=sequential_pair_selector, ... ) >>> spectrogram_est = fdica(spectrogram_mix, n_iter=1000) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)