ssspy.bss.ilrma#
In this module, we separate multichannel signals using independent low-rank matrix analysis (ILRMA). We denote the number of sources and microphones as \(N\) and \(M\), respectively. We also denote short-time Fourier transforms of source, observed, and separated signals as \(\boldsymbol{s}_{ij}\), \(\boldsymbol{x}_{ij}\), and \(\boldsymbol{y}_{ij}\), respectively.
where \(i=1,\ldots,I\) and \(j=1,\ldots,J\) are indices of frequency bins and time frames, respectively. When a mixing system is time-invariant, \(\boldsymbol{x}_{ij}\) is represented as follows:
where \(\boldsymbol{A}_{i}=(\boldsymbol{a}_{i1},\ldots,\boldsymbol{a}_{in},\ldots,\boldsymbol{a}_{iN})\in\mathbb{C}^{M\times N}\) is a mixing matrix. If \(M=N\) and \(\boldsymbol{A}_{i}\) is non-singular, a demixing system is represented as
where \(\boldsymbol{W}_{i}=(\boldsymbol{w}_{i1},\ldots,\boldsymbol{w}_{in},\ldots,\boldsymbol{w}_{iN})^{\mathsf{H}}\in\mathbb{C}^{N\times M}\) is a demixing matrix. The negative log-likelihood of observed signals (divided by \(J\)) is computed as follows:
Algorithms#
- class ssspy.bss.ilrma.ILRMAbase(n_basis, partitioning=False, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), callbacks=None, scale_restoration=True, record_loss=True, reference_id=0, rng=None)#
Base class of independent low-rank matrix analysis (ILRMA).
- Parameters:
n_basis (int) – Number of NMF bases.
partitioning (bool) – Whether to use partioning function. Default:
False.flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set
flooring_fn=None, the identity function (lambda x: x) is used. Default:functools.partial(max_flooring, eps=1e-10).callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_backexplicitly. Default:True.record_loss (bool) – Record the loss at each iteration of the update algorithm if
record_loss=True. Default:True.reference_id (int) – Reference channel for projection back. Default:
0.rng (numpy.random.Generator, optioinal) – Random number generator. This is mainly used to randomly initialize NMF. If
Noneis given,np.random.default_rng()is used. Default:None.
- __call__(input, n_iter=100, initial_call=True, **kwargs)#
Separate a frequency-domain multichannel signal.
- Parameters:
input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
n_iter (int) – The number of iterations of demixing filter updates. Default:
100.initial_call (bool) – If
True, perform callbacks (and computation of loss if necessary) before iterations.
- Return type:
ndarray- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
- _init_nmf(rng=None)#
Initialize NMF.
- Parameters:
rng (numpy.random.Generator, optional) – Random number generator. If
Noneis given,np.random.default_rng()is used. Default:None.- Return type:
None
- apply_projection_back()#
Apply projection back technique to estimated spectrograms.
- Return type:
None
- compute_logdet(demix_filter)#
Compute log-determinant of demixing filter
- Parameters:
demix_filter (numpy.ndarray) – Demixing filters with shape of (n_bins, n_sources, n_channels).
- Return type:
ndarray- Returns:
numpy.ndarray of computed log-determinant values.
- compute_loss()#
Compute loss \(\mathcal{L}\).
- Return type:
float- Returns:
Computed loss.
- normalize()#
Normalize demixing filters and NMF parameters.
- Return type:
None
- normalize_by_power()#
Normalize demixing filters and NMF parameters by power.
Demixing filters are normalized by
\[\boldsymbol{w}_{in} \leftarrow\frac{\boldsymbol{w}_{in}}{\psi_{in}},\]where
\[\psi_{in} = \sqrt{\frac{1}{IJ}|\boldsymbol{w}_{in}^{\mathsf{H}} \boldsymbol{x}_{ij}|^{2}}.\]For NMF parameters,
\[\begin{split}t_{ik} &\leftarrow t_{ik}\sum_{n}\frac{z_{nk}}{\psi_{in}^{p}}, \\ z_{nk} &\leftarrow \frac{\frac{z_{nk}}{\psi_{in}^{p}}} {\sum_{n'}\frac{z_{n'k}}{\psi_{in'}^{p}}},\end{split}\]if
self.partitioning=True. Otherwise,\[t_{ikn} \leftarrow\frac{t_{ikn}}{\psi_{in}^{p}}.\]- Return type:
None
- normalize_by_projection_back()#
Normalize demixing filters and NMF parameters by projection back.
Demixing filters are normalized by
\[\boldsymbol{w}_{in} \leftarrow\frac{\boldsymbol{w}_{in}}{\psi_{in}},\]where
\[\boldsymbol{\psi}_{i} = \boldsymbol{W}_{i}^{-1}\boldsymbol{e}_{m_{\mathrm{ref}}}.\]For NMF parameters,
\[t_{ikn} \leftarrow\frac{t_{ikn}}{\psi_{in}^{p}}.\]- Return type:
None
- reconstruct_nmf(basis, activation, latent=None)#
Reconstruct NMF.
- Parameters:
basis (numpy.ndarray) – Basis matrix. The shape is (n_sources, n_basis, n_frames) if latent is given. Otherwise, (n_basis, n_frames).
activation (numpy.ndarray) – Activation matrix. The shape is (n_sources, n_bins, n_basis) if latent is given. Otherwise, (n_bins, n_basis).
latent (numpy.ndarray, optional) – Latent variable that determines number of bases per source.
- Return type:
ndarray- Returns:
numpy.ndarray of theconstructed NMF. The shape is (n_sources, n_bins, n_frames).
- restore_scale()#
Restore scale ambiguity.
If
self.scale_restoration="projection_back, we use projection back technique.- Return type:
None
- separate(input, demix_filter)#
Separate
inputusingdemixing_filter.\[\boldsymbol{y}_{ij} = \boldsymbol{W}_{i}\boldsymbol{x}_{ij}\]- Parameters:
input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
demix_filter (numpy.ndarray) – The demixing filters to separate
input. The shape is (n_bins, n_sources, n_channels).
- Return type:
ndarray- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_sources, n_bins, n_frames).
- update_once()#
Update demixing filters once.
- Return type:
None
- class ssspy.bss.ilrma.GaussILRMA(n_basis, spatial_algorithm='IP', domain=2, partitioning=False, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, normalization=True, scale_restoration=True, record_loss=True, reference_id=0, rng=None)#
Independent low-rank matrix analysis (ILRMA) [1] on Gaussian distribution.
We assume \(y_{ijn}\) follows a Gaussian distribution.
\[p(y_{ijn}) = \frac{1}{\pi r_{ijn}}\exp\left(-\frac{|y_{ijn}|^{2}}{r_{ijn}}\right),\]where
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}},\]if
partitioning=True. Otherwise,\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]- Parameters:
n_basis (int) – Number of NMF bases.
spatial_algorithm (str) – Algorithm for demixing filter updates. Choose
IP,IP1,IP2,ISS,ISS1, orISS2. Default:IP.domain (float) – Domain parameter. Default:
2.partitioning (bool) – Whether to use partioning function. Default:
False.flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set
flooring_fn=None, the identity function (lambda x: x) is used. Default:functools.partial(max_flooring, eps=1e-10).pair_selector (callable, optional) – Selector to choose updaing pair in
IP2andISS2. IfNoneis given,sequential_pair_selectoris used. Default:None.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None.normalization (bool or str, optional) – Normalization of demixing filters and NMF parameters. Choose
powerorprojection_back. Default:power.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_backorminimal_distortion_principle. Default:True.record_loss (bool) – Record the loss at each iteration of the update algorithm if
record_loss=True. Default:True.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0.rng (numpy.random.Generator, optioinal) – Random number generator. This is mainly used to randomly initialize NMF. If
Noneis given,np.random.default_rng()is used. Default:None.
Examples
Update demixing filters by IP:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = GaussILRMA( ... n_basis=2, ... spatial_algorithm="IP", ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by IP2:
>>> from ssspy.bss._select_pair import sequential_pair_selector >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = GaussILRMA( ... n_basis=2, ... spatial_algorithm="IP2", ... pair_selector=sequential_pair_selector, ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by ISS:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = GaussILRMA( ... n_basis=2, ... spatial_algorithm="ISS", ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by ISS2:
>>> import functools >>> from ssspy.bss._select_pair import sequential_pair_selector >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = GaussILRMA( ... n_basis=2, ... spatial_algorithm="ISS2", ... pair_selector=functools.partial(sequential_pair_selector, step=2), ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
- __call__(input, n_iter=100, initial_call=True, **kwargs)#
Separate a frequency-domain multichannel signal.
- Parameters:
input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
n_iter (int) – The number of iterations of demixing filter updates. Default:
100.initial_call (bool) – If
True, perform callbacks (and computation of loss if necessary) before iterations.
- Return type:
ndarray- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
- apply_projection_back()#
Apply projection back technique to estimated spectrograms.
- Return type:
None
- compute_loss()#
Compute loss \(\mathcal{L}\).
\(\mathcal{L}\) is given as follows:
\[\mathcal{L} = \frac{1}{J}\sum_{i,j,n}\left(\frac{|y_{ijn}|^{2}}{r_{ijn}} + \log r_{ijn}\right) - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|,\]where
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}},\]if
partitioning=True. Otherwise\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]- Return type:
float- Returns:
Computed loss.
- update_activation()#
Update NMF activations.
Update \(t_{ikn}\) as follows:
\[v_{kj} \leftarrow\left[\frac{\displaystyle\sum_{i,n}\frac{z_{nk}t_{ik}} {(\sum_{k'}z_{nk'}t_{ik'}v_{k'j})^{\frac{p+2}{p}}} |y_{ijn}|^{2}}{\displaystyle\sum_{i,n}\dfrac{z_{nk}t_{ik}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{p+2}}v_{kj},\]if
partitioning=True. Otherwise\[v_{kjn} \leftarrow \left[\frac{\displaystyle\sum_{j} \dfrac{t_{ikn}}{(\sum_{k'}t_{ik'n}v_{k'jn})^{\frac{p+2}{p}}}|y_{ijn}|^{2}} {\displaystyle\sum_{i}\frac{t_{ikn}}{\sum_{k'}t_{ik'n}v_{k'jn}}} \right]^{\frac{p}{p+2}}v_{kjn}.\]- Return type:
None
- update_basis()#
Update NMF bases.
Update \(t_{ikn}\) as follows:
\[t_{ik} \leftarrow\left[ \frac{\displaystyle\sum_{j,n}\frac{z_{nk}v_{kj}} {(\sum_{k'}z_{nk'}t_{ik'}v_{k'j})^{\frac{p+2}{p}}} |y_{ijn}|^{2}}{\displaystyle\sum_{j,n} \dfrac{z_{nk}v_{kj}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{p+2}}t_{ik},\]if
partitioning=True. Otherwise\[t_{ikn} \leftarrow \left[\frac{\displaystyle\sum_{j} \dfrac{v_{kjn}}{(\sum_{k'}t_{ik'n}v_{k'jn})^{\frac{p+2}{p}}}|y_{ijn}|^{2}} {\displaystyle\sum_{j}\frac{v_{kjn}}{\sum_{k'}t_{ik'n}v_{k'jn}}}\right] ^{\frac{p}{p+2}}t_{ikn}.\]- Return type:
None
- update_latent()#
Update latent variables in NMF.
Update \(t_{ikn}\) as follows:
\[\begin{split}z_{nk} &\leftarrow\left[\frac{\displaystyle\sum_{i,j}\frac{t_{ik}v_{kj}} {(\sum_{k'}z_{nk'}t_{ik'}v_{k'j})^{\frac{p+2}{p}}} |y_{ijn}|^{2}}{\displaystyle\sum_{i,j}\dfrac{t_{ik}v_{kj}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{p+2}}z_{nk} \\ z_{nk} &\leftarrow\frac{z_{nk}}{\sum_{n'}z_{n'k}}.\end{split}\]- Return type:
None
- update_once()#
Update NMF parameters and demixing filters once.
- Return type:
None
- update_source_model()#
Update NMF bases, activations, and latent variables.
- Return type:
None
- update_spatial_model()#
Update demixing filters once.
If
spatial_algorithmisIPorIP1,update_spatial_model_ip1is called.If
spatial_algorithmisISSorISS1,update_spatial_model_iss1is called.If
spatial_algorithmisIP2,update_spatial_model_ip2is called.If
spatial_algorithmisISS2,update_spatial_model_iss2is called.
- Return type:
None
- update_spatial_model_ip1()#
Update demixing filters once using iterative projection.
Demixing filters are updated sequentially for \(n=1,\ldots,N\) as follows:
\[\begin{split}\boldsymbol{w}_{in} &\leftarrow\left(\boldsymbol{W}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\right)^{-1} \ \boldsymbol{e}_{n}, \\ \boldsymbol{w}_{in} &\leftarrow\frac{\boldsymbol{w}_{in}} {\sqrt{\boldsymbol{w}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\boldsymbol{w}_{in}}},\end{split}\]where
\[\boldsymbol{U}_{in} = \frac{1}{J}\sum_{j} \frac{1}{\left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}}} \boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}\]if
partitioning=True, otherwise\[\boldsymbol{U}_{in} = \frac{1}{J}\sum_{j} \frac{1}{\left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}} \boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}.\]- Return type:
None
- update_spatial_model_ip2()#
Update demixing filters once using pairwise iterative projection following [2].
For \(n_{1}\) and \(n_{2}\) (\(n_{1}\neq n_{2}\)), compute weighted covariance matrix as follows:
\[\boldsymbol{U}_{in} = \frac{1}{J}\sum_{j} \frac{1}{r_{ijn}}\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}},\]\(r_{ijn}\) is computed by
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}}\]if
partitioning=True. Otherwise,\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]Using \(\boldsymbol{U}_{in_{1}}\) and \(\boldsymbol{U}_{in_{2}}\), we compute generalized eigenvectors.
\[\left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i} = \lambda_{i} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i},\]where
\[\begin{split}\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{1}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ), \\ \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{2}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ).\end{split}\]After that, we standardize two eigenvectors \(\boldsymbol{h}_{in_{1}}\) and \(\boldsymbol{h}_{in_{2}}\).
\[\begin{split}\boldsymbol{h}_{in_{1}} &\leftarrow\frac{\boldsymbol{h}_{in_{1}}} {\sqrt{\boldsymbol{h}_{in_{1}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{1}}}}, \\ \boldsymbol{h}_{in_{2}} &\leftarrow\frac{\boldsymbol{h}_{in_{2}}} {\sqrt{\boldsymbol{h}_{in_{2}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{2}}}}.\end{split}\]Then, update \(\boldsymbol{w}_{in_{1}}\) and \(\boldsymbol{w}_{in_{2}}\) simultaneously.
\[\begin{split}\boldsymbol{w}_{in_{1}} &\leftarrow \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{1}} \\ \boldsymbol{w}_{in_{2}} &\leftarrow \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{2}}\end{split}\]At each iteration, we update pairs of \(n_{1}\) and \(n_{1}\) for \(n_{1}\neq n_{2}\).
[2] T. Nakashima, R. Scheibler, Y. Wakabayashi, and N. Ono, “Faster independent low-rank matrix analysis with pairwise updates of demixing vectors,” in Proc. EUSIPCO, 2021, pp. 301-305.
- Return type:
None
- update_spatial_model_iss1()#
Update estimated spectrograms once using iterative source steering.
Update \(y_{ijn}\) as follows:
\[\begin{split}\boldsymbol{y}_{ij} & \leftarrow\boldsymbol{y}_{ij} - \boldsymbol{d}_{in}y_{ijn} \\ d_{inn'} &= \begin{cases} \dfrac{\displaystyle\sum_{j}\dfrac{1}{r_{ijn}} y_{ijn'}y_{ijn}^{*}}{\displaystyle\sum_{j}\dfrac{1} {r_{ijn}}|y_{ijn}|^{2}} & (n'\neq n) \\ 1 - \dfrac{1}{\sqrt{\displaystyle\dfrac{1}{J}\sum_{j}\dfrac{1} {r_{ijn}} |y_{ijn}|^{2}}} & (n'=n) \end{cases},\end{split}\]where
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}},\]if
partitioning=True. Otherwise\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]- Return type:
None
- update_spatial_model_iss2()#
Update estimated spectrograms once using pairwise iterative source steering.
Compute \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\) for \(n_{1}\neq n_{2}\):
\[\begin{split}\begin{array}{rclc} \boldsymbol{G}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}}\dfrac{1}{r_{ijn}} \boldsymbol{y}_{ij}^{(n_{1},n_{2})}{\boldsymbol{y}_{ij}^{(n_{1},n_{2})}}^{\mathsf{H}} &(n=1,\ldots,N), \\ \boldsymbol{f}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}} \dfrac{1}{r_{ijn}}y_{ijn}^{*}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} &(n\neq n_{1},n_{2}), \end{array}\end{split}\]where
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}}\]if
partitioning=True. Otherwise,\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]Using \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\), we compute
\[\begin{split}\begin{array}{rclc} \boldsymbol{p}_{in} &=& \dfrac{\boldsymbol{h}_{in}} {\sqrt{\boldsymbol{h}_{in}^{\mathsf{H}}\boldsymbol{G}_{in}^{(n_{1},n_{2})} \boldsymbol{h}_{in}}} & (n=n_{1},n_{2}), \\ \boldsymbol{q}_{in} &=& -{\boldsymbol{G}_{in}^{(n_{1},n_{2})}}^{-1}\boldsymbol{f}_{in}^{(n_{1},n_{2})} & (n\neq n_{1},n_{2}), \end{array}\end{split}\]where \(\boldsymbol{h}_{in}\) (\(n=n_{1},n_{2}\)) is a generalized eigenvector obtained from
\[\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{i} = \lambda_{i}\boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{i}.\]Separated signal \(y_{ijn}\) is updated as follows:
\[\begin{split}y_{ijn} &\leftarrow\begin{cases} &\boldsymbol{p}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} & (n=n_{1},n_{2}) \\ &\boldsymbol{q}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} + y_{ijn} & (n\neq n_{1},n_{2}) \end{cases}.\end{split}\]- Return type:
None
- class ssspy.bss.ilrma.TILRMA(n_basis, dof, spatial_algorithm='IP', domain=2, partitioning=False, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, normalization=True, scale_restoration=True, record_loss=True, reference_id=0, rng=None)#
Independent low-rank matrix analysis (ILRMA) on Student’s t distribution.
We assume \(y_{ijn}\) follows a Student’s t distribution.
\[p(y_{ijn}) = \frac{1}{\pi r_{ijn}} \left(1+\frac{2}{\nu}\frac{|y_{ijn}|^{2}}{r_{ijn}}\right)^{-\frac{2+\nu}{2}},\]where
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}},\]if
partitioning=True. Otherwise,\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]\(\nu\) is a degree of freedom parameter.
- Parameters:
n_basis (int) – Number of NMF bases.
dof (float) – Degree of freedom parameter in student’s-t distribution.
spatial_algorithm (str) – Algorithm for demixing filter updates. Choose
IP,IP1,IP2,ISS,ISS1, orISS2. Default:IP.domain (float) – Domain parameter. Default:
2.partitioning (bool) – Whether to use partioning function. Default:
False.flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set
flooring_fn=None, the identity function (lambda x: x) is used. Default:functools.partial(max_flooring, eps=1e-10).pair_selector (callable, optional) – Selector to choose updaing pair in
IP2andISS2. IfNoneis given,sequential_pair_selectoris used. Default:None.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None.normalization (bool or str, optional) – Normalization of demixing filters and NMF parameters. Choose
powerorprojection_back. Default:power.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_backorminimal_distortion_principle. Default:True.record_loss (bool) – Record the loss at each iteration of the update algorithm if
record_loss=True. Default:True.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0.rng (numpy.random.Generator, optioinal) – Random number generator. This is mainly used to randomly initialize NMF. If
Noneis given,np.random.default_rng()is used. Default:None.
Examples
Update demixing filters by IP:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = TILRMA( ... n_basis=2, ... dof=1000, ... spatial_algorithm="IP", ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by IP2:
>>> from ssspy.bss._select_pair import sequential_pair_selector >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = TILRMA( ... n_basis=2, ... dof=1000, ... spatial_algorithm="IP2", ... pair_selector=sequential_pair_selector, ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by ISS:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = TILRMA( ... n_basis=2, ... dof=1000, ... spatial_algorithm="ISS", ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by ISS2:
>>> import functools >>> from ssspy.bss._select_pair import sequential_pair_selector >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = TILRMA( ... n_basis=2, ... dof=1000, ... spatial_algorithm="ISS2", ... pair_selector=functools.partial(sequential_pair_selector, step=2), ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
- __call__(input, n_iter=100, initial_call=True, **kwargs)#
Separate a frequency-domain multichannel signal.
- Parameters:
input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
n_iter (int) – The number of iterations of demixing filter updates. Default:
100.initial_call (bool) – If
True, perform callbacks (and computation of loss if necessary) before iterations.
- Return type:
ndarray- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
- apply_projection_back()#
Apply projection back technique to estimated spectrograms.
- Return type:
None
- compute_loss()#
Compute loss \(\mathcal{L}\).
\(\mathcal{L}\) is given as follows:
\[\mathcal{L} = \frac{1}{J}\sum_{i,j} \left\{1+\frac{\nu}{2}\log\left(1+\frac{2}{\nu} \frac{|y_{ijn}|^{2}}{r_{ijn}}\right) + \log r_{ijn}\right\} -2\sum_{i}\log\left|\det\boldsymbol{W}_{i}\right|,\]where
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}},\]if
partitioning=True, otherwise\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]- Return type:
float- Returns:
Computed loss.
- update_activation()#
Update NMF activations.
Update \(t_{ikn}\) as follows:
\[\begin{split}v_{kj} &\leftarrow\left[\frac{\displaystyle\sum_{i,n}\frac{z_{nk}t_{ik}} {\tilde{r}_{ijn}\sum_{k'}z_{nk'}t_{ik'}v_{k'j}} |y_{ijn}|^{2}}{\displaystyle\sum_{i,n}\dfrac{z_{nk}t_{ik}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{p+2}}v_{kj}, \\ \tilde{r}_{ijn} &= \frac{\nu}{\nu+2}\left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2},\end{split}\]if
partitioning=True. Otherwise\[\begin{split}v_{kjn} &\leftarrow \left[\frac{\displaystyle\sum_{j} \dfrac{t_{ikn}}{\tilde{r}_{ijn}\sum_{k'}t_{ik'n}v_{k'jn}}|y_{ijn}|^{2}} {\displaystyle\sum_{i}\frac{t_{ikn}}{\sum_{k'}t_{ik'n}v_{k'jn}}} \right]^{\frac{p}{p+2}}v_{kjn}, \\ \tilde{r}_{ijn} &= \frac{\nu}{\nu+2}\left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2}.\end{split}\]- Return type:
None
- update_basis()#
Update NMF bases.
Update \(t_{ikn}\) as follows:
\[\begin{split}t_{ik} &\leftarrow\left[ \frac{\displaystyle\sum_{j,n}\frac{z_{nk}v_{kj}} {\tilde{r}_{ijn}\sum_{k'}z_{nk'}t_{ik'}v_{k'j}} |y_{ijn}|^{2}}{\displaystyle\sum_{j,n} \dfrac{z_{nk}v_{kj}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{p+2}}t_{ik}, \\ \tilde{r}_{ijn} &= \frac{\nu}{\nu+2}\left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2},\end{split}\]if
partitioning=True. Otherwise\[\begin{split}t_{ikn} &\leftarrow \left[\frac{\displaystyle\sum_{j} \dfrac{v_{kjn}}{\tilde{r}_{ijn}\sum_{k'}t_{ik'n}v_{k'jn}}|y_{ijn}|^{2}} {\displaystyle\sum_{j}\frac{v_{kjn}}{\sum_{k'}t_{ik'n}v_{k'jn}}}\right] ^{\frac{p}{p+2}}t_{ikn}, \\ \tilde{r}_{ijn} &= \frac{\nu}{\nu+2}\left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2}.\end{split}\]- Return type:
None
- update_latent()#
Update latent variables in NMF.
Update \(t_{ikn}\) as follows:
\[\begin{split}z_{nk} &\leftarrow\left[\frac{\displaystyle\sum_{i,j}\frac{t_{ik}v_{kj}} {\tilde{r}_{ijn}\sum_{k'}z_{nk'}t_{ik'}v_{k'j}} |y_{ijn}|^{2}}{\displaystyle\sum_{i,j}\dfrac{t_{ik}v_{kj}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{p+2}}z_{nk} \\ z_{nk} &\leftarrow\frac{z_{nk}}{\sum_{n'}z_{n'k}}, \\ \tilde{r}_{ijn} &= \frac{\nu}{\nu+2}\left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2}.\end{split}\]- Return type:
None
- update_once()#
Update NMF parameters and demixing filters once.
- Return type:
None
- update_source_model()#
Update NMF bases, activations, and latent variables.
- Return type:
None
- update_spatial_model()#
Update demixing filters once.
If
spatial_algorithmisIPorIP1,update_spatial_model_ip1is called.If
spatial_algorithmisISSorISS1,update_spatial_model_iss1is called.If
spatial_algorithmisIP2,update_spatial_model_ip2is called.If
spatial_algorithmisISS2,update_spatial_model_iss2is called.
- Return type:
None
- update_spatial_model_ip1()#
Update demixing filters once using iterative projection.
Demixing filters are updated sequentially for \(n=1,\ldots,N\) as follows:
\[\begin{split}\boldsymbol{w}_{in} &\leftarrow\left(\boldsymbol{W}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\right)^{-1} \ \boldsymbol{e}_{n}, \\ \boldsymbol{w}_{in} &\leftarrow\frac{\boldsymbol{w}_{in}} {\sqrt{\boldsymbol{w}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\boldsymbol{w}_{in}}},\end{split}\]where
\[\boldsymbol{U}_{in} = \frac{1}{J}\sum_{j} \frac{1}{\tilde{r}_{ijn}}\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}.\]\(\tilde{r}_{ijn}\) is defined as
\[\tilde{r}_{ijn} = \frac{\nu}{\nu+2}\left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2},\]if
partitioning=True. Otherwise\[\tilde{r}_{ijn} = \frac{\nu}{\nu+2}\left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2}.\]- Return type:
None
- update_spatial_model_ip2()#
Update demixing filters once using pairwise iterative projection.
For \(n_{1}\) and \(n_{2}\) (\(n_{1}\neq n_{2}\)), compute weighted covariance matrix as follows:
\[\boldsymbol{U}_{in} = \frac{1}{J}\sum_{j} \frac{1}{\tilde{r}_{ijn}}\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}},\]\(\tilde{r}_{ijn}\) is computed by
\[\tilde{r}_{ijn} = \frac{\nu}{\nu+2}\left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2},\]if
partitioning=True. Otherwise,\[\tilde{r}_{ijn} = \frac{\nu}{\nu+2}\left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2}.\]Using \(\boldsymbol{U}_{in_{1}}\) and \(\boldsymbol{U}_{in_{2}}\), we compute generalized eigenvectors.
\[\left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i} = \lambda_{i} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i},\]where
\[\begin{split}\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{1}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ), \\ \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{2}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ).\end{split}\]After that, we standardize two eigenvectors \(\boldsymbol{h}_{in_{1}}\) and \(\boldsymbol{h}_{in_{2}}\).
\[\begin{split}\boldsymbol{h}_{in_{1}} &\leftarrow\frac{\boldsymbol{h}_{in_{1}}} {\sqrt{\boldsymbol{h}_{in_{1}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{1}}}}, \\ \boldsymbol{h}_{in_{2}} &\leftarrow\frac{\boldsymbol{h}_{in_{2}}} {\sqrt{\boldsymbol{h}_{in_{2}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{2}}}}.\end{split}\]Then, update \(\boldsymbol{w}_{in_{1}}\) and \(\boldsymbol{w}_{in_{2}}\) simultaneously.
\[\begin{split}\boldsymbol{w}_{in_{1}} &\leftarrow \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{1}} \\ \boldsymbol{w}_{in_{2}} &\leftarrow \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{2}}\end{split}\]At each iteration, we update pairs of \(n_{1}\) and \(n_{1}\) for \(n_{1}\neq n_{2}\).
- Return type:
None
- update_spatial_model_iss1()#
Update estimated spectrograms once using iterative source steering.
Update \(y_{ijn}\) as follows:
\[\begin{split}\boldsymbol{y}_{ij} & \leftarrow\boldsymbol{y}_{ij} - \boldsymbol{d}_{in}y_{ijn} \\ d_{inn'} &= \begin{cases} \dfrac{\displaystyle\sum_{j}\dfrac{1}{\tilde{r}_{ijn}} y_{ijn'}y_{ijn}^{*}}{\displaystyle\sum_{j}\dfrac{1} {\tilde{r}_{ijn}}|y_{ijn}|^{2}} & (n'\neq n) \\ 1 - \dfrac{1}{\sqrt{\displaystyle\dfrac{1}{J}\sum_{j}\dfrac{1} {\tilde{r}_{ijn}}|y_{ijn}|^{2}}} & (n'=n) \end{cases}.\end{split}\]\(\tilde{r}_{ijn}\) is defined as
\[\tilde{r}_{ijn} = \frac{\nu}{\nu+2}\left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2},\]if
partitioning=True. Otherwise\[\tilde{r}_{ijn} = \frac{\nu}{\nu+2}\left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2}.\]- Return type:
None
- update_spatial_model_iss2()#
Update estimated spectrograms once using pairwise iterative source steering.
Compute \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\) for \(n_{1}\neq n_{2}\):
\[\begin{split}\begin{array}{rclc} \boldsymbol{G}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}}\dfrac{1}{\tilde{r}_{ijn}} \boldsymbol{y}_{ij}^{(n_{1},n_{2})}{\boldsymbol{y}_{ij}^{(n_{1},n_{2})}}^{\mathsf{H}} &(n=1,\ldots,N), \\ \boldsymbol{f}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}} \dfrac{1}{\tilde{r}_{ijn}}y_{ijn}^{*}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} &(n\neq n_{1},n_{2}), \end{array}\end{split}\]where
\[\tilde{r}_{ijn} = \frac{\nu}{\nu+2}\left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2}\]if
partitioning=True. Otherwise,\[\tilde{r}_{ijn} = \frac{\nu}{\nu+2}\left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}} + \frac{2}{\nu+2}|y_{ijn}|^{2}.\]Using \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\), we compute
\[\begin{split}\begin{array}{rclc} \boldsymbol{p}_{in} &=& \dfrac{\boldsymbol{h}_{in}} {\sqrt{\boldsymbol{h}_{in}^{\mathsf{H}}\boldsymbol{G}_{in}^{(n_{1},n_{2})} \boldsymbol{h}_{in}}} & (n=n_{1},n_{2}), \\ \boldsymbol{q}_{in} &=& -{\boldsymbol{G}_{in}^{(n_{1},n_{2})}}^{-1}\boldsymbol{f}_{in}^{(n_{1},n_{2})} & (n\neq n_{1},n_{2}), \end{array}\end{split}\]where \(\boldsymbol{h}_{in}\) (\(n=n_{1},n_{2}\)) is a generalized eigenvector obtained from
\[\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{i} = \lambda_{i}\boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{i}.\]Separated signal \(y_{ijn}\) is updated as follows:
\[\begin{split}y_{ijn} &\leftarrow\begin{cases} &\boldsymbol{p}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} & (n=n_{1},n_{2}) \\ &\boldsymbol{q}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} + y_{ijn} & (n\neq n_{1},n_{2}) \end{cases}.\end{split}\]- Return type:
None
- class ssspy.bss.ilrma.GGDILRMA(n_basis, beta, spatial_algorithm='IP', domain=2, partitioning=False, flooring_fn=functools.partial(<function max_flooring>, eps=1e-10), pair_selector=None, callbacks=None, normalization=True, scale_restoration=True, record_loss=True, reference_id=0, rng=None)#
Independent low-rank matrix analysis (ILRMA) on a generalized Gaussian distribution.
We assume \(y_{ijn}\) follows a generalized Gaussian distribution.
\[p(y_{ijn}) = \frac{\beta}{2\pi r_{ijn}\Gamma\left(\frac{2}{\beta}\right)} \exp\left\{-\left(\frac{|y_{ijn}|^{2}}{r_{ijn}}\right)^{\frac{\beta}{2}}\right\},\]where
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}},\]if
partitioning=True. Otherwise,\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]\(\beta\) is a shape parameter of a generalized Gaussian distribution.
- Parameters:
n_basis (int) – Number of NMF bases.
beta (float) – Shape parameter in generalized Gaussian distribution.
spatial_algorithm (str) – Algorithm for demixing filter updates. Choose
IP,IP1,IP2,ISS,ISS1, orISS2. Default:IP.domain (float) – Domain parameter. Default:
2.partitioning (bool) – Whether to use partioning function. Default:
False.flooring_fn (callable, optional) – A flooring function for numerical stability. This function is expected to return the same shape tensor as the input. If you explicitly set
flooring_fn=None, the identity function (lambda x: x) is used. Default:functools.partial(max_flooring, eps=1e-10).pair_selector (callable, optional) – Selector to choose updaing pair in
IP2andISS2. IfNoneis given,sequential_pair_selectoris used. Default:None.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None.normalization (bool or str, optional) – Normalization of demixing filters and NMF parameters. Choose
powerorprojection_back. Default:power.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_backorminimal_distortion_principle. Default:True.record_loss (bool) – Record the loss at each iteration of the update algorithm if
record_loss=True. Default:True.reference_id (int) – Reference channel for projection back and minimal distortion principle. Default:
0.rng (numpy.random.Generator, optioinal) – Random number generator. This is mainly used to randomly initialize NMF. If
Noneis given,np.random.default_rng()is used. Default:None.
Examples
Update demixing filters by IP:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = GGDILRMA( ... n_basis=2, ... beta=1.99, ... spatial_algorithm="IP", ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by IP2:
>>> from ssspy.bss._select_pair import sequential_pair_selector >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = GGDILRMA( ... n_basis=2, ... beta=1.99, ... spatial_algorithm="IP2", ... pair_selector=sequential_pair_selector, ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by ISS:
>>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = GGDILRMA( ... n_basis=2, ... beta=1.99, ... spatial_algorithm="ISS", ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
Update demixing filters by ISS2:
>>> import functools >>> from ssspy.bss._select_pair import sequential_pair_selector >>> n_channels, n_bins, n_frames = 2, 2049, 128 >>> spectrogram_mix = np.random.randn(n_channels, n_bins, n_frames) \ ... + 1j * np.random.randn(n_channels, n_bins, n_frames) >>> ilrma = GGDILRMA( ... n_basis=2, ... beta=1.99, ... spatial_algorithm="ISS2", ... pair_selector=functools.partial(sequential_pair_selector, step=2), ... rng=np.random.default_rng(42), ... ) >>> spectrogram_est = ilrma(spectrogram_mix, n_iter=100) >>> print(spectrogram_mix.shape, spectrogram_est.shape) (2, 2049, 128), (2, 2049, 128)
- __call__(input, n_iter=100, initial_call=True, **kwargs)#
Separate a frequency-domain multichannel signal.
- Parameters:
input (numpy.ndarray) – The mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
n_iter (int) – The number of iterations of demixing filter updates. Default:
100.initial_call (bool) – If
True, perform callbacks (and computation of loss if necessary) before iterations.
- Return type:
ndarray- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
- apply_projection_back()#
Apply projection back technique to estimated spectrograms.
- Return type:
None
- compute_loss()#
Compute loss \(\mathcal{L}\).
\(\mathcal{L}\) is given as follows:
\[\mathcal{L} = \frac{1}{J}\sum_{i,j,n} \left\{\left(\frac{|y_{ijn}|^{2}}{r_{ijn}}\right)^{\frac{\beta}{2}} + \log r_{ijn}\right\} - 2\sum_{i}\log|\det\boldsymbol{W}_{i}|,\]where
\[r_{ijn} = \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{2}{p}},\]if
partitioning=True. Otherwise\[r_{ijn} = \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{2}{p}}.\]- Return type:
float- Returns:
Computed loss.
- update_activation()#
Update NMF activations.
Update \(v_{kjn}\) as follows:
\[v_{kj} \leftarrow\left[ \frac{\beta}{2} \frac{\displaystyle\sum_{i,n}\frac{z_{nk}t_{ik}} {(\sum_{k'}z_{nk'}t_{ik'}v_{k'j})^{\frac{\beta+p}{p}}}|y_{ijn}|^{\beta}} {\displaystyle\sum_{i,n}\frac{z_{nk}t_{ik}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{\beta+p}}v_{kj},\]if
partitioning=True. Otherwise\[v_{kj} \leftarrow\left[ \frac{\beta}{2} \frac{\displaystyle\sum_{i}\frac{t_{ikn}} {(\sum_{k'}t_{ik'n}v_{k'jn})^{\frac{\beta+p}{p}}}|y_{ijn}|^{\beta}} {\displaystyle\sum_{i}\frac{t_{ik}}{\sum_{k'}t_{ik'n}v_{k'jn}}} \right]^{\frac{p}{\beta+p}}v_{kjn}.\]- Return type:
None
- update_basis()#
Update NMF bases.
Update \(t_{ikn}\) as follows:
\[t_{ik} \leftarrow\left[ \frac{\beta}{2} \frac{\displaystyle\sum_{j,n}\frac{z_{nk}v_{kj}} {(\sum_{k'}z_{nk'}t_{ik'}v_{k'j})^{\frac{\beta+p}{p}}}|y_{ijn}|^{\beta}} {\displaystyle\sum_{j,n}\frac{z_{nk}v_{kj}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{\beta+p}}t_{ik},\]if
partitioning=True. Otherwise\[t_{ikn} \leftarrow\left[ \frac{\beta}{2} \frac{\displaystyle\sum_{j}\frac{v_{kjn}} {(\sum_{k'}t_{ik'n}v_{k'jn})^{\frac{\beta+p}{p}}}|y_{ijn}|^{\beta}} {\displaystyle\sum_{j}\frac{v_{kjn}}{\sum_{k'}t_{ik'n}v_{k'jn}}} \right]^{\frac{p}{\beta+p}}t_{ikn}.\]- Return type:
None
- update_latent()#
Update latent variables in NMF.
Update \(t_{ikn}\) as follows:
\[\begin{split}z_{nk} &\leftarrow\left[ \frac{\beta}{2} \frac{\displaystyle\sum_{i,j}\frac{t_{ik}v_{kj}} {(\sum_{k'}z_{nk'}t_{ik'}v_{k'j})^{\frac{\beta+p}{2}}}|y_{ijn}|^{\beta}} {\displaystyle\sum_{i,j}\frac{t_{ik}v_{kj}}{\sum_{k'}z_{nk'}t_{ik'}v_{k'j}}} \right]^{\frac{p}{\beta+p}}z_{nk}, \\ z_{nk} &\leftarrow\frac{z_{nk}}{\displaystyle\sum_{n'}z_{n'k}}.\end{split}\]- Return type:
None
- update_once()#
Update NMF parameters and demixing filters once.
- Return type:
None
- update_source_model()#
Update NMF bases, activations, and latent variables.
- Return type:
None
- update_spatial_model()#
Update demixing filters once.
If
spatial_algorithmisIPorIP1,update_spatial_model_ip1is called.If
spatial_algorithmisISSorISS1,update_spatial_model_iss1is called.If
spatial_algorithmisIP2,update_spatial_model_ip2is called.If
spatial_algorithmisISS2,update_spatial_model_iss2is called.
- Return type:
None
- update_spatial_model_ip1()#
Update demixing filters once using iterative projection.
Demixing filters are updated sequentially for \(n=1,\ldots,N\) as follows:
\[\begin{split}\boldsymbol{w}_{in} &\leftarrow\left(\boldsymbol{W}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\right)^{-1} \ \boldsymbol{e}_{n}, \\ \boldsymbol{w}_{in} &\leftarrow\frac{\boldsymbol{w}_{in}} {\sqrt{\boldsymbol{w}_{in}^{\mathsf{H}}\boldsymbol{U}_{in}\boldsymbol{w}_{in}}},\end{split}\]where
\[\boldsymbol{U}_{in} \leftarrow\frac{1}{J}\sum_{i,j,n} \frac{\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}}}{\tilde{r}_{ijn}}.\]\(\tilde{r}_{ijn}\) is computed as
\[\tilde{r}_{ijn} = \frac{2|y_{ijn}|^{2-\beta}}{\beta} \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{\beta}{p}},\]if
partitioning=True. Otherwise,\[\tilde{r}_{ijn} = \frac{2|y_{ijn}|^{2-\beta}}{\beta} \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{\beta}{p}}.\]- Return type:
None
- update_spatial_model_ip2()#
Update demixing filters once using pairwise iterative projection.
For \(n_{1}\) and \(n_{2}\) (\(n_{1}\neq n_{2}\)), compute weighted covariance matrix as follows:
\[\boldsymbol{U}_{in} = \frac{1}{J}\sum_{j} \frac{1}{\tilde{r}_{ijn}}\boldsymbol{x}_{ij}\boldsymbol{x}_{ij}^{\mathsf{H}},\]\(\tilde{r}_{ijn}\) is computed by
\[\tilde{r}_{ijn} = \frac{2|y_{ijn}|^{2-\beta}}{\beta} \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{\beta}{p}},\]if
partitioning=True. Otherwise,\[\tilde{r}_{ijn} = \frac{2|y_{ijn}|^{2-\beta}}{\beta} \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{\beta}{p}}.\]Using \(\boldsymbol{U}_{in_{1}}\) and \(\boldsymbol{U}_{in_{2}}\), we compute generalized eigenvectors.
\[\left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i} = \lambda_{i} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right)\boldsymbol{h}_{i},\]where
\[\begin{split}\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{1}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ), \\ \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})} &= (\boldsymbol{W}_{i}\boldsymbol{U}_{in_{2}})^{-1} ( \begin{array}{cc} \boldsymbol{e}_{n_{1}} & \boldsymbol{e}_{n_{2}} \end{array} ).\end{split}\]After that, we standardize two eigenvectors \(\boldsymbol{h}_{in_{1}}\) and \(\boldsymbol{h}_{in_{2}}\).
\[\begin{split}\boldsymbol{h}_{in_{1}} &\leftarrow\frac{\boldsymbol{h}_{in_{1}}} {\sqrt{\boldsymbol{h}_{in_{1}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{1}} \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{1}}}}, \\ \boldsymbol{h}_{in_{2}} &\leftarrow\frac{\boldsymbol{h}_{in_{2}}} {\sqrt{\boldsymbol{h}_{in_{2}}^{\mathsf{H}} \left({\boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}}^{\mathsf{H}}\boldsymbol{U}_{in_{2}} \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\right) \boldsymbol{h}_{in_{2}}}}.\end{split}\]Then, update \(\boldsymbol{w}_{in_{1}}\) and \(\boldsymbol{w}_{in_{2}}\) simultaneously.
\[\begin{split}\boldsymbol{w}_{in_{1}} &\leftarrow \boldsymbol{P}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{1}} \\ \boldsymbol{w}_{in_{2}} &\leftarrow \boldsymbol{P}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{in_{2}}\end{split}\]At each iteration, we update pairs of \(n_{1}\) and \(n_{1}\) for \(n_{1}\neq n_{2}\).
- Return type:
None
- update_spatial_model_iss1()#
Update estimated spectrograms once using iterative source steering.
Update \(y_{ijn}\) as follows:
\[\begin{split}\boldsymbol{y}_{ij} & \leftarrow\boldsymbol{y}_{ij} - \boldsymbol{d}_{in}y_{ijn} \\ d_{inn'} &= \begin{cases} \dfrac{\displaystyle\sum_{j}\dfrac{1}{\tilde{r}_{ijn}} y_{ijn'}y_{ijn}^{*}}{\displaystyle\sum_{j}\dfrac{1} {\tilde{r}_{ijn}}|y_{ijn}|^{2}} & (n'\neq n) \\ 1 - \dfrac{1}{\sqrt{\displaystyle\dfrac{1}{J}\sum_{j}\dfrac{1} {\tilde{r}_{ijn}}|y_{ijn}|^{2}}} & (n'=n) \end{cases},\end{split}\]where \(\tilde{r}_{ijn}\) is computed as
\[\tilde{r}_{ijn} = \frac{2|y_{ijn}|^{2-\beta}}{\beta} \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{\beta}{p}},\]if
partitioning=True. Otherwise,\[\tilde{r}_{ijn} = \frac{2|y_{ijn}|^{2-\beta}}{\beta} \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{\beta}{p}}.\]- Return type:
None
- update_spatial_model_iss2()#
Update estimated spectrograms once using pairwise iterative source steering.
Compute \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\) for \(n_{1}\neq n_{2}\):
\[\begin{split}\begin{array}{rclc} \boldsymbol{G}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}}\dfrac{1}{\tilde{r}_{ijn}} \boldsymbol{y}_{ij}^{(n_{1},n_{2})}{\boldsymbol{y}_{ij}^{(n_{1},n_{2})}}^{\mathsf{H}} &(n=1,\ldots,N), \\ \boldsymbol{f}_{in}^{(n_{1},n_{2})} &=& {\displaystyle\frac{1}{J}\sum_{j}} \dfrac{1}{\tilde{r}_{ijn}}y_{ijn}^{*}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} &(n\neq n_{1},n_{2}), \end{array}\end{split}\]where
\[\tilde{r}_{ijn} = \frac{2}{\beta}|y_{ijn}|^{2-\beta} \left(\sum_{k}z_{nk}t_{ik}v_{kj}\right)^{\frac{\beta}{p}},\]if
partitioning=True. Otherwise,\[\tilde{r}_{ijn} = \frac{2}{\beta}|y_{ijn}|^{2-\beta} \left(\sum_{k}t_{ikn}v_{kjn}\right)^{\frac{\beta}{p}}.\]Using \(\boldsymbol{G}_{in}^{(n_{1},n_{2})}\) and \(\boldsymbol{f}_{in}^{(n_{1},n_{2})}\), we compute
\[\begin{split}\begin{array}{rclc} \boldsymbol{p}_{in} &=& \dfrac{\boldsymbol{h}_{in}} {\sqrt{\boldsymbol{h}_{in}^{\mathsf{H}}\boldsymbol{G}_{in}^{(n_{1},n_{2})} \boldsymbol{h}_{in}}} & (n=n_{1},n_{2}), \\ \boldsymbol{q}_{in} &=& -{\boldsymbol{G}_{in}^{(n_{1},n_{2})}}^{-1}\boldsymbol{f}_{in}^{(n_{1},n_{2})} & (n\neq n_{1},n_{2}), \end{array}\end{split}\]where \(\boldsymbol{h}_{in}\) (\(n=n_{1},n_{2}\)) is a generalized eigenvector obtained from
\[\boldsymbol{G}_{in_{1}}^{(n_{1},n_{2})}\boldsymbol{h}_{i} = \lambda_{i}\boldsymbol{G}_{in_{2}}^{(n_{1},n_{2})}\boldsymbol{h}_{i}.\]Separated signal \(y_{ijn}\) is updated as follows:
\[\begin{split}y_{ijn} &\leftarrow\begin{cases} &\boldsymbol{p}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} & (n=n_{1},n_{2}) \\ &\boldsymbol{q}_{in}^{\mathsf{H}}\boldsymbol{y}_{ij}^{(n_{1},n_{2})} + y_{ijn} & (n\neq n_{1},n_{2}) \end{cases}.\end{split}\]- Return type:
None