ssspy.bss.pdsbss#
In this module, we separate multichannel signals using blind source separation via primal dual splitting algorithm. We denote the number of sources and microphones as \(N\) and \(M\), respectively. We also denote short-time Fourier transforms of source, observed, and separated signals as \(\boldsymbol{s}_{ij}\), \(\boldsymbol{x}_{ij}\), and \(\boldsymbol{y}_{ij}\), respectively.
where \(i=1,\ldots,I\) and \(j=1,\ldots,J\) are indices of frequency bins and time frames, respectively. When a mixing system is time-invariant, \(\boldsymbol{x}_{ij}\) is represented as follows:
where \(\boldsymbol{A}_{i}=(\boldsymbol{a}_{i1},\ldots,\boldsymbol{a}_{in},\ldots,\boldsymbol{a}_{iN})\in\mathbb{C}^{M\times N}\) is a mixing matrix. If \(M=N\) and \(\boldsymbol{A}_{i}\) is non-singular, a demixing system is represented as
where \(\boldsymbol{W}_{i}=(\boldsymbol{w}_{i1},\ldots,\boldsymbol{w}_{in},\ldots,\boldsymbol{w}_{iN})^{\mathsf{H}}\in\mathbb{C}^{N\times M}\) is a demixing matrix. The negative log-likelihood of observed signals (divided by \(2J\)) is computed as follows:
where \(\mathcal{P}\) is a penalty funcion that is determined by the source model.
Let us consider independent vector analysis. In this case, \(\mathcal{P}\) can be written by
where \(C\) is a positive constant.
To the above formulation, we can apply the primal-dual splitting algorithm. On the basis of this algorithm, the demixing filter is updated as follows:
\(\boldsymbol{u}_{ij}\) is a dual variable, which should be initialized by a certain value. \(\mathrm{prox}_{g}\) is a proximal operator defined as
For \(\mathcal{I}\), we can obatain the following proximal operator:
where \(\boldsymbol{U}_{i}\), \(\boldsymbol{V}_{i}\), and \(\boldsymbol{\Sigma}_{i}=\mathrm{diag}(\sigma_{i1},\ldots,\sigma_{iN})\) are singular value decomposition.
When \(\mathcal{P}\) is defined as
the updates by the proximal operator can be written as
Algorithms#
- class ssspy.bss.pdsbss.PDSBSSBase(penalty_fn=None, prox_penalty=None, callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#
Base class of blind source separation via proximal splitting algorithm [1].
- Parameters:
penalty_fn (callable) – Penalty function that determines source model.
prox_penalty (callable) – Proximal operator of penalty function. Default:
None
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
explicitly. Default:True
.record_loss (bool) – Record the loss at each iteration of the update algorithm if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back. Default:
0
.
- class ssspy.bss.pdsbss.PDSBSS(mu1=1, mu2=1, alpha=None, relaxation=1, penalty_fn=None, prox_penalty=None, callbacks=None, scale_restoration=True, record_loss=True, reference_id=0)#
Blind source separation via proximal splitting algorithm [1].
- Parameters:
mu1 (float) – Step size. Default:
1
.mu2 (float) – Step size. Default:
1
.alpha (float) – Relaxation parameter (deprecated). Set
relaxation
instead.relaxation (float) – Relaxation parameter. Default:
1
.penalty_fn (callable) – Penalty function that determines source model.
prox_penalty (callable) – Proximal operator of penalty function. Default:
None
.callbacks (callable or list[callable], optional) – Callback functions. Each function is called before separation and at each iteration. Default:
None
.scale_restoration (bool or str) – Technique to restore scale ambiguity. If
scale_restoration=True
, the projection back technique is applied to estimated spectrograms. You can also specifyprojection_back
explicitly. Default:True
.record_loss (bool) – Record the loss at each iteration of the update algorithm if
record_loss=True
. Default:True
.reference_id (int) – Reference channel for projection back. Default:
0
.
- __call__(input, n_iter=100, initial_call=True, **kwargs)#
Separate a frequency-domain multichannel signal.
- Parameters:
input (numpy.ndarray) – Mixture signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
n_iter (int) – Number of iterations of demixing filter updates. Default:
100
.initial_call (bool) – If
True
, perform callbacks (and computation of loss if necessary) before iterations.
- Return type:
ndarray
- Returns:
numpy.ndarray of the separated signal in frequency-domain. The shape is (n_channels, n_bins, n_frames).
- update_once()#
Update demixing filters and dual parameters once.
- Return type:
None