Unsupervised Audio Source Separation
Using Differentiable Parametric Source Models


Welcome to the demo website of the paper

Schulze-Forster, K., Doire, C., Richard, G., & Badeau, R. "Unsupervised Musical Source Separation Using Differentiable Parametric Source Models" Currently under review at IEEE/ACM Transactions on Audio, Speech and Language Processing.

All example mixtures are taken from the test set.


Audio examples for mixtures of 2 sources (J=2)


Example mixture 1:
Method
Comment
Soprano
Alto
Re-synthesized mix*
True sources
US-F (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-F
Signals s̃j(t) generated with source models
US-S (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-S
Signals s̃j(t) generated with source models
SV-F
Estimates ŝj(t) obtained by Wiener filtering
SV-F
Signals s̃j(t) generated with source models
SV-S
Estimates ŝj(t) obtained by Wiener filtering
SV-S
Signals s̃j(t) generated with source models
NMF1
Wiener filtering
NMF2
Wiener filtering
Unet-F
Wiener filtering
Unet-S
Wiener filtering
*This is the sum of the signals s̃j(t) generated with the parametric source models. It is denoted m̃(t) in the paper. This signal can be manipulated by changing the source model parameters.


Example mixture 2:
Method
Comment
Tenor
Bass
Re-synthesized mix*
True sources
US-F (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-F
Signals s̃j(t) generated with source models
US-S (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-S
Signals s̃j(t) generated with source models
SV-F
Estimates ŝj(t) obtained by Wiener filtering
SV-F
Signals s̃j(t) generated with source models
SV-S
Estimates ŝj(t) obtained by Wiener filtering
SV-S
Signals s̃j(t) generated with source models
NMF1
Wiener filtering
NMF2
Wiener filtering
Unet-F
Wiener filtering
Unet-S
Wiener filtering
*This is the sum of the signals s̃j(t) generated with the parametric source models. It is denoted m̃(t) in the paper. This signal can be manipulated by changing the source model parameters.


Example mixture 3:
Method
Comment
Soprano
Alto
Re-synthesized mix*
True sources
US-F (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-F
Signals s̃j(t) generated with source models
US-S (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-S
Signals s̃j(t) generated with source models
SV-F
Estimates ŝj(t) obtained by Wiener filtering
SV-F
Signals s̃j(t) generated with source models
SV-S
Estimates ŝj(t) obtained by Wiener filtering
SV-S
Signals s̃j(t) generated with source models
NMF1
Wiener filtering
NMF2
Wiener filtering
Unet-F
Wiener filtering
Unet-S
Wiener filtering
*This is the sum of the signals s̃j(t) generated with the parametric source models. It is denoted m̃(t) in the paper. This signal can be manipulated by changing the source model parameters.


Example mixture 4:
Method
Comment
Tenor
Bass
Re-synthesized mix*
True sources
US-F (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-F
Signals s̃j(t) generated with source models
US-S (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-S
Signals s̃j(t) generated with source models
SV-F
Estimates ŝj(t) obtained by Wiener filtering
SV-F
Signals s̃j(t) generated with source models
SV-S
Estimates ŝj(t) obtained by Wiener filtering
SV-S
Signals s̃j(t) generated with source models
NMF1
Wiener filtering
NMF2
Wiener filtering
Unet-F
Wiener filtering
Unet-S
Wiener filtering
*This is the sum of the signals s̃j(t) generated with the parametric source models. It is denoted m̃(t) in the paper. This signal can be manipulated by changing the source model parameters.



Audio examples for mixtures of 4 sources (J=4)


Example mixture 1:
Method
Comment
Soprano
Alto
Tenor
Bass
Re-synthesized mix*
True sources
US-F (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-F
Signals s̃j(t) generated with source models
US-S (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-S
Signals s̃j(t) generated with source models
SV-F
Estimates ŝj(t) obtained by Wiener filtering
SV-F
Signals s̃j(t) generated with source models
SV-S
Estimates ŝj(t) obtained by Wiener filtering
SV-S
Signals s̃j(t) generated with source models
NMF1
Wiener filtering
NMF2
Wiener filtering
Unet-F
Wiener filtering
Unet-S
Wiener filtering
*This is the sum of the signals s̃j(t) generated with the parametric source models. It is denoted m̃(t) in the paper. This signal can be manipulated by changing the source model parameters.


Example mixture 2:
Method
Comment
Soprano
Alto
Tenor
Bass
Re-synthesized mix*
True sources
US-F (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-F
Signals s̃j(t) generated with source models
US-S (proposed)
Estimates ŝj(t) obtained by Wiener filtering
US-S
Signals s̃j(t) generated with source models
SV-F
Estimates ŝj(t) obtained by Wiener filtering
SV-F
Signals s̃j(t) generated with source models
SV-S
Estimates ŝj(t) obtained by Wiener filtering
SV-S
Signals s̃j(t) generated with source models
NMF1
Wiener filtering
NMF2
Wiener filtering
Unet-F
Wiener filtering
Unet-S
Wiener filtering
*This is the sum of the signals s̃j(t) generated with the parametric source models. It is denoted m̃(t) in the paper. This signal can be manipulated by changing the source model parameters.



Audio examples for melody editing


The mixture is parameterized using the proposed method US-F. An example use case of this parameterization is melody editing. The fundamental frequencies of individual sources are modified and then the sources are mixed together. The result is a modified mixture. The parametersization can be exploited for other tasks such as timbre transfer, style transfer, transpostion, or changing the sung text.


Original mixture
Re-synthesized mixture
Modified mixture