Clean speech
Noisy speech
UltraSE
Ultraspeech
PHASEN
USpeech w/ syn. (ours)
USpeech w/ phy. (ours)
Section 2: Large-scale Datasets Synthesis Enhancement
In this section, we demonstrate the recovery results of the large-scale dataset synthesis without any training, including LJSpeech, TIMIT, and VCTK. (See Section 7.2 in paper)
(1) LJSpeech
Clean speech
Noisy speech
USpeech w/ syn. (ours)
(2) TIMIT
Clean speech
Noisy speech
USpeech w/ syn. (ours)
(3) VCTK
Clean speech
Noisy speech
USpeech w/ syn. (ours)
Section 3: Long-duration Enhancement
In this section, we demonstrate the recovery results of the experiments on long-duration enhancement (See Section 7.3 in paper)
(1) Collected Dataset
Clean speech
Noisy speech
USpeech w/ syn. (ours)
USpeech w/ phy. (ours)
(2) LJSpeech
Clean speech
Noisy speech
USpeech w/ syn. (ours)
(3) TIMIT
Clean speech
Noisy speech
USpeech w/ syn. (ours)
(4) VCTK
Clean speech
Noisy speech
USpeech w/ syn. (ours)
Section 4: Different Noise Interference
In this section, we demonstrate the recovery results of the experiments on different noise interference, including different environmental interference, competing speakers inteference, and human voice interference. (See Section 7.5 in paper)
(1) Different Environmental Interference
Clean speech
Noisy speech
USpeech w/ syn. (ours)
USpeech w/ phy. (ours)
(2) Competing Speakers Inteference
Clean speech
Noisy speech
USpeech w/ syn. (ours)
USpeech w/ phy. (ours)
(3) Human Voice Interference
Clean speech
Noisy speech
USpeech w/ syn. (ours)
USpeech w/ phy. (ours)