Abstract: This page provides some audio examples of speech distortions we focus on in the Interspeech 2025 URGENT Challenge. Each example has clean speech, distorted speech, and enhanced speech. The enhanced speech is obtained using the baseline model trained for the "URGENT Challenge 2024", which did not see distortions considered in the URGENT Interspeech 2025 challenge. Unfortunately, the model does not generalize well for unseen distortions. This motivates us to include more diverse types of distortions and to aims to build a more universal, robust, and generalizable speech enhancement model.
Contents:
Audio examples of packet loss
Audio examples of codec lossy compression
Audio examples of wind noise
Packet loss
Packet loss can happen in the online meeting systems due to the busy network traffic. When packet loss happens, the audio becomes choppy. We want to restore the lost speech using a speech enhancement model.
Clean speech ![]() |
Speech with packet losses ![]() |
Enhanced speech ![]() |
Codec lossy compression (MP3)
Some audio codecs are lossy compression, and some components in the original audio are lost, as you can see in the example below, We want to restore the lost part using a speech enhancement model.
Clean speech ![]() |
Speech distorted by codec lossy compression ![]() |
Enhanced speech ![]() |
Wind noise
Wind noise is non-stational noise and behaves turbulently near microphones. Wind noise distorts sound in a non-linear way because the airflow moves the microphone membrane and causes saturation at high noise levels.
In the following example, enhancement model generalizes to unseen wind noise to some extent, but still there is room for improvment.
Clean speech ![]() |
Speech with wind noise ![]() |
Speech with wind noise ![]() |