Immersitech ClearVoice Delivers Impressive
DNSMOS Results vs Competitive Solutions!
Immersitech has made significant improvements in our AI-based audio SDKs designed to deliver superior performance across a variety of noisy environments. We are pleased to present competitive results from our internal testing using the blind test set (~7 hours of noisy speech audio) from the 5th Deep Noise Suppression Challenge at IEEE ICASSP 2023.
Overall Noise Removal and Speech Enhancement Test Results
- About the test: DNSMOS is a representation of the ITU-T Recommendation P.835 subjective speech enhancement w/ noise cancellation test. DNSMOS is a non-intrusive speech quality assessment model (NI-SQA) and does not require a reference (or “clean”) audio clip to do the comparison. As a result, it can more accurately be used to test the quality of non-synthesized noisy audio (noise and speech in the same recording). Learn more about DNSMOS at https://arxiv.org/pdf/2110.01763.pdf.
- The DNSMOS BAK test focuses on noise cancellation performance with Immersitech ClearVoice on average showing a 50.88% improvement over the original audio and 2.92% better than Krisp.ai, the second highest competitor tested in this area.
- The DNSMOS SIG test focuses on speech quality in the audio samples. Our testing highlighted that most competitive platforms are quite good at maintaining speech quality overall, with Immersitech ClearVoice on average showing a 4.42% improvement over the original audio, and only 0.22% lower than Picovoice Koala, the highest scorer tested in this area.
- It should be noted here that Picovoice Koala at the time of testing only supported 16kHz sampling rates, while Immersitech ClearVoice can maintain the higher quality/frequency characteristics of 48kHz speech.
- Immersitech ClearVoice on average shows a 25.37% improvement over the original audio and is 3.48% better than Picovoice Koala, the second highest competitor tested, in DNSMOS-OVRL (overall) testing.
One area of strong performance for Immersitech ClearVoice was impulse noises (i.e. typing, clicking, smartphone/computer alerts) which so often create impactful distractions during online communications across gaming, business calls, and online learning sessions.
© Immersitech 2023