Challenge Description
Following the success of two previous Short-duration Speaker Verification Challenges, the TdSV Challenge 2024 aims to focus on the relevance of recent training strategies, such as self-supervised learning. The challenge evaluates the TdSV task in two practical scenarios, namely, conventional TdSV using predefined Passphrases (Task 1) and TdSV using user-defined passphrases (Task 2). In Task 1, participants are tasked with training a speaker encoder model using a significantly large training dataset sourced from a predefined phrase pool consisting of 10 phrases. Speaker models are created using three repetitions of a specific passphrase from the phrase pool. In Task 2, participants are required to train a speaker encoder model using a large text-independent training dataset. Additionally, there are utterances from a predefined pool of 6 phrases for each in-domain training speaker. Three cash prizes will be given away for each task based on the results of the evaluation dataset and other qualitative factors.
Challenge Website
Organizers
Challenge Description
The SLT-24 GenASR challenge aims to explore a new field of post-ASR based text modeling focusing on three distinct tasks: (1) ASR-Quality aware: N-best ASR-LM Correction, (2) Speaker-aware: Speaker-Tagging Assignment, and (3) Non-Semantic aware: Post-ASR Text-to-Emotion recognition. We provide open benchmarks and systems to encourage researchers worldwide to open speech technology within economic computing budgets and to push the limits of ASR-related tasks performance. We include two tracks of limited LM size <= 7B and unlimited size to welcome all speech communities.
Challenge Website
https://sites.google.com/view/gensec-challenge/home
Honorary Challenge Chair
Andreas Stolcke
Uniphore and ICSI
Organizers
Challenge Description
The Singing Voice Deepfake Detection (SVDD) Challenge addresses the surge in AI-generated singing voices, which now closely resemble natural human singing and align seamlessly with musical scores, leading to heightened concerns for artists and the music industry. Unlike spoken voice, the singing voice introduces unique detection challenges due to its intricate musical nature and prominent background music. This pioneering challenge is the first to focus on detecting both authentic and synthetic singing voices in lab-controlled and real-world scenarios. It features two distinct tracks: CtrSVDD, for controlled settings, utilizing top-performing singing voice synthesis and singing voice conversion systems, and WildSVDD, which expands our previous SingFake dataset to offer a wide array of real-world examples. This pioneering effort aims not only to advance the field of synthetic singing voice detection but also to address the music industry’s concerns about singing voice authenticity, paving the way for more secure digital musical experiences.
Challenge Website
https://challenge.singfake.org/
Organizers
Challenge Description
AIGC has already become a very hot topic and has been widely used in our daily lives. In the speech field, there are several challenges specifically targeting the audio anti spoofing countermeasure problem, e.g. ASVspoof, ADD, etc. However, there are limited efforts to address the source speaker tracing problem. Source speaker tracing is to identify the information of the source speaker or eventually reconstruct the speech of the source speaker from the manipulated speech signals in different scenarios, e.g. voice conversion, speaker anonymization, speech editing, etc. This year’s challenge will focus on source speaker verification against voice conversion. Participants will be asked to decide whether two converted utterances are from the same source speaker. Besides the challenge track, there is also a research track to attract submissions related to source speaker tracing and other topics related to speech anti-spoofing countermeasure.
Challenge Website
https://sstc-challenge.github.io/
Organizers
Challenge Description
The VoiceMOS Challenge 2024 (VMC 2024) is the third edition of the VMC series. The purpose of the challenge is to compare different systems and approaches to the task of predicting human ratings of speech (usually in terms of mean opinion score, MOS). This year, there will be three tracks. The first track aims to predict the MOS ratings of a “zoomed-in” subset comprising the top systems from the VMC 2022 dataset collected through a separate listening test. The second track is based on a new dataset containing samples and their ratings from singing voice synthesis and conversion systems. The third track is semi-supervised MOS prediction for noisy, clean, and enhanced speech. Participants will only be allowed to use a very small amount of MOS-labeled data provided by organizers.
Challenge Website
https://sites.google.com/view/voicemos-challenge
Organizers
Erica Cooper
National Institute Of Information And Communications Technology, Japan
Challenge Description
Stuttering affects 1% of the global population, influencing social and mental wellbeing, and posing communication and self-esteem challenges. Despite unknown causality, early intervention is crucial. However, regions like Mainland China have a dearth of speech-language professionals, which hinders timely support. The StutteringSpeech Challenge addresses this gap through innovation in stuttering event detection (SED) and automatic speech recognition (ASR) technology. This initiative focuses on encouraging user interface designs that are inclusive for people who stutter. It comprises Task I, detecting stuttering events in audio; Task II, crafting efficient ASR systems to transcribe stuttered speech; and Task III, a research paper track enriching our understanding of stuttering speech technology. This challenge underlines the commitment to fostering awareness and creating inclusive technology, thereby augmenting communication accessibility in today’s smart device and chatbot ecosystem.
Challenge Website
Organizers
Challenge Description
Challenge Website
https://codecsuperb.github.io/
Organizers
Dongchao Yang
The Chinese University of Hong Kong, Hong Kong
Challenge Description
With the increasing prevalence of speech-enabled applications, smart home technology has become commonplace in numerous households. For the general people, waking up and controlling smart devices is no longer a difficult task. However, individuals with dysarthria face significant challenges in utilizing these technologies due to the inherent variability in their speech. Dysarthria is a motor speech disorder commonly associated with conditions such as cerebral palsy, Parkinson’s disease, amyotrophic lateral sclerosis and stroke. Dysarthric speakers often experience pronounced difficulties in articulation, fluency, speech rate, volume, and clarity. Consequently, their speech is difficult to comprehend for commercially available smart devices. Moreover, dysarthric individuals frequently encounter additional motor impairments, further complicating their ability to manipulate household devices. As a result, voice control has emerged as an ideal solution for enabling dysarthria speakers to execute various commands, assisting them in leading simple and independent lives.
This challenge seeks to solve speaker-dependent wake-up spotting tasks by utilizing a small amount of wake-up word audio of the specific person. This research has the potential to not only enhance the quality of life for individuals with dysarthria, but also facilitate smart devices in better accommodating diverse user requirements, making it a truly universal technology. We hope that the challenge will raise awareness about dysarthria and encourage greater participation in related research endeavors. By doing so, we are committed to promoting awareness and understanding of dysarthria in society and eliminating discrimination and prejudice against people with dysarthria.
Challenge Website
Organizers
Challenge Description
Following the success of the 1st FutureDial challenge, the 2nd FutureDial challenge aims to benchmark and stimulate research in building dialog systems with RAG, with the newly released dialog dataset, MobileCS2. We aim to create a forum to discuss key challenges in the field and share findings from real-world applications.
Challenge Website
Organizers