Invited Speakers
Challenges and Progress in Automatic Speech-to-Speech Translation: Bridging the Gap to Real-Time Interpretation
Satoshi Nakamura – Monday (Dec 2)
Professor, The Chinese University of Hong Kong, Shenzhen
Abstract: Automatic speech-to-speech translation has long been a dream technology for humanity. Through numerous breakthroughs from years of research, we have now reached the stage where this service can be used on smartphones. However, there are still many challenges remaining before we can achieve translation quality comparable to that of a trained simultaneous interpreter. Key issues include how to translate across languages with different word orders, such as between English and Japanese, without waiting for the end of a sentence or utterance; how to balance latency and content fidelity in translation; and how to extract the speaker’s intent from their intonation. Having conducted research on speech translation over an extended period, I would like to reflect on some of the past progress and introduce ongoing research addressing these challenges.
Biography: Dr. Satoshi Nakamura is a full professor at The Chinese University of Hong Kong, Shenzhen. He is also a professor emeritus at Nara Institute of Science and Technology (NAIST) and Honorarprofessor of Karlsruhe Institute of Technology, Germany. He received his B.S. from Kyoto Institute of Technology in 1981 and Ph.D. from Kyoto University in 1992. He was an Associate Professor in the Graduate School of Information Science at NAIST from 1994-2000. He was Department head and Director of ATR Spoken Language Communication Research Laboratories in 2000-2004, and 2005-2008, respectively, and Vice president of ATR in 2007-2008. He was Director General of Keihanna Research Laboratories, National Institute of Information and Communications Technology, Japan 2009-2010. He was a full professor at Nara Institute of Science and Technology as a full professor from 2011 to 2024. His research interests include modeling and systems of spoken language processing, speech processing, spoken language translation, spoken dialog systems, natural language processing, and data science. He is one of the world leaders in speech-to-speech translation research. He has been serving various speech-to-speech translation research projects, including C-Star, A-Star, and the International Workshop on Spoken Language Translation IWSLT. He is currently the chairperson of ISCA SIG SLT (Spoken Language Translation). He also contributed to the standardization of the network-based speech translation at ITU-T. He was a committee member of IEEE SLTC 2016-2018. He was an Elected Board Member of the International Speech Communication Association, ISCA, from 2012 to 2019. He received the Antonio Zampolli Prize in 2012 and retained the title of IEEE Fellow, ISCA Fellow, IPSJ Fellow, and ATR Fellow.
Holistic Artificial Intelligence (HAI): From Big Models to Big Applications
Junlan Feng – Tuesday (Dec 3)
Chief Scientist, China Mobile Research Institute
Abstract: Today, we have access to hundreds of large generative models and millions of smaller, task-specific models. Despite this, we currently lack a mechanism that allows for flexible dispatching and composition of multiple models to tackle the myriad of intelligent tasks we encounter in real-world scenarios.
In this talk, Dr. Feng will unveil a novel framework, namely “Holistic Artificial Intelligence” (HAI), designed to alleviate the difficulty of AI service development by orchestrating available AI models. With HAI, a user or client can articulate his/her intelligent service requirement in a multitude of natural ways, such as through natural language, speech, illustrative images, action sequences, or a combination of these methods. The central control unit of HAI powered by foundation models maps this requirement to an execution plan consisting of a specific model or a cascaded process of multiple models. Additionally, it aligns these models with the most suitable computing and network facilities for efficient deployment. This talk will discuss the four major technical challenges that underpin the HAI framework, related work in the community, and her team’s explorations into this exciting field.
Biography: Dr. Junlan Feng, IEEE Fellow, Chief Scientist of China Mobile, Vice Chairman of the China Artificial Intelligence Industry Alliance, Vice Chair of Big Model Forum of CCF, Board Chair of Linux Foundation Network (2020-2023). Dr. Feng received her Ph.D. on Speech Recognition from Chinese Academy of Sciences, and had been a principal researcher at AT&T Labs Research from 2001 to 2013. Dr. Feng joined China Mobile Research in 2013. Since then, she has been leading the AI R&D of China Mobile – Jiutian. She has published over 170+ technical research papers, holds 100+ patents. Jiutian products and AI models under her leadership have been deployed to 4000+ production services and contributed yearly commercial value of 4.1 billion Chinese Yuan. Dr. Feng is a frequent reviewer and program member of major top AI conferences and journals.
Towards Safe, Truly Open, and Factual Large Language Models
Preslav Nakov – Wednesday (Dec 4)
Professor, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi
Abstract: We will discuss several initiatives towards safe, truly open, and factual large language models (LLMs). First, we will present Do-Not-Answer, a dataset for evaluating the guardrails of LLMs, which is at the core of the safety mechanisms incorporated in Jais, the world’s leading open Arabic-centric foundation and instruction-tuned large language model, and Nanda, our recently released open Hindi LLM. Next, we will discuss the LLM360 initiative of MBZUAI’s Institute on Foundation Models, aiming at developing fully transparent open-source LLMs. We will then examine the factuality challenges associated with large language models, and we will present some recent relevant tools for addressing these challenges developed at MBZUAI: (i) OpenFactCheck, a framework for fact-checking LLM output, for building customized fact-checking systems, and for benchmarking LLMs for factuality, (ii) LM-Polygraph, a tool for predicting an LLM’s uncertainty in its output using cheap and fast uncertainty quantification techniques, and (iii) LLM-DetectAIve, a tool for machine-generated text detection. Finally, we will conclude with a perspective on multimodal LLMs and some policy recommendations.
Biography: Preslav Nakov is Professor and Department Chair for NLP at the Mohamed bin Zayed University of Artificial Intelligence. He is part of the core team at MBZUAI’s Institute for Foundation Models that developed Jais, the world’s best open-source Arabic-centric LLM, Nanda, the world’s best Hindi model, and LLM360, the first truly open LLM. Previously, he was Principal Scientist at the Qatar Computing Research Institute, HBKU, where he led the Tanbih mega-project, developed in collaboration with MIT, which aims to limit the impact of “fake news”, propaganda and media bias by making users aware of what they are reading, thus promoting media literacy and critical thinking. He received his PhD degree in Computer Science from the University of California at Berkeley, supported by a Fulbright grant. He is Chair-Elect of the European Chapter of the Association for Computational Linguistics (EACL), Secretary of ACL SIGSLAV, and Secretary of the Truth and Trust Online board of trustees. Formerly, he was PC chair of ACL 2022, and President of ACL SIGLEX. He is also member of the editorial board of several journals including Computational Linguistics, TACL, ACM TOIS, IEEE TASL, IEEE TAC, CS&L, NLE, AI Communications, and Frontiers in AI. He authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and 250+ research papers. He received a Best Paper Award at ACM WebSci’2022, a Best Long Paper Award at CIKM’2020, a Best Resource Paper Award at EACL’2024, a Best Demo Paper Award (Honorable Mention) at ACL’2020, a Best Task Paper Award (Honorable Mention) at SemEval’2020, a Best Poster Award at SocInfo’2019, and the Young Researcher Award at RANLP’2011. He was also the first to receive the Bulgarian President’s John Atanasoff award, named after the inventor of the first automatic electronic digital computer. His research was featured by over 100 news outlets, including Reuters, Forbes, Financial Times, CNN, Boston Globe, Aljazeera, DefenseOne, Business Insider, MIT Technology Review, Science Daily, Popular Science, Fast Company, The Register, WIRED, and Engadget, among others.