CORPORATE TALK SERIES
Kristen M. Altenburger
(Research Scientist - Core Data Science, Facebook Meta)
Bio: Kristen M. Altenburger (she/her/hers) is a Research Scientist on the Networks & Behavior group within Meta’s Core Data Science team and is a Non-Resident Fellow with the RegLab at Stanford Law School.
(Partner Applied Scientist Manager, Microsoft, USA)
Bio: Ross Cutler is a Partner Applied Scientist Manager at Microsoft in the IC3 group where he manages the IC3-AI team of applied scientists and software engineers with the focus of improving Teams/Skype audio/video quality and reliability and enabling new functionality with AI. He has been with Microsoft since 2000, joining as a researcher in Microsoft Research. He has published 60+ academic papers and has 100+ granted patents in the areas of computer vision, speech enhancement, machine learning, optics, and acoustics. Ross received his Ph.D. in Computer Science (2000) in the area of computer vision from the University of Maryland, College Park.
Title for Talk: Developing machine learning based speech enhancement models for Microsoft Teams and Skype
Abstract: Microsoft Teams and Skype are used daily by hundreds of millions of users and have become critical tools for working remotely and communicating with friends and family. In this talk we will describe how we are replacing traditional digital signal processing (DSP) speech enhancement components in those products with machine learning (ML) based models. We will describe how we have replaced the echo canceller, noise suppressor, packet loss concealment and added dereverberation. The new ML based models significantly outperform their old DSP components, but how we developed them is even more interesting. We used a Software 2.0 development methodology and created the first large scale datasets for training and testing these models, the first scalable systems to accurately label this type of data, and the first objective functions that are highly correlated to human perception to help train and evaluate these models. We also created 11 academic challenges at ICASSP and INTERSPEECH to engage with academic and industry researchers which significantly accelerated the process and raised the state of the art in these areas. We are applying this process to other areas such as ML based video codecs and ML based bandwidth control.
|Full Paper Submission:||12th September 2022|
|Acceptance Notification:||26th September 2022|
|Final Paper Submission:||5th October 2022|
|Early Bird Registration:||3rd October 2022|
|Presentation Submission:||6th October 2022|
|Conference:||12 - 15 October 2022|
• Conference Proceedings will be submitted for publication at IEEE Xplore® digital library .
• Best Paper Award will be given for each track.
• Conference Record No 53756