Zhaoheng Ni

prof_pic.jpg

I’m a senior research scientist at Meta Reality Labs working on generative models for audio and text. Previously, I was a maintainer of TorchAudio library, the official audio library of PyTorch. Before Meta, I was a PhD student advised by Michael I Mandel and an undergraduate student advised by Yan Xu.

My research interests are single-channel/multi-channel speech enhancement, generative models, and natual language prorcessing.

News

Apr 11, 2024 Our MMS paper has been accepted by Journal of Machine Learning Research!
Feb 26, 2024 Checkout the demo videos and paper of our FoleyGen model!
Dec 13, 2023 Five papers have beed accepted by ICASSP 2024!
Sep 22, 2023 Our TorchAudio 2.1 paper has been accepted by ASRU 2023!
May 17, 2023 One paper has been accepted by Interspeech 2023!

Selected Publications

  1. TorchAudio: Building Blocks for Audio and Speech Processing
    Yao-Yuan Yang ,  Moto Hira ,  Zhaoheng Ni ,  Artyom Astafurov ,  Caroline Chen ,  Christian Puhrsch ,  David Pollack ,  Dmitriy Genzel ,  Donny Greenberg ,  Edward Z Yang ,  and  others
    In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2022
  2. WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation
    Zhaoheng Ni ,  Yong Xu ,  Meng Yu ,  Bo Wu ,  Shixiong Zhang ,  Dong Yu ,  and  Michael I Mandel
    In 2021 IEEE Spoken Language Technology Workshop (SLT) , 2021
  3. TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio
    Anurag Kumar ,  Ke Tan ,  Zhaoheng Ni ,  Pranay Manocha ,  Xiaohui Zhang ,  Ethan Henderson ,  and  Buye Xu
    In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , 2023
  4. Scaling Speech Technology to 1,000+ Languages
    Vineel Pratap ,  Andros Tjandra ,  Bowen Shi ,  Paden Tomasello ,  Arun Babu ,  Sayani Kundu ,  Ali Elkahky ,  Zhaoheng Ni ,  Apoorv Vyas ,  Maryam Fazel-Zarandi ,  and  others
    arXiv preprint arXiv:2305.13516, 2023