统计信号处理_声学前端:深度学习算法和传统信号处理方法各有千秋
在十年前,聲學前端(音頻前處理)還主要是基于傳統信號處理的方法,在很長的一段時間里,研究者們建立了一整套涵蓋單通道和多通道的語音增強、語音分離、回聲消除、聲源定位、波束形成等技術,這些技術許多都是基于最優線性自適應濾波理論的。最近幾年里,深度學習方法被引入到音頻前處理領域,并在多個任務中(比如語音分離和增強)性能超越傳統信號處理方法,展現了極大的潛力。不過到目前為止,我們看到兩種方法各有千秋。他們的主要區別有以下幾點:
兩種方法都有各自提升的空間。比如騰訊AI LAB最近分析了傳統信號處理方法做的假設和簡化,提出了一系列改進的方案,相對于原有信號處理方法能夠更好提升降噪和去混響的性能;我們也提出了一些結合傳統信號處理方法和深度學習方法的優化方案,結合兩者的長處,克服各自的短處,也取得了有意義的進展;我們還在結合多模態的語音分離/降噪技術中提出了一系列的新方法。
在即將到來的 CHIME 2020 (https://chimechallenge.github.io/chime2020-workshop/ )workshop上, 我們會介紹其中的一些進展。相關的論文如下:
有關語音分離/增強以及訓練準則:
? Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Dong Yu, “Neural Spatio-Temporal Filtering for Target Speech Separation”, submitted to Interspeech 2020
? Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu, "Enhancing End-To-End Multi-Channel Speech Separation via Spatial Feature Learning", ICASSP 2020
? Yong Xu, Chao Weng, Like Hui, Jianming Liu, Meng Yu, Dan Su, Dong Yu, "Joint Training of Complex Ratio Mask Based Beamformer And Acoustic Model for Noise Robust ASR", ICASSP 2019
? Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, and Dong Yu. "End-to-end multi-channel speech separation." arXiv preprint arXiv:1905.06286 (2019).
有關基于多模態的 Diarization和語音分離/提取/識別
? Rongzhi Gu, Shixiong Zhang, Yong Xu, Lianwu Chen, Yuexian Zou, Dong Yu, “Multi-modal Multi-channel Target Speech Separation”, IEEE Journal of Selected Topics in Signal Processing, 2020.
? Ke Tan, Yong Xu, Shixiong Zhang, Meng Yu, Dong Yu, “Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network”, IEEE Journal of Selected Topics in Signal Processing, 2020
? Jianwei Yu, Shixiong Zhang, Jian Wu, Shahram Ghorbani, Bo Wu, Shiyin Kang, Shansong Liu, Xunying Liu, Helen Meng, Dong Yu, "Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset", ICASSP 2020
? Yifan Ding, Yong Xu, Shi-Xiong Zhang, Yahuan Cong, and Liqiang Wang "Self-supervised learning for audio-visual speaker diarization." ICASSP 2020.
? Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu, "Time Domain Audio Visual Speech Separation", ASRU 2019
有關盲分離和只基于音頻的目標語音提取
? Meng Yu, Xuan Ji, Bo Wu, Dan Su, Dong Yu, “End-to-End Multi-Look Keyword Spotting”, submitted to Interspeech 2020
? Xuan Ji, Meng Yu, Jie Chen, Jimeng Zheng, Dan Su, Dong Yu, "Integration of Multi-Look Beamformers for Multi-Channel Keyword Spotting", ICASSP 2020
? Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, "Mixup-Breakdown: A Consistency Training Method for Improving Generalization of Speech Separation Models", ICASSP 2020.
? Xuan Ji, Meng Yu, Chunlei Zhang, Dan Su, Tao Yu, Xiaoyu Liu, Dong Yu, "Speaker-Aware Target Speaker Enhancement by Jointly Learning with Speaker Embedding Extraction", ICASSP 2020.
? Aswin Shanmugam Subramanian, Chao Weng, Meng Yu, Shi-Xiong Zhang, Yong Xu, Shinji Watanabe, Dong Yu, "Far-Field Location Guided Target Speech Extraction Using End-To-End Speech Recognition Objectives", ICASSP 2020
? Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John HL Hansen, and Dong Yu. "A Unified Framework for Speech Separation." in submission to Speech Communications (2019).
? Rongzhi Gu, Lianwu Chen, Shixiong Zhang, Jimeng Zheng, Meng Yu, Yong Xu, Dan Su, Yuexian Zou and Dong Yu, “Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information”, Interspeech 2019
? Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu and Dong Yu, “A comprehensive study of speech separation: spectrogram vs waveform separation”, Interspeech 2019
? Meng Yu, Xuan Ji, Yi Gao, Lianwu Chen, Jie Chen, Jimeng Zheng, Dan Su, Dong Yu, "Text-Dependent Speech Enhancement for Small-Footprint Robust Keyword Detection", Interspeech 2018.
? Jun Wang, Jie Chen, Dan Su, Lianwu Chen, Meng Yu, Yanmin Qian, Dong Yu, "Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures", Interspeech 2018
總結
以上是生活随笔為你收集整理的统计信号处理_声学前端:深度学习算法和传统信号处理方法各有千秋的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python死锁案例_python避免死
- 下一篇: 聚类算法 距离矩阵_模糊聚类算法