Bytesing: A chinese singing voice synthesis system using duration allocated encoder-decoder acoustic models and wavernn vocoders Y Gu, X Yin, Y Rao, Y Wan, B Tang, Y Zhang, J Chen, Y Wang, Z Ma 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 86 | 2021 |
Ppg-based singing voice conversion with adversarial representation learning Z Li, B Tang, X Yin, Y Wan, L Xu, C Shen, Z Ma ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 39 | 2021 |
Towards realistic visual dubbing with heterogeneous sources T Xie, L Liao, C Bi, B Tang, X Yin, J Yang, M Wang, J Yao, Y Zhang, Z Ma Proceedings of the 29th ACM International Conference on Multimedia, 1739-1747, 2021 | 33 | 2021 |
Towards high-fidelity singing voice conversion with acoustic reference and contrastive predictive coding C Wang, Z Li, B Tang, X Yin, Y Wan, Y Yu, Z Ma arXiv preprint arXiv:2110.04754, 2021 | 17 | 2021 |
Improving accent conversion with reference encoder and end-to-end text-to-speech W Li, B Tang, X Yin, Y Zhao, W Li, K Wang, H Huang, Y Wang, Z Ma arXiv preprint arXiv:2005.09271, 2020 | 13 | 2020 |
Pedestrian intrusion detection based on improved GMM and SVM M Zhang, JS Jin, M Wang, B Tang, Y Zheng 2016 13th international computer conference on wavelet active media …, 2016 | 6 | 2016 |
Application of pronunciation knowledge on phoneme recognition by lstm neural network B Zhang, Y Gan, Y Song, B Tang 2016 23rd International Conference on Pattern Recognition (ICPR), 2906-2911, 2016 | 5 | 2016 |
TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection J Liu, Z Su, H Huang, C Wan, Q Wang, J Hong, B Tang, F Zhu arXiv preprint arXiv:2306.15212, 2023 | 3 | 2023 |
Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling Q Wang, H Huang, M Wang, Y Dai, J Zhong, B Tang arXiv preprint arXiv:2404.09192, 2024 | | 2024 |
Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP J Zhong, Y Li, H Huang, K Richmond, J Liu, Z Su, J Guo, B Tang, F Zhu arXiv preprint arXiv:2309.05423, 2023 | | 2023 |
CPNet: Exploiting CLIP-based Attention Condenser and Probability Map Guidance for High-fidelity Talking Face Generation J Xu, B Tang, M Wang, M Li, M Ma 2023 IEEE International Conference on Multimedia and Expo (ICME), Brisbane …, 2023 | | 2023 |
Towards Using Clothes Style Transfer for Scenario-Aware Person Video Generation J Xu, B Tang, M Wang, S Bian, W Guo, X Yin, Z Ma ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | | 2022 |