Follow
Yu Gu
Title
Cited by
Cited by
Year
ByteSing: A chinese singing voice synthesis system using duration allocated encoder-decoder acoustic models and WaveRNN vocoders
Y Gu, X Yin, Y Rao, Y Wan, B Tang, Y Zhang, J Chen, Y Wang, Z Ma
2021 12th International Symposium on Chinese Spoken Language Processing …, 2021
862021
Waveform modeling and generation using hierarchical recurrent neural networks for speech bandwidth extension
ZH Ling, Y Ai, Y Gu, LR Dai
IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (5), 883-894, 2018
812018
Speech bandwidth extension using bottleneck features and deep recurrent neural networks.
Y Gu, ZH Ling, LR Dai
Interspeech, 297-301, 2016
572016
A Kinect based gesture recognition algorithm using GMM and HMM
Y Song, Y Gu, P Wang, Y Liu, A Li
2013 6th International Conference on Biomedical Engineering and Informatics …, 2013
342013
Waveform Modeling Using Stacked Dilated Convolutional Neural Networks for Speech Bandwidth Extension.
Y Gu, ZH Ling
INTERSPEECH, 1123-1127, 2017
292017
Human action recognition based on depth images from microsoft kinect
T Liu, Y Song, Y Gu, A Li
2013 Fourth Global Congress on Intelligent Systems, 200-204, 2013
292013
Multi-task WaveNet: A multi-task generative model for statistical parametric speech synthesis without fundamental frequency conditions
Y Gu, Y Kang
arXiv preprint arXiv:1806.08619, 2018
232018
Restoring high frequency spectral envelopes using neural networks for speech bandwidth extension
Y Gu, ZH Ling
2015 International Joint Conference on Neural Networks (IJCNN), 1-8, 2015
112015
Video-to-audio generation with hidden alignment
M Xu, C Li, Y Ren, R Chen, Y Gu, W Liang, D Yu
arXiv preprint arXiv:2407.07464, 2024
52024
Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation
Q Zhu, J Zhang, Y Gu, Y Hu, L Dai
Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 19768 …, 2024
52024
Rep2wav: Noise robust text-to-speech using self-supervised representations
Q Zhu, Y Gu, R Chen, C Weng, Y Hu, L Dai, J Zhang
arXiv preprint arXiv:2308.14553, 2023
42023
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance
S Chen, Y Gu, J Zhang, N Li, R Chen, L Chen, L Dai
arXiv preprint arXiv:2406.05325, 2024
32024
Sifisinger: A High-Fidelity End-to-End Singing Voice Synthesizer Based on Source-Filter Model
J Cui, Y Gu, C Weng, J Zhang, L Chen, L Dai
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
22024
DurIAN-E: Duration informed attention network for expressive text-to-speech synthesis
Y Gu, Y Bian, G Lei, C Weng, D Su
arXiv preprint arXiv:2309.12792, 2023
22023
Speech vocoder based on deep convolutional neural networks
HC Wu, Y Gu, ZH Ling
Proc. of the 14th National Conference on Man-Machine Speech Communicationn …, 2017
2*2017
LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with Inference Acceleration via Latent Consistency Distillation
S Chen, Y Gu, J Cui, J Zhang, R Chen, L Dai
arXiv preprint arXiv:2408.12354, 2024
12024
Eeg2vec: Self-Supervised Electroencephalographic Representation Learning
Q Zhu, X Zhao, J Zhang, Y Gu, C Weng, Y Hu
arXiv preprint arXiv:2305.13957, 2023
12023
Video-to-Audio Generation with Fine-grained Temporal Semantics
Y Hu, Y Gu, C Li, R Chen, D Yu
arXiv preprint arXiv:2409.14709, 2024
2024
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Y Ren, C Li, M Xu, W Liang, Y Gu, R Chen, D Yu
arXiv preprint arXiv:2409.08601, 2024
2024
Opine: Leveraging a Optimization-Inspired Deep Unfolding Method for Multi-Channel Speech Enhancement
A Li, R Chen, Y Gu, C Weng, D Su
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20