Follow
Haohan Guo
Title
Cited by
Cited by
Year
A new gan-based end-to-end tts training algorithm
H Guo, FK Soong, L He, L Xie
INTERSPEECH, 2019
582019
Conversational end-to-end tts for voice agents
H Guo, S Zhang, FK Soong, L He, L Xie
2021 IEEE Spoken Language Technology Workshop (SLT), 403-409, 2021
572021
Exploiting syntactic features in a parsed tree to improve end-to-end TTS
H Guo, FK Soong, L He, L Xie
INTERSPEECH, 2019
372019
Improving adversarial waveform generation based singing voice conversion with harmonic signals
H Guo, Z Zhou, F Meng, K Liu
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
142022
Feature reinforcement with word embedding and parsing information in neural TTS
H Ming, L He, H Guo, FK Soong
arXiv preprint arXiv:1901.00707, 2019
142019
Phonetic posteriorgrams based many-to-many singing voice conversion via adversarial training
H Guo, H Lu, N Hu, C Zhang, S Yang, L Xie, D Su, D Yu
arXiv preprint arXiv:2012.01837, 2020
102020
A multi-stage multi-codebook VQ-VAE approach to high-performance neural TTS
H Guo, F Xie, FK Soong, X Wu, H Meng
arXiv preprint arXiv:2209.10887, 2022
72022
MSMC-TTS: Multi-stage multi-codebook VQ-VAE based neural TTS
H Guo, F Xie, X Wu, FK Soong, H Meng
IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 1811-1824, 2023
62023
A multi-scale time-frequency spectrogram discriminator for GAN-based non-autoregressive TTS
H Guo, H Lu, X Wu, H Meng
arXiv preprint arXiv:2203.01080, 2022
52022
BASE TTS: Lessons from building a billion-parameter text-to-speech model on 100K hours of data
M Łajszczak, G Cámbara, Y Li, F Beyhan, A van Korlaar, F Yang, A Joly, ...
arXiv preprint arXiv:2402.08093, 2024
22024
QS-TTS: towards semi-supervised text-to-speech synthesis via vector-quantized self-supervised speech representation learning
H Guo, F Xie, J Kang, Y Xiao, X Wu, H Meng
arXiv preprint arXiv:2309.00126, 2023
12023
Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations
H Guo, F Xie, X Wu, H Lu, H Meng
arXiv preprint arXiv:2210.15131, 2022
12022
Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations
H Lu, X Wu, H Guo, S Liu, Z Wu, H Meng
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
2024
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition
J Kang, L Meng, M Cui, H Guo, X Wu, X Liu, H Meng
arXiv preprint arXiv:2401.04152, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–14