Follow
Xiang Yin
Xiang Yin
Bytedance AI Lab
Verified email at bytedance.com
Title
Cited by
Cited by
Year
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models
R Huang, J Huang, D Yang, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, Z Zhao
International Conference on Machine Learning, 13916-13932, 2023
2172023
Bytesing: A chinese singing voice synthesis system using duration allocated encoder-decoder acoustic models and wavernn vocoders
Y Gu, X Yin, Y Rao, Y Wan, B Tang, Y Zhang, J Chen, Y Wang, Z Ma
2021 12th International Symposium on Chinese Spoken Language Processing …, 2021
862021
Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias
Z Jiang, Y Ren, Z Ye, J Liu, C Zhang, Q Yang, S Ji, R Huang, C Wang, ...
arXiv preprint arXiv:2306.03509, 2023
502023
Ppg-based singing voice conversion with adversarial representation learning
Z Li, B Tang, X Yin, Y Wan, L Xu, C Shen, Z Ma
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
392021
A unified sequence-to-sequence front-end model for mandarin text-to-speech synthesis
J Pan, X Yin, Z Zhang, S Liu, Y Zhang, Z Ma, Y Wang
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
342020
Make-an-audio 2: Temporal-enhanced text-to-audio generation
J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ...
arXiv preprint arXiv:2305.18474, 2023
332023
Towards realistic visual dubbing with heterogeneous sources
T Xie, L Liao, C Bi, B Tang, X Yin, J Yang, M Wang, J Yao, Y Zhang, Z Ma
Proceedings of the 29th ACM International Conference on Multimedia, 1739-1747, 2021
332021
Modeling F0 trajectories in hierarchically structured deep neural networks
X Yin, M Lei, Y Qian, FK Soong, L He, ZH Ling, LR Dai
Speech Communication 76, 82-92, 2016
322016
A hybrid text normalization system using multi-head self-attention for mandarin
J Zhang, J Pan, X Yin, C Li, S Liu, Y Zhang, Y Wang, Z Ma
ICASSP 2020-2020 IEEE international conference on acoustics, speech and …, 2020
252020
Clinical efficacy of bone cement-injectable cannulated pedicle screw short segment fixation for lumbar spondylolisthesis with osteoporosise
Y Liu, J Xiao, X Yin, M Liu, J Zhao, P Liu, F Dai
Scientific reports 10 (1), 3929, 2020
252020
Mega-tts 2: Zero-shot text-to-speech with arbitrary length speech prompts
Z Jiang, J Liu, Y Ren, J He, C Zhang, Z Ye, P Wei, C Wang, X Yin, Z Ma, ...
arXiv preprint arXiv:2307.07218, 2023
212023
Cross-speaker emotion transfer based on speaker condition layer normalization and semi-supervised training in text-to-speech
P Wu, J Pan, C Xu, J Zhang, L Wu, X Yin, Z Ma
arXiv preprint arXiv:2110.04153, 2021
192021
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation
Z Ye, J He, Z Jiang, R Huang, J Huang, J Liu, Y Ren, X Yin, Z Ma, Z Zhao
arXiv preprint arXiv:2305.00787, 2023
182023
Biomechanical influence of the surgical approaches, implant length and density in stabilizing ankylosing spondylitis cervical spine fracture
Y Liu, Z Wang, M Liu, X Yin, J Liu, J Zhao, P Liu
Scientific reports 11 (1), 6023, 2021
182021
Towards high-fidelity singing voice conversion with acoustic reference and contrastive predictive coding
C Wang, Z Li, B Tang, X Yin, Y Wan, Y Yu, Z Ma
arXiv preprint arXiv:2110.04754, 2021
172021
Virtual try-on with pose-garment keypoints guided inpainting
Z Li, P Wei, X Yin, Z Ma, AC Kot
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
162023
Fine-Grained Prosody Modeling in Neural Speech Synthesis Using ToBI Representation.
Y Zou, S Liu, X Yin, H Lin, C Wang, H Zhang, Z Ma
Interspeech, 3146-3150, 2021
162021
Clapspeech: Learning prosody from text context with contrastive language-audio pre-training
Z Ye, R Huang, Y Ren, Z Jiang, J Liu, J He, X Yin, Z Zhao
arXiv preprint arXiv:2305.10763, 2023
152023
A chapter-wise understanding system for text-to-speech in Chinese novels
J Pan, L Wu, X Yin, P Wu, C Xu, Z Ma
ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021
142021
Real3d-portrait: One-shot realistic 3d talking portrait synthesis
Z Ye, T Zhong, Y Ren, J Yang, W Li, J Huang, Z Jiang, J He, R Huang, ...
arXiv preprint arXiv:2401.08503, 2024
132024
The system can't perform the operation now. Try again later.
Articles 1–20