Follow
Yang Zhao
Title
Cited by
Cited by
Year
Where does it exist: Spatio-temporal video grounding for multi-form sentences
Z Zhang, Z Zhao, Y Zhao, Q Wang, H Liu, L Gao
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020
932020
Discriminative and Correlative Partial Multi-Label Learning.
H Wang, W Liu, Y Zhao, C Zhang, T Hu, G Chen
IJCAI, 3691-3697, 2019
832019
Cascaded prediction network via segment tree for temporal video grounding
Y Zhao, Z Zhao, Z Zhang, Z Lin
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
722021
Bubogpt: Enabling visual grounding in multi-modal llms
Y Zhao, Z Lin, D Zhou, Z Huang, J Feng, B Kang
arXiv preprint arXiv:2307.08581, 2023
482023
Chat-3d: Data-efficiently tuning large language model for universal dialogue of 3d scenes
Z Wang, H Huang, Y Zhao, Z Zhang, Z Zhao
arXiv preprint arXiv:2308.08769, 2023
122023
Learning from multi-dimensional partial labels
H Wang, W Liu, Y Zhao, T Hu, K Chen, G Chen
Proceedings of the Twenty-Ninth International Conference on International …, 2021
112021
Connecting Multi-modal Contrastive Representations
Z Wang, Y Zhao, X Cheng, H Huang, J Liu, L Tang, L Li, Y Wang, A Yin, ...
arXiv preprint arXiv:2305.14381, 2023
72023
Video-Guided Curriculum Learning for Spoken Video Grounding
Y Xia, Z Zhao, S Ye, Y Zhao, H Li, Y Ren
Proceedings of the 30th ACM International Conference on Multimedia, 5191-5200, 2022
62022
Scene-robust natural language video localization via learning domain-invariant representations
Z Wang, Y Zhao, H Huang, Y Xia, Z Zhao
Findings of the Association for Computational Linguistics: ACL 2023, 144-160, 2023
52023
Distilling coarse-to-fine semantic matching knowledge for weakly supervised 3d visual grounding
Z Wang, H Huang, Y Zhao, L Li, X Cheng, Y Zhu, A Yin, Z Zhao
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
52023
Towards effective multi-modal interchanges in zero-resource sounding object localization
Y Zhao, C Zhang, H Huang, H Li, Z Zhao
Advances in Neural Information Processing Systems 35, 38089-38102, 2022
52022
3drp-net: 3d relative position-aware network for 3d visual grounding
Z Wang, H Huang, Y Zhao, L Li, X Cheng, Y Zhu, A Yin, Z Zhao
arXiv preprint arXiv:2307.13363, 2023
42023
DATE: Domain Adaptive Product Seeker for E-commerce
H Li, H Jiang, T Jin, M Li, Y Chen, Z Lin, Y Zhao, Z Zhao
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
42023
Extending Multi-modal Contrastive Representations
Z Wang, Z Zhang, L Liu, Y Zhao, H Huang, T Jin, Z Zhao
arXiv preprint arXiv:2310.08884, 2023
12023
AntPivot: Livestream Highlight Detection via Hierarchical Attention Mechanism
Y Zhao, X Lin, W Xu, M Zheng, Z Liu, Z Zhao
arXiv preprint arXiv:2206.04888, 2022
12022
Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding
H Huang, Y Zhao, Z Wang, Y Xia, Z Zhao
arXiv preprint arXiv:2312.13633, 2023
2023
Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers
H Huang, Z Wang, R Huang, L Liu, X Cheng, Y Zhao, T Jin, Z Zhao
arXiv preprint arXiv:2312.08168, 2023
2023
AntCritic: Argument Mining for Free-Form and Visually-Rich Financial Comments
Y Zhao, W Xu, X Lin, J Huo, H Chen, Z Zhao
arXiv preprint arXiv:2208.09612, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–18