Follow
Hao Tan
Hao Tan
Adobe Research
Verified email at adobe.com - Homepage
Title
Cited by
Cited by
Year
Lxmert: Learning cross-modality encoder representations from transformers
H Tan, M Bansal
Proceedings of the 2019 Conference on Empirical Methods in Natural Language …, 2019
22672019
Unifying vision-and-language tasks via text generation
J Cho, J Lei, H Tan, M Bansal
International Conference on Machine Learning, 1931-1942, 2021
4302021
How much can clip benefit vision-and-language tasks?
S Shen, LH Li, H Tan, M Bansal, A Rohrbach, KW Chang, Z Yao, ...
arXiv preprint arXiv:2107.06383, 2021
3392021
Learning to navigate unseen environments: Back translation with environmental dropout
H Tan, L Yu, M Bansal
arXiv preprint arXiv:1904.04195, 2019
2892019
A joint speaker-listener-reinforcer model for referring expressions
L Yu, H Tan, M Bansal, TL Berg
Proceedings of the IEEE conference on computer vision and pattern …, 2017
2842017
Vokenization: Improving language understanding with contextualized, visual-grounded supervision
H Tan, M Bansal
arXiv preprint arXiv:2010.06775, 2020
1162020
Vimpac: Video pre-training via masked token prediction and contrastive learning
H Tan, J Lei, T Wolf, M Bansal
arXiv preprint arXiv:2106.11250, 2021
582021
Envedit: Environment editing for vision-and-language navigation
J Li, H Tan, M Bansal
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
562022
Enabling robots to understand incomplete natural language instructions using commonsense reasoning
H Chen, H Tan, A Kuntz, M Bansal, R Alterovitz
2020 IEEE International Conference on Robotics and Automation (ICRA), 1963-1969, 2020
502020
Diagnosing the environment bias in vision-and-language navigation
Y Zhang, H Tan, M Bansal
arXiv preprint arXiv:2005.03086, 2020
482020
Lrm: Large reconstruction model for single image to 3d
Y Hong, K Zhang, J Gu, S Bi, Y Zhou, D Liu, F Liu, K Sunkavalli, T Bui, ...
arXiv preprint arXiv:2311.04400, 2023
432023
The curse of performance instability in analysis datasets: Consequences, source, and suggestions
X Zhou, Y Nie, H Tan, M Bansal
arXiv preprint arXiv:2004.13606, 2020
372020
Expressing visual relationships via language
H Tan, F Dernoncourt, Z Lin, T Bui, M Bansal
arXiv preprint arXiv:1906.07689, 2019
372019
An Effective Framework for Weakly-Supervised Phrase Grounding
Q Wang, H Tan, S Shen, M Mahoney, Z Yao
Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020
33*2020
Instant3d: Fast text-to-3d with sparse-view generation and large reconstruction model
J Li, H Tan, K Zhang, Z Xu, F Luan, Y Xu, Y Hong, K Sunkavalli, ...
arXiv preprint arXiv:2311.06214, 2023
322023
Improving cross-modal alignment in vision language navigation via syntactic information
J Li, H Tan, M Bansal
arXiv preprint arXiv:2104.09580, 2021
302021
Vidlankd: Improving language understanding via video-distilled knowledge transfer
Z Tang, J Cho, H Tan, M Bansal
Advances in Neural Information Processing Systems 34, 24468-24481, 2021
252021
Dmv3d: Denoising multi-view diffusion using 3d large reconstruction model
Y Xu, H Tan, F Luan, S Bi, P Wang, J Li, Z Shi, K Sunkavalli, G Wetzstein, ...
arXiv preprint arXiv:2311.09217, 2023
222023
Modality-balanced models for visual dialogue
H Kim, H Tan, M Bansal
Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 8091-8098, 2020
212020
Documentclip: Linking figures and main body text in reflowed documents
F Liu, H Tan, C Tensmeyer
arXiv preprint arXiv:2306.06306, 2023
172023
The system can't perform the operation now. Try again later.
Articles 1–20