RobustLR: A diagnostic benchmark for evaluating logical robustness of deductive reasoners S Sanyal, Z Liao, X Ren Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 16* | 2022 |
Chatcounselor: A large language models for mental health support JM Liu, D Li, H Cao, T Ren, Z Liao, J Wu arXiv preprint arXiv:2309.15461, 2023 | 13 | 2023 |
In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search H Li, Y Ning, Z Liao, S Wang, XL Li, X Lu, F Brahman, W Zhao, Y Choi, ... arXiv preprint arXiv:2311.07237, 2023 | 1 | 2023 |
Introducing v0. 5 of the ai safety benchmark from mlcommons B Vidgen, A Agrawal, AM Ahmed, V Akinwande, N Al-Nuaimi, N Alfaraj, ... arXiv preprint arXiv:2404.12241, 2024 | | 2024 |
AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs Z Liao, H Sun arXiv preprint arXiv:2404.07921, 2024 | | 2024 |
AttributionBench: How Hard is Automatic Attribution Evaluation? Y Li, X Yue, Z Liao, H Sun arXiv preprint arXiv:2402.15089, 2024 | | 2024 |
A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents L Mo, Z Liao, B Zheng, Y Su, C Xiao, H Sun arXiv preprint arXiv:2402.10196, 2024 | | 2024 |
In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Induced Search H Li, Z Liao, Y Ning, S Wang, XL Li, X Lu, F Brahman, W Zhao, Y Choi, ... | | |