‪Zhexin Zhang‬ - ‪Google Scholar‬

Get my own profile

Cited by

	All	Since 2019
Citations	230	230
h-index	7	7
i10-index	7	7

0

120

60

30

90

20212022202320243 12 104 111

Public access

1 article

0 articles

available

not available

Based on funding mandates

Zhexin Zhang

Zhexin Zhang

Tsinghua University

Verified email at mails.tsinghua.edu.cn


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Safety assessment of chinese large language models H Sun, Z Zhang, J Deng, J Cheng, M Huang arXiv preprint arXiv:2304.10436, 2023	51	2023
OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics J Guan, Z Zhang, Z Feng, Z Liu, W Ding, X Mao, C Fan, M Huang ACL 2021, 2021	39	2021
Safetybench: Evaluating the safety of large language models with multiple choice questions Z Zhang, L Lei, L Wu, R Sun, Y Huang, C Long, X Liu, X Lei, J Tang, ... arXiv preprint arXiv:2309.07045, 2023	35	2023
Recent advances towards safe, responsible, and moral dialogue systems: A survey J Deng, H Sun, Z Zhang, J Cheng, M Huang arXiv preprint arXiv:2302.09270 1, 2023	26	2023
Defending large language models against jailbreaking attacks through goal prioritization Z Zhang, J Yang, P Ke, M Huang arXiv preprint arXiv:2311.09096, 2023	22	2023
Unveiling the implicit toxicity in large language models J Wen, P Ke, H Sun, Z Zhang, C Li, J Bai, M Huang arXiv preprint arXiv:2311.17391, 2023	13	2023
Persona-Guided Planning for Controlling the Protagonist's Persona in Story Generation Z Zhang, J Wen, J Guan, M Huang NAACL 2022, 2022	13	2022
Ethicist: Targeted training data extraction through loss smoothed soft prompting and calibrated confidence estimation Z Zhang, J Wen, M Huang arXiv preprint arXiv:2307.04401, 2023	7	2023
MoralDial: A framework to train and evaluate moral dialogue systems via moral discussions H Sun, Z Zhang, F Mi, Y Wang, W Liu, J Cui, B Wang, Q Liu, M Huang arXiv preprint arXiv:2212.10720, 2022	7	2022
Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation Z Zhang, J Cheng, H Sun, J Deng, F Mi, Y Wang, L Shang, M Huang EMNLP 2022 Findings, 2022	6	2022
Automatic comment generation for Chinese student narrative essays Z Zhang, J Guan, G Xu, Y Tian, M Huang Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022	4	2022
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors Z Zhang, Y Lu, J Ma, D Zhang, R Li, P Ke, H Sun, L Sha, Z Sui, H Wang, ... arXiv preprint arXiv:2402.16444, 2024	2	2024
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning Z Zhang, J Cheng, H Sun, J Deng, M Huang Findings of the Association for Computational Linguistics: EMNLP 2023, 10421 …, 2023	2	2023
Enhancing Offensive Language Detection with Data Augmentation and Knowledge Distillation J Deng, Z Chen, H Sun, Z Zhang, J Wu, S Nakagawa, F Ren, M Huang Research 6, 0189, 2023	1	2023
Selecting Stickers in Open-Domain Dialogue through Multitask Learning Z Zhang, Y Zhu, Z Fei, J Zhang, J Zhou ACL 2022 Findings, 2022	1	2022
Moraldial: A framework to train and evaluate moral dialogue systems via constructing moral discussions H Sun, Z Zhang, F Mi, Y Wang, W Liu, J Cui, B Wang, Q Liu, M Huang arXiv preprint arXiv:2212.10720, 2022	1	2022
Self-Supervised Sentence Polishing by Adding Engaging Modifiers Z Zhang, J Guan, X Cui, Y Ran, B Liu, M Huang Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023		2023

The system can't perform the operation now. Try again later.

Articles 1–17