Seguir
Ziang Song
Ziang Song
Dirección de correo verificada de stanford.edu
Título
Citado por
Citado por
Año
When can we learn general-sum Markov games with a large number of players sample-efficiently?
Z Song, S Mei, Y Bai
arXiv preprint arXiv:2110.04184, 2021
882021
Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent
Y Bai, C Jin, S Mei, Z Song, T Yu
Advances in Neural Information Processing Systems 35, 22313-22325, 2022
132022
Reward collapse in aligning large language models
Z Song, T Cai, JD Lee, WJ Su
arXiv preprint arXiv:2305.17608, 2023
122023
Sample-efficient learning of correlated equilibria in extensive-form games
Z Song, S Mei, Y Bai
Advances in Neural Information Processing Systems 35, 4099-4110, 2022
112022
Reward Collapse in Aligning Large Language Models: A Prompt-Aware Approach to Preference Rankings
Z Song, T Cai, JD Lee, WJ Su
ICML 2023 Workshop The Many Facets of Preference-Based Learning, 2023
12023
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–5