Seguir
Runlong Zhou
Runlong Zhou
Paul G. Allen School of Computer Science & Engineering, University of Washington
Dirección de correo verificada de cs.washington.edu - Página principal
Título
Citado por
Citado por
Año
Stochastic shortest path: Minimax, parameter-free and towards horizon-free regret
J Tarbouriech, R Zhou, SS Du, M Pirotta, M Valko, A Lazaric
Advances in neural information processing systems 34, 6843-6855, 2021
312021
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
R Zhou, R Wang, SS Du
International Conference on Machine Learning, 42698-42723, 2023
7*2023
Sharp variance-dependent bounds in reinforcement learning: Best of both worlds in stochastic and deterministic environments
R Zhou, Z Zhang, SS Du
International Conference on Machine Learning, 42878-42914, 2023
72023
Understanding curriculum learning in policy optimization for solving combinatorial optimization problems
R Zhou, Y Tian, Y Wu, SS Du
arXiv preprint arXiv:2202.05423, 2022
4*2022
Free from bellman completeness: Trajectory stitching via model-based return-conditioned supervised learning
Z Zhou, C Zhu, R Zhou, Q Cui, A Gupta, SS Du
arXiv preprint arXiv:2310.19308, 2023
12023
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
R Zhou, SS Du, B Li
arXiv preprint arXiv:2402.12621, 2024
2024
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–6