Runlong Zhou

20212022202320243 12 27 8

Acceso público

2 artículos

0 artículos

disponibles

no disponibles

Basado en requisitos de financiación

Simon Shaolei DuAssistant Professor, School of Computer Science and Engineering, University of WashingtonDirección de correo verificada de cs.washington.edu
Michal ValkoLlama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMindDirección de correo verificada de meta.com
Matteo PirottaResearch Scientist, Meta (FAIR)Dirección de correo verificada de fb.com
Jean TarbouriechGoogle DeepMindDirección de correo verificada de google.com
Alessandro LazaricResearch Scientist, Facebook Artificial Intelligence ResearchDirección de correo verificada de inria.fr
Ruosong WangPhD Student, Carnegie Mellon UniversityDirección de correo verificada de andrew.cmu.edu
Yuandong TianResearch Scientist, Meta AI (FAIR)Dirección de correo verificada de fb.com
Yi WuInstitute for Interdisciplinary Information Sciences, Tsinghua UniversityDirección de correo verificada de mail.tsinghua.edu.cn
Zhang ZihanTsinghua UniversityDirección de correo verificada de mails.tsinghua.edu.cn

Runlong Zhou

Paul G. Allen School of Computer Science & Engineering, University of Washington

Dirección de correo verificada de cs.washington.edu - Página principal


Título Ordenar por citas Ordenar por año Ordenar por título	Citado por Citado por	Año
Stochastic shortest path: Minimax, parameter-free and towards horizon-free regret J Tarbouriech, R Zhou, SS Du, M Pirotta, M Valko, A Lazaric Advances in neural information processing systems 34, 6843-6855, 2021	31	2021
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes R Zhou, R Wang, SS Du International Conference on Machine Learning, 42698-42723, 2023	7*	2023
Sharp variance-dependent bounds in reinforcement learning: Best of both worlds in stochastic and deterministic environments R Zhou, Z Zhang, SS Du International Conference on Machine Learning, 42878-42914, 2023	7	2023
Understanding curriculum learning in policy optimization for solving combinatorial optimization problems R Zhou, Y Tian, Y Wu, SS Du arXiv preprint arXiv:2202.05423, 2022	4*	2022
Free from bellman completeness: Trajectory stitching via model-based return-conditioned supervised learning Z Zhou, C Zhu, R Zhou, Q Cui, A Gupta, SS Du arXiv preprint arXiv:2310.19308, 2023	1	2023
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs R Zhou, SS Du, B Li arXiv preprint arXiv:2402.12621, 2024		2024

El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.

Artículos 1–6

Citas por año