2025

The landscape of agentic reinforcement learning for llms: A survey
The landscape of agentic reinforcement learning for llms: A survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin#, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai# (* equal contribution, # corresponding author)

preprint

The landscape of agentic reinforcement learning for llms: A survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin#, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai# (* equal contribution, # corresponding author)

preprint

SSRL: Self-Search Reinforcement Learning
SSRL: Self-Search Reinforcement Learning

Yuchen Fan*, Kaiyan Zhang*, Heng Zhou*, Yuxin Zuo, Yanxu Chen, Yu Fu, Xinwei Long, Xuekai Zhu, Che Jiang, Yuchen Zhang, Li Kang, Gang Chen, Cheng Huang, Zhizhou He, Bingning Wang, Lei Bai#, Ning Ding#, Bowen Zhou# (* equal contribution, # corresponding author)

Under review.

SSRL: Self-Search Reinforcement Learning

Yuchen Fan*, Kaiyan Zhang*, Heng Zhou*, Yuxin Zuo, Yanxu Chen, Yu Fu, Xinwei Long, Xuekai Zhu, Che Jiang, Yuchen Zhang, Li Kang, Gang Chen, Cheng Huang, Zhizhou He, Bingning Wang, Lei Bai#, Ning Ding#, Bowen Zhou# (* equal contribution, # corresponding author)

Under review.

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai# (* equal contribution, # corresponding author)

EMNLP 2025 main

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai# (* equal contribution, # corresponding author)

EMNLP 2025 main

2024

SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset
SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset

Yubin Hu*, Kairui Wen*, Heng Zhou, Xiaoyang Guo, Yong-Jin Liu# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS)

SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset

Yubin Hu*, Kairui Wen*, Heng Zhou, Xiaoyang Guo, Yong-Jin Liu# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS)