Heng Zhou
Logo USTC
Logo Shanghai AI Laboratory

I am currently an intern at the Shanghai Artificial Intelligence Laboratory, focusing on multi-agent systems, reinforcement learning, and LLM alignment. I am a Ph.D. student at the University of Science and Technology of China (USTC) in a joint training program with the Shanghai AI Lab, advised by Prof. Wanli Ouyang and Prof. Lei Bai.

Research interests: machine learning, multimodal learning, AI agents, robotics.


Education
  • USTC & PJLAB
    USTC & PJLAB
    Ph.D. Student, advised by Prof. Wanli Ouyang and Prof. LEI BAI
    Sep. 2025 - present
Experience
  • Shanghai AI Laboratory
    Shanghai AI Laboratory
    Research Intern, advised by Prof. LEI BAI
    Oct. 2024 - present
Honors & Awards
  • Beijing Merit Student
    2025
  • National Scholarship
    2023
Selected Publications (view all )
The landscape of agentic reinforcement learning for llms: A survey
The landscape of agentic reinforcement learning for llms: A survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin#, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai# (* equal contribution, # corresponding author)

preprint

The landscape of agentic reinforcement learning for llms: A survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin#, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai# (* equal contribution, # corresponding author)

preprint

SSRL: Self-Search Reinforcement Learning
SSRL: Self-Search Reinforcement Learning

Yuchen Fan*, Kaiyan Zhang*, Heng Zhou*, Yuxin Zuo, Yanxu Chen, Yu Fu, Xinwei Long, Xuekai Zhu, Che Jiang, Yuchen Zhang, Li Kang, Gang Chen, Cheng Huang, Zhizhou He, Bingning Wang, Lei Bai#, Ning Ding#, Bowen Zhou# (* equal contribution, # corresponding author)

Under review.

SSRL: Self-Search Reinforcement Learning

Yuchen Fan*, Kaiyan Zhang*, Heng Zhou*, Yuxin Zuo, Yanxu Chen, Yu Fu, Xinwei Long, Xuekai Zhu, Che Jiang, Yuchen Zhang, Li Kang, Gang Chen, Cheng Huang, Zhizhou He, Bingning Wang, Lei Bai#, Ning Ding#, Bowen Zhou# (* equal contribution, # corresponding author)

Under review.

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai# (* equal contribution, # corresponding author)

EMNLP 2025 main

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai# (* equal contribution, # corresponding author)

EMNLP 2025 main

All publications