Heng Zhou
Logo USTC
Logo Shanghai AI Laboratory

I am currently an intern at the Shanghai Artificial Intelligence Laboratory, focusing on multi-agent systems, reinforcement learning, and LLM alignment. I am a Ph.D. student at the University of Science and Technology of China (USTC), advised by Prof. Wanli Ouyang and Prof. Lei Bai.

Research interests: machine learning, multimodal learning, AI agents, robotics.


Education
  • USTC
    USTC
    Ph.D. Student, advised by Prof. Wanli Ouyang and Prof. LEI BAI
    Sep. 2025 - present
Experience
  • Shanghai AI Laboratory
    Shanghai AI Laboratory
    Research Intern, advised by Prof. LEI BAI
    Oct. 2024 - present
Honors & Awards
  • Beijing Merit Student
    2025
  • National Scholarship
    2023
Selected Publications (view all )
Ego to World: Collaborative Spatial Reasoning in Embodied Systems via Reinforcement Learning

Heng Zhou, Li Kang, Yiran Qin, Xiufeng Song, Ao Yu, Zilu Zhang, Haoming Song, Kaixin Xu, Yuchen Fan, Dongzhan Zhou, Xiaohong Liu, Ruimao Zhang, Philip Torr, Lei Bai#, Zhenfei Yin# (# corresponding author)

preprint

Ego to World: Collaborative Spatial Reasoning in Embodied Systems via Reinforcement Learning

Heng Zhou, Li Kang, Yiran Qin, Xiufeng Song, Ao Yu, Zilu Zhang, Haoming Song, Kaixin Xu, Yuchen Fan, Dongzhan Zhou, Xiaohong Liu, Ruimao Zhang, Philip Torr, Lei Bai#, Zhenfei Yin# (# corresponding author)

preprint

Reading ≠ Seeing: Diagnosing and Closing the Typography Gap in Vision-Language Models
Reading ≠ Seeing: Diagnosing and Closing the Typography Gap in Vision-Language Models

Heng Zhou, Ao Yu, Li Kang, Yuchen Fan, Yutao Fan, Xiufeng Song, Hejia Geng, Yiran Qin

preprint

Reading ≠ Seeing: Diagnosing and Closing the Typography Gap in Vision-Language Models

Heng Zhou, Ao Yu, Li Kang, Yuchen Fan, Yutao Fan, Xiufeng Song, Hejia Geng, Yiran Qin

preprint

Toward Efficient Agents: Memory, Tool Learning, and Planning
Toward Efficient Agents: Memory, Tool Learning, and Planning

Xiaofang Yang*, Lijun Li*, Heng Zhou*, Tong Zhu*, Xiaoye Qu, Yuchen Fan, Qianshan Wei, Rui Ye, Li Kang, Yiran Qin, Zhiqiang Kou, Daizong Liu, Qi Li, Ning Ding, Siheng Chen, Jing Shao (* equal contribution)

preprint

Toward Efficient Agents: Memory, Tool Learning, and Planning

Xiaofang Yang*, Lijun Li*, Heng Zhou*, Tong Zhu*, Xiaoye Qu, Yuchen Fan, Qianshan Wei, Rui Ye, Li Kang, Yiran Qin, Zhiqiang Kou, Daizong Liu, Qi Li, Ning Ding, Siheng Chen, Jing Shao (* equal contribution)

preprint

LiveSearchBench: An Automatically Constructed Benchmark for Retrieval and Reasoning over Dynamic Knowledge
LiveSearchBench: An Automatically Constructed Benchmark for Retrieval and Reasoning over Dynamic Knowledge

Heng Zhou, Ao Yu, Yuchen Fan, Jianing Shi, Li Kang, Hejia Geng, Yongting Zhang, Yutao Fan, Yuhao Wu, Tiancheng He, Yiran Qin, Lei Bai#, Zhenfei Yin# (# corresponding author)

preprint

LiveSearchBench: An Automatically Constructed Benchmark for Retrieval and Reasoning over Dynamic Knowledge

Heng Zhou, Ao Yu, Yuchen Fan, Jianing Shi, Li Kang, Hejia Geng, Yongting Zhang, Yutao Fan, Yuhao Wu, Tiancheng He, Yiran Qin, Lei Bai#, Zhenfei Yin# (# corresponding author)

preprint

The landscape of agentic reinforcement learning for llms: A survey
The landscape of agentic reinforcement learning for llms: A survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin#, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai# (* equal contribution, # corresponding author)

Transactions on Machine Learning Research (TMLR)

The landscape of agentic reinforcement learning for llms: A survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin#, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai# (* equal contribution, # corresponding author)

Transactions on Machine Learning Research (TMLR)

SSRL: Self-Search Reinforcement Learning
SSRL: Self-Search Reinforcement Learning

Yuchen Fan*, Kaiyan Zhang*, Heng Zhou*, Yuxin Zuo, Yanxu Chen, Yu Fu, Xinwei Long, Xuekai Zhu, Che Jiang, Yuchen Zhang, Li Kang, Gang Chen, Cheng Huang, Zhizhou He, Bingning Wang, Lei Bai#, Ning Ding#, Bowen Zhou# (* equal contribution, # corresponding author)

Under review.

SSRL: Self-Search Reinforcement Learning

Yuchen Fan*, Kaiyan Zhang*, Heng Zhou*, Yuxin Zuo, Yanxu Chen, Yu Fu, Xinwei Long, Xuekai Zhu, Che Jiang, Yuchen Zhang, Li Kang, Gang Chen, Cheng Huang, Zhizhou He, Bingning Wang, Lei Bai#, Ning Ding#, Bowen Zhou# (* equal contribution, # corresponding author)

Under review.

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai# (* equal contribution, # corresponding author)

EMNLP 2025 main Oral paper, SAC Highlight Award, (Top 1%)

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai# (* equal contribution, # corresponding author)

EMNLP 2025 main Oral paper, SAC Highlight Award, (Top 1%)

All publications