2026

LFQA-E: Carefully Benchmarking Long-form QA Evaluation

Yuchen Fan, Chen Lin, Xin Zhong, Shuo Zhang, Heng Zhou, Yuchen Zhang, Mingyu Liang, Chengxing Xie, Ermo Hua, Gang Chen, Zhizhou He, Cheng Huang, Ning Ding, Bowen Zhou

International Conference on Learning Representations (ICLR) 2026

LFQA-E: Carefully Benchmarking Long-form QA Evaluation

Yuchen Fan, Chen Lin, Xin Zhong, Shuo Zhang, Heng Zhou, Yuchen Zhang, Mingyu Liang, Chengxing Xie, Ermo Hua, Gang Chen, Zhizhou He, Cheng Huang, Ning Ding, Bowen Zhou

International Conference on Learning Representations (ICLR) 2026

Ego to World: Collaborative Spatial Reasoning in Embodied Systems via Reinforcement Learning

Heng Zhou, Li Kang, Yiran Qin, Xiufeng Song, Ao Yu, Zilu Zhang, Haoming Song, Kaixin Xu, Yuchen Fan, Dongzhan Zhou, Xiaohong Liu, Ruimao Zhang, Philip Torr, Lei Bai#, Zhenfei Yin# (# corresponding author)

preprint

Ego to World: Collaborative Spatial Reasoning in Embodied Systems via Reinforcement Learning

Heng Zhou, Li Kang, Yiran Qin, Xiufeng Song, Ao Yu, Zilu Zhang, Haoming Song, Kaixin Xu, Yuchen Fan, Dongzhan Zhou, Xiaohong Liu, Ruimao Zhang, Philip Torr, Lei Bai#, Zhenfei Yin# (# corresponding author)

preprint

Reading ≠ Seeing: Diagnosing and Closing the Typography Gap in Vision-Language Models
Reading ≠ Seeing: Diagnosing and Closing the Typography Gap in Vision-Language Models

Heng Zhou, Ao Yu, Li Kang, Yuchen Fan, Yutao Fan, Xiufeng Song, Hejia Geng, Yiran Qin

preprint

Reading ≠ Seeing: Diagnosing and Closing the Typography Gap in Vision-Language Models

Heng Zhou, Ao Yu, Li Kang, Yuchen Fan, Yutao Fan, Xiufeng Song, Hejia Geng, Yiran Qin

preprint

DIVA: Discrete Diffusion Vision-Language-Action Models for Parallelized Action Generation

Xiufeng Song, Yiran Qin, Yan Tai, Li Kang, Heng Zhou, Siqi Luo, Jiwen Yu, Ling Yang, Philip Torr, Lei Bai

preprint

DIVA: Discrete Diffusion Vision-Language-Action Models for Parallelized Action Generation

Xiufeng Song, Yiran Qin, Yan Tai, Li Kang, Heng Zhou, Siqi Luo, Jiwen Yu, Ling Yang, Philip Torr, Lei Bai

preprint

From Perception to Action: An Interactive Benchmark for Vision Reasoning
From Perception to Action: An Interactive Benchmark for Vision Reasoning

Yuhao Wu, Maojia Song, Yihuai Lan, Lei Wang, Zhiqiang Hu, Yao Xiao, Heng Zhou, Weihua Zheng, Dylan Raharja, Soujanya Poria, Roy Ka-Wei Lee

preprint

From Perception to Action: An Interactive Benchmark for Vision Reasoning

Yuhao Wu, Maojia Song, Yihuai Lan, Lei Wang, Zhiqiang Hu, Yao Xiao, Heng Zhou, Weihua Zheng, Dylan Raharja, Soujanya Poria, Roy Ka-Wei Lee

preprint

RoboMonster: Compositional Generalization of Heterogeneous Multi-End Effector Embodied Agents

Yiran Qin, Zhemeng Zhang, Heng Zhou, Li Kang, Bruno NY Chen, Ximeng Meng, Xiufeng Song, Jiahua Ma, Zhenfei Yin, Xiaohong Liu

preprint

RoboMonster: Compositional Generalization of Heterogeneous Multi-End Effector Embodied Agents

Yiran Qin, Zhemeng Zhang, Heng Zhou, Li Kang, Bruno NY Chen, Ximeng Meng, Xiufeng Song, Jiahua Ma, Zhenfei Yin, Xiaohong Liu

preprint

Building Scalable Real-World Robot Data Generation via Compositional Simulation

Yiran Qin, Jiahua Ma, Li Kang, Wenzhan Li, Xiufeng Song, Heng Zhou, Jiwen Yu, Zhenfei Yin, Xihui Liu, Philip Torr, Yilun Du, Ruimao Zhang

preprint

Building Scalable Real-World Robot Data Generation via Compositional Simulation

Yiran Qin, Jiahua Ma, Li Kang, Wenzhan Li, Xiufeng Song, Heng Zhou, Jiwen Yu, Zhenfei Yin, Xihui Liu, Philip Torr, Yilun Du, Ruimao Zhang

preprint

State Rank Dynamics in Linear Attention LLMs
State Rank Dynamics in Linear Attention LLMs

Ao Sun, Hongtao Zhang, Heng Zhou, Yixuan Ma, Yiran Qin, Tongrui Su, Yan Liu, Zhanyu Ma, Jun Xu, Jiuchong Gao, Jinghua Hao, Renqing He

preprint

State Rank Dynamics in Linear Attention LLMs

Ao Sun, Hongtao Zhang, Heng Zhou, Yixuan Ma, Yiran Qin, Tongrui Su, Yan Liu, Zhanyu Ma, Jun Xu, Jiuchong Gao, Jinghua Hao, Renqing He

preprint

Advances and Innovations in the Multi-Agent Robotic System (MARS) Challenge
Advances and Innovations in the Multi-Agent Robotic System (MARS) Challenge

Li Kang*, Heng Zhou*, Xiufeng Song*, Rui Li*, Bruno NY Chen, Ziye Wang, Ximeng Meng, Stone Tao, Yiran Qin, Xiaohong Liu, Ruimao Zhang, Lei Bai, Yilun Du, Hao Su, Philip Torr, Zhenfei Yin, Ruihao Gong, Yejun Zeng, Fengjun Zhong, Shenghao Jin, Jinyang Guo, Xianglong Liu, Xiaojun Jia, Tianqi Shan, Wenqi Ren, Simeng Qin, Jialing Yang, Xiaoyu Ma, Tianxing Chen, Zixuan Li, Zijian Cai, Yan Qin, Yusen Qin, Qiangyu Chen, Kaixuan Wang, Zhaoming Han, Yao Mu, Ping Luo, Yuanqi Yao, Haoming Song, Jan-Nico Zaech, Fabien Despinoy, Danda Pani Paudel, Luc Van Gool (* equal contribution)

preprint

Advances and Innovations in the Multi-Agent Robotic System (MARS) Challenge

Li Kang*, Heng Zhou*, Xiufeng Song*, Rui Li*, Bruno NY Chen, Ziye Wang, Ximeng Meng, Stone Tao, Yiran Qin, Xiaohong Liu, Ruimao Zhang, Lei Bai, Yilun Du, Hao Su, Philip Torr, Zhenfei Yin, Ruihao Gong, Yejun Zeng, Fengjun Zhong, Shenghao Jin, Jinyang Guo, Xianglong Liu, Xiaojun Jia, Tianqi Shan, Wenqi Ren, Simeng Qin, Jialing Yang, Xiaoyu Ma, Tianxing Chen, Zixuan Li, Zijian Cai, Yan Qin, Yusen Qin, Qiangyu Chen, Kaixuan Wang, Zhaoming Han, Yao Mu, Ping Luo, Yuanqi Yao, Haoming Song, Jan-Nico Zaech, Fabien Despinoy, Danda Pani Paudel, Luc Van Gool (* equal contribution)

preprint

Toward Efficient Agents: Memory, Tool Learning, and Planning
Toward Efficient Agents: Memory, Tool Learning, and Planning

Xiaofang Yang*, Lijun Li*, Heng Zhou*, Tong Zhu*, Xiaoye Qu, Yuchen Fan, Qianshan Wei, Rui Ye, Li Kang, Yiran Qin, Zhiqiang Kou, Daizong Liu, Qi Li, Ning Ding, Siheng Chen, Jing Shao (* equal contribution)

preprint

Toward Efficient Agents: Memory, Tool Learning, and Planning

Xiaofang Yang*, Lijun Li*, Heng Zhou*, Tong Zhu*, Xiaoye Qu, Yuchen Fan, Qianshan Wei, Rui Ye, Li Kang, Yiran Qin, Zhiqiang Kou, Daizong Liu, Qi Li, Ning Ding, Siheng Chen, Jing Shao (* equal contribution)

preprint

2025

LiveSearchBench: An Automatically Constructed Benchmark for Retrieval and Reasoning over Dynamic Knowledge
LiveSearchBench: An Automatically Constructed Benchmark for Retrieval and Reasoning over Dynamic Knowledge

Heng Zhou, Ao Yu, Yuchen Fan, Jianing Shi, Li Kang, Hejia Geng, Yongting Zhang, Yutao Fan, Yuhao Wu, Tiancheng He, Yiran Qin, Lei Bai#, Zhenfei Yin# (# corresponding author)

preprint

LiveSearchBench: An Automatically Constructed Benchmark for Retrieval and Reasoning over Dynamic Knowledge

Heng Zhou, Ao Yu, Yuchen Fan, Jianing Shi, Li Kang, Hejia Geng, Yongting Zhang, Yutao Fan, Yuhao Wu, Tiancheng He, Yiran Qin, Lei Bai#, Zhenfei Yin# (# corresponding author)

preprint

Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning
Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning

Zelin Tan, Hejia Geng, Xiaohang Yu, Mulei Zhang, Guancheng Wan, Yifan Zhou, Qiang He, Xiangyuan Xue, Heng Zhou, Yutao Fan, Zhongzhi Li, Zaibin Zhang, Guibin Zhang, Chen Zhang, Zhenfei Yin, Philip Torr, Lei Bai

preprint

Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning

Zelin Tan, Hejia Geng, Xiaohang Yu, Mulei Zhang, Guancheng Wan, Yifan Zhou, Qiang He, Xiangyuan Xue, Heng Zhou, Yutao Fan, Zhongzhi Li, Zaibin Zhang, Guibin Zhang, Chen Zhang, Zhenfei Yin, Philip Torr, Lei Bai

preprint

The landscape of agentic reinforcement learning for llms: A survey
The landscape of agentic reinforcement learning for llms: A survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin#, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai# (* equal contribution, # corresponding author)

Transactions on Machine Learning Research (TMLR)

The landscape of agentic reinforcement learning for llms: A survey

Guibin Zhang*, Hejia Geng*, Xiaohang Yu*, Zhenfei Yin#, Zaibin Zhang, Zelin Tan, Heng Zhou, Zhongzhi Li, Xiangyuan Xue, Yijiang Li, Yifan Zhou, Yang Chen, Chen Zhang, Yutao Fan, Zihu Wang, Songtao Huang, Yue Liao, Hongru Wang, Mengyue Yang, Heng Ji, Michael Littman, Jun Wang, Shuicheng Yan, Philip Torr, Lei Bai# (* equal contribution, # corresponding author)

Transactions on Machine Learning Research (TMLR)

SSRL: Self-Search Reinforcement Learning
SSRL: Self-Search Reinforcement Learning

Yuchen Fan*, Kaiyan Zhang*, Heng Zhou*, Yuxin Zuo, Yanxu Chen, Yu Fu, Xinwei Long, Xuekai Zhu, Che Jiang, Yuchen Zhang, Li Kang, Gang Chen, Cheng Huang, Zhizhou He, Bingning Wang, Lei Bai#, Ning Ding#, Bowen Zhou# (* equal contribution, # corresponding author)

Under review.

SSRL: Self-Search Reinforcement Learning

Yuchen Fan*, Kaiyan Zhang*, Heng Zhou*, Yuxin Zuo, Yanxu Chen, Yu Fu, Xinwei Long, Xuekai Zhu, Che Jiang, Yuchen Zhang, Li Kang, Gang Chen, Cheng Huang, Zhizhou He, Bingning Wang, Lei Bai#, Ning Ding#, Bowen Zhou# (* equal contribution, # corresponding author)

Under review.

VeriGUI: Verifiable Long-Chain GUI Dataset
VeriGUI: Verifiable Long-Chain GUI Dataset

Shunyu Liu, Minghao Liu, Huichi Zhou, Zhenyu Cui, Yang Zhou, Yuhao Zhou, Jialiang Gao, Heng Zhou, Yunhao Yang, Wendong Fan, Puzhen Zhang, Ge Zhang, Jiajun Shi, Weihao Xuan, Jiaxing Huang, Shuang Luo, Fang Wu, Heli Qi, Qingcheng Zeng, Junjie Wang, Aosong Feng, Jindi Lv, Sicong Jiang, Ziqi Ren, Wangchunshu Zhou, Zhenfei Yin, Wenlong Zhang, Guohao Li, Wenhao Yu, Lei Ma, Lei Bai, Qunshu Lin, Mingli Song, Dacheng Tao

preprint

VeriGUI: Verifiable Long-Chain GUI Dataset

Shunyu Liu, Minghao Liu, Huichi Zhou, Zhenyu Cui, Yang Zhou, Yuhao Zhou, Jialiang Gao, Heng Zhou, Yunhao Yang, Wendong Fan, Puzhen Zhang, Ge Zhang, Jiajun Shi, Weihao Xuan, Jiaxing Huang, Shuang Luo, Fang Wu, Heli Qi, Qingcheng Zeng, Junjie Wang, Aosong Feng, Jindi Lv, Sicong Jiang, Ziqi Ren, Wangchunshu Zhou, Zhenfei Yin, Wenlong Zhang, Guohao Li, Wenhao Yu, Lei Ma, Lei Bai, Qunshu Lin, Mingli Song, Dacheng Tao

preprint

Swarm Intelligence Enhanced Reasoning: A Density-Driven Framework for LLM-Based Multi-Agent Optimization
Swarm Intelligence Enhanced Reasoning: A Density-Driven Framework for LLM-Based Multi-Agent Optimization

Ying Zhu, Heng Zhou, Rui Su, Peiqin Zhuang, Lei Bai# (# corresponding author)

preprint

Swarm Intelligence Enhanced Reasoning: A Density-Driven Framework for LLM-Based Multi-Agent Optimization

Ying Zhu, Heng Zhou, Rui Su, Peiqin Zhuang, Lei Bai# (# corresponding author)

preprint

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang*, Xiufeng Song*, Heng Zhou*, Yiran Qin#, Jie Yang, Xiaohong Liu, Philip Torr, Lei Bai#, Zhenfei Yin# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS) 2025

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai# (* equal contribution, # corresponding author)

EMNLP 2025 main Oral paper, SAC Highlight Award, (Top 1%)

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

Heng Zhou*, Hejia Geng*, Xiangyuan Xue, Li Kang, Yiran Qin, Zhiyong Wang, Zhenfei Yin#, Lei Bai# (* equal contribution, # corresponding author)

EMNLP 2025 main Oral paper, SAC Highlight Award, (Top 1%)

2024

SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset
SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset

Yubin Hu*, Kairui Wen*, Heng Zhou, Xiaoyang Guo, Yong-Jin Liu# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS)

SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset

Yubin Hu*, Kairui Wen*, Heng Zhou, Xiaoyang Guo, Yong-Jin Liu# (* equal contribution, # corresponding author)

Annual Conference on Neural Information Processing Systems (NeurIPS)