Xiao Ma (马骁)
I am a research scientist at TikTok. I obtained my PhD from National University of Singapore, advised by Prof. David Hsu. I also worked closely with Prof. Wee Sun Lee. I received my B.Sc. in Computer Science from Shanghai Jiao Tong University in 2017. Previously, I have spent wonderful time at Dyson Robot Learning Lab, led by Dr. Stephen James, at Sea AI Lab, hosted by Prof. Shuicheng Yan and Dr. Min Lin, and at SenseTime Research, hosted by Dr. Shuai Yi.
I'm broadly interested in robot learning and reinforcement learning.
Collaborations and discussions are welcomed! Please feel free to email me if you're interested.
Email  / 
CV  / 
Google Scholar  / 
Twitter  / 
Github
|
|
News
- [Sept. 2024] 1 paper accepted to CoRL 2024.
- [Jul. 2024] I joined TikTok as a research scientist.
- [Jun. 2024] 1 paper accepted to RA-L 2024.
- [Feb. 2024] 1 paper accepted to CVPR 2024.
- [Sept. 2023] 3 papers accepted to NeurIPS 2023.
- [Feb. 2023] 1 paper accepted to CVPR 2023.
- [Jan. 2023] 3 papers accepted to ICLR 2023 (1 oral 2 posters)!
- [Feb. 2022] G-DOOM for deformable object manipulation has been accepted to ICRA 2022!
- [May 2021] PROMPT for ab-initio object manipulation has been accepted by RSS 2021.
- [Oct. 2020] CVRL for model-based RL under complex observations has been accepted by CoRL 2020.
- [Sept. 2020] BALMS for long-tailed visual recognition has been accepted by NeurIPS 2020.
- [Jul. 2020] STAR for pedestrian trajectory prediction has been accepted by ECCV 2020.
- [Dec. 2019] DPFRL for reinforcement learning under complex and partial observations has been accepted by ICLR 2020.
- [Nov. 2019] PF-RNNs for sequence modeling under uncertainty has been accepted to AAAI 2020.
- [Jun. 2019] DAN was nominated for the best system paper and best student paper of RSS 2019!
|
Selected Publications (Full publication list)
|
|
BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark
Nikita Cherniadev*,
Nicholas Backshall*,
Xiao Ma*,
Yunfan Lu,
Younggyo Seo,
Stephen James (*equal contributions)
Conference on Robot Learning (CoRL), 2024  
project page
/
pdf
/
code
We present BiGym, a new benchmark and learning environment for mobile bi-manual demo-driven robotic manipulation. BiGym consists of 40 diverse tasks in home environment, and provides human-collected demonstrations.
|
|
Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation
Eugene Teoh*,
Sumit Patidar*,
Xiao Ma,
Stephen James (*equal contributions)
ArXiv Preprint, 2024  
project page
/
pdf
/
code
GreenAug provides a simple visual augmentation to robot policies by first collecting data with a green screen, then augmenting it with different textures. The resulting policy can be transferred to unseen visually distinct novel locations (scenes).
|
|
Redundancy-aware Action Spaces for Robot Learning
Pietro Mazzaglia*,
Nicholas Backshall*
Xiao Ma,
Stephen James (*equal contributions)
IEEE Robotics and Automation Letters (RA-L), 2024  
project page
/
pdf
/
code
We present a new family of action spaces that benefits from both the efficiency from the task space and the flexibility from the joint space for the robotic manipulation.
|
|
Hierarchical Diffusion Policy for Multi-Task Robotic Manipulation
Xiao Ma,
Sumit Patidar,
Iain Haughton,
Stephen James
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024  
project page
/
pdf
/
code
Hierarchical Diffusion Policy (HDP) factorises the policy space into a 1) high-level task-planning agent and 2) low-level goal-conditioned diffusion policy, which achieves both task-level generalisation and flexible low-level control.
|
|
InsActor: Instruction-driven Physics-based Characters
Jiawei Ren*,
Mingyuan Zhang,
Cunjun Yu*,
Xiao Ma,
Liang Pan,
Ziwei Liu,
Conference on Neural Information Processing Systems (NeurIPS), 2023  
project page
/
pdf
/
code
InsActor is a principled generative framework that leverages recent advancements in diffusion-based human motion models for physics-based human animation generation.
|
|
Efficient Diffusion Policies for Offline Reinforcement Learning
Bingyi Kang*,
Xiao Ma*,
Chao Du,
Tianyu Pang,
Shuicheng Yan (*equal contributions)
Conference on Neural Information Processing Systems (NeurIPS), 2023  
pdf
/
code
We introduce Efficient Diffusion Policies (EDPs), a more general, faster, and better diffusion policy class for offline RL. EDPs reduce the training time of DQL from 5 days to 5 hours!
|
|
Mutual Information Regularized Offline Reinforcement Learning
Xiao Ma*,
Bingyi Kang*,
Zhongwen Xu,
Min Lin,
Zhongwen Xu,
Shuicheng Yan (*equal contributions)
Conference on Neural Information Processing Systems (NeurIPS), 2023  
pdf
/
code
MISA is a general framework for offline RL motivated by mutual information estimation. We show that both Conservative Q Learning (CQL) and TD3+BC can be considered as its variants.
|
|
Imitation Learning via Differentiable Physics
Siwei Chen,
Xiao Ma,
Zhongwen Xu
Computer Vision and Pattern Recognition (CVPR), 2023  
pdf
/
code
/
bibtex
We present Imitation Learning via Differentiable Physics (ILD), which casts the imitation learning as a state-matching task through differentiable physics-based Chamfer distance loss. ILD significantly improves the sample efficiency and generalization of imitation learning algorithms with only one expert demonstration.
|
|
DaxBench: Benchmarking Deformable Object Manipulation with Differentiable Physics
Siwei Chen*,
Cunjun Yu*,
Yiqing Xu*,
Linfeng Li,
Xiao Ma,
Zhongwen Xu,
David Hsu
(*equal contributions)
International Conference on Learning Representations (ICLR), 2023   (Oral)
pdf
/
bibtex
We present DaXBench, a comprehensive benchmark for deformable object manipulation, including planning, imitation learning, and reinforcement learning, based on a scalable and differentiable physics simulator coded in JAX.
|
|
DiffMimic: Efficient Motion Mimicking with Differentiable Physics
Jiawei Ren*,
Cunjun Yu*,
Siwei Chen,
Xiao Ma,
Liang Pan,
Ziwei Liu,
(*equal contributions)
International Conference on Learning Representations (ICLR), 2023  
project page
/
pdf
/
code
/
live demo
/
bibtex
DiffMimic scales motion imitation for simulated characters with differentiable physics. Training controllers on large-scale motion database is more accessible with DiffMimic.
|
|
Learning Latent Graph Dynamics for Deformable Object Manipulation
Xiao Ma,
David Hsu,
Wee Sun Lee,
International Conference on Robotics and Automation (ICRA), 2022  
project page
/
pdf
/
bibtex
We present G-DOOM for deformable object manipulation. G-DOOM abstract an deformable object as a keypoint-based graph and models the spatio-temporal keypoint interactions with Recurrent Graph Dynamics. G-DOOM achieves SOTA performance on a set of deformable object manipulation tasks.
|
|
Ab Initio Particle-based Object Manipulation
Siwei Chen,
Xiao Ma,
Yunfan Lu,
David Hsu,
Robotics: Science and Systems (RSS), 2021  
project page
/
pdf
/
code
/
bibtex
This paper introduces PROMPT, a framework for particle-based object manipulation. PROMPT performs high-quality online point cloud reconstruction from multi-view images captured by an eye-in-hand camera. It achieves high performance in object grasping, pushing, and placing.
|
|