Welcome to my academic homepage. I am a Research Scientist at BIGAI. Previously, I received my Ph.D. in Computer Science at University of California, Los Angeles (UCLA). I went to Tsinghua University for undergraduate study in Computer Science.

I work on multimodal learning for understanding, reasoning and skill learning. In particular, I'm interested in building models/agents that can learn from 2D/3D vision and text data, and perform a wide range of reasoning and embodied control tasks. Some of my research keywords can be found below:

  • Multimodal learning: Vision and language, Visual reasoning, 3D vision, Generalist models
  • Representation learning: Zero-shot and few-shot learning, Generative model
  • Embodied agents: Reinforcement learning and imitation, Robotics, Sensor fusion
Email: jeasinema [at] gmail [dot] com / Google Scholar / LinkedIn

News


Selected Publications


Preprint

Zihao Wang, Shaofei Cai, Anji Liu, Xiaojian Ma, Yitao Liang
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
arXiv preprint / arXiv / Project / Code 
Best paper award, ICML-23 TEACH Workshop

Sirui Xie, Xiaojian Ma, Peiyu Yu, Yixin Zhu, Ying Nian Wu and Song-Chun Zhu
HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving
arXiv preprint / Paper / Click Here to Play HALMA! / arXiv 


Conference

Ziyu Zhu, Xiaojian Ma, Yixin Chen, Zhidong Deng, Siyuan Huang, Qing Li
3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment
ICCV 2023 / arXiv / Project / Code 
A new 3D-Language foundation model

Shaofei Cai, Zihao Wang, Xiaojian Ma, Anji Liu, Yitao Liang
Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction
CVPR 2023 / arXiv / Project / Code 

Xiaojian Ma, Silong Yong, Zilong Zheng, Qing Li, Yitao Liang, Song-Chun Zhu, Siyuan Huang
SQA3D: Situated Question Answering in 3D Scenes
ICLR 2023 / Paper / arXiv / Slides / Project / Code / Benchmark 
A new quest: embodied scene understanding

Peyi Yu, Sirui Xie, Xiaojian Ma, Baoxiong Jia, Bo Pang, Ruiqi Gao, Yixin Zhu, Song-Chun Zhu and Ying Nian Wu
Latent Diffusion Energy-Based Model for Interpretable Text Modeling
ICML 2022 / Paper / arXiv / Code 

Huaizu Jiang*, Xiaojian Ma*, Weili Nie, Zhiding Yu, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
CVPR 2022 / Paper / Poster / Slides / Project / arXiv / Code / Bibtex 
Oral presentation

Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar
RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
ICLR 2022 / Paper / Poster / Slides / Project / OpenReview / arXiv / Code / Bibtex 

Peyi Yu, Sirui Xie, Xiaojian Ma, Yixin Zhu, Ying Nian Wu and Song-Chun Zhu
Unsupervised Foreground Extraction via Deep Region Competition
NeurIPS 2021 / Paper / arXiv / Code 

Mingxuan Jing, Wenbing Huang, Fuchun Sun, Xiaojian Ma, Tao Kong, Chuang Gan and Lei Li
Adversarial Option-Aware Hierarchical Imitation Learning
ICML 2021 / Paper / arXiv / Code 
Spotlight presentation

Hongzhuo Liang, Chuangchuang Zhou, Shuang Li, Xiaojian Ma, Norman Hendrich, Timo Gerkmann, Fuchun Sun and Jianwei Zhang
Robust Robotic Pouring using Audition and Haptics
IROS 2020 / Paper / Project Page / arXiv / Code / Video 
Oral presentation

Shuang Li, Jiaxi Jiang, Philipp Ruppel, Hongzhuo Liang, Xiaojian Ma, Norman Hendrich, Fuchun Sun and Jianwei Zhang
A Mobile Robot Hand-Arm Teleoperation System by Vision and IMU
IROS 2020 / Paper / Project Page / arXiv / Code / Video 
Oral presentation

Mark Edmonds, Xiaojian Ma, Siyuan Qi, Yixin Zhu, Hongjing Lu and Song-Chun Zhu
Theory-based Causal Transfer: Integrating Instance-level Induction and Abstract-level Structure Learning
AAAI 2020 / Paper / Project Page / arXiv / Code 
Oral presentation

Xiaojian Ma*, Mingxuan Jing*, Wenbing Huang, Fuchun Sun, Bin Fang and Huaping Liu
Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance
AAAI 2020 / Paper / Project Page / arXiv / Code 
also in Structure & Priors in Reinforcement Learning Workshop @ ICLR 2019

Xiaojian Ma*, Chao Yang*, Wenbing Huang*, Fuchun Sun, Huaping Liu, Junzhou Huang and Chuang Gan,
Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement
NeurIPS 2019 / Paper / Project Page / arXiv / Code 
Spotlight presentation

Hongzhuo Liang, Shuang Li, Xiaojian Ma, Norman Hendrich, Timo Gerkmann, Fuchun Sun and Jianwei Zhang
Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring
IROS 2019 / Paper / Project Page / arXiv / Code / Video 

Xiaojian Ma*, Hongzhuo Liang*, Shuang Li, Michael Görner, Song Tang, Bin Fang, Fuchun Sun and Jianwei Zhang
PointNetGPD: Detecting Grasp Configurations from Point Sets
ICRA 2019 / Paper / Project Page / arXiv / Code / Video 

Xiaojian Ma*, Shuang Li*, Hongzhuo Liang, Michael Görner, Philipp Ruppel, Bin Fang, Fuchun Sun and Jianwei Zhang
Vision-based Teleoperation of Shadow Dexterous Hand using End-to-End Deep Neural Network
ICRA 2019 / Paper / Project Page / arXiv / Code / Video 

Xiaojian Ma*, Mingxuan Jing*, Wenbing Huang, Fuchun Sun and Huaping Liu
Task Transfer by Preference-Based Cost Learning
AAAI 2019 / Paper / Project Page / arXiv / Code 
Spotlight presentation

Experience


Professional Service



Contact


Beijing Institute for General Artificial Intelligence (BIGAI)
Wei Lai Ke Ji Da Sha
No.2 Yi He Yuan Rd.
Beijing, China 100080
jeasinema [at] gmail [dot] com
[Google Scholar]  |  [GitHub]  


© Xiaojian Ma 2023