I am Xiangtai Li. I work on computer vision, multi-modal learning, and related problems.
I am working as a Staff Research Scientist at TikTok (ByteDance), Singapore.
Our team works on applications and research for TikTok Live. Topics cover multi-modal large language models, diffusion models, and LLM reasoning.
Previously, I worked as a Research Fellow at MMLab@NTU, S-Lab, advised by Prof. Chen Change Loy.
I obtained my PhD degree from Peking University (PKU) under the supervision of Prof. Yunhai Tong, and my bachelorโs degree from Beijing University of Posts and Telecommunications (BUPT).
My research topics focus on three main aspects:
-
Multi-modal learning with LLMs (MLLM): unified modeling, benchmarking, dataset pipeline building, RL-based post-training.
-
Image/video generation and editing, controllable image/video generation.
Previously, I worked on image/video segmentation and detection, and open vocabulary learning.
Moreover, the code and models for my work (about 98%), including the projects I have deeply contributed to, are open-sourced on GitHub.
I serve as a regular reviewer for many conferences and journals, including CVPR, ICCV, ECCV, ICLR, AAAI, NeurIPS, ICML, IJCAI, IEEE-TIP, IEEE-TPAMI, IJCV, IEEE-TSCVT, IEEE-TMM, and IEEE-TGRS.
I also serve as an Area Chair for ICLR-2025/2026, CVPR-2026, ICML-2025, ICCV-2025, NeurIPS-2025, AAAI-2025/2026, WACV-2026, and ECCV-2026.
In addition, I also serve as an Associate Editor for T-PAMI.
I am looking for stronger interns with LLM/Diffusion model/infra background, location: Beijing and Singapore.
My email addresses are xiangtai94@gmail.com and xiangtai.li@bytedance.com. Welcome to discuss.
๐ Publications
* means equal contribution.
Recent Works
Several Other Previous works
Code can be found in this.
๐ Honors and Awards
-
National Scholarship, Ministry of Education of China in PKU (year 2020-2021) (year 2019-2020).
-
President Scholarship of PKU (year 2020-2021).
-
2017, 2022 Beijing Excellent Graduates.
-
2017, 2022 BUPT Excellent Graduates, PKU Excellent Graduates.
๐ Educations
-
2017.09 - 2022.07, PhD in Peking University (PKU).
-
2013.09 - 2017.07, Bachelor in Beijing University of Posts and Telecommunications (BUPT).
๐ฌ Invited Talks
-
2024.03 Invited talk on Open-Vocabulary Segmentation and Segment Anything at VALSE, online. Slide, Video.
-
2023.08 Invited talk on Video Segmentation at VALSE, online. Slides, Video.
-
2022.05 Invited talk on Panoptic Segmentation and Beyond in Baidu PaddleSeg Group.
-
2021.12 Invited talk on Video Segmentation in DiDi Auto-Driving Group.
-
2021.10 Invited talk on Aligned Segmentation HuaWei Noah Auto-Driving Group.
๐ป Internships and Work Experience
-
SenseTime Research, mentored by Dr. Guangliang Cheng and Dr. Jianping Shi.
-
JD AI Lab (remote cooperation), mentored by Dr. Yibo Yang and Prof. Dacheng Tao.
-
DeepMotion (Now Xiaomi Car), mentored by Dr. Kuiyuan Yang.
-
I was mentored by Dr.Kuiyuan Yang, Prof.Li Zhang, Dr.Guangliang Cheng, Dr.Yibo Yang, Prof.Dacheng Tao, Prof.Zhouchen Lin, Dr.Jiangmiao Pang during my PhD study.
-
I used to hold a research consultant at Shanghai AI lab, working with Dr.Yining Li, Dr.Kai Chen, Dr.Jingbo Wang, and Dr.Yanhong Zeng.