I am Xiangtai Li, I work on computer vision, multi-modal learning and related problems.
I am working as a Research Scientist in Tiktok (Bytedance), Singapore.
Our team works on the application and research on Tiktok Live. Topics cover multi-modal large langauge models, diffusion models and LLM reasoning.
Previously, I worked as a Research Fellow at MMLab@NTU, S-Lab advised by Prof.Chen Change Loy.
I obtained my PhD degree at Peking University (PKU) under the supervision of Prof.Yunhai Tong, and my bachelor’s degree at Beijing University of Posts and Telecommunications (BUPT).
My research topics are:
1,Multi-modal learning with LLMs (MLLM): Unified modeling, Benchmarking.
2, Image/video generation and editing, Controllable image/video generation.
3, Multi-modal agent system design.
I serve as a regular reviewer for lots of conference and journals, including CVPR, ICCV, ECCV, ICLR, AAAI, NeurIPS, ICML, IJCAI, IEEE-TIP, IEEE-TPAMI, IJCV, IEEE-TSCVT, IEEE-TMM, IEEE-TGRS, Remote Sensing.
I also serve as an area chair for ICLR-2025, ICML-2025, ICCV-2025, NeurIPS-2025.
I am looking for several self-motivated research interns on MLLMs and Diffusion model backgrounds.
(My E-mail is xiangtai94@gmail.com and lxtpku@pku.edu.cn).