I am Xiangtai Li, I work on computer vision, multi-modal learning and related problems.

I am working as a Research Scientist in Tiktok (Bytedance), Singapore.

Our team works on the application and research on Tiktok Live. Topics cover multi-modal large langauge models, diffusion models and LLM reasoning.

Previously, I worked as a Research Fellow at MMLab@NTU, S-Lab advised by Prof.Chen Change Loy.

I obtained my PhD degree at Peking University (PKU) under the supervision of Prof.Yunhai Tong, and my bachelor’s degree at Beijing University of Posts and Telecommunications (BUPT).

My research topics are:

1,Multi-modal learning with LLMs (MLLM): Unified modeling, Benchmarking.

2, Image/video generation and editing, Controllable image/video generation.

3, Multi-modal agent system design.

I serve as a regular reviewer for lots of conference and journals, including CVPR, ICCV, ECCV, ICLR, AAAI, NeurIPS, ICML, IJCAI, IEEE-TIP, IEEE-TPAMI, IJCV, IEEE-TSCVT, IEEE-TMM, IEEE-TGRS, Remote Sensing.

I also serve as an area chair for ICLR-2025, ICML-2025, ICCV-2025, NeurIPS-2025.

I am looking for several self-motivated research interns on MLLMs and Diffusion model backgrounds.

(My E-mail is xiangtai94@gmail.com and lxtpku@pku.edu.cn).