* denotes equal contribution, mentored students by me are underline.
2024
  1. OMG-Seg: Is One Model Good Enough For All Segmentation?
    Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy
    In: CVPR , 2024.
    Arxiv | Code Project Page
  2. Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively.
    Haobo Yuan, Xiangtai Li, Chong Zhou, Yining Li, Kai Chen, Chen Change Loy
    In: Arxiv , 2024.
    Arxiv | Code Project Page
  3. Point Could Mamba: Point Cloud Learning via State Space Model.
    Tao Zhang, Xiangtai Li, Haobo Yuan, Shunping Ji, Shuicheng Yan
    In: Arxiv , 2024.
    Arxiv | Code
  4. Towards Language-Driven Video Inpainting via Multimodal Large Language Models.
    Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy
    In: CVPR , 2024.
    Arxiv | Code Project Page
  5. RAP-SAM: Towards Real-Time All-Purpose Segment Anything.
    Shilin Xu, Haobo Yuan, Qingyu Shi, Lu Qi, Jingbo Wang, Yibo Yang, Yining Li, Kai Chen, Yunhai Tong, Bernard Ghanem, Xiangtai Li, Ming-Hsuan Yang
    In: Arxiv , 2024.
    Arxiv | Code Project Page

  6. Generalizable Entity Grounding via Assistance of Large Language Model
    Lu Qi, Yi-Wen Chen, Lehan Yang, Tiancheng Shen, Xiangtai Li , Weidong Guo, Yu Xu, Ming-Hsuan Yang
    In: Arxiv , 2024.
    Arxiv
  7. BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model.
    Yiran Song, Qianyu Zhou, Xiangtai Li, Deng-Ping Fan, Xuequan Lu, Lizhuang Ma
    In: CVPR , 2024.
    Arxiv | Code
  8. RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation.
    Peng Lu, Tao Jiang, Yining Li, Xiangtai Li, Kai Chen, Wenming Yang
    In: CVPR , 2024.
    Arxiv | Code
  9. A Generalist FaceX via Learning Unified Facial Representation.
    Yue Han, Jiangning Zhang, Junwei Zhu, Xiangtai Li, Yanhao Ge, Wei Li, Chengjie Wang, Yong Liu, Xiaoming Liu, Ying Tai
    In: Arxiv (In Submission) , 2024.
    Arxiv | Code
  10. Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning.
    Xinshun Wang, Zhongbin Fang, Xia Li, Xiangtai Li, Chen Chen, Mengyuan Liu
    In: CVPR , 2024.
    Arxiv | Code
  11. CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction.
    Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Xiangtai Li, Wentao Liu, Chen Change Loy
    In: ICLR   (Spotlight), 2024.
    Arxiv | Code |
  12. Towards Open-Vocabulary Learning: A Survey.
    Jianzong Wu*, Xiangtai Li*, Shilin Xu*, Haobo Yuan*, Henghui Ding, Yibo Yang, Xia Li, Jiangning Zhang, Yunhai Tong, Xudong Jiang, Bernard Ghanem, Dacheng Tao
    In: IEEE T-PAMI , 2024.
    Arxiv | Project
  13. EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm.
    Jiangning Zhang*, Xiangtai Li*, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong Liu, Dacheng Tao
    In: IJCV, 2024.
    Arxiv | Code |
  14. An Open and Comprehensive Pipeline for Unified Object Grounding and Detection.
    Xiangyu Zhao, Yicheng Chen, Shilin Xu, Xiangtai Li, Xinjiang Wang, Yining Li, Haian Huang
    Technical Report For MM-Grounding DINO , 2024.
    Arxiv | Code
  15. GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning.
    Xiaojie Li, Yibo Yang, Xiangtai Li, Jianlong Wu, Yue Yu, Bernard Ghanem, Min Zhang
    Technical Report , 2024.
    Arxiv | Code
  16. Explore In-Context Segmentation via Latent Diffusion Models.
    Chaoyang Wang , Xiangtai Li, Henghui Ding, Lu Qi, Jiangning Zhang, Yunhai Tong, Chen Change Loy, Shuicheng Yan
    Technical Report , 2024.
    Arxiv | Project
  17. VG4D: Vision-Language Model Goes 4D Video Recognition.
    Zhichao Deng, Xiangtai Li, Xia Li, Yunhai Tong, Shen Zhao, Mengyuan Liu
    In: ICRA, 2024.
    Arxiv | Code |
  18. Towards Robust Referring Image Segmentation.
    Jianzong Wu*, Xiangtai Li*, Xia Li, Henghui Ding, Yunhai Tong, Dacheng Tao
    In: TIP, 2024.
    Arxiv | Code |
2023
  1. Transformer-Based Visual Segmentation: A Survey.
    Xiangtai Li, Henghui Ding, Wenwei Zhang, Haobo Yuan, Jiangmiao Pang, Guangliang Cheng, Kai Chen, Ziwei Liu, Chen Change Loy
    In: IEEE T-PAMI (major) , 2023.
    Arxiv | Code| Project Page in MMlab@NTU
  2. PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation.
    Xiangtai Li, Shilin Xu, Yibo Yang, Haobo Yuan, Guangliang Cheng, Yunhai Tong, Zhouchen Lin, Ming-Hsuan Yang, Dacheng Tao
    In: IEEE T-PAMI (major), Extension of PanopticPartFormer, 2023.
    Paper | Code |
  3. Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation.
    Xiangtai Li, Haobo Yuan, Wenwei Zhang, Guangliang Cheng, Jiangmiao Pang, Chen Change Loy
    In: ICCV , 2023.
    Arxiv | Code
  4. SFNet: Faster and Accurate Semantic Segmentation via Semantic Flow.
    Xiangtai Li, Jiangning Zhang, Yibo Yang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Dacheng Tao
    In: IJCV, 2023.
    Arxiv | Code |
  5. Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation.
    Jianzong Wu*, Xiangtai Li*, Henghui Ding, Xia Li, Guangliang Cheng, Yunhai Tong, Chen Change Loy
    In: ICCV, 2023.
    Arxiv | Code |
  6. Rethinking Mobile Block for Efficient Neural Models.
    Jiangning Zhang, Xiangtai Li, Jian Li, Liang Liu, Zhucun Xue, Boshen Zhang, Zhengkai Jiang, Tianxin Huang, Yabiao Wang, Chengjie Wang
    In: ICCV, 2023.
    Arxiv | Code |
  7. EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM.
    Chong Zhou, Xiangtai Li, Chen Change Loy, Bo Dai
    In: Arxiv, 2023.
    Arxiv | Code |
  8. Explore In-Context Learning for 3D Point Cloud Understanding.
    Zhongbin Fang, Xiangtai Li, Xia Li, Joachim M. Buhmann, Chen Change Loy, Mengyuan Liu
    In: NeurIPS   (Spotlight), 2023.
    Arxiv | Code
  9. Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class-Incremental Learning.
    Yibo Yang, Haobo Yuan, Xiangtai Li , Zhouchen Lin, Philip Torr, Dacheng Tao
    In: ICLR   (Spotlight) , 2023.
    Arxiv | Code
  10. 4D Panoptic Scene Graph Generation.
    Jingkang Yang, Jun Chen, Wenxuan Peng, Shuai Liu, Fangzhou Hong, Xiangtai Li, Kaiyang Zhou, Qifeng Chen, Ziwei Liu
    In: NeurIPS   (Spotlight), 2023.
    Arxiv | Code
  11. Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision.
    Menghao Li, Chunlei Wang, Wenquan Feng, Shuchang Lyu, Guangliang Cheng, Xiangtai Li, Binghao Liu, Qi Zhao
    In: ICCV workshop, 2023 (Oral Presentation, Best Paper Nomination).
    Arxiv | Code |
  12. DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection.
    Shilin Xu*, Xiangtai Li*, Size Wu, Wenwei Zhang, Yining Li, Guangliang Cheng, Yunhai Tong, Kai Chen, Chen Change Loy
    In: Arxiv, .
    Arxiv | Code |
  13. Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants .
    Yibo Yang, Haobo Yuan, Xiangtai Li, Jianlong Wu, Lefei Zhang, Zhouchen Lin, Philip Torr, Dacheng Tao, Bernard Ghanem
    In: Arxiv (In Submission), 2023.
    Arxiv | Code |
  14. Multi-Task Learning with Multi-Query Transformer for Dense Prediction.
    Yangyang Xu*, Xiangtai Li*, Haobo Yuan, Yibo Yang, Jing Zhang, Yunhai Tong, Lefei Zhang, Dacheng Tao
    In: IEEE-TCSVT , 2023.
    Arxiv | Code |
  15. Panoptic Video Scene Graph Generation.
    Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Wayne Zhang, Kaiyang Zhou, Chen Change Loy, Ziwei Liu.
    In: CVPR , 2023.
    Paper | Code
  16. Pair then Relation: Pair-Net for Panoptic Scene Graph Generation.
    Jinghao Wang, Zhengyu Wen, Xiangtai Li, Zujin Guo, Jingkang Yang, Ziwei Liu
    In: IEEE T-PAMI , 2023.
    Paper | Code
  17. Rethinking Evaluation Metrics of Open-Vocabulary Segmentation.
    Hao Zhou, Tiancheng Shen, Xu Yang, Hai Huang, Xiangtai Li, Lu Qi, Ming-Hsuan Yang
    In: Arxiv (In Submission) , 2023.
    Arxiv | Project
  18. Effective Adapter for Face Recognition in the Wild.
    Yunhao Liu, Lu Qi, Yu-Ju Tsai, Xiangtai Li, Kelvin C. K. Chan, Ming-Hsuan Yang
    In: Arxiv (In Submission) , 2023.
    Arxiv | Project
  19. MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation.
    Jiahao Xie, Wei Li, Xiangtai Li , Ziwei Liu, Yew Soon Ong, Chen Change Loy
    In: Arxiv (In Submission) , 2023.
    Arxiv | Project
  20. Exploring Plain ViT Reconstruction for Multi-class Unsupervised Anomaly Detection.
    Jiangning Zhang, Xuhai Chen, Yabiao Wang, Chengjie Wang, Yong Liu, Xiangtai Li , Ming-Hsuan Yang, Dacheng Tao
    In: Arxiv (In Submission) , 2023.
    Arxiv | Project
  21. Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation.
    Yue Han, Jiangning Zhang, Zhucun Xue, Chao Xu, Xintian Shen, Yabiao Wang, Chengjie Wang, Yong Liu, Xiangtai Li
    In: T-PAMI (major), 2023.
    Arxiv | Code
2022
  1. TransVOD: End-to-End Video Object Detection with Spatial-Temporal Transformers.
    Qianyu Zhou*, Xiangtai Li*, Lu He, Yibo Yang, Guangliang Cheng, Yunhai Tong, Lizhuang Ma, Dacheng Tao
    In: IEEE-T-PAMI, 2022.
    Arxiv | Code
  2. Improving Video Instance Segmentation via Temporal Pyramid Routing.
    Xiangtai Li, Hao He, Yibo Yang, Henghui Ding, Kuiyuan Yang, Guangliang Cheng, Yunhai Tong, Dacheng Tao
    In: IEEE-T-PAMI, 2022.
    Arxiv | Code |
  3. Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation.
    Xiangtai Li*, Wenwei Zhang*, Jiangmiao Pang*, Kai Chen, Guangliang Cheng, Yunhai Tong, Chen Change Loy
    In: CVPR, 2022.   (Oral Presentation)
    Arxiv | Code |
  4. Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation.
    Xiangtai Li, Shilin Xu, Yibo Yang, Guangliang Cheng, Yunhai Tong, Dacheng Tao
    In: ECCV, 2022.
    Arxiv | Code |
  5. Fashionformer: A Simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition.
    Shilin Xu*, Xiangtai Li*, Jingbo Wang, Guangliang Cheng, Yunhai Tong, Dacheng Tao
    In: ECCV, 2022.
    Arxiv | Code |
  6. PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation.
    Haobo Yuan*, Xiangtai Li*, Yibo Yang, Guangliang Cheng, Jing Zhang, Yunhai Tong, Lefei Zhang, Dacheng Tao
    In: ECCV, 2022.
    Arxiv | Code |
  7. Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network?
    Yibo Yang, Shixiang Chen, Xiangtai Li, Liang Xie, Zhouchen Lin, Dacheng Tao
    In: NeurIPS, 2022.
    Arxiv | Code |
  8. Convolution-enhanced Evolving Attention Networks.
    Yujing Wang, Yaming Yang, Zhuo Li, Jiangang Bai, Mingliang Zhang, Xiangtai Li, Jing Yu, Ce Zhang, Gao Huang, Yunhai Tong
    In: IEEE-T-PAMI, 2022.
    Paper | Code |
2021
  1. End-to-End Video Object Detection with Spatial-Temporal Transformers.
    Lu He*, Qianyu Zhou*, Xiangtai Li*, Li Niu, Guangliang Cheng, Xiao Li, Wenxuan Liu, Yunhai Tong, Lizhuang Ma, Liqing Zhang
    In: ACM-MM, 2021.
    Arxiv | Code
  2. Fast and Accurate Scene Parsing via Bi-direction Alignment Networks.
    Yanran Wu, Xiangtai Li, Chen Shi, Yunhai Tong, Yang Hua, Tao Song, Ruhui Ma, Haibing Guan
    In: ICIP, 2021.
    Arxiv | Code
  3. Dynamic Dual Sampling Module for Fine-Grained Semantic Segmentation.
    Chen Shi, Xiangtai Li, Yanran Wu, Yunhai Tong, Yi Xu
    In: ICIP, 2021.
    Arxiv | Code
  4. BoundarySqueeze: Image Segmentation as Boundary Squeezing.
    Hao He*, Xiangtai Li*, Yibo Yang, Guangliang Cheng, Yunhai Tong, Lubin Weng, Shiming Xiang, Dacheng Tao
    In: Arxiv (in submission), 2021.
    Arxiv | Code
  5. Enhanced Boundary Learning for Glass-like Object Segmentation.
    Hao He*, Xiangtai Li*, Guangliang Cheng, Jianping Shi, Yunhai Tong, Gaofeng Meng, Veronique Prinet, Lubin Weng
    In: ICCV, 2021.
    Arxiv | Code
  6. PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation.
    Xiangtai Li*, Hao He*, Xia Li, Duo Li, Guangliang Cheng, Jianping Shi, Lubin Weng, Yunhai Tong, Zhouchen Lin
    In: CVPR, 2021.
    Arxiv | Code | Jitter Code
  7. Involution: Inverting the Inherence of Convolution for Visual Recognition.
    Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen
    In: CVPR, 2021.
    Arxiv | Code
  8. Towards Efficient Scene Understanding via Squeeze Reasoning.
    Xiangtai Li, Xia Li, Ansheng You, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Zhouchen Lin
    In: IEEE-TIP, 2021.
    Arxiv | Code
  9. Global Aggregation then Local Distribution for Scene Parsing.
    Xiangtai Li, Li Zhang, Guangliang Cheng, Kuiyuan Yang, Yunhai Tong, Xiatian Zhu, Tao Xiang
    In: IEEE-TIP, 2021.
    Arxiv | Code
2020
  1. Semantic Flow for Fast and Accurate Scene Parsing.
    Xiangtai Li, Ansheng You, Zhen Zhu, Houlong Zhao, Maoke Yang, Kuiyuan Yang, Yunhai Tong
    In: ECCV, 2020.   (Oral Presentation)
    Arxiv | Code | Code (torchcv) |
  2. Improving Semantic Segmentation via Decoupled Body and Edge Supervision.
    Xiangtai Li , Xia Li, Li Zhang, Guangliang Cheng, Jianping Shi, Zhouchen Lin, Shaohua Tan, Yunhai Tong
    In: ECCV, 2020.
    Arxiv | Code
  3. GFF: Gated Fully Fusion for Semantic Segmentation.
    Xiangtai Li, Houlong Zhao, Lei Han, Yunhai Tong, Kuiyuan Yang
    In: AAAI, 2020.   (Oral Presentation)
    Arxiv | Code |
2019
  1. Global Aggregation then Local Distribution in Fully Convolutional Networks.
    Xiangtai Li, Li Zhang, Ansheng You, Maoke Yang, Kuiyuan Yang, Yunhai Tong
    In: BMVC, 2019. Paper | Code |

  2. Dual Graph Convolutional Network for Semantic Segmentation.
    Li Zhang*, Xiangtai Li*, Anurag Arnab, Kuiyuan Yang, Yunhai Tong, Philip H.S. Torr
    In: BMVC, 2019. Paper | Code |

  3. Flo2seg: Motion-Aided Semantic Segmentation.
    Xiangtai Li, Jiangang Bai, Yunhai Tong, Kuiyuan Yang
    In: IJCNN, 2019. long paper Paper |


Based on a template by Jon Barron