Multimedia Semantic Analytics Lab

We are the Multimedia Semantic Analytics Lab (MSALab) at Peking University, led by Prof. Yunhai Tong.

Our mission is to build intelligent systems that understand and generate meaning across language, vision, and video. We combine strong theoretical foundations with practical, open-world applications in large-scale AI.

Highlights

We conduct cutting-edge research in multimodal learning, visual perception, video understanding, and language semantics, pushing AI systems toward deeper and more reliable semantic intelligence.

Our work combines strong theoretical foundations with practical, open-world applications in large-scale AI.

Current Core Directions
  • Multimodal large language models & agentic reasoning.
  • Image & video generation and editing.
  • Unified models.
  • Efficient and trustworthy large language models.
Open Positions

We welcome self-motivated research interns and PhD applicants with strong mathematical and engineering backgrounds, especially those interested in multimodal AI and generative modeling.

Explore our recent work ->