The Multimedia Semantic Analytics Lab, lead by Prof. Yunhai Tong, has been a center of excellence for Artificial Intelligence research, teaching, and practice.
Our group aims at exploring the state-of-the-art methods in multi-modal learning, visual perception, video content analysis, and language semantics analysis to advance the AI model’s abilities to understand semantics behind various modal.
Currently, we are focusing on three directions:
Multi-modal Large Language model.
Text2Image and Text2Video Generation and Editing.
Large Language Model.
Our group at Peking University is recruiting self-motivated research interns and Ph.D.s who have strong coding and mathematical abilities and great interest in multi-modal, large language model, text-to video generation and editing.
Recent research works can be found in the here