1

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Story visualization, the task of creating visual narratives from textual descriptions, has seen progress with text-to-image generation models. However, these models often lack effective control over character appearances and interactions, …

DreamRelation: Bridging Customization and Relation Generation

Customized image generation is crucial for delivering personalized content based on user-provided image prompts, aligning large-scale text-to-image diffusion models with individual needs. However, existing models often overlook the relationships …

You Can't Ignore Either: Unifying Structure and Feature Denoising for Robust Graph Learning

Recent research on the robustness of Graph Neural Networks (GNNs) under noises or attacks has attracted great attention due to its importance in real-world applications. Most previous methods explore a single noise source, recovering corrupt node …

SEFraud: Graph-based Self-Explainable Fraud Detection via Interpretative Mask Learning

Graph-based fraud detection has widespread application in modern industry scenarios, such as spam review and malicious account detection. While considerable efforts have been devoted to designing adequate fraud detectors, the interpretability of …

Characteristic-Aware Time-Series Representation Learning for Unsupervised Anomaly Detection

Time-series anomaly detection is an important research topic in data mining, popular in both academia and industry. Recently, unsupervised anomaly detection draws considerable attention, since it can detect anomalies without parameter tuning on …

Collaborative Multi-Task Representation for Natural Language Understanding

Multi-task learning has shown large benefits in Natural Language Understanding (NLU). However, current state-of-the-arts (SOTAs) like MT-DNN and MMoE do not model task relationships explicitly and fail to obtain effective task alignment. In this …

Motionbooth: Motion-aware customized text-to-video generation

In this work, we present MotionBooth, an innovative framework designed for animating customized subjects with precise control over both object and camera movements. By leveraging a few images of a specific object, we efficiently fine-tune a …

SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow

Semantic segmentation and semantic image synthesis are two representative tasks in visual perception and generation. While existing methods consider them as two distinct tasks, we propose a unified diffusion-based framework (SemFlow) and model them …

Hgamlp: Heterogeneous graph attention mlp with de-redundancy mechanism

Heterogeneous graphs contain rich semantic information that can be exploited by heterogeneous graph neural networks (HGNNs). However, scaling HGNNs to large graphs is challenging due to the high computational cost. Existing non-parametric HGNNs use …

Sfnet: Faster and accurate semantic segmentation via semantic flow

In this paper, we focus on exploring effective methods for faster and accurate semantic segmentation. A common practice to improve the performance is to attain high-resolution feature maps with strong semantic representation. Two strategies are …