3

RLRF4Rec: Reinforcement Learning from Recsys Feedback for Enhanced Recommendation Reranking

Large Language Models (LLMs) have demonstrated remarkable performance across diverse domains, prompting researchers to explore their potential for use in recommendation systems. Initial attempts have leveraged the exceptional capabilities of LLMs, …

LLAVADI: What Matters For Multimodal Large Language Models Distillation

The recent surge in Multimodal Large Language Models (MLLMs) has showcased their remarkable potential for achieving generalized intelligence by integrating visual understanding into Large Language Models.Nevertheless, the sheer model size of MLLMs …

VG4D: Vision-Language Model Goes 4D Video Recognition

Understanding the real world through point cloud video is a crucial aspect of robotics and autonomous driving systems. However, prevailing methods for 4D point cloud recognition have limitations due to sensor resolution, which leads to a lack of …

Dst-det: Simple dynamic self-training for open-vocabulary object detection

Open-vocabulary object detection (OVOD) aims to detect the objects beyond the set of categories observed during training. This work presents a simple yet effective strategy that leverages the zero-shot classification ability of pre-trained …

Label-efficient interactive time-series anomaly detection

Time-series anomaly detection is an important task and has been widely applied in the industry. Since manual data annotation is expensive and inefficient, most applications adopt unsupervised anomaly detection methods, but the results are usually …

BoundarySqueeze: Image Segmentation as Boundary Squeezing

This paper proposes a novel method for high-quality image segmentation of both objects and scenes. Inspired by the dilation and erosion operations in morphological image processing techniques, the pixel-level image segmentation problems are treated …

Customizing Graph Neural Networks using Path Reweighting

Graph Neural Networks (GNNs) have been extensively used for mining graph-structured data with impressive performance. We argue that the paths in a graph imply different semantics for different downstream tasks. However, traditional GNNs do not …

Ladabert: Lightweight adaptation of bert through hybrid model compression

BERT is a cutting-edge language representation model pre-trained by a large corpus, which achieves superior performances on various natural language understanding tasks. However, a major blocking issue of applying BERT to online services is that it …

Customized graph embedding: tailoring embedding vectors to different applications

Graph is a natural representation of data for a variety of real-word applications, such as knowledge graph mining, social network analysis and biological network comparison. For these applications, graph embedding is crucial as it provides vector …

Global aggregation then local distribution in fully convolutional networks

It has been widely proven that modelling long-range dependencies in fully convolutional networks (FCNs) via global aggregation modules is critical for complex scene understanding tasks such as semantic segmentation and object detection. However, …