Abstract:Fraud detection presents a challenging task characterized by ever-evolving fraud patterns and scarce labeled data. Existing methods predominantly rely on graph-based or sequence-based approaches. While graph-based approaches connect users through shared entities to capture structural information, they remain vulnerable to fraudsters who can disrupt or manipulate these connections. In contrast, sequence-based approaches analyze users' behavioral patterns, offering robustness against tampering but overlooking the interactions between similar users. Inspired by cohort analysis in retention and healthcare, this paper introduces VecAug, a novel cohort-augmented learning framework that addresses these challenges by enhancing the representation learning of target users with personalized cohort information. To this end, we first propose a vector burn-in technique for automatic cohort identification, which retrieves a task-specific cohort for each target user. Then, to fully exploit the cohort information, we introduce an attentive cohort aggregation technique for augmenting target user representations. To improve the robustness of such cohort augmentation, we also propose a novel label-aware cohort neighbor separation mechanism to distance negative cohort neighbors and calibrate the aggregated cohort information. By integrating this cohort information with target user representations, VecAug enhances the modeling capacity and generalization capabilities of the model to be augmented. Our framework is flexible and can be seamlessly integrated with existing fraud detection models. We deploy our framework on e-commerce platforms and evaluate it on three fraud detection datasets, and results show that VecAug improves the detection performance of base models by up to 2.48\% in AUC and 22.5\% in R@P$_{0.9}$, outperforming state-of-the-art methods significantly.

Stacking GA2M for inherently interpretable fraudulent reviewer identification by fusing target and non-target features

Camouflage is NOT Easy: Uncovering Adversarial Fraudsters in Large Online App Review Platform

Towards more accurate multi-label software behavior learning

FdGars: Fraudster Detection via Graph Convolutional Networks in Online App Review System

StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics

A deceptive reviews detection model: Separated training of multi-feature learning and classification

Spotting Sneaky Scammers: Malicious Account Detection from a Chinese Financial Platform

Enhancing binary classification: A new stacking method via leveraging computational geometry

A novel approach for fraudulent reviewer detection based on weighted topic modelling and nearest neighbors with asymmetric Kullback–Leibler divergence

Stacked Generalization Architecture for Predicting Publisher Behaviour from Highly Imbalanced User-Click Data Set for Click Fraud Detection

A deceptive review detection framework: Combination of coarse and fine-grained features

Catch Me If You Can: Identifying Fraudulent Physician Reviews with Large Language Models Using Generative Pre-Trained Transformers

ScoreGAN: A Fraud Review Detector based on Multi Task Learning of Regulated GAN with Data Augmentation

Spatio-Temporal Graph Representation Learning for Fraudster Group Detection

Unmasking Deception: A Comparative Study of Tree-Based and Transformer-Based Models for Fake Review Detection on Yelp

Reliable Fake Review Detection Via Modeling Temporal and Behavioral Patterns

MetaStackVis: Visually-Assisted Performance Evaluation of Metamodels

Enhancing transparency and fairness in automated credit decisions: an explainable novel hybrid machine learning approach

Retracted: Overcoming the Inadaptability of Sparse Group Lasso for Data with Various Group Structures by Stacking

VecAug: Unveiling Camouflaged Frauds with Cohort Augmentation for Enhanced Detection

Stacking Ensemble Technique for Classifying Breast Cancer