Abstract:Large scale recommender models find most relevant items from huge catalogs, and they play a critical role in modern search and recommendation systems. To model the input space with large-vocab categorical features, a typical recommender model learns a joint embedding space through neural networks for both queries and items from user feedback data. However, with millions to billions of items in the corpus, users tend to provide feedback for a very small set of them, causing a power-law distribution. This makes the feedback data for long-tail items extremely sparse. Inspired by the recent success in self-supervised representation learning research in both computer vision and natural language understanding, we propose a multi-task self-supervised learning (SSL) framework for large-scale item recommendations. The framework is designed to tackle the label sparsity problem by learning better latent relationship of item features. Specifically, SSL improves item representation learning as well as serving as additional regularization to improve generalization. Furthermore, we propose a novel data augmentation method that utilizes feature correlations within the proposed framework. We evaluate our framework using two real-world datasets with 500M and 1B training examples respectively. Our results demonstrate the effectiveness of SSL regularization and show its superior performance over the state-of-the-art regularization techniques. We also have already launched the proposed techniques to a web-scale commercial app-to-app recommendation system, with significant improvements top-tier business metrics demonstrated in A/B experiments on live traffic. Our online results also verify our hypothesis that our framework indeed improves model performance even more on slices that lack supervision.

Self-Supervised Logit Adjustment

Learning Where to Learn in Cross-View Self-Supervised Learning

Self-supervised Learning is More Robust to Dataset Imbalance

Combating Representation Learning Disparity with Geometric Harmonization

On the Discriminability of Self-Supervised Representation Learning

Towards Realistic Long-tailed Semi-supervised Learning in an Open World

LaSSL: Label-Guided Self-Training for Semi-supervised Learning

Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

Flexible Distribution Alignment: Towards Long-tailed Semi-supervised Learning with Proper Calibration

Self-supervised Learning for Large-scale Item Recommendations

Dataset Awareness is not Enough: Implementing Sample-level Tail Encouragement in Long-tailed Self-supervised Learning

Learning What and Where to Learn: A New Perspective on Self-supervised Learning

Feature Space Renormalization for Semi-supervised Learning

DeLaLA: Semisupervised Learning via Determinately Labeling and Kernelized Large Margin Projection

Improving Barely Supervised Learning by Discriminating Unlabeled Samples with Super-Class

Making Self-supervised Learning Robust to Spurious Correlation via Learning-speed Aware Sampling

The Common Stability Mechanism behind most Self-Supervised Learning Approaches

Label-free Monitoring of Self-Supervised Learning Progress

An Empirical Study of Self-supervised Learning with Wasserstein Distance

DC-SSL: Addressing Mismatched Class Distribution in Semi-supervised Learning

Semi-Supervised Learning Through Label Propagation on Geodesics