Abstract:High-dimensional and incomplete (HDI) data are commonly encountered in various big data-related applications concerning the complex interactions among numerous nodes, such as the user-item iterations in a recommender system. A stochastic gradient descent (SGD)-based latent factor analysis (LFA) model can perform efficient representation learning to such HDI data, thereby extracting useful knowledge from them. However, a standard SGD algorithm updates a latent factor based on the current stochastic gradient only, without the considerations on the past information, making a resultant model suffer from slow convergence. To address this critical issue, this paper proposes an Adaptive Non-linear PID-incorporated SGD (ANPS) algorithm with two-fold ideas: 1) rebuilding the instant learning error when computing the stochastic gradient following the principle of a nonlinear PID controller to incorporate past update information into the learning scheme efficiently, and 2) implementing gain parameter adaptation following the principle of particle swarm optimization (PSO). Experiments on six widely-adopted HDI datasets demonstrate that compared with state-of-the-art LFA models, an ANPS-based LFA model achieves significant advantage in both efficiency and accuracy. Moreover, its flexible gain parameter adaptation mechanism greatly boosts its practicability for real issues. Note to Practitioners—In many industrial applications like recommender systems, social network systems, and cloud service systems, people usually encounter numerous nodes and their highly-incomplete relationships. An HDI matrix is commonly adopted to describe such specific relationships. One of the major challenges is to acquire useful knowledge from an HDI matrix efficiently and accurately for various data analysis tasks, e.g., accurate recommendation, community detection, and web service selection. An SGD-based LFA model has been widely adopted to tackle this issue. However, it suffers from slow convergence that leads to considerable time cost on large-scale datasets. This study proposes an ANPS algorithm following the principle of a nonlinear PID controller. With it, an ANPS-based LFA model is achieved, which possesses fast convergence rate on an industrial HDI matrix. The proposed ANPS algorithm can be leveraged for different types of various machine learning models, thereby improving their utility and scalability in practice.

An Efficient Solution to Factor Drifting Problem in the pLSA Model

Hybrid Gaussian pLSA model and item based collaborative filtering recommendation

Using latent class models for neighbors selection in collaborative filtering

A unified latent factor correction scheme for collaborative filtering

Dynamic Revising of pLSA Recommender System

Efficient Probabilistic Latent Semantic Analysis with Sparsity Control

Improving Recommendations With Collaborative Factors

Latent Factor Transition for Dynamic Collaborative Filtering.

PSLF: A PID Controller-incorporated Second-order Latent Factor Analysis Model for Recommender System

A proportional-integral-derivative-incorporated stochastic gradient descent-based latent factor analysis model

A two-step Recommendation Algorithm via Iterative Local Least Squares

Optimal Weighting Distance-Based Similarity for Locally Weighted PLS Modeling

Exploiting Cluster Structure in Probabilistic Matrix Factorization.

A Nonlinear PID-Incorporated Adaptive Stochastic Gradient Descent Algorithm for Latent Factor Analysis

Position-Transitional Particle Swarm Optimization-incorporated Latent Factor Analysis

Exploiting Structural and Temporal Influence for Dynamic Social-Aware Recommendation

Dynamic Collaborative Filtering with Compound Poisson Factorization

Dynamic Poisson Factorization

Pseudo Gradient-Adjusted Particle Swarm Optimization for Accurate Adaptive Latent Factor Analysis

A Dynamic-Neighbor Particle Swarm Optimizer for Accurate Latent Factor Analysis

A Differential Evolution-Enhanced Latent Factor Analysis Model for High-dimensional and Sparse Data