Abstract:It is always demanding to learn robust visual representation for various learning problems; however, this learning and maintenance process usually suffers from noise, incompleteness or knowledge domain mismatch. Thus, robust representation learning by removing noisy features or samples, complementing incomplete data, and mitigating the distribution difference becomes the key. Along this line of research, low-rank modeling has been widely-applied to solving representation learning challenges. This survey covers the topic from a knowledge flow perspective in terms of: (1) robust knowledge recovery, (2) robust knowledge transfer, and (3) robust knowledge fusion, centered around several major applications. First of all, we deliver a unified formulation for robust knowledge discovery given single dataset. Second, we discuss robust knowledge transfer and fusion given multiple datasets with different knowledge flows, followed by practical challenges, model variations, and remarks. Finally, we highlight future research of robust knowledge discovery for incomplete, unbalance, large-scale data analysis. This would benefit AI community from literature review to future direction.
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to learn robust visual representations to deal with problems such as noise in data, incompleteness, or mismatches in knowledge domains. Specifically, from the perspective of knowledge flow, the author explores how to achieve the following three - aspect goals through low - rank modeling:
1. **Robust Knowledge Recovery**: For single - domain data, it aims to recover the underlying low - rank structure from data affected by noise and outliers. This is helpful for applications such as data clustering, anomaly detection, image segmentation, and classification.
2. **Robust Knowledge Transfer**: When data comes from different distributions, align the distribution differences of different domains through low - rank modeling, thereby smoothly transferring the knowledge of the source domain to the target domain. This includes dealing with data from two different domains, even if these data contain noise, outliers, or irrelevant knowledge.
3. **Robust Knowledge Fusion**: For multi - view or multi - domain data, extract the common knowledge between different views through low - rank modeling to support new learning tasks. This involves strategies such as early fusion, late fusion, and decision fusion, aiming to capture the consistent knowledge between different views.
### Summary of Mathematical Formulas
- **Unified Framework for Robust Knowledge Recovery**:
\[
\min_{\phi_{1/2}(\cdot), Z, E} \text{rank}(Z)+\lambda \|E\|_p+\gamma R(\phi_{1/2}(X), Z, Y)
\]
where $\phi_{1/2}(\cdot)$ is a general mapping function, $R(\phi_{1/2}(X), Z, Y)$ is a regularization term, $\lambda$ and $\gamma$ are balancing parameters, and the constraint is:
\[
\phi_1(X)=\phi_2(X)Z + E
\]
- **Optimization Problem for Robust Knowledge Completion**:
\[
\min_Z \|P_\Omega(X - Z)\|_F^2+\lambda \text{rank}(Z)
\]
where $P_\Omega(X)$ is a projection matrix that retains the observed elements and replaces the missing values.
- **Optimization Problem for Robust Transferable Embedding**:
\[
\min_{Z, E, P} \text{rank}(Z)+\lambda \|E\|_{2,1}+\gamma f(P, X_s, X_t)
\]
with the constraint:
\[
P^\top X_s = P^\top X_t Z + E
\]
Through these methods, the paper provides a comprehensive framework for learning robust representations in the face of noise, incomplete data, and domain differences, thereby improving the performance of various learning tasks.