Abstract:In this paper we study the setting where features are added or change interpretation over time, which has applications in multiple domains such as retail, manufacturing, finance. In particular, we propose an approach to provably determine the time instant from which the new/changed features start becoming relevant with respect to an output variable in an agnostic (supervised) learning setting. We also suggest an efficient version of our approach which has the same asymptotic performance. Moreover, our theory also applies when we have more than one such change point. Independent post analysis of a change point identified by our method for a large retailer revealed that it corresponded in time with certain unflattering news stories about a brand that resulted in the change in customer behavior. We also applied our method to data from an advanced manufacturing plant identifying the time instant from which downstream features became relevant. To the best of our knowledge this is the first work that formally studies change point detection in a distribution independent agnostic setting, where the change point is based on the changing relationship between input and output.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to determine the time point at which these new or changed features start to have a significant impact on the output variable when data features change or are added over time. Specifically, the authors propose a method that can determine, without assumptions, the time point at which new or changed features become relevant in a supervised learning setting. In addition, they also propose a more efficient version of the method with the same asymptotic performance, and their theory also applies to cases where there are multiple change points. ### Background and Motivation In many practical application areas, such as retail, manufacturing, and finance, features in a data set may change or be added over time. For example, in the manufacturing process, new measurement tools or old tools that are re - introduced into the production line after maintenance will result in new or changed measurement values. These changes may affect the quality of the product, so it is very important to determine when these changes start to have a significant impact on the product quality. This not only helps manufacturers take preventive or corrective measures, but also can improve overall production efficiency and profitability. ### Solution The authors propose an algorithm called "Search - and - Split" (SaS), which determines the time point of feature changes by minimizing empirical risk. Specifically, they define two function classes \(H_1\) and \(H_2\) to describe the data relationships before and after the change respectively, and determine the optimal time point \(t^*\) by minimizing the following objective function: \[R^*(h_1, h_2, t_0)=\frac{1}{m}\left(\sum_{t = 1}^{t_0-1}(h_1(x_t)-\eta_t)^2+\sum_{t = t_0}^{m}(h_2(x_t)-\eta_t)^2\right)\] where \(\eta_t=\mathbb{E}[Y_t]\) is the expected value of the output variable. By minimizing \(R^*\), the best time point \(t^*\) can be found such that new or changed features start to have a significant impact on the output variable. ### Theoretical Analysis The authors provide distribution - independent excess - risk guarantees and prove that their method is still effective in the case of feature changes. Specifically, they prove the following theorem: **Theorem 1**: With probability at least \(1-\delta\), we have \[R^*(\hat{h}_1,\hat{h}_2,\hat{t})\leq R^*(h_1^*,h_2^*,t^*)+\frac{22B\sqrt{2\ln\left(\frac{2(m + 1)}{\delta}\right)+\sum_{j = 1}^23p_j\ln\left(\frac{emB}{p_j}\right)}}{m}\] where \(\hat{h}_1\) and \(\hat{h}_2\) are the estimated functions obtained by minimizing the empirical risk \(\hat{R}\), \(\hat{t}\) is the estimated time point, \(h_1^*\) and \(h_2^*\) are the optimal functions, \(t^*\) is the optimal time point, \(p_1\) and \(p_2\) are the pseudo - dimensions of \(H_1\) and \(H_2\) respectively, and \(B\) is the range of the output variable. ### Experimental Verification The authors conducted experiments on synthetic data and two real - world industrial data sets to verify the effectiveness of their method. The experimental results show that the SaS method performs best in adapting to feature changes, while SaSF (the efficient version of SaS), although slightly inferior, is still able to adapt to changes quickly and is significantly more computationally efficient than the original SaS method. ### Application Scenarios This method is applicable not only to the manufacturing industry, but also to other fields such as retail, finance, document classification, and sensor networks, where features may change or be added over time. For example, in the retail industry, sales strategies can be adjusted by identifying the time point at which brand reputation changes; in the financial field, it can be used...

Learning with Changing Features

Change Point Detection for Nonparametric Regression under Strongly Mixing Process

Leveraging change point detection to discover natural experiments in data

Yongmiao Hong, Oliver Linton, Jiajing Sun, and Meiting Zhu’s Contribution to the Discussion of ‘the Discussion Meeting on Probabilistic and Statistical Aspects of Machine Learning’

A Novel Change-Point Detection Approach for Monitoring High-Dimensional Traffics in Distributed Systems

Online Change-Point Detection of Linear Regression Models

A Novel Approach for Fast Detection of Multiple Change Points in Linear Models

Change Point Detection with Neural Online Density-Ratio Estimator

Change Point Detection for Automatic Time Series Forecasting

Sequential Change Point Detection for Time Series - an Adjusted-Range Based Approach

Automatic Change-Point Detection in Time Series via Deep Learning

Online Change-point Detection for Matrix-valued Time Series with Latent Two-way Factor Structure

Sequential change point detection in high dimensional time series

An encoding approach for stable change point detection

Change Point Detection for High-dimensional Linear Models: A General Tail-adaptive Approach

Selective linear segmentation for detecting relevant parameter changes

A distribution-free change-point monitoring scheme in high-dimensional settings with application to industrial image surveillance

Multiple Change-Point Detection: A Selective Overview

Change Point Detection With Conceptors

Anomaly and change point detection for time series with concept drift

Learning Sinkhorn divergences for supervised change point detection