Research on the Construction Algorithm of Principal Curves

Lianwei Zhao,Yanchang Zhao,Siwei Luo,Chao Shao
2005-01-01
Abstract:Principal curves have been defined as self-consistent, smooth, onedimensional curves which pass through the middle of a multidimensional data set. They are nonlinear generalization of the first Principal Component. In this paper, we take a new approach by defining principal curves as continuous curves based on the local tangent space in the sense of limit. It is proved that this new principal curves not only satisfy the self-consistency property, but also are the unique existence for any given open covering. According to the new definition, a new practical algorithm for constructing principal curves is given too. And the convergence properties of this algorithm are analyzed. The new construction algorithm of principal curves is illustrated on some simulated data sets. Areas. data mining, machine learning Research on the Construction Algorithm of Principal Curves Lianwei Zhao, Yanchang Zhao, Siwei Luo, and Chao Shao 1 School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China lw_zhao@126.com 2 Faculty of Information Technology, University of Technology, Sydney, Australia yczhao@it.uts.edu.au Abstract. Principal curves have been defined as self-consistent, smooth, onedimensional curves which pass through the middle of a multidimensional data set. They are nonlinear generalization of the first Principal Component. In this paper, we take a new approach by defining principal curves as continuous curves based on the local tangent space in the sense of limit. It is proved that this new principal curves not only satisfy the self-consistent property, but also are the unique existence for any given open covering. Based on the new definition, a new practical algorithm for constructing principal curves is given. And the convergence properties of this algorithm are analyzed. The new construction algorithm of principal curves is illustrated on some simulated data sets. Principal curves have been defined as self-consistent, smooth, onedimensional curves which pass through the middle of a multidimensional data set. They are nonlinear generalization of the first Principal Component. In this paper, we take a new approach by defining principal curves as continuous curves based on the local tangent space in the sense of limit. It is proved that this new principal curves not only satisfy the self-consistent property, but also are the unique existence for any given open covering. Based on the new definition, a new practical algorithm for constructing principal curves is given. And the convergence properties of this algorithm are analyzed. The new construction algorithm of principal curves is illustrated on some simulated data sets.
What problem does this paper attempt to address?