Abstract:Three methods of least squares are examined for fitting a line to points in the plane. Two well known methods are to minimize sums of squares of vertical or horizontal distances to the line. Less known is to minimize sums of squares of distances to the line. Concise proofs are given for each method using a combination of the first derivative test for functions of two variables and completing the square. The three methods are compared and the distances to the line method appears to be favourable in most circumstances. They generally draw different regression lines. The method of vertical displacements typically gives a slope of too small magnitude while the method of horizontal displacements typically gives a slope of too large magnitude. An inequality involving the three slopes is proved. Rotating all the data points in the same way with these two methods does not result in the regression line being rotated the same way. However, the distance to the line method is invariant under rotations.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: to compare three different least squares methods for fitting a straight line to a set of planar points, and to explore their advantages and disadvantages and their performance in different situations. Specifically: 1. **Vertical distance minimization method (Y - method)**: - Fit a straight line by minimizing the sum of the squares of the vertical distances from points to the line. - The formula is \( Y_i=\vert m x_i + b - y_i\vert \), and define \( Y(m, b)=\frac{1}{n}\sum_{i = 1}^{n}Y_i^{2} \). 2. **Horizontal distance minimization method (X - method)**: - Fit a straight line by minimizing the sum of the squares of the horizontal distances from points to the line. - The formula is \( X_i=\vert\mu y_i+\beta - x_i\vert \), and define \( X(\mu,\beta)=\frac{1}{n}\sum_{i = 1}^{n}X_i^{2} \). 3. **Point - to - line distance minimization method (D - method)**: - Fit a straight line by minimizing the sum of the squares of the shortest distances from points to the line. - The formula is \( D_i=\vert x_i\sin\theta - y_i\cos\theta - c\vert \), and define \( D(\theta, c)=\frac{1}{n}\sum_{i = 1}^{n}D_i^{2} \). The main contributions of the paper are: - **Proved the minimization conditions of each method**, giving concise proofs using the first - order derivative test and the method of completing the square. - **Compared the performance of the three methods**, pointing out that in most cases, the point - to - line distance minimization method (D - method) has more advantages. - **Analyzed the invariance of the three methods under rotation and translation transformations**. In particular, the D - method remains invariant under rotation, while the other two methods do not have this property. - **Explored the behavior in special cases**, for example, when data points are distributed on parallel lines or on a circumference, the performance differences of different methods. In summary, the paper aims to reveal the characteristics of different least squares methods for fitting straight lines through mathematical derivations and case analyses, and to recommend more appropriate methods for practical applications.

Variations on least squares

Least-squares fitting of circles and ellipses

Optimisation of Least Squares Algorithm: A Study of Frame Based Programming Techniques in Horizontal Networks

New algorithms for computing the least trimmed squares estimator

Bivariate least squares linear regression: towards a unified analytic formalism

Modified least squares method and a review of its applications in machine learning and fractional differential/integral equations

Numerical methods for solving linear least squares problems

Partitioned Least Squares

Variants of Non-Negative Least-Mean-Square Algorithm and Convergence Analysis

A Robust Method of the Least Squares Orthogonal Distance Fitting

Linear least squares and quadratic programming

Uniqueness of Minima of a Certain Least Squares Problem

A rigorous introduction to linear models

The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate

A Function Fitting Method

No penalty no tears: Least squares in high-dimensional linear models

Relation Between a Support Vector Machine and the Least Square Method

Least Squares Solution with the Minimum-Norm to General Matrix Equations Via Iteration.

A three point formula for finding roots of equations by the method of least squares

Research on Variations of Least Square Support Vector Machine

Model-free Study of Ordinary Least Squares Linear Regression