Non-linear Least Squares¶
Introduction and definitions¶
Note
Definition Least Squares Problem
where \(f_i:R^{n} \in R, i = 1, \cdots\) , \(m\) are given functions, and \(m \leq n\);
fitting model:
The model depends on the parameters \(x = [x_1; x_2; x_3; x_4]^{T}\) . We assume that there exists an \(x^{\star}\) so that
where \(\epsilon_{i}\) are (measurement) errors on the data ordinates, assumed to behave like “white noise”.
For any choice of x we can compute the residuals
We assume that the cost function F is differentiable and so smooth that the following Taylor expansion is valid
where \(g\) is the gradient
and \(H\) is the Hessian
Descent Methods¶
The Steepest Descent method¶
Newton’s Method¶
Line Search¶
Trust Region and Damped Methods¶
Non-linear Least Squares Problems¶
The Gauss–Newton Method¶
梯度gradient, 由多元函数的各个偏导数组成的向量。以二元函数为例,其梯度为:
黑森矩阵Hessian matrix,由多元函数的二阶偏导数组成的方阵,描述函数的局部曲率,以二元函数为例
雅可比矩阵 Jacobian matrix,是多元函数一阶偏导数以一定方式排列成的矩阵,体现了一个可微方程与给出点的最优线性逼近。以二元函数为例
如果扩展多维的话F: \(\mathbb{R}^n \mapsto \mathbb{R}^m\) ,则雅可比矩阵是一个m行n列的矩阵
雅可比矩阵作用,如果P是 \(\mathbb{R}^n\) 中的一点,F在P点可微分,那么在这一点的导数由 \(J_F(P)\) 给出,在此情况下, 由 \(J_F(P)\) 描述的线性算子即接近点P的F的最优线性逼近:
牛顿法的基本思想是采用多项式函数来逼近给定的函数值,然后求出极小点的估计值,重复操作,直到达到一定精度为止。
Note
考虑如下一维无约束的极小化问题
需要注意的是,牛顿法在求极值的时候,如果初始点选取不好,则可能不收敛于极小点
试用牛顿法求 \(f(x) = \frac{3}{2}x_1^{2} + \frac{1}{2}x_2^{2} - x_1 x_2 - 2x_1\) 的极小值
取 \(x^{(0)} = (0, 0)^{T}\), 则
那么: