Previous Up Next

17.2.1  Least-squares polynomial approximation

The fitpoly command is used for replacing tabular data or a function by a polynomial.

Smoothing data with polynomial regression.

fitpoly can be used for finding a polynomial that fits a given tabular data.

Example

We find a polynomial which best approximates data infected by noise. Data is obtained by evaluating the polynomial f(x)=1/100x4−1/5x3+6x at 100 points between x=1 and x=20.8. The noise is generated by a random, normally-distributed variable.

f(x):=x^4/100-x^3/5+6x:; X:=linspace(1.0,20.8):; Y:=apply(f,x)+randvector(100,randvar(normal,mean=0,stddev=10)):; p:=fitpoly(X,Y)
     
[0.011665,−0.266955,0.844543,2.44967,3.94846]           

We obtain the list of polynomial coefficients, starting with the leading coefficient. Note that the polynomial of degree 4 is returned (it has five coefficients), which is, in this case, obviously optimal for data smoothing. To visualize the fit, enter:

scatterplot(X,Y); plot(r2e(p,x),x=0..20,color=blue+line_width_2)

When approximating only the data for 1≤ x≤ 10, you obtain a polynomial of 9th degree. This is called overfitting. Namely, data curvature at the restriction is small, which makes the noise more prominent.

p1:=fitpoly(X,Y,x=1..10):; degree(p1)
     
9           

To make the approximating polynomial less sensitive to noise, you can increase the threshold value tol.

p2:=fitpoly(X,Y,x=1..10,threshold=0.05)
     
−0.256298728063 x3+2.89166276754 x2−9.93723278591 x+19.4201078899           

(Alternatively, you could set the parameter length to a smaller value, e.g. 3.) The command lines below show the curves p1 and p2 against data points (the overfitted one is drawn in grey).

scatterplot(X,Y); plot(p1,x=1..10,color=grey); plot(p2,x=1..10,color=blue+line_width_2)
Approximating functions by polynomials.

fitpoly can be used for approximating a continuous function f:[a,b]→ℝ by a polynomial pn of certain degree n which minimizes the error in the sense of ℓ2 norm.

Examples

fitpoly(cos(x),0..pi/2,1e-2)
     
−0.244320054366 x3−0.336364730985 x2+1.13004492821 x−0.0107731059645           
f:=exp(-7x)*5x:; g:=fitpoly(f,0..1,degree=8)
     
 −21.717636069 x8+107.930784832 x7−232.831655404 x6         
 +286.778708741 x5−222.236631985 x4+111.004732684 x3         
 −33.8769709361 x2+4.95239715728 x+0.000500886757642          

The mean absolute error of the above approximation can be estimated as follows:

sqrt(romberg((f-g)^2,x=0..1))
     
9.3456615562×10−5           

Previous Up Next