Differences between revisions 9 and 10
 ⇤ ← Revision 9 as of 2007-12-13 14:16:42 → Size: 3262 Editor: OlegKobchenko Comment: ← Revision 10 as of 2008-12-08 10:45:29 → ⇥ Size: 3262 Editor: anonymous Comment: converted to 1.6 markup No differences found!

Linear regression is a statistical method of modeling the relationship between the dependent variable Y and independent X by estimating the coefficients of the linear form:

where each terms is a certain expression with the original independent variables (). For example, it could be that .

## Least Squares Method

In least squares method, the coefficients of linear regression are selected in a way to minimize the sum of squared deviations between observations and their estimates:

## Surface Fit Example

As an example we will take a certain bi-quadratic form

then add a small amount of noise, to simulate observed data, and try to reconstruct the coefficients using the least squares method.

 inline:lsq_form.png inline:lsq_data.png inline:lsq_estm.png 'surface'plot X1;X2;FORM 'surface'plot X1;X2;DATA 'surface'plot X1;X2;COEF mp XMAT

   load 'plot'
mp =: +/ . *

'X1 X2' =: |: ,"0/~ i:8
$XMAT =: 1 , X1 , (X1^2) , X2 , (X1*X2) ,: (X2^2) 6 17 17 FORM =: 1 0 0.2 0.3 0 _0.4 mp XMAT FORM -: 1 + (0.2*X1^2) + (0.3*X2) + (_0.4*X2^2) 1 NOISE =: 4 * _0.5 + ($X1) ?.@$0$DATA   =: FORM + NOISE
17 17
COEF  =: (,DATA) %. |:,"2 XMAT

Now we can compare the obtained coefficients with the original formula.

   0j4": COEF  ,: (,FORM) %. |:,"2 XMAT
1.0011 _0.0144 0.2005 0.3104 0.0024 _0.4013
1.0000  0.0000 0.2000 0.3000 0.0000 _0.4000

Additional regression analysis is provided in the 'stats' package.

   load 'stats'
(|:}.,"2 XMAT) regression ,DATA

Var.       Coeff.         S.E.           t
0        1.00105        0.12654        7.91
1       _0.01444        0.01375       _1.05
2        0.20052        0.00316       63.55
3        0.31036        0.01375       22.56
4        0.00241        0.00281        0.86
5       _0.40131        0.00316     _127.17

Source     D.F.        S.S.          M.S.           F
Regression    5    27192.76720     5438.55344     4144.49
Error       283      371.36300        1.31224
Total       288    27564.13020

S.E. of estimate         1.14553
Corr. coeff. squared     0.98653                         

The index shows high degree of match between the observations and their estimates.