# colour.algebra.regression Module¶

## Regression Analysis¶

Defines various objects to perform statistical regression analysis:

References

Performs the statistics computation about the ideal trend line from given data using the least-squares method.

The equation of the line is $$y=b+mx$$ or $$y=b+m1x1+m1x2+...+mnxn$$ where the dependent variable $$y$$ value is a function of the independent variable $$x$$ values.

Parameters: y (array_like) – Dependent and already known $$y$$ variable values used to curve fit an ideal trend line. x (array_like, optional) – Independent $$x$$ variable(s) values corresponding with $$y$$ variable. additional_statistics (ndarray) – Output additional regression statistics, by default only the $$b$$ variable and $$m$$ coefficients are returned. Regression statistics. ndarray, ({{mn, mn-1, ..., b}, {sum_of_squares_residual}}) ValueError – If $$y$$ and $$x$$ variables have incompatible dimensions.

References

 [2] http://en.wikipedia.org/wiki/Simple_linear_regression (Last accessed 24 May 2014)

Examples

Linear regression with the dependent and already known $$y$$ variable:

>>> y = np.array([1, 2, 1, 3, 2, 3, 3, 4, 4, 3])
>>> linear_regression(y)
array([ 0.2909090...,  1.        ])


Linear regression with the dependent $$y$$ variable and independent $$x$$ variable:

>>> x1 = np.array([40, 45, 38, 50, 48, 55, 53, 55, 58, 40])
>>> linear_regression(y, x1)
array([ 0.1225194..., -3.3054357...])


Multiple linear regression with the dependent $$y$$ variable and multiple independent $$x_i$$ variables:

>>> x2 = np.array([25, 20, 30, 30, 28, 30, 34, 36, 32, 34])
>>> linear_regression(y, tuple(zip(x1, x2)))
array([ 0.0998002...,  0.0876257..., -4.8303807...])


Multiple linear regression with additional statistics:

>>> linear_regression(y, tuple(zip(x1, x2)), True)
(array([ 0.0998002...,  0.0876257..., -4.8303807...]), array([ 2.1376249...]))