# Supervised learning with Scikit-Learn Library

## Supervised learning

### Generalized Linear Models

The following are a set of methods intended for regression in which the target value is expected to be a linear combination of the input variables. In mathematical notion, if *y* is the predicted value.

Across the module, we designate the vector w as *coef_* and w0 as *intercept_*.

**Ordinary Least Squares**

**LinearRegression** fits a linear model with coefficients w to minimize the residual sum of squares between the observed responses in the dataset, and the responses predicted by the linear approximation. Mathematically it solves a problem of the form:

LinearRegression will take in its *fit* method arrays *X*, *y* and will store the coefficients *w* of the linear model in its *coef_* member.

```
from sklearn import linear_model
reg = linear_model.LinearRegression()
reg.fit ([[0, 0], [1, 1], [2, 2]], [0, 1, 2])
reg.coef_
# array([ 0.5, 0.5])
```

However, coefficient estimates for Ordinary Least Squares rely on the independence of the model terms. When terms are correlated and the columns of the design matrix *X* have an approximately linear dependence, the design matrix becomes close to a singular and as a result, the least-squares estimate becomes highly sensitive to random errors in the observed response, producing a large variance. This situation of *multicollinearity* can arise, for example, when data are collected without an experimental design.

**Ridge Regression**

*Ridge* regression addresses some of the problems of *Ordinary Least Squares* by imposing a penalty on the size of coefficients. The ridge coefficients minimize a penalized residual sum of squares

Here, a > 0 is a complexity parameter that controls the amount of shrinkage: the larger the value of a, the greater the amount of shrinkage, and thus the coefficients become more robust to collinearity.

As with other linear models, Ridge will take in its fit method arrays *X,y* and will store the coefficients* w* of the linear model in its *coef_* member:

```
from sklearn import linear_model
reg = linear_model.Ridge(alpha = .5)
reg.fit([[0, 0], [0, 0], [1, 1]], [0, .1, 1])
reg.coef_
# array([ 0.34545455, 0.34545455])
reg.intercept_
# 0.13636...
```

**Lasso**

The *Lasso* is a linear model that estimates sparse coefficients. It is useful in some contexts due to its tendency to prefer solutions with fewer parameter values, effectively reducing the number of variables upon which the given solution is dependent. For this reason, the Lasso and its variants are fundamental to the field of compressed sensing. Under certain conditions, it can recover the exact set of non-zero weights.

Mathematically, it consists of a linear model trained with l1 prior as a regularizer. The objective function to minimize:

The implementation in the class *Lasso* uses coordinate descent as the algorithm to fit the coefficients.

```
from sklearn import linear_model
reg = linear_model.Lasso(alpha = 0.1)
reg.fit([[0, 0], [1, 1]], [0, 1])
reg.predict([[1, 1]])
# array([ 0.8])
```

Also useful for lower-level tasks is the function *lasso_path* that computes the coefficients along the full path of possible values.