Moving Beyond LinearityAU STAT627Emil Hvitfeldt2021-06-141 / 31

$\require{color}\definecolor{orange}{rgb}{1, 0.603921568627451, 0.301960784313725}$

$\require{color}\definecolor{blue}{rgb}{0.301960784313725, 0.580392156862745, 1}$

$\require{color}\definecolor{pink}{rgb}{0.976470588235294, 0.301960784313725, 1}$ `

Moving beyond Linearity

We have so far worked (mostly) with linear models

linear models are great because they are simple to describe, easy to work with in terms of interpretation and inference

However, the linear assumption is often not satisfied

This week we will see what happens once we slowly relax the linearity assumption

2 / 31

Moving beyond Linearity

3 / 31

Moving beyond Linearity

4 / 31

Moving beyond Linearity

5 / 31

Polynomial regression

Simple linear regression

$y_i = \beta_0 + \beta_1 x_i + \epsilon_i$

2nd degree polynomial regression

$y_i = \beta_0 + \beta_1 x_i + \beta_2 x_i^2 + \epsilon_i$

6 / 31

Polynomial regression

Polynomial regression function with $d$ degrees

$y_i = \beta_0 + \beta_1 x_i + \beta_2 x_i^2 + \beta_3 x_i^3 + ... + \beta_d x_i^d + \epsilon_i$

Notice how we can treat the polynomial regression as

7 / 31

Polynomial regression

We are not limited to only use 1 variable when doing polynomial regression

Instead of thinking of it as fitting a "polynomial regression" model

Think of it as fitting a linear regression using polynomially expanded variables

8 / 31

Polynomial regression

2 degrees

9 / 31

Polynomial regression

3 degrees

10 / 31

Polynomial regression

4 degrees

11 / 31

Polynomial regression

10 degrees

12 / 31

Step Functions

We can also try to turn continuous variables into categorical variables

If we have data regarding the ages of people, then we can arrange the groups such as

under 21
21-34
35-49
50-65
over 65

13 / 31

Step Functions

We divide a variable into multiple bins, constructing an ordered categorical variable

14 / 31

Step Functions

15 / 31

Step Functions

16 / 31

Step Functions

17 / 31

Step Functions

18 / 31

Step Functions

Depending on the number of cuts, you might miss the action of the variable in question

Be wary about using this method if you are going in blind, you end up creating a lot more columns of your data set and your flexibility increase drastically

19 / 31

Basis Functions

Both polynomial and piecewise-constant regression models are special cases of the basis function modeling approach

The idea is to have a selection of functions $b_1(X), b_2(X), ..., b_K(X)$ that we apply to our predictors

$y_i = \beta_0 + \beta_1 b_1(x_i) + \beta_2 b_2(x_i) + \beta_3 b_3(x_i) + ... + \beta_K b_K(x_i) + \epsilon_i$

Where $b_1(X), b_2(X), ..., b_K(X)$ are fixed and known

20 / 31

Basis Functions

The upside to this approach is that we can take advantage of the linear regression model for calculations along with all the inference tools and tests

This does not mean that we are limited to using linear regression models when using basis functions

21 / 31

Regression Splines

We can combine polynomial expansion and step functions to create piecewise polynomials

Instead of fitting 1 polynomial over the whole range of the data, we can fit multiple polynomials in a piecewise manner

22 / 31

Regression Splines

23 / 31

Regression Splines

24 / 31

Regression Splines

25 / 31

Regression Splines

26 / 31

Local Regression

local regression is a method where the modeling is happening locally

namely, the fitted line only takes in information about nearby points

27 / 31

Local Regression

28 / 31

Local Regression

29 / 31

Generalized Additive Models

Generalized Additive Models provide a general framework to extend the linear regression model by allowing non-linear functions of each predictor while maintaining additivity

The standard multiple linear regression model

$y_i = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + ... + \beta_p x_{ip} + \epsilon_i$

is extended by replacing each linear component $\beta_j x_{ij}$ with a smooth linear function $f_j(x_{ij})$

30 / 31

Generalized Additive Models

Given us

$y_i = \beta_0 + f_1(x_{i1}) + f_2(x_{i2}) + f_3(x_{i3}) + ... + f_p(x_{ip}) + \epsilon_i$

Since we are keeping the model additive we left with a more interpretable model since we are able to look at the effect of each of the predictors on the response by keeping the other predictors constant

31 / 31

Moving beyond Linearity

We have so far worked (mostly) with linear models

linear models are great because they are simple to describe, easy to work with in terms of interpretation and inference

However, the linear assumption is often not satisfied

This week we will see what happens once we slowly relax the linearity assumption

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Moving Beyond Linearity

AU STAT627

Emil Hvitfeldt

2021-06-14

Moving beyond Linearity

Moving beyond Linearity

Moving beyond Linearity

Moving beyond Linearity

Polynomial regression

Polynomial regression

Polynomial regression

Polynomial regression

2 degrees

Polynomial regression

3 degrees

Polynomial regression

4 degrees

Polynomial regression

10 degrees

Step Functions

Step Functions

Step Functions

Step Functions

Step Functions

Step Functions

Step Functions

Basis Functions

Basis Functions

Regression Splines

Regression Splines

Regression Splines

Regression Splines

Regression Splines

Local Regression

Local Regression

Local Regression

Generalized Additive Models

Generalized Additive Models

Moving beyond Linearity

Help