X

Download Department of Zoology PowerPoint Presentation

SlidesFinder-Advertising-Design.jpg

Login   OR  Register
X


Iframe embed code :



Presentation url :

Home / Forest & Animals / Forest & Animals Presentations / Department of Zoology PowerPoint Presentation

Department of Zoology PowerPoint Presentation

Ppt Presentation Embed Code   Zoom Ppt Presentation

PowerPoint is the world's most popular presentation software which can let you create professional Department of Zoology powerpoint presentation easily and in no time. This helps you give your presentation on Department of Zoology in a conference, a school lecture, a business proposal, in a webinar and business and professional representations.

The uploader spent his/her valuable time to create this Department of Zoology powerpoint presentation slides, to share his/her useful content with the world. This ppt presentation uploaded by onlinesearch in Forest & Animals ppt presentation category is available for free download,and can be used according to your industries like finance, marketing, education, health and many more.

About This Presentation

Department of Zoology Presentation Transcript

Slide 1 - Multiple regression
Slide 2 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression
Slide 3 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression
Slide 4 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression
Slide 5 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2)
Slide 6 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df
Slide 7 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom?
Slide 8 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance
Slide 9 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example
Slide 10 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30?
Slide 11 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b
Slide 12 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage
Slide 13 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b
Slide 14 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y
Slide 15 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b
Slide 16 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION
Slide 17 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11
Slide 18 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11
Slide 19 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13
Slide 20 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height?
Slide 21 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high…
Slide 22 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea?
Slide 23 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier
Slide 24 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier Why did the parameter estimates change? Why did the F tests change? F1,13
Slide 25 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier Why did the parameter estimates change? Why did the F tests change? F1,13 Heavy people often tall (tall people often heavy) Tall people can jump higher People light for their height can jump a bit more Weight Height Jump + + -
Slide 26 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier Why did the parameter estimates change? Why did the F tests change? F1,13 Heavy people often tall (tall people often heavy) Tall people can jump higher People light for their height can jump a bit more Weight Height Jump + + - The problem: The parameter estimate and significance of an x-variable is affected by the x-variables already in the model! How do we know which variables are significant, and which order to enter them in model?
Slide 27 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier Why did the parameter estimates change? Why did the F tests change? F1,13 Heavy people often tall (tall people often heavy) Tall people can jump higher People light for their height can jump a bit more Weight Height Jump + + - The problem: The parameter estimate and significance of an x-variable is affected by the x-variables already in the model! How do we know which variables are significant, and which order to enter them in model? Solutions 1) Use a logical order. For example it makes sense to test the interaction first 2) Stepwise regression: “tries out” various orders of removing variables.
Slide 28 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier Why did the parameter estimates change? Why did the F tests change? F1,13 Heavy people often tall (tall people often heavy) Tall people can jump higher People light for their height can jump a bit more Weight Height Jump + + - The problem: The parameter estimate and significance of an x-variable is affected by the x-variables already in the model! How do we know which variables are significant, and which order to enter them in model? Solutions 1) Use a logical order. For example it makes sense to test the interaction first 2) Stepwise regression: “tries out” various orders of removing variables. Stepwise regression Enters or removes variables in order of significance, checks after each step if the significance of other variables has changed Enters one by one: forward stepwise Enters all, removes one by one: backwards stepwise
Slide 29 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier Why did the parameter estimates change? Why did the F tests change? F1,13 Heavy people often tall (tall people often heavy) Tall people can jump higher People light for their height can jump a bit more Weight Height Jump + + - The problem: The parameter estimate and significance of an x-variable is affected by the x-variables already in the model! How do we know which variables are significant, and which order to enter them in model? Solutions 1) Use a logical order. For example it makes sense to test the interaction first 2) Stepwise regression: “tries out” various orders of removing variables. Stepwise regression Enters or removes variables in order of significance, checks after each step if the significance of other variables has changed Enters one by one: forward stepwise Enters all, removes one by one: backwards stepwise Forward stepwise regression Enter the variable with the highest correlation with y-variable first (p>p enter). Next enter the variable to explain the most residual variation (p>p enter). Remove variables that become insignificant (p> p leave) due to other variables being added. And so on…
Slide 30 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier Why did the parameter estimates change? Why did the F tests change? F1,13 Heavy people often tall (tall people often heavy) Tall people can jump higher People light for their height can jump a bit more Weight Height Jump + + - The problem: The parameter estimate and significance of an x-variable is affected by the x-variables already in the model! How do we know which variables are significant, and which order to enter them in model? Solutions 1) Use a logical order. For example it makes sense to test the interaction first 2) Stepwise regression: “tries out” various orders of removing variables. Stepwise regression Enters or removes variables in order of significance, checks after each step if the significance of other variables has changed Enters one by one: forward stepwise Enters all, removes one by one: backwards stepwise Forward stepwise regression Enter the variable with the highest correlation with y-variable first (p>p enter). Next enter the variable to explain the most residual variation (p>p enter). Remove variables that become insignificant (p> p leave) due to other variables being added. And so on… General words of caution! Correlation does not equal causation!
Slide 31 - Multiple regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Problem: to draw a straight line through the points that best explains the variance Regression Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Variance explained (change in line lengths2) Variance unexplained (residual line lengths2) Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression In regression, each x-variable will normally have 1 df Test with F, just like ANOVA: Variance explained by x-variable / df Variance still unexplained / df Regression Essentially a cost: benefit analysis – Is the benefit in variance explained worth the cost in using up degrees of freedom? Regression Also have R2: the proportion of total variance explained by the variable Variance explained by x-variable Variance still unexplained Variance explained by x-variable Unexplained variance Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example Total variance for 32 data points is 300 units. An x-variable is then regressed against the data, accounting for 150 units of variance. What is the R2? What is the F ratio? Regression example R2 = 150/300 = 0.5 F 1,30 = 150/1 = 30 150/30 Why is df error = 30? Multiple regression Tree age Herbivore damage Higher nutrient trees Lower nutrient trees Damage= m1*age + b Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Tree age Herbivore damage Tree nutrient concentration Residuals of herbivore damage Damage= m1*age + m2*nutrient + b Damage= m1*age + m2*nutrient + m3*age*nutrient +b No interaction (additive): Interaction (non-additive): y y Non-linear regression? Just a special case of multiple regression! Y = m1 x +m2 x2 +b X X2 Y 1 1 1.1 2 4 2.0 3 9 3.6 4 16 3.1 5 25 5.2 6 36 6.7 7 49 11.3 X2 X1 Y = m1 x1 +m2 x2 +b STEPWISE REGRESSION 8 11 10 9 Jump height (how high ball can be raised off the ground) Feet off ground Total SS = 11.11 F1,13 Why do you think weight is + correlated with jump height? An idea Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high… How could we test this idea? lighter heavier Why did the parameter estimates change? Why did the F tests change? F1,13 Heavy people often tall (tall people often heavy) Tall people can jump higher People light for their height can jump a bit more Weight Height Jump + + - The problem: The parameter estimate and significance of an x-variable is affected by the x-variables already in the model! How do we know which variables are significant, and which order to enter them in model? Solutions 1) Use a logical order. For example it makes sense to test the interaction first 2) Stepwise regression: “tries out” various orders of removing variables. Stepwise regression Enters or removes variables in order of significance, checks after each step if the significance of other variables has changed Enters one by one: forward stepwise Enters all, removes one by one: backwards stepwise Forward stepwise regression Enter the variable with the highest correlation with y-variable first (p>p enter). Next enter the variable to explain the most residual variation (p>p enter). Remove variables that become insignificant (p> p leave) due to other variables being added. And so on… General words of caution! Correlation does not equal causation! General words of caution! Can interpolate between points, but don’t extraoplate (Mark Twain effect) In the space of 176 the lower Mississippi has shortened itself 242 miles. That is an average of a trifle over 1 1/3 miles per year. Therefore, any calm person, who is not blind or idiotic, can see that in the old Oölithic Silurian Period, just a million years ago next November, the Lower Mississippi River was upwards of 1,300,000 miles long, and stuck out over the Gulf of Mexico like a fishing rod