Linear prediction and least squares fit

Andrew Pua

February 2022

Summary

Where I come from

What the course is about

The goals of the course

Strategies that could work

Why is the course taught in English and why it is a good thing

How do I make things less difficult?

Grading

平时成绩

Example of a short quiz

Write down the expression needed to calculate \(\mathbb{E}\left(X^{2}\right)\).

Inspiration for these notes

Review of expected values

Optimal prediction under squared error loss

Implications?

Example

Predictions with additional information

Explicit solutions

Linear algebra interpretation

Orthogonality conditions

Systems of linear equations

The “modern” version of the linear regression model

Exercises

The many versions of the linear regression model

Least squares algebra

Least squares geometry

Least squares algebra and geometry

Intuition

Exercises

Connecting the linear regression model and the least squares fit

I SEE THE MOUSE

set.seed(20220221) # Change this to generate different results
coefs <- matrix(NA, nrow=10^4, ncol=2) # Storage
for(i in 1:10^4)
{
  source <- matrix(c(1,3,3,5,0,2,1,1), ncol = 2) # joint distribution
  data <- source[sample(nrow(source), size=40, replace = TRUE),]  # IID sampling
  temp <- lm(data[, 1] ~ data[, 2]) # least squares
  coefs[i, ] <- summary(temp)[[4]][, 1] # store coefficients
}

Sampling distribution of \(\left(\widehat{\beta}_{0},\widehat{\beta}_{1}\right)\), \(n=40\)

Sampling distribution of \(\left(\widehat{\beta}_{0},\widehat{\beta}_{1}\right)\), \(n=160\)

Sampling distribution of \(\left(\widehat{\beta}_{0},\widehat{\beta}_{1}\right)\), \(n=640\)

Consistency

Consistency, continued

Asymptotic normality

The meaning of \(\phi^2\)

The effect of plugging in estimates instead of the real thing

Exercises

Consistently estimating the asymptotic variance

Asymptotic variance, \(n=40\)

set.seed(20220221) # Change this to generate different results
ses <- matrix(NA, nrow=10^4, ncol=2) # Storage 
for(i in 1:10^4)
{
  source <- matrix(c(1,3,3,5,0,2,1,1), ncol = 2) # joint distribution
  data <- source[sample(nrow(source), size=40, replace = TRUE),]  # IID sampling
  temp <- lm(data[, 1] ~ data[, 2]) # least squares
  coefs[i, ] <- summary(temp)[[4]][, 1] # store coefficients
  ses[i, ] <- summary(temp)[[4]][,2] # store standard errors
}
c(mean(ses[, 1])/sd(coefs[, 1]), mean(ses[, 2])/sd(coefs[, 2])) # SE/SD ratio
## [1] 1.048279 1.093472

Asymptotic variance, \(n=40\)

Asymptotic variance, \(n=160\)

## [1] 1.114528 1.205512

Asymptotic variance, \(n=640\)

## [1] 1.141887 1.225075