Sunday, November 16, 2014

Weighted least squares

... or Why can't we just be ordinary squares?

In fitting your linear model, you may be interested in generating a prediction line that describes the relationship between your predictor(s) and your outcome. If you have constant variance in the errors (homoskedasticity), an ordinary least square (OLS) approach is used to fit the model to the data and generate a best fit line. A best fit line essentially minimizes the distance between the observed data and the predictions made by the model. If your data shows constant variance in the errors AND the errors are normally distributed, then OLS is the maximum likelihood estimator.

However, in spatial statistics (the analysis of data with a spatial component that considers spatial dependency) we often use data that violate the rule of error constant variance (heteroskedasticity). In this case, we use weighted least square (WLS) to fit the model to the data and generate a best fit line. In WLS, the error assumptions are that errors are normally distributed with mean vector 0 and nonconstant variance-covariance matrix σ2W, where W is a diagonal matrix. See this post from Penn State for a short intro to the nonconstant variance-covariance matrix.


No comments:

Post a Comment