# Category: Statistics

## Statistical foundations of machine learning

## Forecasting: principles and practice

## Generating Model Data with Various G.M. Violations and Testing them in R

## NIST Handbook Case Studies in Process Modeling

Step-by-step statistical modeling analysis projects using data from physical science and engineering applications. Walks through data collection, data exploration, modeling and results interpretation.

## NIST Handbook Load Cell Calibration Case Study

Walks through analysis and modeling of a load cell output data with the goal of being able to understand performance characteristics of the cell and be able to predict future load output levels. exploratory data analysis, model fitting, heteroskedasticity tests and corrections, and interpretation of analysis results.

NIST Handbook Load Cell Calibration Case Study

**Other Links:**

Load Cell Terminology

## Zero Conditional Mean of Errors – Gauss-Markov Assumption

The zero conditional mean of errors Gauss-Markov assumption is like stating that there’s no relationship or linking mechanism between the stochastic error and any of the independent variables in that model

Zero conditional mean of errors – Gauss-Markov assumption (Ben Lambert)

## Serial Correlation Gauss-Markov Assumption

Having no serial correlation of errors is stating that the dependent variable in the sample observations from a population don’t affect or depend on each other.

**Links:**

Gauss-Markov – explanation of random sampling and serial correlation (Ben Lambert)

Understanding the Theory Behind Serial Correlation (Udemy Blog)

Covariance and Correlation (Random)

Serial correlation testing – introduction (Ben Lambert)

Taking expectations of a random variable (Ben Lambert)

Expectations and Variance properties (Ben Lambert)

## Weighted Least Squares – General Intuition and Usage

Weighted Least Squares adjusts the line of best fit plotting points taking into account a variable variance as the observation plot progresses. I.e. the regression in an area where there is lower variance will be “weighted” lower than an area where there’s a higher variance.

The weight is derived by taking the residual errors of the regression model and deriving a separate model of regression for that residual error, which describes a function of movement of how the error varies throughout the observations.

The derived weight is then applied as a multiplier to the regressor coefficients in the model.

**Links:**

Weighted Least Squares: an introduction (Ben Lambert)

Weighted Least Squares: mathematical introduction (Ben Lambert)

Weighted Least Squares: an example (Ben Lambert)

Weighted Least Squares in practice – feasible GLS – part 1 (Ben Lambert)

Weighted Least Squares in practice – feasible GLS – part 2 (Ben Lambert)

Weighted Least Squares Regression Process Modeling Method (NIST/SEMATECH e-Handbook of Statistical Methods)

Weighted Least Squares Regression Estimating Parameters (NIST/SEMATECH e-Handbook of Statistical Methods)

Weighted Least Squares (PennState STAT 501)

Weighted Least Squares Examples (PennState STAT 501)

## Heteroskedasticity – General Intuition and Usage

**Description: **

Systematic variance in our model error relative to one of the independent variables in the regression model.

**2 Types:**

Population Heteroskedasticity

– There is heteroskedasticity in the actual population data.

Omitted Variable Bias Heteroskedasticity (Model Heteroskedasticity)

– Error is linked to one of the independent variables in the regression model.

**Investigation Approach**

When analyzing model we should assume that we have omitted variable bias heteroskedasticity (if patterned error variance is visible in scatter-plot) and try to prove existence with tests. If model heteroskedasticity tests fail then we should explore tests for population heteroskedasticity.

**Problems Caused:**

Standard errors will be wrong. Any inference done using bad standard errors will be wrong.

OLS is no longer BLUE. Another type of regression model would better describe the population process.

**Solutions:**

Use White and/or Newey-west methods to correct the erratic standard errors produced in a heteroskedastic model.

– We can then use for inference techniques.

– Problem is that the heteroskedasticity in the model still exists.

Use fGLS regression model to produce a regression model with a homoskedastic error variance.

– Solves the root problem of heteroskedasticity in the population data.

**Links:**

Heteroskedasticity summary (Ben Lambert)

Heteroscedasticity: as a symptom of omitted variable bias – part 1 (Ben Lambert)

Heteroscedasticity: as symptom of omitted variable bias – part 2 (Ben Lambert)

Heteroscedasticity: dealing with the problems caused (Ben Lambert)

Chapter 19: Heteroskedasticity (Introductory Econometrics: Using Monte Carlo Simulation with Microsoft Excel)

Scatter Plot: Variation of Y Does Depend on X (heteroscedastic) (NIST/SEMATECH e-Handbook of Statistical Methods)