Orthogonality is a mathematical property that is beneficial for statistical models. It’s particularly helpful when performing factorial analysis of designed experiments.

Orthogonality has various mathematic and geometric definitions. In this post, I’ll define it mathematically and then explain its practical benefits for statistical models.

## Terminology

First, here’s a bit of background terminology that you’ll encounter when discussing orthogonality.

In math, a matrix is a two-dimensional rectangular array of numbers with columns and rows. A vector is simply a matrix that has either one row or one column.

For a regression model, the columns in your dataset are the independent and dependent variables. These columns are vectors.

When I refer to a vector in this context, you can think of a datasheet column representing a variable. Orthogonality applies specifically to the independent variables.

Related post: Independent and Dependent Variables

## Orthogonal Definition

Vectors are orthogonal when the products of their matching elements sum to zero. That’s a mouthful, but it’s pretty simple illustrating how to find orthogonal vectors.

Follow these steps to calculate the sum of the vectors’ products.

1. Multiply the first values of each vector.
2. Multiply the second values, and repeat for all values in the vectors.
3. Sum those products.

If the sum equals zero, the vectors are orthogonal.

Let’s work through an example. Below are two vectors, V1 and V2. Each vector has five values.  The table below multiplies the values in each vector and sums them. Because the sum equals zero, these two vectors are orthogonal.

For the discussion about orthogonality in linear models below, consider each vector to be an independent variable.

## Orthogonality in Regression and ANOVA models

Orthogonality provides essential benefits to linear models, even though that might not be obvious from the mathematic definition!

When independent variables are orthogonal, they are uncorrelated, which is beneficial. Statisticians refer to the correlation amongst independent variables as multicollinearity. A little bit is okay, but more can cause problems.

The best case is when there is no multicollinearity at all, which is an orthogonal model. Orthogonality indicates that the independent variables are genuinely independent. They are not associated at all—totally uncorrelated.

For orthogonal models, the coefficient estimates for the reduced model will be the same as those in the full…