## Intro To Regressions

Before I explain how to conduct a regression, I’ll answer the question that is probably on your mind: what is a regression?

A *regression *shows the relationship between two things (let’s call them *variables*). These variables can be almost anything, like the relationship between:

- a persons weight vs. how much they can bench press
- the number of planes in the sky vs. the amount of pollution in the air
- daily caloric intake vs. life expectancy
- the square footage of a house vs. its selling price

A regression shows the relationship between two variables.

When we look at a regression, *there is always one variable that affects the other. *For example, it may be reasonable to say that the square footage of a house affects its price, and *not *the other way around.

The variable that **does the affecting** is the ** INDEPENDENT **variable, while the variable that

**is affected by the other**is the

*variable.*

**DEPENDENT****In our case, the market data is the independent variable, and the stock data is the dependent variable. **This means we want to know how our stock returns changes relative to changes in the returns of the market, and *not *the other way around.

When we do a regression, we are essentially plotting points on a graph, where the independent variable (market data) is on the x-axis, and the dependent variable (stock data) is on the y-axis.

Let’s *pretend *that on January 1st, 2010, **ABC increased 5%** while the **S&P500 increased 10%**. Our graph would look like this:

Now, this isn’t really enough to do anything, but as we look at other time periods, we can add some **more **points. We may start to see a trend.

*(Keep in mind that in reality, our graph will positive returns AND negative returns (losses), but this image should illustrate the point. Also, the trend we see may not be upward sloping; it could very well slope downwards, but this is rare).*

Now all these points finally come down to what we are really looking for: *a regression. *

When we do a regression, one of the things we will find is a * line of best fit*. This line is situated through the individual points so that it

*minimizes*the distance between all the points and the line

**as much as mathematically possible.**

It may not look like it, but *this graph now contains a very important metric that is vital to investing*. **It is the slope of the line.** You may remember it from high school math:

#### y = mx + b

where **m is the slope** and **b is the y-intercept**. In this case, **x is our market data (the independent variable)**, and **y is our stock data (the depended variable)**.

Let’s say that the slope of the line is 2. That means that if our market data (x) is 10%, and the y-intercept is 0 (b), then we can expect that our stock data should be 20%.

*Note, the y-intercept (b) is practically never 0, but this will be explained in future posts when we get into more complex sections of how I do an analysis.*

That means that when the market increases by 10%, our model says that our stock should increase by 20%. Therefore, it would have a beta of 2.

**Now that we understand this visually, let’s calculate it on our spreadsheet.**

## Doing Our Regression

Now, there are a few ways to go about doing this, and statisticians, professional analysts, mathematicians, and economists would likely conduct a more complex regression that includes *variance *and *covariance*. But this method works just the same, and **is more intuitive** for most students of econ, stats, and finance.

Let’s go back to our spreadsheet data and make a new column named **“Slope”**.

For this sort of dataset, I want to find the slope using about 30 points, which means it includes 30 weeks of returns. So, I’ll use the function.

#### =slope(y data, x data)

Remember that the market data affects the stock data, so:

- x data = dSP500, and
- y data = dABC

It’s easy to get mixed up here, so make sure you are using the *returns *of ABC and S&P500, *not *the prices. Otherwise, you’ll get an answer that is totally BONKERS and will make no sense.

**I’ll say it again:*** find the slope using the returns of the stock and index, NOT the prices.*

Press “Enter” and voilà,** you just have calculated the beta of ABC for the 30 week time period between January 26 and August 24, 2015!**

**But don’t celebrate too early; we aren’t done yet.**

The beta we have calculated only takes into account a 30 week period between January 26 and August 24 using the weekly returns from ABC and S&P500. **But returns are always changing,** so the next week will give us a *different beta.*

By repeating the slope function using the same sized sample (30 weeks) but moving it forward one time period (one row), we see that the beta was 0.79 between February 2 and August 31.

And if we repeat a THIRD time, we see the beta change again to 0.89 when looking at data between Feb 9 and Sept 7.

#### So, Which Is The Right Answer?

The beta of a stock is ALWAYS changing. In fact, we cannot state that a beta is 0.74 because by the time next week arrives, it will be something else!

**Calculating a single beta for a stock is remarkably useless.**

##### I’ll say it again: a single-point beta is *useless *since the returns for the market and for the stock are always changing.

*useless*

So, instead of finding just one beta, we are going to find **LOTS**.

Click and hold the lower right hand corner of the slope cell we just completed. Drag it down until there are only 30 spots left in the data. For example, if the total number of rows from the downloaded data are 263, drag until you hit 233.

When we do this, we are still calculating the slope using 30 data points, but we are moving forwards in time. Not only have we calculated the beta between January 26 and August 24, but we have also calculated the beta between:

- February 2 – August 31
- February 9 – September 7
- February 16 – September 14…

And so on, right up until the end date of our dataset.

The reason we** don’t drag the box all the way to the bottom** is because it would calculate the slope using blank data, which skews our data. This is what the bottom of our data set would look like:

This is the selection we are looking for:

**We no longer only have one beta. We have a couple hundred!**

Now, if your eyes are starting to hurt and you are getting sick of reading, take a break. This is where people start to think they are pros at quantitative analysis because they can impress their friends with fancy functions, **but we aren’t done yet.**

What I didn’t tell you at the beginning is *we aren’t just calculating beta*. We have created a…

**Beta Frequency Distribution!**

I know, it’s a sneaky thing to do, **but you’ve already completed it! **All you have to do now is turn it into a graph, and you’re done 🙂

## No Comments