Friday, November 4, 2016

My Best Frenemy, Price Elasticity (Part 2 - Linear Regression)

This is the second post in a series discussing elasticity. The first post discussed the Econ 101 way to measure price elasticity, a brief example of how to apply it to a theoretical price change to see what will happen to quantity, as well as some of the pros and cons of using this calc. In this post, I will discuss using simple linear regression to find an improved (although not perfect) measure of price elasticity.

We're shifting our mindset from economics to statistics, so we need to clarify a few assumptions.

Here's a basic supply and demand chart that I borrowed from Wikipedia.


This is definitely the Econ way to look at the world. How can I tell? The x-axis is Quantity, and the y-axis is Price. It's backwards! In math class, you learned that the input (or predictor variable) is on the x-axis and the output (or response variable) is on the y-axis. However, in the chart above, the two are flipped. This is kinda funky and for whatever reason is the convention used by economists.

Switching to a Regression Mindset
In the real world, quantity is a function of price for consumers. This means that Price is on the x-axis and Quantity is on the y-axis. From a customer point of view, how high or low the price is set will definitely have an influence on the quantity sold.

We're going to turn things around a bit and go for a statistics point of view. I'm assuming that you have at least some familiarity with linear regression. If you do not, I strongly recommend taking a look at Brandon Foltz's YouTube series to get a basic grasp.

I'm about to run through an example or two of using regression to find pseudo-elasticity.  My goal is to show how to apply linear regression to pricing data and not get bogged down in the nitty-gritty of statistics. To wit, this means I have created data that inherently passes the four main assumptions that justify the usage of linear regression. Those assumptions are:
  1. Linear relationship between input variable(s) and output variable.
  2. Errors are statistically independent.
  3. Homoscedasticity of errors (constant variance).
  4. Normality of error distribution.
There are better places on the internet than this blog that discuss these assumptions. One of my favorites is from Prof. Robert Nau at Duke University, so make sure to check that out if you want a truly robust understanding of regression. He also created an Excel plugin called RegressIt that is really great. I will probably make a post about it in the future. Anyway, on to the example.


When the Boss wants Elasticity
Ok! Here's the situation. You're a pricing analyst at Ashland Milwaukee Corp. You make only one product, which is a small novelty widget that tells knock-knock jokes. The boss ran into an old friend from business school that went on and on about the AMAZING things his company is doing with price elasticity. Of course, now your boss wants something similar.

So now it's up to you to measure elasticity and do something with it. As with most things, the best place to start is at the beginning, so you grab your invoice data. It shows how much you sold and at what price. You negotiate your sales price with all of your customers, so you've got a range of net prices to investigate.

Next, you fire up Excel and create a scatter plot so you can see what your data looks like. Here's the result:


Nothing too fancy. Without doing anything more than plotting the data you can already see that there's some sort of relationship between price and quantity. As price increases, quantity sold decreases, just like we've talked about before.

Charts are fantastic, but ultimately the boss wants a specific number to work with. No problem. You can add a trend line to the chart, and while you're at it you  might as well add in the equation of the trend line and the R2 ("R squared" in case you want to Google it). Hey, we're getting somewhere!


Take a look at that equation for the trend line:

y = -305.42x +14356

I am going to rewrite the equation with some substitutions to the variables to clarify what's going on.

Quantity = -305.42 * Price + 14356

Boom! Now I can substitute in any Price and get an understanding of approximately how many units I should sell. For example, if I wanted to see how many units I would sell if I price at exactly $20, it would look like:

Quantity = -305.42 * ($20) + 14356
Quantity = 8248 (Rounded up because it doesn't make sense to sell a fraction of a product.)

What about $21?

Quantity = -305.42 * ($21) +14356
Quantity = 7943

Something interesting is going on here. For each additional dollar I price my item, I sell about 305 fewer units. Or, if I lower my price by a dollar, I should expect to sell an additional 305 units. Works both ways.

Where have I seen 305 before? Ah yes! It is the coefficient of slope in our linear equation. The implication is that the coefficient of slope is almost the same thing as price elasticity. Why is it only almost the same? Because price elasticity is a measurement of the percent change in quantity versus a percent change in price. The slope we have above is a change in actual quantity versus a change in actual price. It represents a change in the total number of units we sell versus the change of a dollar or two in price. You may have previously heard this called a demand function.

The boss asked us for the elasticity, but we did the linear regression and technically we don't have elasticity. Great, we're screwed, right? Nah, probably not. Most times people don't really want the elasticity, per se. They want what elasticity can do for them, which is to inform them of the impact of a decision to change price. Thankfully, we have that. We know that each change of $1 in price will result in a change of 305 units in quantity. That's pretty darn impactful information!

But wait, there's more: R2
If you recall in my previous post about price elasticity, one of my big gripes about the Econ 101 way of measuring elasticity is that it doesn't give us a good idea if the result of the calculation is any good or not. A huge pro of regression analysis is that it does help provide that additional context, which we find in the R2.

Without getting too far into the weeds, R2 represents the explanatory power of our model. In the example above, it tells us that 64.23% of the variance in Quantity is explained by the variance in Price. That's an incredibly high number, which means that we can feel confident when we tell our boss the results of the analysis.

Your boss will probably ask about the remaining ~35% of variance in Quantity that isn't explained by Price. The fact of the matter is that you don't know. The only two factors in the simple regression we ran are quantity and price, so those are the only two things we can have insight into. The other 35% could consist of anything. The weather could have been bad, maybe you had quality issues with the supplier, or maybe it's random error that is inherent in any statistical model. Regardless, it's better to focus on what you do have instead of what you're missing.

A few words of caution before you go
Before you go wild and start regressing all over the place, you have to be careful. What if the boss wants to see how many units you would sell if you raised the price to $40? Yes, on paper you can plug $40 into the model and get a result. The result of that calculation won't be worth much, though.

When we ran our regression, it was over the price range of about $15 to $25, so we have a pretty good idea of what happens between price and quantity in that range. When you start to go way beyond the range of data you regress on, that's when the model falls apart. We don't have a really great idea of what will happen at $40 because we haven't seen any data at that price point.

It would be reckless to apply our model with a parameter so far outside of current data. Using the model to look at $26 is fine, maybe $28 is ok too, but pretty quickly you get into the land of make believe. Unfortunately I have seen with my very own eyes highly compensated white shoe consultants make exactly this error and do the equivalent of recommending the $40 price. You will be ahead of the game if you keep in mind as you begin to apply regression analysis.

In the next post we'll take a look at using a natural log transformation to help us to find the actual elasticity. See you next time!




No comments:

Post a Comment