Mortage rates are at a near historic low as this chart below shows. Everyone and their brother thinks this is the best time to refinance and it would be hard to argue with them. The decision seems to be a no brainer, because if the rates go down any further, one can always refinance at the lower rates.

For many people, the choice may not be simple. Suppose you are getting a house built and if it will not be available for the next 3-6 months, you may not be able to get a mortgage today, although you may be able to lock-in today's rates for a 90 day period. This would require some additional costs, which may not be justified if the rates are anyway going to drop! I am not a mortgage broker or financial advisor, so i will not get into the pros and cons of paying points or closing costs. But i am a "data scientist" and i would certainly like to use the tools of my trade to help with my decision making.

There are several questions which analytics and historical data might be able to answer:

1. Can we actually predict (or forecast) the rates over a 6-month horizon?
2. If we cannot make high accuracy point forecasts, can we at least quantify the chances that the rates will increase over the next 3 to 6 months?
3. What is the maximum increase (if at all) that is likely to happen over the next 3-6 months?

1. Using R to make point forecasts

Forecasting can potentially provide an answer to the first question. We started with data for 30-year fixed rate mortgages (weekly rates from 2008 to 2012) and ran a basic HoltWinters forecast model. This article has more on the actual process of implementing a time series forecasting using R. The red line is the fitted data and the black is the raw data.

This chart below shows the timeseries forecast with confidence intervals (yellow 95%, orange 80%) around the forecasted points which are the last 8 weeks.

How good is this model? The above analysis was run with weekly data from 7/4/2008 to 5/25/2012. We have a few hold out samples (from 6/1/2012 to 7/20/2012) to compare the forecasted values with actual. The chart below shows how the forecasted data compares with real data from the hold-out weeks.

Mortgage rates depend upon several external factors: GDP growth rate, consumer price index, producer price index, payroll employment, unemployment, housing starts etc. It would be a good strategy to model first how these measures impact rates before attempting a forecast. For example higher GDP (or CPI, PPI, Payroll, Housing start) rates are considered inflationary which serves to increase interest rates, while higher unemployment trends typically move interest rates down. But there are many feedback effects as well. In this case, one cannot expect a simple one-factor forecast of a highly complex quantity to be highly accurate. As the chart above shows the best forecast was about 93% accurate (for the first week in the hold out set) and the worst was about 84%.

2. Using Excel to run basic analytics

Data however can at least explain general trends. These are the questions 2 and 3 raised above. We could for example simply look at the historic data to understand short-term trends: how many basis points (1% = 100 basis points) can the rates swing by over a period of 3-months or 6-months? The process to do this analysis is fairly easy within excel.

We can then plot and analyze the histograms of the 3-month and 6-month swing data from columns C and D shown above.

Let us answer question 2 (for the 3-month analysis):

There is an almost equal likelihood that the rates could either go up or go down! However there is a very slight edge (54%) chance that the rates will either not change or will decrease. That is good news. Further more the chances of rates going up by more than 150 basis points are less than 4% and there is less than 0.5% chance that the maximum swing (increase) will be around 350 percentage points (answer to question 3).

How will you use this information in your decision making? That depends upon your individual risk appetite. But with analytics you can bring solid rationale for your decision making process.

Are you interested in a datamining cookbook that explains many of these techniques and shows you how to apply them using open source products like R and RapidMiner? Take the anonymous survey below to give us feedback!