“Truth is a Pathless Land”

...but finding an effective solution to your business problem does not have to be. Business analytics landscape does actually appear so, with a myriad techniques and vendor tools in the market.

Simafore provides tools and expertise to:

  • Integrate data
  • Select and deploy appropriate analytics
  • Institutionalize processes

About this Blog

The Analytics Compass Blog is aimed at two types of readers:

  • individuals who want to build analytics expertise and 

  • small businesses who want to understand how analytics can help them improve their business performance. 

If you fall into one of these categories, join hundreds of others and subscribe now!

Subscribe via E-mail

Your email:

Search SimaFore

FREE SMB Survey Report

describe the image
  • Our report (pdf) is a survey of analytics usage and needs of more than 100 SMBs
  • Find out which analytics applications have
    • Highest value
    • Most demand
    • Best ROI

Affordable Analytics for SMB

 

Browse by Tag

Blog - The Analytics Compass

Current Articles | RSS Feed RSS Feed

Using RapidMiner for time series forecasting in cost modeling: 1 of 2

  
  
  

The process for product or transportation cost forecasting involves the following steps:

  1. identifying most relevant factors for the target variable which is usually an overall cost, such as weekly spend or per unit production cost, 
  2. building a regression based cost model which functionally relates the input factors to the target variable,  
  3. developing time series based forecasts for each of the factors identified in 1,
  4. using the regression model from 2, and time series forecasts in 3 make the final forecasts for costs.

The process has been described in detail in this article. In another article we focused on step 3 in detail and showed how to use the open source package, R to build time series forecasts for cost modeling applications. In this first part of two, we will discuss the approach used by RapidMiner to build time series forecasts which is fundamentally different from standard techniques. In the second part we will apply the process on cost modeling data. 

RapidMiner's approach to time series is based on two main data transformation processes:

  1. Windowing to transform the time series data into a generic data set: this step will convert the last row of a window within the time series into a label or target variable
  2. Applying any of the "learners" or algorithms to predict the target variable and thus predict the next time step in the series

A "typical" time series data and its transformed structure (after windowing) is conceptually shown below.rapidminer time series windowing concept resized 600

The parameters of the "Windowing" operator allow changing the size of the windows (shown as colored vertical boxes on the left), the overlap between the windows (also known as step size) and a prediction horizon which is used for forecasting. Thus a series data is now converted into a generic data set which can be processed by any of the available RapidMiner operators. 

The next main process required for running time series analyses using RapidMiner involves applying any of the available "learners" to "predict" the label variable shown in the green vertical box (see graphic above). The example set (or raw data) for this learner is the "horizontal" data set shown above with the target or label variable in the green box. Also, most of the Performance operators can be used to assess the fitness of the learning scheme to the data.

In part two of this series we will show a process that uses this logic to make time series forecasts and will compare with the results obtained from more traditional time series tools. 

Sign up for our blog above and make sure you dont miss the next part.

Comments

Currently, there are no comments. Be the first to post one!
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics