Ranking KPIs: a critical first step for small or big data analytics

Posted by Bala Deshpande on Wed, Sep 04, 2013 @ 06:45 AM

When the data.gov website was launched in 2009, it had a measly 47 datasets. Four years later it has exploded to nearly 100,000 data sets in more than 50 formats. This is merely the public facing data which the government makes available to the tax paying citizenry. The "other" government data (still funded by taxes) which are not openly available to all, due to security and other reasons is clearly significantly larger. EMC Corporation recently released a report where they indicated that only about a quarter of this data is tagged and analyzed by the government currently. Officials have been quoted as saying that in the next 5 years, the feds will spend about $13 billion (16% of the total IT budget) to improve big data infrastructure and develop data mining best practices for this data. The report also summarized the top three areas where large government agencies can best leverage big data and analytics: improving process and efficiency, enhancing security and predicting trends.

Read More

Tags: keyconnect, correlations, KPI

How to ensure that KPI are mutually exclusive and non redundant

Posted by Bala Deshpande on Tue, May 07, 2013 @ 06:15 AM

In any business, strategic decisions can be made only after we know what factors have the greatest impact on the bottomline. Key performance indicators or KPI are a term generally reserved for those factors which signal the health of a business. The business in question could be the overall functioning of an entity or it could be a specific division within a larger activity. For example, for the CEO of a large corporation the "business" interest is the overall profitability of the company. But for a manager of production in a smaller organization, the business interest would be number of defective units, number of days of accident free production, average throughput volume, average time for repairs and so on. So the quantities of interest would be some metrics or KPIs that succinctly summarize the division's performance in those terms.

Read More

Tags: key performance indicator, keyconnect, analytics to measure KPI

Key Performance Indicators: effect of wrong data or wrong questions

Posted by Bala Deshpande on Fri, Oct 19, 2012 @ 09:10 AM

The purpose of using analytics is to help make sound decisions using data as opposed to making shoot-from-the-hip decisions using instinct or gut-feel. This is all well and good, however there is one pitfall to watch out for before starting on the analytics journey: wrong data or wrong questions will derail the best efforts. What do we mean by this?

Read More

Tags: key performance indicator, keyconnect, analytics to measure KPI

Using chi squared calculator in KeyConnect for market basket analysis

Posted by Bala Deshpande on Thu, Oct 11, 2012 @ 09:00 AM

Imagine if you are the store owner of a drugstore and want to optimize your shelf space to improve cross-selling. Suppose you have customer transaction data from the sale of cosmetics: typically such data could contain the items purchased together by different customers. Specifically let us suppose we have a dataset that records what each cosmetics customer purchases during one transaction. (Such a data set is described in this data mining book). There are seven different cosmetics related items that are sold in a drugstore and the manager wants to use the transaction data described above to make best use of the shelf space. 

Read More

Tags: key performance indicator, keyconnect, chi square test

Feature selection: mutual information vs. other commonly used methods

Posted by Bala Deshpande on Wed, Sep 26, 2012 @ 09:05 AM

Feature selection or dimension reduction is a data preparation activity for most predictive analytics and data mining work. One can argue that feature selection is one side of the coin and extracting key performance indicators (KPI) is the other side. Both of these activities require us to parse through available data and identify the big hitters or key players within the dataset. Where they differ is the final objective: in data mining, the final objective is to simply reduce the dimension of the data to optimize model building. In KPI analysis, the objective is to finalize what metrics to track.

Read More

Tags: keyconnect, data mining, feature selection

Extract key performance indicators using mutual information in 5 steps

Posted by Bala Deshpande on Thu, Sep 20, 2012 @ 08:16 AM

In this article, we briefly describe a 5-step process that will allow anyone to extract a key performance indicator from diverse datasets. The process will employ open source and software-as-a-service tools that are affordable and easy to deploy. 

Read More

Tags: data mining with rapidminer, key performance indicator, keyconnect

Using chi squared calculator to understand key performance indicators

Posted by Bala Deshpande on Mon, Aug 27, 2012 @ 07:33 AM

While Key Performance Indicators (KPI) offer a rational basis for judging performance, there are two main challenges. The first challenge is selecting the right key performance indicators for your business. How do you know which are the best KPIs for your needs? The standard solution is to monitor several KPIs simulataneously. This can lead to data overload and become a barrier for effective communication of business performance to executives.

Read More

Tags: keyconnect, chi square test, KPI

Gender bias in decision making? Verifying with chi squared calculator

Posted by Bala Deshpande on Wed, Aug 15, 2012 @ 06:47 AM

One scenario which market researchers try to get a good understanding of is the influence of gender on big ticket decision making. For example in our family, my wife a strong say in all the high value decision choices such as buying or remodeling our house to buying a new car! A convertible is ruled out in favor of a minivan, a man-cave in the basement is strongly contested by an extra guest bedroom and so on. 

Read More

Tags: business analytics, keyconnect, chi square test

Data preparation for predictive analytics: using mutual information

Posted by Bala Deshpande on Tue, Aug 14, 2012 @ 08:40 AM

A well-known thumb rule in data mining and predictive analytics is that 80% of an analyst's time is spent on cleaning and preparing the data for analysis. A well prepared dataset is indeed more than half the work done. Recently, one of our regular blog readers wrote back saying that some tips and tools to help in this context would be very helpful. Of course, we agree whole heartedly. 

Read More

Tags: keyconnect, data mining tools, mutual information

How to use a chi squared calculator to analyze multiple variables

Posted by Bala Deshpande on Wed, Aug 08, 2012 @ 08:29 AM

The chi squared test of independence is usually used to check if two categorical (or nominal) variables are independent or not. However when you have a dataset that consists of several nominal variables, you can still apply the test to answer the more general question "are any of these variables related to one another?" 

Read More

Tags: keyconnect, chi square test