The Analytics Compass Blog

Twice weekly articles to help SMB companies optimize business performance with data analytics and to improve their analytics expertise.

Subscribe via E-mail

Your email:

Search SimaFore

FREE SMB Survey Report

describe the image
  • Our report (pdf) is a survey of analytics usage and needs of more than 100 SMBs
  • Find out which analytics applications have
    • Highest value
    • Most demand
    • Best ROI

Affordable Analytics for SMB


Browse by Tag

Current Articles | RSS Feed RSS Feed

Ranking KPIs: a critical first step for small or big data analytics

keyconnect KPI ranking

When the website was launched in 2009, it had a measly 47 datasets. Four years later it has exploded to nearly 100,000 data sets in more than 50 formats. This is merely the public facing data which the government makes available to the tax paying citizenry. The "other" government data (still funded by taxes) which are not openly available to all, due to security and other reasons is clearly significantly larger. EMC Corporation recently released a report where they indicated that only about a quarter of this data is tagged and analyzed by the government currently. Officials have been quoted as saying that in the next 5 years, the feds will spend about $13 billion (16% of the total IT budget) to improve big data infrastructure and develop data mining best practices for this data. The report also summarized the top three areas where large government agencies can best leverage big data and analytics: improving process and efficiency, enhancing security and predicting trends.

How to ensure that KPI are mutually exclusive and non redundant

Key performance indicators or KPI

In any business, strategic decisions can be made only after we know what factors have the greatest impact on the bottomline. Key performance indicators or KPI are a term generally reserved for those factors which signal the health of a business. The business in question could be the overall functioning of an entity or it could be a specific division within a larger activity. For example, for the CEO of a large corporation the "business" interest is the overall profitability of the company. But for a manager of production in a smaller organization, the business interest would be number of defective units, number of days of accident free production, average throughput volume, average time for repairs and so on. So the quantities of interest would be some metrics or KPIs that succinctly summarize the division's performance in those terms.

Key Performance Indicators: effect of wrong data or wrong questions

asking right questions before deploying analytics

The purpose of using analytics is to help make sound decisions using data as opposed to making shoot-from-the-hip decisions using instinct or gut-feel. This is all well and good, however there is one pitfall to watch out for before starting on the analytics journey: wrong data or wrong questions will derail the best efforts. What do we mean by this?

Using chi squared calculator in KeyConnect for market basket analysis

analytics insights for cosmetics cross selling

Imagine if you are the store owner of a drugstore and want to optimize your shelf space to improve cross-selling. Suppose you have customer transaction data from the sale of cosmetics: typically such data could contain the items purchased together by different customers. Specifically let us suppose we have a dataset that records what each cosmetics customer purchases during one transaction. (Such a data set is described in this data mining book). There are seven different cosmetics related items that are sold in a drugstore and the manager wants to use the transaction data described above to make best use of the shelf space. 

Feature selection: mutual information vs. other commonly used methods


Feature selection or dimension reduction is a data preparation activity for most predictive analytics and data mining work. One can argue that feature selection is one side of the coin and extracting key performance indicators (KPI) is the other side. Both of these activities require us to parse through available data and identify the big hitters or key players within the dataset. Where they differ is the final objective: in data mining, the final objective is to simply reduce the dimension of the data to optimize model building. In KPI analysis, the objective is to finalize what metrics to track.

Extract key performance indicators using mutual information in 5 steps

simple 5 step process for key parameter indicator extraction resized 600

In this article, we briefly describe a 5-step process that will allow anyone to extract a key performance indicator from diverse datasets. The process will employ open source and software-as-a-service tools that are affordable and easy to deploy. 

Using chi squared calculator to understand key performance indicators

automating kpi analysis with chi squared calculator resized 600

While Key Performance Indicators (KPI) offer a rational basis for judging performance, there are two main challenges. The first challenge is selecting the right key performance indicators for your business. How do you know which are the best KPIs for your needs? The standard solution is to monitor several KPIs simulataneously. This can lead to data overload and become a barrier for effective communication of business performance to executives.

Gender bias in decision making? Verifying with chi squared calculator

chi square calculator to analyze if gender plays role in big ticket decision making resized 600

One scenario which market researchers try to get a good understanding of is the influence of gender on big ticket decision making. For example in our family, my wife a strong say in all the high value decision choices such as buying or remodeling our house to buying a new car! A convertible is ruled out in favor of a minivan, a man-cave in the basement is strongly contested by an extra guest bedroom and so on. 

Data preparation for predictive analytics: using mutual information

data preparation for predictive analytics identifying constants resized 600

A well-known thumb rule in data mining and predictive analytics is that 80% of an analyst's time is spent on cleaning and preparing the data for analysis. A well prepared dataset is indeed more than half the work done. Recently, one of our regular blog readers wrote back saying that some tips and tools to help in this context would be very helpful. Of course, we agree whole heartedly. 

How to use a chi squared calculator to analyze multiple variables

analyzing multiple nominal variables using the chi squared test resized 600

The chi squared test of independence is usually used to check if two categorical (or nominal) variables are independent or not. However when you have a dataset that consists of several nominal variables, you can still apply the test to answer the more general question "are any of these variables related to one another?" 

All Posts