Simafore provides tools and expertise to:
The Analytics Compass Blog is aimed at two types of readers:
individuals who want to build analytics expertise and
small businesses who want to understand how analytics can help them improve their business performance.
If you fall into one of these categories, join hundreds of others and subscribe now!
According to Gartner's latest 2013 hype cycle, predictive analytics has reached (more or less) the so-called "Plateau of Productivity" which means that businesses are actively utilizing the technology and benefitting from it. The first time we tracked this on our blog was exactly two years ago, when predictive analytics was slowly but surely moving up along the "slope of enlightenment". (See the hype cycle since 2000). On the other hand, Big Data is at its peak of inflated expectations and will likely remain there for the next few years. Internet of Things is also right there, very close behind.
While most businesses today can easily tell how many products they sold to their customers, they cannot easily tell which customer bought what product or which customer was more profitable and which customer was less profitable. Does this pose a problem? In today's world of highly knowledgeable consumers and social networks which have begun to impact purchase behaviors (think reviews, facebook likes, etc), the future success of many companies will depend on how they can retain customers by means of individual propositions and not simply rely on volumes or mass-market appeal alone.
Two of the many ways descriptive and predictive analytics add value to businesses, is by helping them understand their customers better through customer segmentation and customer profiling. Segmentation and profiling have been quite widely employed by big businesses in industries such as finance, insurance and database marketing. It is only now, with the availability of cheap computing, coupled with low cost or no-cost (as in open source) software, businesses of all shapes and sizes can afford to take advantage of these processes. Here we describe one simple process for conducting a fundamental segmentation analysis called the Recency-Frequency-Monetary Value (RFM) analysis.
Data balancing is an important preparatory step for predictive modeling. Unbalanced data refers to a situation where one class of responses is disproportionately larger than the opposite class. For instance in direct marketing: unbalanced data implies that there are far more samples of one class, say non-responders than the responders class. In case of credit risk or fraud analytics, there may be far more examples of non-fraudulent transactions than fraudulent ones. Similarly in predictive analytics for machine failure, there may be far more instances of non-failures than failures. Training models using such imbalanced data will lead to sub-optimal predictive models. In this article we will see how to address such situations.
There are numerous small businesses that collect significant amounts of customer data but rarely do anything value adding beyond using the customer names and addresses for sending catalogs or mailings. What they do not realize is that this data is a treasure trove of information if properly utilized. One of our customers is a niche products manufacturer and distributor. They estimate a market size of about 2 to 3 million unique consumers for their products in the United States. They have about 30,000 customer records collected over the years pertaining to which products were sold, when they were sold and some customer demographic information. They also maintain a product catalog of about 4000 distinct items (SKUs).
In a previous article we provided an overview of the different types of analytics one can run for problems related to customer acquisition, customer retention and customer churn. We mentioned how most of these questions either fall into strategic or tactical categories and can be addressed by either descriptive analytics or predictive analytics. In this article we will explore in a little more detail some of the tactical problems that can be addressed by predictive analytics.
A typical problem facing many manufacturers is developing accurate price quotes for their customers. For example, many automotive subsystems which go into a car are complex assemblies of hundreds of parts. Every time a new vehicle is designed, many of these subsystems will need to be changed in some way - from minor adjustments to sometimes a new design altogether. However, the underlying parts or "ingredients" very rarely vary, with only their configurations getting changed. For example, an electrical subsystem will still need copper, plastic and some metal, but the amounts of these materials and the way in which they are put together can vary significantly.
Many non-profits such as academic institutions, public radio and television, charitable organizations rely on well established donor models to operate. As non-profits, one would expect their focus on cost management to be pretty sharp and when they use commercial vendors to drive their campaigns, this attention gets translated on to the business model of the vendor.
For someone who has been mucking around with data for a couple of decades, the term "big" applied to data actually trivializes data and analytics. It is an unfortunate term that reeks imprecision. True there are definitions which qualify what the "big" implies: most notably the 3 v's of volume, velocity and variety in data. But as Eric Siegel puts it in his latest book "Predictive Analytics", big is only relative.
If you are the owner of a small business, you may know a lot about your products or services and you may also know much about your customers. While this knowledge or business wisdom is definitely valuable, your advantages to succeed in the business will be greater if you can supplement your intelligence - with machine learning. This is the premise behind predictive analytics as a service, the kind offered by companies such as BigML. Their affordable cloud based app will let you build decision tree based predictive models which can then be used for a variety of applications. These models can be used to segment customers and products, can help make sales predictions and market pricing, can help with marketing your products and services more effectively. These are of course the classic potential benefits of deploying predictive analytics.