text mining

Understanding the needs of your customers is a critical aspect of business. This requires proper customer segmentation. There are many different approaches to segmenting customers: based on their behavior, based on their business structures (e.g. small, medium, large) or even based on the amount of revenues they generate for you, as performed in an 80-20 customer analysis. Text mining survey data can do this job rather effortlessly as we shall see.

Customer surveys can be great tools for this purpose. You can use the results from surveys to more than simply understand how your customers respond to specific scenarios. They are more valuable than that – a well administered survey can help you to build models to predict how they may behave under new situations. Customer segmentation and customer profiling through analytics of surveys is a very systematic process.

Text mining survey results

Everyone is familiar with surveys that rely on a bunch of questions for which responses typically ranges from 1 through 5, where 1 represents one extreme type of behavior (very high or very low) and 5 represents the opposite end of the spectrum. This makes it easy to collect and aggregate the results. However it also forces respondents to shoe horn their answers into one of the five classes. On the other hand open ended questions which require a text input make it very challenging to analyze the results and draw conclusions. But open ended questions can be extremely valuable because respondents can raise unique concerns which may actually turn out to be not so unique after all when viewed across all respondents. In other words, free form text can be very revealing by the nature of text patterns that emerge from the responses.

For example, in a survey conducted by one of our customers, we found, via text mining an open ended response, a recurring pattern that would have been hard to infer from rigid numerical ranges. A simple word cloud reveals that the respondents have “struggle”d with some issues. Similarly the words “software” and “design” are also common themes of concern for the respondents.

text mining survey data

The next logical thing to do is identify which of these words are associated with one another. For example, running a word association analysis showed that many of these respondents “struggle” to meet “timing” or “time”. Perhaps there is a strong need for productivity improvement tools. Similarly strong correlations existed between “software” and “understanding” giving some clues to perhaps training needs of the respondents. 

While numerical scores may be somewhat restrictive, they can be helpful in building training data sets for predictive models on the survey data.

How to transform survey data into a predictive model?

The key is to combine numerical scores and text responses in a single model that can be trained to categorize new survey respondents based on their text responses alone and categorize them into proper bins. The process is conceptually simple. Utilize the numerical responses from standard survey questions to assign scores to customers. Build the free form text responses into a term document matrix as is typically done in text mining analyses. This will yield a complete training data set where the numerical score is the predicted outcome and the terms in the term document matrix or TDM are the predictors or independent variables. This is of course very similar to sentiment analysis. 

The difficulty in either case – survey or sentiment analysis – is to spend some time upfront aggregating the numerical responses to arrive at a composite “score”. However this needs to be done only one time and the pay-off is worth the effort. Text mining survey data can yield unexpected value.

Originally posted on Fri, Feb 07, 2014 @ 09:02 AM

No responses yet

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.