Predictive Analytics with cricket statistics: IND-AUS QF2 analysis

Posted by Bala Deshpande on Tue, Mar 22, 2011 @ 10:30 AM

This is one game where if you are an Indian fan, you really hope that statistics loses. As we have done with the past several predictions, the data for the predictive analytics exercise has come from encounters between the opposing sides over the last 5 years.

Taking a look at the simplest statistic of percentage of games won by IND is telling:

vs. BANGLA 91%

vs. ENG 66%

vs. SA 46%

vs. WI 56%

and (drum roll) vs. AUS 40% !

Based on this alone, IND supporters have plenty to be pessimistic about. So what were the circumstances under which IND won those 40% of the games?

Fortunately, our RapidMiner decision tree analysis will come in handy to answer precisely such a question. Once again, we see that partnerships play the most significant role. rapid miner decision tree cricket IND AUS

  • In particular, India need to have 2 significant partnerships worth at least 77 runs
  • If not, the bowlers, specifically pace bowlers, have to step into the breach and take more than 7 wickets

It is as simple as that!

There has been a question about whether this AUS side is on the decline. To test this theory, we quickly looked at the more recent results after the AUS team lost their key players such as Gilchrist, Hayden, Warne etc. This is a bit more even keeled. Of the last 10 matches, AUS and IND have both won 5 each. More hearteningly for true cricket lovers, 3 of these games have been such close contests with run chases coming as close as 9 runs, 5 runs and 3 runs - AUS falling short once, by 9 runs and IND falling short twice! So be ready for a real close finish in this quarterfinal!

Download our updated summary report and all datasets below.

using decision trees for cricket predictionsdownload cricket datasets and decision tree summary report

Tags: predictive analytics in sport, data mining with rapidminer, cricket statistics, decision trees