When it comes to Twenty20 cricket (T20), a more evolved and nuanced approach to analytics is the need of the hour — and the Indian Premier League (IPL) is a glaring example of this.

Why? Back in the day when Test matches were the be-all and end-all of cricket, data analysis simply meant keeping track of runs scored and wickets taken. Then came One-Day Internationals (ODIs), and the world woke up to strike rates, economy rates, chase precision, among others. Soon, we began to crunch video data to analyse player movements. Enter T20s, and we are now comparing ball-by-ball data streams with the existing legacy data to generate unprecedented real-time insights. In the end, T20 numbers need to be placed in more situational and contextual spaces — without compromising on the objectivity of analysis. For this, a whole new stats language is indispensable.

How is analytics revolutionising cricket experience on televisions, laptops, mobiles and even live audience in the stadium today? How can teams, coaches, players, management, cricket analysts and even popular digital platforms like Hotstar benefit from it? How can we go beyond questions like ‘How many runs did you score?’ or ‘How many wickets did you take?’, and ask questions such as ‘How valuable are you to your team?’ This article correlates the recent developments in IPL to present a compelling case for how big data and Artificial Intelligence (AI) are transforming cricket in India.

Choice of players and team formation

At the outset, let us start with the following quote by Satish Menon, CEO of Kings XI Punjab, published in a recent Times Now interview.

“I think we spent a lot of time working at analytics. We were completely ready for the auctions as a result of the groundwork. If we missed out on one or two players, we made up by matching it accordingly. While initially, we were hoping to get at least 50 percent of players who were on our list, we ended up with 60-70 percent of the players we wanted. Whether it was Andrew Tye, or R Ashwin, Yuvraj Singh or even Chris Gayle, we are happy with the players we have in our squad. Gayle was not on the list of the core players we wanted, but we bought him towards the end,” Menon had said.

In fact, not just Kings XI Punjab, many IPL teams have cracked down on data at granular level to augment their team-selection strategy. So what kind of data are we talking about?

Comprehensive player analysis is one of the biggest challenges faced by any team administration. One way to wrap one’s head around this is to calculate a player index called MVPI or the Most Valuable Player Index, which is a weighted composite score of the following attributes of a player:

I. Batting Metrics

II. Bowling Metrics

Dataset from CricSheet Data

The dataset that we have used for our analysis is the publicly available Cricsheet Data which is a reservoir of datasets for all cricket matches — domestic and international. The analysis focusses on T20 matches in particular, and accordingly, we have calculated and weighted the metrics for batsman and bowlers in context of the shorter version of the game.

I. Batting Metrics: What Makes a Batsman Indispensable to a Team?

With a view to calculate and visualise 360o batting ability in T20s, five different metrics are considered for the batsman, as discussed in the subsequent sections. In this article, we have taken the example of Chris Gayle to illustrate different batting metrics, and have performed a basic MVPI analysis using the Cricsheet data.

1. Hard Hitting Ability = (Fours + Sixes) / Balls Played by batsman

Hard Hitting Ability is a crucial factor in chasing a target in T20. Our analysis indicates that Chris Gayle has an excellent Hard Hitting Ability, given his phenomenal track record of scoring fours and sixes in matches by facing the minimum number of balls. He has scored a commendable 1.102 in this parameter.

2. Finisher = Not Out innings / Total Innings played

Whether a batsman can race to the bottom or not has pivotal impact on the chances of the team in winning the match. Finishing Ability of a batsman is particularly important in the second innings. Our analysis reflects that Chris Gayle doesn’t have good Finishing Ability, as he gets out well before the end of the match most of the times. He scores a modest 0.146 on this parameter.

 3. Fast Scoring Ability = Total Runs / Balls Played by Batsman

A Fast Scorer batsman achieves a maximum score by facing the minimum number of balls. Why is this trait important? Simply because in the T20 format, wasting a ball can severely bring down the chances of winning.

Chris Gayle is outstanding in this metric as he is able to rake in maximum runs while facing the minimum number of balls — he scores a whopping of 149.25 on this parameter.

4. Consistency = Total Runs/Number of Times Out

A consistent player with a good average is always an invaluable asset to his team. Virat Kohli, for example, has been the bedrock of India’s T20 wins in recent years. In our analysis, Chris Gayle has scored above average in this parameter with a score at 43.51.

5. Running Between Wickets = (Total Runs - (Fours + Sixes)) / (Total Balls Played - Boundary Balls)

While boundaries get the jackpot, runs scored by Running Between Wickets brings the edge to the T20 chase. It is a must-have trait in batsmen to keep the scoreboard ticking when the boundaries are not coming. Our analysis shows that Chris Gayle scores poorly in this parameter as he deals mostly in boundaries - singles and doubles are not his forte. His score is 0.502 in this parameter.

II. Bowling Metrics

Similar to the batsmen metrics, there are five different metrics that we have taken for the bowlers. We have taken the example of Amit Mishra to illustrate the bowling metrics.

1. Economy = Runs Conceded / (Number of balls bowled by bowler/6)

Economy is perhaps one of the most important traits of a bowler in a T20 match. For a team to ensure that the opposition doesn’t score much, and to give the other bowlers a chance to put pressure on the batting team, bowlers with good economy are the ones to bet on.

Our analysis shows that Amit Mishra has an economy of 7.02 which is highly commendable in T20 cricket.

2. Wicket Taking Ability = Number of balls bowled / Wickets Taken

Wicket Taking ability is equally vital for a bowler for two key reasons — firstly, it puts pressure on the upcoming batsman, and secondly, it slows down the run rate. Amit Mishra has scored 18.2 in terms of Wicket Taking Ability, because he takes relatively more wickets than an average bowler in a given number of overs.

3. Consistency = Runs Conceded / Wickets Taken

This is a no-brainer. Just because a bowler is taking wickets, it doesn’t mean that he can be generous with runs — especially in T20. Consistent bowlers help a team get maximum wickets while keeping runs at bay. Amit Mishra demonstrates high Wicket Taking Ability with an outstanding 21.12 score on this parameter.

4. Crucial Wicket Taking Ability = Number of times Four or Five Wickets Taken / Number of Innings Played

Crucial Wicket Taking bowler is the anchor in the team whose performance can change the course of a match. Such bowlers help win a team due to their splendid performance as individuals. Amit Mishra scores 0.026 in this attribute for having taken more 4 and 5 wicket hauls than the average bowler.

5. Short Performance Index = (Wickets Taken – Number of Times Four Wickets Taken – Number of Times Five Wickets Taken) / (Innings Played - Number of Times Four Wickets or Five Wickets Taken)

This parameter factors in good bowling periods demonstrated by a bowler throughout the tournament. Amit Mishra, who has turned around games many times for his team, scores 1.082 in this parameter.

Approach:  Methodology and Analysis

Now that we understand what MVPI is and how it can identify a player’s potential value to a team, let us see how it works in a step-by-step manner:

1. First, we used the 10 metrics (5 batting and 5 bowling) explained above to calculate the effectiveness of a player in the T20 format. We have benchmarked these scores against pre-defined ranges for different metrics and then normalised the scores to remove any metric bias, using the formula:

Score for a Feature = (Players Count – Rank in that Feature / Players Count)

2. In the next step, we have carried out feature selection and have calculated the respective weights for each of the selected metrics. This is done using the Recursive Feature Elimination technique, which retains features by recursively narrowing down to smaller and smaller sets of attributes.

The model is first trained by feeding an initial set of attributes followed by computation of relative importance of each attribute. Then, the least important attributes are eliminated from the present set of attributes. This approach is then repeated on the reduced set each time, until the desired number of attributes to be selected is eventually reached.

3. The last step involves multiplying the importance (weight) of each feature with its value and then aggregating the weighted value set. This way, we get the final points corresponding to each player which are then sorted in decreasing order to get the Most Valuable Players in a particular.

Improved Viewer Engagement

Did you know? A new app by Chennai Super Kings (CSK) has features like news, views, match analysis, fan zone and much more, primarily aimed at leveraging high-end analytics to garner viewer-engagement right through the mobile interface.

CSK CEO KS Viswanathan, who inaugurated the app, told The Hindu, "To give back to the fans, CSK has been exploring ways and means of bringing fans closer to the team, be it via the official merchandise or tickets or being able to follow their Lions."

Viewer-engagement becomes critical in a high-stake league championship like IPL, where teams and players vie for sponsor money. In the end, by pushing match experience through match analysis, views, news, merchandise, live tickets, and so on through mobile applications, monetising sponsorships are bound to be at the helm of big data insights in cricket.

After all, it is only in the recent years that Indian cricket lovers have been given the inside view of what happens behind the scenes in team selections, training, and match preparation. Big data, artificial intelligence and analytics have become the deal-makers and deal-breakers in league cricket across the world, and IPL is a brilliant example of how data consumption in Indian cricket has come of age.

Disclaimer: Please note that the opinions expressed in this article are purely based on analysis of data available with us at the time of such analysis and are not intended to promote/defame any player or any other entity.

The author is the CEO of FORMCEPT, a six-year-old, Series A funded start-up, in the Cognitive/AI/IoT space, selling enterprise analytics solutions in India and abroad.