Leicester's Tale: Another Perspective on the EPL 2015/16 Through Expected Goals (xG) Modelling
Probabilistic modeling is an effective tool for evaluating team performance and predicting outcomes in sports. However, an important question that hasn't been fully explored is whether these models can reliably reflect actual performance while assign…
Authors: Sheikh Badar Ud Din Tahir, Leonardo Egidi, Nicola Torelli
PREPRINT Leicester’s T ale: Another P ersp ectiv e on the EPL 2015/16 Through Exp ected Goals (xG) Mo delling Sheikh Badar Ud Din T ahir ∗ , Leonardo Egidi , Nicola T orelli Departmen t of Economics, Business, Mathematics and Statistics “Bruno de Finetti”, Universit y of T rieste, T rieste, 34127, Italy . Abstract Probabilistic mo deling is an effectiv e to ol for ev aluating team p erformance and predicting outcomes in sp orts. Ho wev er, an imp ortant question that hasn’t b een fully explored is whether these mo dels can reliably reflect actual p erformance while assigning meaningful probabilities to rare results that differ greatly from exp ectations. In this study , we create an inference-based probabilistic framework built on exp ected goals (xG). This framework conv erts shot-level even t data into season-level sim ulations of p oints, rankings, and outcome probabilities. Using the English Premier League 2015/16 season as a data, we demonstrate that the framework captures the ov erall structure of the league table. It correctly identifies the top-four contenders and relegation candidates while explaining a significant p ortion of the v ariance in final p oints and ranks. In a full-season ev aluation, the mo del assigns a low probabilit y to extreme outcomes, particularly Leicester City’s historic title win, which stands out as a statistical anomaly . W e then lo ok at the ex ante inferen tial and early-diagnostic role of xG b y only using mid-season information. With first-half data, w e simulate the rest of the season and show that teams with stronger mid-season xG profiles tend to earn more points in the second half, even after considering their current league p osition. In this mid-season assessment, Leicester City ranks among the top teams b y xG and is giv en a small but notew orthy chance of winning the league. This suggests that their ultimate success was unlikely but not entirely detached from their actual p erformance. Our analysis indicates that exp ected goals mo dels work b est as probabilistic baselines for analysis and early-w arning diagnostics, rather than as certain predictors of rare season outcomes. Keyw ords: Expected Goals; F o otball; Poisson Models; Sp orts Analytics; Win Probability THIS IS A PREPRINT THA T HAS NOT YET UNDERGONE PEER REVIEW. *Corresp onding author e-mail: sheikhbadaruddin.tahir@phd.units.it Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t 1 In tro duction Uncertain t y is a k ey part of all sp orts and plays a big role in attracting fans and k eeping them engaged. T o b etter understand and reduce this uncertaint y , fo otball has turned to data-driven metho ds. F o otball analytics marks a ma jor change in how we in terpret, ev aluate, and improv e the game. This drives the need for new measuremen ts, lik e exp ected goals (xG) (Vilela 2024, Nip oti, and Schia von 2025, Bandara et al. 2024). Bey ond p erformance metrics, fo otball analytics cov ers a wide range of tasks that can b e tackled using statistical and predictive mo deling. F or example, it can predict match outcomes, assess play er p erformance, examine team strategies, and guide decisions in pla y er recruitment and injury preven tion (see e.g., Souza et al. 2021, Skripniko v et al. 2025, Elsharkawi et al. 2025, title & Suguna 2023). T ogether, these applications show the broad reach and growing imp ortance of analytics in to day’s game. Exp ected goals (xG) mo dels ev aluate shot quality by calculating the c hance of scoring based on past attempts (Sp earman, 2018). In simple terms, exp ected goals assign a probability b etw een 0 and 1 to every shot a team takes during a matc h. A score of 0 means that there is no chance that the shot b eing a goal, while 1 means that a goal is certain. F ormally , the exp ected goals (xG) of a team in a giv en match can b e represented as the sum of the scoring probabilities of all its shots: xG team, matc h = N X i =1 p i , (1) where N sho ws the num b er of shots taken by the team in that match, and p i is the probability that shot i results in a goal. This metho d is more effective than a conv entional goal-based metric in addressing randomness in fo otball, as a shot is a muc h more frequent o ccurrence than a goal (Anzer, & Bauer, 2021). Historically , researchers hav e mo deled the n umb er of goals a team scores in a fo otball matc h using statistical distributions to forecast matc h outcomes (Wheatcroft, 2021). F or instance, goal-based approaches (Egidi & T orelli 2021, Mead, O’Hare, and McMenemy) mo delled the num b er of goals scored directly using statistical distributions such as P oisson-based mo dels. The fo cus is on how many goals a team or a pla yer is likely to score, while result based approaches directly mo del matc h outcomes (for example, win, dra w or loss) rather than the sequence of even ts that leads to them (see, e.g., Macr ` ı Demartino et al. 2024). Since its developmen t, the xG metric has b ecome ubiquitous in the w orld of fo otball. The majority of top-tier fo otball teams and b etting corp orations emplo y these statistics, including related concepts of exp ected assists and p ost-shot exp ected goals. A dditionally , these metrics pla y a critical role in play er dev elopment and acquisition for organizations and in enhancing predictive mo dels used in sp orts b etting (Mead et al., 2023). The primary purp ose of these metrics is to provide a more comprehensive assessmen t of a play er and team’s p erformance b eyond just the total num b er of goals scored. By quantifying shot probabilities, a team can gain an impro ved understanding of whether they are generating high-quality opp ortunities, exp eriencing p o or finishing 2 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t luc k, or b enefiting from fav orable v ariance. This analytical to ol has recently gained significan t p opularity , as the final outcome of a match do es not alwa ys accurately reflect the opp ortunities that a team had. Exp ected goals (xG) provide a more detailed picture of a team’s p erformance in individual matc hes, but their broader relev ance b ecomes clear when examined ov er an en tire season. Discrepancies b et w een cumulativ e xG and final league rankings highlight the concept of ranking uncertain t y , whereby a team’s p osition in the table may not accurately reflect its underlying p erformance. Even teams that consistently generate high-qualit y chances may underp erform due to defensive errors, adverse v ariance, or unfa v orable match dynamics. Conv ersely , other teams may outp erform their xG b y con v erting low-probabilit y chances at unusually high rates. Suc h discrepancies exp ose structural limitations in point-based league standings as represen tations of team qualit y . Ho w ev er, xG has limitations despite its widespread use and analytical utility . Metho dological differences in data collection approac hes can yield substantially different results for identical shots when different xG mo dels are employ ed. This v ariability requires a thorough examination of data sources and a clear understanding of each mo del’s underlying assumptions. Another imp ortan t limitation is that the predictive p ow er of xG v alues in indi- vidual matches is limited, and sub ject to high v ariance. Match outcomes can deviate substan tially from xG exp ectations due to randomness and the small sample sizes inheren t in single games. Consequently , distinguishing real p erformance trends from statistical noise typically requires aggregating data across multiple matches to obtain meaningful xG-based insigh ts. These limitations highlight the imp ortance of interpret- ing xG metrics within a broader framework of uncertaint y mo deling and probabilistic reasoning in fo otball analysis. Nonetheless, muc h of the existing literature fo cuses primarily on p oint estimates of team or play er performance, attention to the uncertaint y and v ariability embedded in season-level outcomes derived from xG data. In particular, the disparity b etw een cum ulativ e xG p erformance and actual league standings remains a largely unexplored area, despite its relev ance for accessing p erformance consistency , mo del reliability , and result fairness. This study addresses this gap by developing a probabilistic framework to quan tify ranking uncertaint y and iden tify rare matc h outcomes using metrics deriv ed from xG. In other words, this study do es not interpret xG as a deterministic predictor of final league outcomes. Instead, we distinguish b etw een retrosp ectiv e mo del fit and ex-an te inferential and diagnostic v alue of expected goals as an early-warning signal. Our central question is not whether xG can precisely anticipate rare outcomes, but whether it provides timely indications that a season may be unfolding in an unexp ected direction. Our primary ob jectiv e is to conduct a systematic ev aluation of ho w often teams deviate from their xG exp ectations in the final league table. W e refer to this phenomenon as ranking uncertaint y , where a team’s final p osition is inconsisten t with its underlying xG-deriv ed p erformance indicators. T o assess this, we simulate alternative league standings by using xG-derived scoring intensities and compare them with the actual rankings, thereby identifying which teams ov erp erform or underp erform relative to exp ectations. 3 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t F urthermore, we prop ose a framework for iden tifying and characterizing rare out- comes, such as underdog victories despite lo w exp ected goals (xG) or draws featuring highly imbalanced xG distributions. These anomalies c hallenge conv entional mo del- ing assumptions and call for careful analysis from b oth probabilistic and statistical p ersp ectives . Moreo v er, while several core predictors in fo otball analytics (e.g., shot lo cation and distance) are well established, this study introduces a set of hand-crafted, domain- informed features designed to enhance b oth model in terpretabilit y and tactical relev ance. These features include spatial zones, b o dy part used, shot placemen t, and match con text, which are incorp orated into a logistic regression framew ork to estimate the probabilit y that a shot results in a goal. The resulting xG v alues are subsequently emplo y ed in P oisson-based scoring mo dels to simulate match outcomes, whic h are then aggregated into full-season simulations to examine the v ariabilit y and fairness of league rankings. This study con tributes a transparent and statistically grounded analytical pip eline that: • Quan tifies ranking uncertaint y across entire seasons through sim ulation-driven league tables based on exp ected goals (xG). • Pro vides empirical insight into the statistical signatures of rare or inconsistent matc h outcomes. • Examines the trade-off b etw een predictive p erformance and data practicality when mo deling imbalanced even ts. • Utilizes detailed shot-level and even t-level data from the English Premier League (EPL) 2015/16, presenting a practical, real-world v alidation of the framework. T o the b est of our kno wledge, this is among the first studies to formally quantify league ranking uncertaint y using xG simulations, while simultaneously addressing the mo deling of rare fo otball ev ents within a comp etitiv e league context. In ligh t of these considerations, the remainder of this study is organized as follows: Section 2 reviews related literature on exp ected goals (xG) mo deling and ranking ev aluation. Section 3, introduces the prop osed framework for constructing a logistic regression-based exp ected goals mo del using hand-crafted shot-level features derived from match even t data. It also describ es the estimation of ranking distributions and title probabilities across the 2015/16 EPL season, based on an xG-driv en Poisson pro cess. Section 4 presen ts a comparativ e analysis of predicted and actual league standings, including relev ant p erformance metrics. Finally , Section 5 concludes with a discussion of the key findings, implications, and limitations of the study . 2 Related W ork 2.1 Ev olution of Exp ected Goals (xG) The origins of exp ected goals (xG) mo dels remain somewhat ambiguous, with most sources (e.g., Rathke 2017; Herbinet 2018; Umami et al. 2021) attribute the concept’s foundation to Macdonald’s (2012) study of shot outcomes in ice ho ck ey , while others 4 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t (e.g., Sp earman 2018) trace the term back to Green’s (2012) study . At its core, the exp ected goals framework can b e formulated as a classification problem, as it in v olv es estimating the probability that a shot results in a goal. Consequently , a range of statistical and machine learning metho ds hav e b een employ ed to mo del these probabilities, including multinomial logistic regression, gradient b o osting, neural net w orks, supp ort vector machines (SVM), and tree-based classification techniques (Anzer, & Bauer, 2021). 2.2 F eatures Used in xG Mo deling Most features in these mo dels are drawn from in-game data, which can generally b e group ed into tw o categories: p ositional data and even t-based data. Even t-based data includes actions such as passes, duels, penalties, and shots, along with other on-field ev en ts recorded during a match. Eac h even t is typically describ ed b y a set of v ariables for example, the pitc h lo cation (x and y co ordinates), the area where the action was completed (such as the target of a pass or shot), the play er inv olved, the match con text, the outcome (success or failure), and a range of other situational details. Previous researc h has examined ma jor fo otball championships and national leagues to understand how goals are scored (Kubayi, 2020; Liu et al., 2015). In these studies, researc hers analysed v ariables such as league ranking (top, mid, or low er-tier clubs), matc h outcome (win, draw, or loss), ven ue (home or a wa y), and tactical system. In another study (Rathke, 2017), pro jected that presen ting play ers with their xG v alues could improv e attacking p erformance by highlighting the most effectiv e shot lo cations and techniques. The study found that b oth sho oting distance and angle predictors significan tly influence the probability of scoring a goal. Similarly , Sp earman (2018) examined the effects of distance and angle on sho oting outcomes and emphasised that shot lo cation is a critical feature to incorp orate in to predictive mo dels. Given its significance in explaining the v ariability of goal probability , shot lo cation remains a central v ariable in nearly all studies related to exp ected goals (e.g., Kharrat et al. 2020, Brechot & Flepp. 2020). Another commonly discussed feature in the literature is shot t yp e, which provides con textual information ab out the attempt and can b e divided into tw o sub categories. The first sub category relates to the b o dy part used for the shot (left or right fo ot, head, or another part). The second concerns the game situation in which the shot o ccurs. Dep ending on the mo del, this ma y include op en play , counterattac k, free kick, or p enalty kick. Brec hot and Flepp (2020) incorp orated these features in to their xG mo del and found that b oth factors significantly affect shot outcomes. Sp ecifically , they rep orted that shots taken from a free kick are more lik ely to result in goals, p enalties ev en more so, whereas headers are considerably less likely to do so compared with shots taken in op en pla y with either fo ot. Bey ond these technical and spatial features, recent studies hav e emphasized the imp ortance of incorporating dynamic, match-lev el context into xG mo deling. One such factor, match status, has b een identified as a critical v ariable in explaining p ossession patterns throughout the matc h, as teams’ strategies evolv e with the scoreline. F or instance, a losing-match status is often asso ciated with increased ball p ossession as 5 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t teams seek to regain control and build sustained attacks through indirect play . Con- v ersely , a winning-match status typically corresp onds to reduced p ossession, reflecting a strategic preference for direct play or coun terattacking (W ang et al., (2022)). F urthermore, matc h status also affects p ossession by field zone. Lago (2009) found that p ossession time in the offensive third was longer when a team was losing than when it was winning or drawing. Nevertheless, due to factors such as limited sample size, inconsistent analytical frameworks, and the inherent complexit y of fo otball, as a dynamic and unpredictable sp ort, muc h of the existing literature rep orts inconclusive findings regarding these relationships.. 2.3 League-Lev el xG Mo deling and Sim ulation The English Premier League (EPL) is widely regarded as one of the most comp etitive and financially v aluable fo otball leagues in the w orld (Cox and Philipp ou 2022). The 2015/16 EPL season, how ever, was highly atypical: Leicester City , a club that had narro wly av oided relegation the previous year and b egan the campaign with b o okmak ers’ o dds of 5000-1, remarkably secured the league title. Sev eral factors contributed to this outcome, including an exceptionally efficient con v ersion rate, a robust defensive organization, and consisten t p erformances against higher-resourced opp onents. This unpreceden ted achiev ement not only captured global attention but also challenged con v entional assumptions ab out the determinants of success in fo otball. In particular, Leicester’s triumph underscores the limitations of conv entional p oin t-based standings in accurately reflecting team p erformance. Recen t studies hav e introduced P oisson-based mo deling frameworks for simulating matc h outcomes using xG inputs or goal exp ectations (e.g., Eggels et al., 2016; Nguyen, 2021). Ho wev er, these w orks hav e mainly focused on predicting individual matc h results or final p oint totals, without examining their relationship to full league standings or iden tifying systematic ov er- and underp erformance across an en tire season. This study addresses this gap by p osing the following researc h question: How accurately can exp ected goals (xG)-based mo dels sim ulate full-season league standings, and what insights do they provide into ranking uncertain ty and rare fo otball outcomes in the English Premier League. 3 Material and Metho ds This study addresses this research ob jectiv es by developing a statistical framework that leverages xG-based inference to simulate season-long p erformance, analyze rank- ing distributions, and compare predicted versus actual league tables. Unlike previous approac hes that emphasize single-matc h forecasting, our study quantifies ranking uncertain t y , identifies p erformance discrepancies, and ev aluates the reliability and fair- ness of xG-driven assessments in a full-season comp etitive con text. Figure 1 illustrates the architectural pip eline of prop osed xG-based framework for season-level ranking analysis. 6 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t Fig. 1 : End-to-end xG-based framework for season-level ranking simulation and inference 3.1 Data Source The research utilizes the English Premier League (EPL) 2015/16 fo otball dataset (Sec˘ arean, 2024). The data comprises tw o main comp onen ts: • Matc h-lev el data: Eac h observ ation represents a single matc h and includes attributes such as home and a wa y team names, final scores, match date, and b etting o dds. • Ev en t-level data: This dataset con tains detailed, timestamp ed in-game even ts such as shots, fouls, and passes. Each even t is linked to a match via a unique match ID (id o dsp). In this study , we used a subset of v ariables relev ant to mo deling and simulation summarized in T able 1. V ariables not included in the analysis (e.g., secondary even t types, redundant timestamps) were excluded to reduce data complexit y and improv e mo del efficiency . 7 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t T able 1 : Description and type of the v ariables considered in this study . V ariable Description T yp e id o dsp Unique match ID used to join matc h and even t data Categorical (ID) even t team T eam taking the shot Categorical side Home (1) or A wa y (2) indicator Binary is goal Indicator if the shot resulted in a goal Binary location Coded pitch zone where the shot o ccurred Categorical shot place Shot placement (e.g., b ottom left, top right) Categorical bo dypart Body part used for the shot (e.g., left foot, head) Categorical situation Match context: open play , corner, free kick Categorical assist metho d Type of assist (pass, cross, etc.) Categorical fast break Whether the shot came from a counterattac k Binary ht, at Home and aw ay team names Categorical fthg, ftag F ull-time home and aw ay goals Numeric F urthermore, this study analyzes all 380 matches from the EPL 2015/16 season, comprising 20 teams comp eting o ver 38 rounds. Ev ent-lev el data w ere collected for both teams in each match, resulting in 760 team-level observ ations. Across a season, a total of 975 goals were recorded, including 172 home wins, 93 dra ws, and 115 aw ay wins. W e use this data to build the basis for full-season simulations, in which Poisson-distributed xG v alues are emplo y ed to ev aluate ranking uncertaint y and p erformance deviations. 3.2 F eature Engineering T o estimate the probability that a shot results in a goal (i.e., exp ected goals or xG), sev eral shot-lev el features were extracted and transformed from the even t dataset. The data were first filtered to include only shot even ts ( event type == 1 ) and complete cases for all relev ant predictors. The following categorical predictors were engineered and incorp orated into the logistic regression mo del: • lo c ation : Pitc h areas were group ed into distance-based zones, such as close r ange , me dium r ange , Outside b ox , and long r ange . A more gran ular version was also tested, distinguishing c entr e of the b ox from side of the b ox . • shot plac e : The intended direction of the shot (e.g., top right corner, b ottom left) w as enco ded as a categorical v ariable to capture accuracy and targeting b ehavior. • b o dyp art : The b o dy part used to take the shot—righ t fo ot, left fo ot, or head—was included due to its known effect on scoring likelihoo d. • situation : The match con text in whic h the shot o ccurred, such as op en play , set piece, corner, or free kick. • assist metho d : The type of assist preceding the shot, including pass, cross, through ball, or header. • fast br e ak : A binary indicator denoting whether the shot follo w ed a coun terattac k. A dditionally , an interaction term w as introduced as a new categorical feature, dis- tance zone, deriv ed from the lo c ation v ariable to represen t spatial sho oting zones. This feature groups shot lo cations into distance-based categories to enhance interpretabilit y 8 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t and capture spatial effects on goal probability . T o further explore p otential dep en- dencies b etw een spatial and biomechanical v ariables, we computed interaction terms (e.g., distance zone * bodypart ). All categorical v ariables w ere appropriately factor- ized and included directly in the mo del, glm() function internally handles categorical enco ding. These features w ere selected based on prior empirical findings and their domain relev ance, and were used as inputs in a logistic regression mo del for estimating xG at the shot level. T able 2 outlines the feature engineering steps used to define the distance zone predictor. T able 2 : Distance Zone F eature Engineering Steps Step T ask Details 1 Input Data Start with filtered shot even ts ( event type == 1 ). 2 Identify Lo cation Use the location variable from event data (co ded 1–19). 3 Define Zones Group lo cation co des into categories based on proximit y to goal. 4 Zone Mapping Close Range: 10, 12, 13, 14; Medium Range: 3, 9, 11; Outside Box: 15, 16; Long Range: 17, 18; Other: all remaining codes. 5 Create New V ariable Assign new v ariable distance zone based on the above mapping. 3.3 Mo del Dev elopment and xG Estimation The central ob jective of this mo deling step was to estimate the probability that a shot results in a goal, commonly referred to as exp ected goals (xG). W e formulated this as a binary classification problem, where each shot is lab eled either as a goal ( y i = 1) or not ( y i = 0). T o mo del this probability , we applied logistic regression, a generalized linear mo del with a binomial distribution and logit link function. The logistic regression estimates the probability ˆ p i that a shot i results in a goal, based on a linear combination of categorical predictors. The mo del can b e expressed as shown in Eq. (2): logit( p ) = log p 1 − p = β 0 + k X j =1 β j X j . (2) Where: p : probability that a shot is con v erted into a goal (exp ected goal, xG) X j : j -th predictor v ariable β j : co efficient corresp onding to the j -th predictor β 0 : intercept term T able 3 presen ts the definitions of the outcome, co v ariates, and mo del parameters used in the analysis. 9 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t T able 3 : Definitions of outcome, cov ariates, and mo del parameters. Sym b ol Definition y i Observed outcome of shot i ; 1 = goal, 0 = no goal p i Probability of a goal (exp ected goals, xG) for shot i logit( p i ) Log-odds transformation of the probabilit y η i Linear predictor (sum of weigh ted predictors) for shot i β 0 Intercept term in the regression mo del β j Coefficient for the j -th feature (e.g., b o dy part, lo cation) X ij V alue of the j -th feature for shot i 3.3.1 Mo del 1: Base Mo del with Distance Zones The base sp ecification mo dels the log-o dds of a shot resulting in a goal as a linear function of categorical cov ariates that describ e the shot con text. The co v ariates include distance zone (close, medium, outside b ox, long range), an engineered feature that groups pitch lo cations by approximate sho oting distance (see Eq. (3)). A dditional predictors include: • shot plac e : (e.g., b ottom left, top right), capturing the intended target area. • b o dyp art : (right fo ot, left fo ot, head), representing the biomechanics of finishing. • situation : (op en pla y , corner, free kick, p enalt y), reflecting match conte xt. • assist metho d : (pass, cross, reb ound, etc.), describing chance creation. • fast br e ak : (yes/no), indicating counterattac k situations. logit( p ) = β 0 + β 1 · distance zone + β 2 · shot place + β 3 · b o dypart + β 4 · situation + β 5 · assist metho d + β 6 · fast break . (3) 3.3.2 Mo del 2: In teraction Mo del (Distance Zone × Bo dy P art) T o capture biomechanical in teractions b etw een spatial context and finishing technique, Mo del 2 extends the base sp ecification b y including an in teraction b etw een distance zone and b o dy parts. This interaction allows the mo del to reflect, for example, that headers are more effective from close range but rarely successful from long distances. The mo del is expressed as: logit( p ) = β 0 + β 1 · distance zone + β 2 · shot place + β 3 · b o dypart + β 4 · situation + β 5 · assist metho d + β 6 · fast break + β 7 · distance zone × bo dypart . (4) 3.3.3 Mo del 3: Gran ular Spatial Zones The third mo del increases spatial resolution by replacing the coarse distance zones with a more detailed categorical v ariable, distance zone granular (see Eq. (5)). This feature distinguishes finer pitc h lo cations such as the c entr e of the b ox , side of the b ox , 10 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t p enalty sp ot , and very close r ange . The ob jective is to assess whether greater spatial gran ularit y improv es predictive p erformance relative to simpler distance groupings. logit( p ) = β 0 + β 1 · distance zone granular + β 2 · shot place + β 3 · b o dypart + β 4 · situation + β 5 · assist metho d + β 6 · fast break . (5) Mo del p erformance was compared across all three sp ecifications using Akaike Informa- tion Criterion (AIC), residual deviance, and the num b er of predictors used (see T able 4). T able 4 : Mo del comparison by AIC, residual deviance, and n um b er of predictors. Mo del AIC Residual Deviance Num. Predictors Base (distance zone) 5354.47 5334.47 10 With Interaction 5357.31 5329.31 14 Granular Zones 5261.82 5239.82 11 Among the three xG mo dels, the Gr anular Sp atial Zones mo del achiev ed the low est AIC and residual deviance (∆AIC ≥ 90), indicating the b est parsimony-adjusted fit. 3.3.4 Mo del Prediction and xG Assignment The logistic regression mo del was trained using the selected optimal cov ariates, and the parameters w ere estimated using the maximum likelihoo d estimation (MLE). The fitted mo del was subsequently applied to compute the xG v alue for each shot, defined as the mo del-predicted probability that a given attempt results in a goal, conditional on its characteristics. The fitted logistic regression mo del pro duces a predicted probabilit y ˆ p i for each shot i , given its input features (see Eqs. (6)–(7)), computed using the inv erse logit (sigmoid) function applied to the linear predictor η i : ˆ p i = 1 1 + exp( − η i ) , (6) η i = ˆ β 0 + k X j =1 ˆ β j X ij . (7) Where: ˆ p i : estimated probability of a goal (i.e., xG v alue). η i : linear predictor com bining mo del co efficients and input features. ˆ β 0 : mo del intercept. ˆ β j : estimated co efficient for predictor X j . X ij : v alue of the j -th feature for shot i . 11 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t The simulation framework is summarized in T able 5 . T able 5 : Ex ante simulation framework for season outcome inference Stage Comp onen t Sp ecification Preparation T emp oral split Split into first half and second hal f fixtures. xG estimation Shot-level logistic mo del → match-lev el xG for first-half matches only . T eam strength Aggregate first-half xG to team-level attack ( xG for ) and defense ( xG against ). Normalization Conv ert xG totals to per-match rates; normalize b y league means. Simulation Poisson intensities Opponent-adjusted scoring rates λ home , λ aw ay based on attack and defense. Goal generation G h m ∼ Poisson( λ h m ) , G a m ∼ Poisson( λ a m ). Standings update Assign points and goals; com bine simulated second half with realised first half. Inference Ranking Rank teams per simulation using league tiebreakers. Outcome probabilities Compute E [ Poin ts ], E [ Rank ], P ( Title ), P ( T op4 ), P (Releg). 4 Results This section ev aluates the accuracy of the xG-based mo del in simulating team p erfor- mance and capturing ranking uncertaint y during the 2015/16 EPL season. W e present statistical comparisons, visualization of predicted vs actual v alues, and uncertaint y quan tification across 1000 sim ulations. 4.1 Mid-season xG as an Early-w arning Signal T o assess whether expected goals provide early diagnostic information, w e analyze team p erformance at the mid-season p oint of the campaign. T eams are grouped according to cum ulativ e first-half xG, and their subsequent second-half p erformance is examined. Figure 2 shows a clear monotonic relationship b et w een first-half xG quartiles and second-half p oint accumulation, indicating that xG contains systematic information ab out subsequent p erformance. 12 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t Figure 2 : T eams group ed into quartiles by mid-season xG. Imp ortan tly , this relationship p ersists even after controlling for mid-season p oints (see Figure 3), suggesting that xG captures dimensions of underlying p erformance not fully reflected in the league table at that stage. Figure 3 : Residualised second-half p oints vs first-half xG. 13 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t A t mid-season, Leicester’s league p osition was 2nd and Arsenal led the table on rank. In contrast, Leicester City ranked fourth in xG at mid-season, despite leading the league on p oints (see figure 4). Figure 4 : Mid-season rank table realised outcomes vs underlying p erformance The mid-season shows the mo derate discrepancy b etw een p oints-based and xG-based rankings and rev eals that Leicester’s subsequent success was not entirely detached from their underlying p erformance, ev en though their even tual championship remained a highly improbable outcome under the mo del (see App endix C). 4.2 Mid-Season xG and Ex An te Outcome Probabilities T o further assess the ex an te inferential v alue of exp ected goals, we simulate the remainder of the season using only information av ailable at the mid-season p oint. Sp ecifically , team attacking strength is deriv ed from cum ulative first-half xG, which is conv erted into a p er-match scoring rate and used to simulate second-half match outcomes. First-half realised points are carried forward, while all second-half results are generated probabilistically . 14 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t 4.2.1 Mid-Season Diagnostic Signal: xG v ersus Realised P erformance W e b egin by examining the relationship b etw een exp ected goals and realised league outcomes at mid-season. Figure 5 compares team rankings based on cumulativ e first- half xG with rankings based on realised p oin ts after 19 matches. T eams close to the diagonal exhibit consistency b etw een underlying p erformance and results, whereas deviations indicate ov er- or underp erformance relative to xG. Sev eral teams o ccupying top p ositions in the p oints table—suc h as Arsenal and Manc hester City also rank highly in xG, suggesting that their league p ositions were supp orted by strong underlying p erformance. Leicester City , who were third in the league at mid-season, likewise ranked highly in cum ulativ e xG, indicating that their early success was not purely anomalous. By contrast, some teams accum ulated p oin ts disprop ortionate to their xG, signalling p otential regression risk in the second half of the season. Figure 5 : Mid-season rank-gap diagnostic 4.2.2 Predicted Second-Half P oints with Uncertaint y Using first-half xG only , w e simulate the remaining half fixtures of the season 1,000 times under a Poisson-based parametrized pro cess calibrated on team-level attacking and defensive strength. Figure 6 presents the predicted second-half p oints for eac h team, alongside 95% confidence interv als derived from the simulation distribution. 15 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t The results indicate substantial heterogeneit y in uncertaint y across the league. T eams with strong and stable xG profiles such as Arsenal, Manchester City , and T ot- tenham exhibit relatively narrow confidence interv als, reflecting consistent underlying p erformance. In contrast, mid-table teams display wider interv als, indicating greater sensitivit y to sto chastic match outcomes. In other words, most teams’ realised outcomes fall within the cen tral mass of their predicted distributions, indicating that mid-season xG captures underlying p erformance strength. Figure 6 : Ex an te simulations conditional on mid-season xG, (where blac k is the mean simulated p oints, grey is 95% and orange is the realised outcome) Leicester Cit y’s predicted second-half p oints lie ab ov e the league median, but w ell b elo w their even tual realised p erformance. Imp ortantly , their realised second-half p oin ts fall within the upp er tail of the simulated distribution rather than entirely outside it, indicating that their title-winning run represen ts a low-probabilit y but feasible outcome under mid-season xG information. 4.2.3 Prediction Error and Agreement at Mid-Season T o assess agreement b et ween predicted and realised second-half outcomes, we examine the difference b etw een simulated mean p oints and realised second-half p oints. Most teams cluster around zero, indicating that first-half xG provides a reasonable baseline 16 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t for second-half p erformance on av erage. How ever, sev eral notable deviations remain. Leicester City emerges as the most pronounced positive outlier, substantially exceeding their predicted mean. Con versely , teams such as Chelsea and Manc hester United underp erform relative to their xG-based exp ectations, suggesting inefficiencies or con textual factors not fully captured by chance qualit y alone. These deviations underscore the probabilistic nature of the framew ork xG constrains the distribution of plausible outcomes but do es not eliminate v ariance arising from finishing efficiency , tactical adaptation, injuries, or random matc h even ts. 4.2.4 Mid-Season Rank Inference and Uncertain ty Bey ond p oin t totals, the simulation framework enables inference o ver league rankings. Figure 7 compares predicted av erage finishing p ositions from the mid-season sim ulations with the realised final standings. The mo del successfully distinguishes broad p erformance tiers, separating title con tenders, mid-table teams, and relegation candidates. Leicester City’s predicted a v erage rank at mid-season lies within the top four, indicating that their underlying p erformance ju stified contender status even if the even tual championship outcome lay in the extreme tail of the distribution. Figure 7 : Mid-Season Predicted A v erage Rank vs Realized Final Rank In addition, the figure reveals that mid-season xG sim ulations correctly captured league tiers, even if th ey did not predict the exact final ordering. On the other hand, rank v ariability across simulations is low est for teams at the top and b ottom of the table, and highest for mid-table teams, reflecting greater comp etitiv e b alance and uncertain t y in those regions of the league. 17 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t 4.2.5 Ex Ante Outcome Probabilities A central adv antage of the simulation-based approach is the ability to estimate probabilistic outcomes rather than p oint forecasts. Figure 7 rep orts the mid-season probabilities of winning the title, finishing in the top four, and b eing relegated for all teams. In addition, the figure illustrates how first-half exp ected-goals p erformance shap es simulated second-half outcomes. Several teams traditionally view ed as title con tenders, such as Arsenal and Manchester City , retain the highest exp ected p oin ts and title probabilities. Leicester City also emerges as a statistically plausible contender at mid-season, with an av erage sim ulated rank of 3.1, a title probability of 16.7%, and a top-four probability exceeding 80%. 18 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t Figure 8 : Second-half sim ulations conditional on first-half exp ected goals 19 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t A t the b ottom of the table, teams such as Aston Villa, Sunderland, and Newcastle exhibit high relegation probabilities, aligning closely with their even tual outcomes. 4.2.6 Leicester City: F rom Mid-Season Signal to T ail Outcome Leicester City provides a comp elling illustration of the distinction b et w een ex ante inference and realised outcomes. At mid-season, their strong xG p erformance placed them among the league’s top teams in terms of underlying quality . Simulations based on this information assigned them a meaningful probabilit y of a top-four finish and a small but non-negligible chance of winning the title. Their even tual championship, how ever, represents an extreme realisation within the upp er tail of the predicted distribution. T ak en together, the mid-season results demonstrate that exp ected goals are more v aluable as an early-warning and uncertain t y-quan tification to ol. They pro vide a disciplined baseline for inference under partial information, while explicitly allowing for the rare but consequential even ts that define fo otball seasons. 4.3 Ex Post Benc hmark: F ull-Season xG Sim ulations F or completeness, w e ev aluate the sim ulation framework retrosp ectively using full- season xG inputs. Because this b enchmark conditions on information unav ailable at mid-season, it represents a b est-case upp er b ound on achiev able mo del fit rather than a feasible real-time forecast. Ov erall, the framew ork repro duces the broad structure of the league table, with strong rank correlations and mo derate p oin t-lev el errors (see table 6 in App endix A). Sim ulated p oint totals typically fall within tw o to three matc h outcomes of realised season totals, while av erage ranking errors are approximately three league p ositions. Ho w ever, extreme outcomes most notably Leicester City’s championship—remain difficult to anticipate even under full-season information. Accordingly , this ex-p ost b enc hmark serv es primarily as a reference p oint against which the mid-season ex-an te results should b e interpreted, reinforcing the role of xG as a probabilistic baseline and early-w arning diagnostic rather than a deterministic predictor. 4.4 Ranking Uncertaint y and Simulation Spread T o quan tify the v ariation in league rankings that may arise from the inheren t random- ness of match outcomes even when underlying xG p erformance remains consistent we analyzed the distribution and v ariance of team ranks across 1,000 full-season sim u- lations. This ranking v ariability captures the uncertain t y in seasonal outcomes and highligh ts which teams exhibited stable versus v ariable predicted p erformance. This step is critical for ensuring that team rank distributions conv erge as the num b er of sim ulations increases, rather than fluctuating erratically due to sample size limitations. The results demonstrate that: • F or most teams, the sim ulated rank v alues conv erge early , with only minimal shifts observed after 500 sim ulations (see App endix B). 20 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t • T op- and b ottom-ranked teams, such as Arsenal, Manc hester City , Aston Villa, and Norwich, exhibit particularly stable rank curves, indicating low v ariance in their simulated season outcomes. • Mid-table teams show slightly greater v ariation in the early iterations but still stabilize b y around 750–1,000 iterations, confirming that the sample size is sufficien t for reliable predictions. 4.5 Predicted Title, T op-4, and Relegation Probabilities (Mid-Season Ex Ante) A ma jor strength of probabilistic sim ulation is its abilit y to quantify uncertaint y in team outcomes under partial information. Beyond av erage rank or p oints, these simulations allo w us to estimate the likelihoo d that a team achiev es critical league milestones such as winning the title, qualifying for the top four, or being relegated. All results rep orted in this section are generated ex ante at mid-season, using exp ected-goals information from the first half of the season only . Figure 9 presents the mid-season outcome probabilities of winning the title, finishing in the top four, and being relegated for all 20 EPL teams in the 2015/16 season. Arsenal en tered the mid-season p oint with the highest title probabilit y (49.0%), follo w ed by Manc hester City (26.6%) and T ottenham (5.4%), confirming their status as strong xG-based contenders at that stage of the campaign. In con trast to narratives based solely on league position, Leicester City also emerges as a statistically plausible con tender at mid-season. The mo del assigns Leicester an a v erage simulated rank of 3.12, a title probabilit y of 16.7%, and a top-four probabilit y exceeding 80%. While these v alues are low er than those of Arsenal and Manchester Cit y , they indicate that Leicester’s underlying p erformance supp orted contender status, rather than representing a purely anomalous ov erp erformance. A t the lo wer end of the table, teams suc h as Aston Villa, Sunderland, and New castle United exhibit the highest predicted probabilities of relegation, reflecting p ersistently w eak xG profiles at mid-season. In particular, Aston Villa’s relegation probability exceeds 90%, while b oth Sunderland and Newcastle face probabilities abov e 50%, aligning closely with their subsequent season tra jectories. Ov erall, the aggregate mid-season simulation results demonstrate that exp ected goals provide a meaningful early-warning signal for b oth p ositiv e and negative season outcomes. These findings underscore the role of randomness in fo otball and highlight the v alue of exp ected goals as a probabilistic baseline for inference and uncertaint y quan tification, rather than as a deterministic predictor of rare season outcomes. 21 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t Figure 9 : Mid-season outcome probabilities based on xG simulations: title hop es, top-4 chances, and relegation risks 4.5.1 Leicester City: A T ail Outcome Under Mid-Season Inference Leicester Cit y’s 2015/16 championship represents the most prominent extreme realisa- tion in our mid-season simulation framework. Conditioning exclusively on first-half exp ected goals, Leicester achiev ed an av erage sim ulated finishing p osition within the top four and was assigned a non-negligible probabilit y of winning the title. While this probability remained substantially low er than that of traditional contenders such as Arsenal and Manchester City , it w as strictly p ositive, reflecting Leicester’s strong underlying p erformance by mid-season. Their even tual title win therefore constitutes a low-probabilit y but feasible outcome under the ex ante predictiv e distribution implied b y exp ected goals. 22 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t 4.5.2 Relativ e Ov er- and Under-Performance at Mid-Season Bey ond Leicester City , the mid-season simulations highlight several teams whose realised outcomes diverged meaningfully from their xG-based expectations. Manc hester United and Chelsea, in particular, underp erformed relative to their predicted second- half p oints and ranks, despite generating comp etitive first-half xG profiles. These deviations suggest inefficiencies in c hance conv ersion, defensive execution, or con textual factors such as injuries and tactical instability that are not fully captured by xG alone. By contrast, teams suc h as Arsenal and Manc hester Cit y exhibited realised outcomes broadly consistent with their mid-season exp ectations, reinforcing the mo del’s ability to distinguish genuine contenders from mid-table uncertaint y . Overall, these patterns underscore the probabilistic nature of the framew ork exp ected goals constrain the range of likely outcomes and identify structural strengths, but do not preclude meaningful deviations driven by sto chastic matc h even ts and executional v ariance. 5 Discussion and F urther Researc h 5.1 In terpretation of Sim ulation Results The mid-season xG-based inference framework demonstrates a strong ability to recov er the broad structure of the league under partial information. Conditioning exclusively on first-half exp ected goals, the simulations correctly identify the principal contenders for top-four p ositions, as well as teams facing elev ated relegation risk. This indicates that exp ected goals provide a meaningful early diagnostic signal of underlying team qualit y b efore final outcomes are realised. A t the same time, the results highlight the inherent limitations of xG when applied to rare or extreme outcomes. Leicester Cit y , despite exhibiting a strong first-half xG profile and ranking among the leading teams at mid-season, was assigned a relatively mo dest title probability . Their even tual c hampionship therefore represents a lo w- probabilit y realisation from the upp er tail of the sim ulated outcome distribution rather than a contradiction of the mo del’s exp ectations. This distinction is critical under mid-season information, Leicester’s success was unlikely , but not implausible. More generally , deviations b etw een sim ulated exp ectations and realised outcomes underscore the role of factors not explicitly captured by shot-based mo dels. Persisten t finishing efficiency , defensive organisation, tactical adaptation, squad rotation, and injury dynamics may all amplify or suppress realised p erformance relative to xG-based baselines. As such, exp ected goals should b e interpreted as constraining the range of plausible outcomes rather than deterministically predicting final league p ositions. Con v ersely , several teams traditionally viewed as title contenders most notably Arsenal and Manchester City retained the highest mid-season title probabilities, consisten t with their strong underlying xG profiles. While their realised outcomes fell short of simulated exp ectations, these discrepancies further illustrate the influence of sto c hastic v ariation and conte xtual factors ov er the second half of the season. T aken together, the mid-season results reinforce the cen tral contribution of this study exp ected goals are most informativ e as an ex ante inference to ol and early- w arning signal, capable of identifying structural strength and quan tifying uncertaint y , 23 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t while explicitly allo wing for rare but consequential deviations that define fo otball seasons. 5.2 Mo del Limitations Despite the utility of xG mo deling, several limitations must b e ackno wledged: • Logistic regression treats all shots as indep endent even ts, ignoring temporal or tactical sequences (e.g., sustained pressure, momentum). While this assumption is standard in shot-based xG mo deling, it may underrepresen t clustering effects suc h as sustained pressure or momentum within matc hes. • Due to data av ailability constraints, the mo del conditions on shot c haracteristics alone and does not explicitly incorp orate defensive pressure, goalk eep er p ositioning, or team shap e, all of which may influence conv ersion probability . • The same xG v alue is assigned regardless of individual play er finishing abilit y , p oten tially obscuring systematic differences. F or example, b etw een elite attack ers and more defensively oriented pla yers. • T eam-sp ecific styles (e.g., compact defense, high pressing) are not incorp orated, y et they significantly influence match outcomes. • P oisson-based simulation captures the inherent randomness of goal scoring, but in small samples, it can exaggerate extreme outcomes. F or example, a team with xG ≈ 1 may o ccasionally b e simulated to score 4-5 goals, inflating v ariance and assigning non-negligible probabilities to improbable ev ents. While these effects av erage out ov er thou sands of runs, they partly explain why the mo del ma y understate fav orites and ov erstate long shots. This b ehavior reinforces the in terpretation of the simulations as probabilistic baselines rather than precise generators of individual season outcomes. 5.3 F urther W ork T o enhance p redictiv e p ow er and mo del practicality , several improv ements are needed, suc h as: • Incorp orating pla y er-sp ecific scoring abilit y and finishing efficiency to p ersonalize xG assignments. • A dding contextual features such as defensive pressure, num b er of defenders, and goalk eep er p ositioning to b etter reflect shot difficulty . • Emplo ying sequence-aw are mo dels (e.g., RNNs or attention mechanisms) to capture momentum and build-up play . • Assigning weigh ts for team style, pace, and pressing intensit y to more accurately sim ulate league dynamics. • Lev eraging adv anced approaches suc h as Bay esian metho ds or ensemble learners to improv e p erformance. • Extending the framework to other leagues (e.g., Serie A, La Liga) to assess mo del transferabilit y . 24 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t App endix A: Ex P ost Benc hmark Using F ull-Season xG F or completeness, we rep ort a retrosp ective b enc hmark in which the sim ulation framew ork is conditioned on full-season exp ected goals. Because this analysis uses information unav ailable at mid-season, it do es not constitute a feasible ex ante forecast. Instead, it provides an upp er b ound on achiev able mo d el fit, against which the mid-season inference results in Section 4 can b e contrasted. T able 6 rep orts rank- and p oint-lev el agreemen t b etw een simulated and realized outcomes under this full-information setting. T able 6 : Ex p ost b enchmark: full-season xG fit (upp er b ound) Metric Definition P oin ts Ranks Spearman’s ρ Rank correlation (monotonic asso ciation) 0.837 0.824 Pearson’s r Linear correlation coefficient 0.871 0.862 R 2 Proportion of variance explained 0.758 0.744 RMSE Root Mean Squared Error 7.624 2.922 MAE Mean Absolute Error 6.102 2.220 App endix B: Conv ergence of av erage team ranks with increasing sim ulations Figure 10 shows that the an ticipated av erage ranks quic kly come together as the n um b er of simulations go es up. This shows that the exp ected outcomes are ov erall stable across teams. 25 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t Figure 10 : Conv ergence of av erage predicted ranks with increasing simulations App endix C: Leicester Cit y: A Descriptiv e Illustration The figure 11 shows that Leicester accum ulated p oints at a faster rate than underlying xG, indicating early o verperformance relativ e to chance creation. This descriptive tra jectory complemen ts the league-level analysis by providing an intuitiv e illustration of Leicester’s divergence b etw een realised outcomes and underlying p erformance. 26 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t Figure 11 : Commulativ e xG vs p oints (first half ) A ckno wledgements. Language refinement for this manuscript was supp orted b y Ov erleaf ’s editing to ols and Grammarly . Data A v ailabilit y The datasets supp orting the conclusions of this article are av ailable in the following public rep ositories: • Kaggle: https://www.kaggle.com/datasets/secarean ualin/fo otball- ev en ts • GitHub (StatsBomb Op en Data): https://gith ub.com/statsb omb/open- data • StatsBomb Resource Cen tre: https://statsbomb.com/resource- centre/ • Hudl StatsBomb Platform: https://www.h udl.com/en gb/pro ducts/statsb omb F unding This work has b een supp orted by the pro ject ”SMAR T sp orts”: “Statistical Mo dels and AlgoRiThms in sp orts. Applications in professional and amateur contexts, with able-b o died and disabled athletes”, funded by the MIUR Progetti di Ricerca di Rilev an te Interesse Nazionale (PRIN) Bando 2022 - gran t n. 2022R74PLE (CUP J53D23003860006). 27 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t Comp eting In terests The authors declare no comp eting interests. References Anzer, G., and P . Bauer. 2021. “A Goal Scoring Probability Mo del for Shots Based on Sync hronized Positional and Even t Data in F o otball (So ccer). ” F r ontiers in Sp orts and A ctive Living 3: 624475. Bandara, I., S. Shelyag, S. Ra jasegarar, D. Dwy er, E. J. Kim, and M. Angelov a. 2024. “Predicting Goal Probabilities with Improv ed xG Mo dels Using Even t Sequences in Asso ciation F o otball. ” PLOS ONE 19 (10): e0312278. Brec hot, M., and R. Flepp. 2020. “Dealing with Randomness in Match Outcomes: Ho w to Rethink Performance Ev aluation in Europ ean Club F o otball Using Exp ected Goals. ” Journal of Sp orts Ec onomics 21 (4): 335–62. Co x, A., and C. Philipp ou. 2022. “Measuring the Resilience of English Premier League Clubs to Economic Recessions. ” So c c er & So ciety 23 (4–5): 482–99. Eggels, H., R. V an Elk, and M. Pec henizkiy . 2016. “Explaining So ccer Match Outcomes with Goal Scoring Opp ortunities Predictiv e Analytics. ” In Pr o c e e dings of the 3r d W orkshop on Machine L e arning and Data Mining for Sp orts A nalytics (MLSA 2016) . Aac hen: CEUR-WS.org. Egidi, L., and N. T orelli. 2021. “Comparing Goal-Based and Result-Based Approac hes in Mo delling F o otb all Outcomes. ” So cial Indic ators R ese ar ch 156 (2): 801–13. Elsharka wi, M., R. H. Ali, and T. A. Khan. 2025. “Crafting a Play er Impact Met- ric through Analysis of F ootball Matc h Ev en t Data. ” Journal of Computational Mathematics and Data Scienc e 15: 100115. Herbinet, C. 2018. “Predicting F o otball Results Using Mac hine Learning T echniques. ” MEng thesis, Imp erial College London. Kharrat, T., I. G. McHale, and J. L. Pe˜ na. 2020. “Plus–Min us Pla y er Ratings for So ccer. ” Eur op e an Journal of Op er ational R ese ar ch 283 (2): 726–36. Kuba yi, A. 2020. “Analysis of Goal Scoring Patterns in the 2018 FIF A W orld Cup. ” Journal of Sp orts Scienc es . Lago, C. 2009. “The Influence of Match Lo cation, Quality of Opp osition, and Match Status on Possession Strategies in Professional Asso ciation F o otball. ” Journal of Sp orts Scienc es 27 (13): 1463–69. Liu, H., M. ´ A. Gomez, C. Lago-Pe˜ nas, and J. Sampaio. 2015. “Match Statistics Related to Winning in the Group Stage of 2014 Brazil FIF A W orld Cup. ” Journal of Sp orts Scienc es 33 (12): 1205–13. Macdonald, B. 2012. “An Expected Goals Mo del for Ev aluating NHL T eams and Play ers. ” In Pr o c e e dings of the 2012 MIT Slo an Sp orts A nalytics Confer enc e , March. 28 Leicester’s T ale: xG Mo delling for EPL 2015/16 Preprin t Macr ` ı Demartino, R., L. Egidi, and N. T orelli. 2024. “Alternative Ranking Mea- sures to Predict International F o otball Results. ” Computational Statistics , 1–19. h ttps://doi.org/10.1007/s00180-024-XXXX. Mead, J., A. O’Hare, and P . McMenem y . 2023. “Exp ected Goals in F o otball: Improving Mo del Performance and Demonstrating V alue. ” PLOS ONE 18 (4): e0282295. Nguy en, Q. 2021. “Poisson Mo deling and Predicting English Premier League Goal Scoring. ” arXiv Pr eprint Nip oti, B., and L. Schia von. 2025. “Exp ected Goals under a Bay esian Viewp oin t: Uncer- tain t y Quantification and Online Learning. ” Journal of Quantitative A nalysis in Sp orts 21 (1): 37–50. Rathk e, A. 2017. “An Examination of Exp ected Goals and Shot Efficiency in So ccer. ” Journal of Human Sp ort and Exer cise 12 (2): 514–29. Sec˘ arean, A. 2025. F o otb al l Events [Dataset]. Kaggle. A v ailable at: h ttps://www.kaggle.com/datasets/secarean ualin/fo otball-ev ents (accessed F ebruary 9, 2025). title, A. P ., and M. Suguna. 2023. “Data-Driven Play er Recruitment in F o otball. ” In 2023 2nd International Confer enc e on A utomation, Computing and R enewable Systems (ICA CRS) , 844–50. IEEE. Souza, N. J. D., H. N. Samrudh, S. Gautham, B. U. Shaman Bhat, and N. Nagarathna. 2021. “F o otball Game Analysis and Prediction. ” In Soft Computing and Signal Pr o- c essing: Pr o c e e dings of 3r d ICSCSP 2020, V olume 1 , 167–80. Singap ore: Springer Singap ore. Skripnik o v, A. V., A. Cemek, and D. Gillman. 2025. “Leveraging Minute-b y-Minute So ccer Match Even t Data to Adjust T eam’s Offensive Pro duction for Game Context. ” Journal of Quantitative A nalysis in Sp orts . h ttps://doi.org/10.1515/jqas-2024-0162. Sp earman, W. 2018. “Beyond Exp ected Goals. ” In Pr o c e e dings of the 12th MIT Slo an Sp orts A nalytics Confer enc e , F ebruary , 1–17. Umami, I., D. H. Gautama, and H. R. Hatta. 2021. “Implementing the Exp ected Goal (xG) Mo del to Predict Scores in So ccer Matches. ” International Journal of Informatics and Information Systems 4 (1): 38–54. Vilela, J. A. T. F. 2024. “Exploring Sto chastic Efficiency Analysis for Exp ected Goals in F o otball: Assessing Offensive Efficiency Across Europ e’s Ma jor F o otball Leagues. ” Master’s thesis, Universidade NOV A de Lisb oa (Portugal). W ang, S. H., Y. Qin, Y. Jia, and I. K. E. Igor. 2022. “A Systematic Review ab out the P erformance Indicators Related to Ball Possession. ” PLOS ONE 17 (3): e0265540. Wheatcroft, E. 2021. “F orecasting F ootball Matches by Predicting Match Statistics. ” Journal of Sp orts A nalytics 7 (2): 77–97. 29
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment