The Impact of Formations on Football Matches Using Double Machine Learning. Is it worth parking the bus?

The Impact of F ormations on F o otball Matc hes Using Double Mac hine Learning. Is it w orth parking the bus? Gen ´ ıs Ruiz-Men´ arguez 1 and Lloren¸ c Badiella 1 1 Dep artament de Matem` atiques, Universitat A ut` onoma de Bar c elona, Sp ain Abstract This study addresses a cen tral tactical dilemma for fo otball coaches: whether to emplo y a defensiv e strategy , collo quially known as ”parking the bus,” or a more oﬀensive one. Using an adv anced Double Mac hine Learning (DML) framework, this pro ject provides a robust and in ter- pretable to ol to estimate the causal impact of diﬀerent formations on key match outcomes such as goal diﬀerence, p ossession, corners, and disciplinary actions. Lev eraging a dataset of ov er 22,000 matc hes from top Europ ean leagues, formations were categorized in to six represen tative t yp es based on tactical structure and exp ert consultation. A ma jor metho dological contribution lies in the adaptation of DML to handle categorical treatments, sp eciﬁcally formation com bina- tions, through a no vel matrix-based residualization process—allowing for a detailed estimation of formation-v ersus-formation eﬀects that can inform a coac h’s tactical decision-making. Results sho w that while oﬀensive formations lik e 4-3-3 and 4-2-3-1 oﬀer modest statistical adv antages in p ossession and corners, their impact on goals is limited. F urthermore, no evidence supp orts the idea that defensiv e formations—commonly asso ciated with parking the bus—increase a team’s winning potential. Additionally , red cards app ear unaﬀected b y formation choice, suggest- ing other b eha vioral factors dominate. Although this approach do es not fully capture all asp ects of pla ying st yle or team strength, it provides a v aluable framework for coaches to analyze tactical eﬃciency and sets a precedent for future research in sp orts analytics. Keyw ords: F ootball, F ormation, Double Mac hine Learning, Causal Inference, Confounding, XG- Bo ost 1 In tro duction F ootball coac hes face a critical decision b efore each matc h: which formation to implemen t to maximize their team’s performance. The organization of the eleven starting play ers into sp eciﬁc structural patterns is a foundational element of tactical strategy . This pro ject aims to pro vide data-driv en insigh ts for this decision b y analyzing whether certain defensiv e formations, commonly referred to as ”parking the bus” (1), are truly eﬀective in speciﬁc scenarios. Jos´ e Mourinho, former Chelsea F C manager, p opularized this term to describ e a riv al’s excessively defensiv e approach. 1 These defensive strategies are often employ ed against stronger opp onen ts (2, 3); yet their actual eﬀectiv eness remains a sub ject of debate. This study addresses this debate directly for the beneﬁt of coac hes and tactical analysts. W e examine the direct relationship b et w een formations and several k ey matc h outcomes, mo ving b ey ond anecdotal evidence to provide a robust statistical analysis. After generalizing and grouping formations based on tactical similarity and exp ert criteria, a Double Mac hine Learning framework is applied to estimate the isolated impact of each formation com bination on v arious v ariables, such as goal diﬀerence. While formations ma y not b e the sole predictor of match statistics, understanding whether sp eciﬁc systems consisten tly enhance or hinder k ey p erformance outcomes is in v aluable for strategic planning. Ultimately , the ob jectiv e is to assess the isolated impact of formations themselves, providing coac hes with a clear, un biased lo ok at how tactical systems, indep enden t of individual team quality , inﬂuence matc h results. 2 Data Pro cessing Matc h data for this study were obtained from the Sp ortmonks API (4), a commercial pro vider of comprehensiv e fo otball data. The dataset comprises o ver 22,000 professional league ﬁxtures and includes a wide range of v ariables critical for our analysis, suc h as match statistics, team formations, and con textual information. T o ensure the dataset w as robust and represen tative of mo dern football tactics, sp eciﬁc criteria w ere applied for data inclusion. The selection of seasons and leagues was guided by tw o primary considerations: regulatory consistency and tactical diversit y . 1. Season Selection: The analysis includes seasons from 2018–2019 to 2024–2025. This broad temp oral scop e w as chosen to ensure a suﬃcient data volume for robust statistical mo del- ing. While the ma jority of this p eriod reﬂects the mo dern tactical environmen t shap ed by the ”ﬁv e substitutions” rule, seasons prior to this regulatory c hange (2018–2019, 2019–2020, and 2020–2021) were also included to signiﬁcantly expand the dataset’s size and statistical p o wer. The eﬀects of this regulatory change are implicitly captured by the inclusion of b oth league and season within the confounder set, whic h accoun ts for temp oral and league-sp eciﬁc v ariations that ma y b e correlated with the rule change. 2. League Selection: T o create a geographically and tactically div erse dataset, matc hes from the seven highest-ranked Europ ean men’s fo otball leagues according to UEF A co eﬃcien ts w ere included: the ﬁrst divisions of England, Italy , Spain, German y , F rance, the Netherlands, and P ortugal. T o further enhance the statistical p o wer of the mo dels, additional leagues w ere selected based on tw o criteria: a minim um of 32 regular-season matc hes and suﬃcient data a v ailabilit y from the API. This led to the inclusion of the top divisions of T urk ey , Belgium, and P oland, as well as the second divisions of Spain, Italy , and England. 2 2.1 Confounding V ariables The primary ob jective of this analysis is to isolate the in trinsic relationship b et ween a team’s formation and a riv al’s formation with resp ect to v arious match outcomes. T o achiev e this, a set of confounding v ariables has b een selected. These v ariables are chosen to control for factors that inﬂuence b oth team strength and match outcomes but are not a direct result of the c hosen formation. Consequen tly , mediator v ariables such as accumulated goals or possession, whic h are themselves outcome s of a team’s pla ying style, will b e inten tionally excluded from the mo del. The follo wing v ariables are used as confounders: • Matc h Con text: The season, league, or day of the week of the ﬁxture. • T eam Side: A binary indicator for whether the main team pla yed at home or aw a y . • T eam Strength Metrics: – The rate of accum ulated p oin ts for b oth teams prior to the match. – A binary ﬂag indicating whether eac h team w as sim ultaneously comp eting in the UEF A Champions League. – The league ranking of eac h team at the time of the ﬁxture. – The winning streak from last matc hes. • En vironmen tal F actors: W eather data for the match v enue. The ﬁnal dataset included additional k ey v ariables for each ﬁxture which are used as mediators, treatmen t v ariables or target v ariables at some p oin t of the analysis: • P erformance Metrics: Goals, corners, p ossession, and disciplinary actions (red and yello w cards) for b oth teams. • T actical Information: The formation system used b y b oth teams (e.g., 4-3-3). • Con textual V ariables: In-matc h even ts (e.g., p enalties, substitutions), lineups, and weather data (e.g., temp erature, humidit y). V ariables that could act as mediators b et ween a formation c hoice and the outcome—such as in- game substitutions or team fouls—were inten tionally excluded from the set of confounders. This distinction is crucial to ensure that statistical mo del isolates the pure direct eﬀect of the formation, rather than capturing indirect eﬀects. While these v ariables are not used as confounders, some, suc h as p ossession and disciplinary actions, serve as secondary target v ariables for our analysis. The ra w dataset underw ent several cleaning and ﬁltering steps to ensure data qualit y and method- ological consistency . Speciﬁc seasons from the Belgian Pro League and T urkish S¨ up er Lig w ere ex- cluded from the analysis due to incomplete matc h data, which w ould hav e compromised the integrit y of feature engineering. 3 T o fo cus on a consistent comp etitiv e en vironment, non-regular league ﬁxtures, such as pla y-oﬀs and play-outs, w ere remo ved from the dataset. F urthermore, to mitigate p oten tial biases from early- season instability and late-season goal achiev emen t eﬀects, matc hes from the ﬁrst t wo and ﬁnal four rounds of each season were also discarded. This ensures that the analysis is based on matc hes where teams are at a comparable lev el of statistical p erformance and motiv ation. T o prop erly con trol for team quality and contextual factors, several v ariables w ere engineered to quan tify team strength at the time of eac h ﬁxture. These v ariables served as potential confounders in the relationship b et w een treatmen t and outcome v ariables. Betting o dds data w ere inten tionally excluded from this set, as suc h information ma y implicitly incorp orate factors related to formation c hoice, p oten tially introducing a source of endogeneity . The engineered team strength v ariables include: • Accum ulated P oints: A numerical v ariable representing the ratio of p oin ts earned out of the total p ossible in prior matches, computed separately for home and aw a y p erformances. • Champions League Flag: A binary v ariable indicating whether the home or a wa y team was comp eting in the UEF A Champions League at the time of the ﬁxture. • T eam Ranking: An in teger v ariable reﬂecting the league p osition of each team, with the same rank assigned to teams with equal p oin ts. Consisten t with the study’s ob jective to isolate the pure causal eﬀect of formations, v ariables that are structurally related to formations which are considered mediators -suc h as accumulated goals or fouls- were not included in the confounder set. This careful v ariable selection is crucial to preven t the mo del from capturing indirect eﬀects and ensures that the estimated causal impact is a true reﬂection of formation c hoice itself. 3 Metho dology In order to estimate formations’ impact on fo otball match outcomes w e propose to use a Double Mac hine Learning approach (5), that will remov e the eﬀect of confounders. F or the DML 1st step consider the following linear mo del: Y = D β + X γ + ε • Y : outcome v ariable (e.g., goal diﬀerence) • D : treatmen t v ariable (e.g., formation combination of b oth teams) • X : matrix of control v ariables (confounders), • β : causal parameter of interest, 4 • γ : nuisance parameters, • ε : error term. 3.1 Double Machine Learning Double Machine Learning states that the eﬀect of a treatmen t v ariable D on an outcome Y , after accoun ting for con trol v ariables X , can b e obtained by ﬁrst removing from b oth the treatment and the outcome the v ariation explained by the confounding v ariables, and then regressing the residuals of the outcome on the residuals of the treatment. The resulting co eﬃcien t corresp onds to the partial eﬀect β of the treatment that is orthogonal to the confounding v ariables. F unction ˆ f ( X ) is deﬁned as the Mac hine Learning estimator of Y on X , and ˆ g ( X ) as the estimator of D on X . Both ˆ f ( X ) and ˆ g ( X ) can b e adaptive to high dimensional spaces, and capture nonlinearities, in teractions, and complex dep endencies automatically . The ﬁnal stage is obtained isolating the ˆ β estimator from the 3rd step residualization pro cess. Through sample splitting and cross-ﬁtting, this mo diﬁcation main tains orthogonality and robust- ness while allo wing the estimation of treatmen t eﬀects in high-dimensional or nonlinear en vironments. Some examples of this extension include using b oosting or decision trees mo dels as a substitution of traditional linear mo dels. 3.2 New Approach for DML with categorical treatment Applying categorical treatment in the second DML step enables us to fully capture the partial eﬀect of the target v ariable on eac h diﬀeren t treatment v ariable. This approac h is highly eﬀectiv e when dealing with m ulticategorical and indep enden t treatment v ariables. T o consider eac h formation com bination in the v ariable D , dummy v ariables for k 2 − 1 unique formation com binations will b e generated, where k is the num b er of diﬀeren t formations and it is equal for b oth main and riv al team. The goal is to iden tify which formations are prone to contribute to a higher v alue in the target v ariable (e.g. goals diﬀerence), meaning winning by a larger margin in the case of goals. T o do so, each treatmen t v ariable vector D i,j corresp onds to the combination of the main team formation i and riv al team formation j . Therefore, the resulting mo dels can b e expressed in a t wo-dimensional matrix. D i,j = α i,j X + ε D i,j Where, • D i,j : is a dummy vector which con tains 1 for the combination of main team formation i and riv al team formation j row, -1 for the unused formation row ( D k 2 ) and 0 otherwise. • α i,j is a v ector of co eﬃcien ts corresp onding to each v ariable in the X matrix. 5 Note that the D k,k formation com bination is ommitted, as it is not used to av oid m ulticollinearit y . Nev ertheless, this last formation data can b e deduced from the other ones . This enco ding scheme enables us to gather the true eﬀect of each formation combination D i,j on the target v ariable Y . (See eﬀect co ding (6)). r D i,j = D i,j − ˆ D i,j Where r D i,j is the residual of the treatmen t v ariable D i,j prediction. Consequen tly , the third mo del is computed: r Y = β 1 , 1 r D 1 , 1 + · · · + β 1 ,k r D 1 ,k + · · · + β k,k − 1 r D k,k − 1 + ϵ The greater the β i,j is, the greater p ositiv e impact it will hav e on the target v ariable Y . The estimator ˆ β k,k can b e obtained as: ˆ β k,k = − ( ˆ β 1 , 1 + · · · + ˆ β 1 ,k + · · · + ˆ β k,k − 1 ) Since β k,k b elongs to the diagonal estimators, β k,k = 0. 3.2.1 Categorical T reatment Prop erties By applying this new approac h, the treatment D is orthogonalized on the confounders X . After regressing each dumm y D i,j on X and taking residuals r D i,j , eac h residual r D i is orthogonal to the con trols X . As any correlation b et ween D i,j and Y that was due to X has b een purged, the remaining correlation b et ween r D i,j and r Y also captures the true causal eﬀect. F or every dumm y v ariable D i,j , a separate orthogonal moment condition is deﬁned after residualization. Therefore, the orthogonalit y condition required by DML holds for every formation combination separately and the original framew ork prop erties remain v alid. Ho wev er, this new approach leads to a diﬀeren t in terpretation of the parameter β . Eac h coeﬃcient β i,j estimates the causal eﬀect on goal diﬀerence of choosing the main team formation i given the riv al team formation j in comparison to the omitted combination D k,k . 3.3 Pro ject Metho dology T o address the formation impact analysis, the standard DML framework w as adapted. The core of our approac h inv olv es a tw o-stage residualization pro cess, which can b e summarized as follows: 1. First-Stage Residualization: A machine learning mo del, sp eciﬁcally an X GBo ost Regressor (7), is used to predict the target v ariable (e.g., goal diﬀerence) based on the set of confounding v ariables. The resulting residual, denoted as r Y , represen ts the component of the target v ariable not explained b y these confounders. 6 2. T reatmen t Model Residualization: Similarly , an XGBoost Regressor is trained to predict the treatmen t v ariable (formation com bination) using the same set of confounders. T o apply this to our categorical treatment, a separate regression mo del is computed for each formation com bination. The resulting residuals, r D , represen t the v ariation in the treatmen t not explained b y the confounders. 3. Causal Parameter Estimation: Finally , the causal eﬀect of formations is estimated by a simple linear regression of the target residuals ( r Y ) on the treatment residuals ( r D ). The co eﬃcien ts from this ﬁnal regression, ˆ β i,j , provide the unbiased causal estimates of the eﬀect of eac h formation combination on the target v ariable. 4 F ormation Analysis Results This approach consists of analyzing the impact of diﬀeren t tactical strategies on match outcomes. T o do so, formations from the dataset w ere group ed into six distinct categories, sorted from most defensiv e to most oﬀensive. 4.1 F ormation T reatment F ormations are represen ted by a sequence of n umbers, sp ecifying the distribution of outﬁeld play ers across the defensiv e, midﬁeld, and oﬀensive lines. Since the goalkeeper is not counted, the num b ers sum to ten. Initially , the dataset contained 28 distinct formations. T o enable robust statistical analysis and mitigate the issues of data sparsity and lack of statistical p ow er that would result from an exces- siv ely large treatment matrix, these formations were group ed into six representativ e categories. This pro cess was conducted based on exp ert consultation and a tactical similarity criterion, ensuring that formations with a similar structural foundation w ere aggregated. F or example, formations such as 4-3-3 and its v ariations (e.g., 4-3-1-2) w ere categorized together as they share a similar core structure. This grouping approach addresses the issue of diﬀeren t represen tations of the same underlying tactical system b eing treated as distinct entities. The six resulting formations, sorted from the most defensive to the most oﬀensive one, and their prop ortion in the dataset are shown in T able 1. T able 1. F ormation p ercen tages across diﬀerent leagues League 5-4-1 4-4-2 3-5-2 4-2-3-1 4-3-3 3-4-3 La Liga 8.5% 32.57% 10.79% 22.72% 20.64% 4.78% Premier League 7.75% 15.32% 9.26% 31.23% 24.43% 12.01% Serie A 1.38% 6.16% 33.24% 18.54% 25.45% 15.23% Bundesliga 4.48% 13.9% 21.73% 26.68% 13.41% 19.81% Ligue 1 8.72% 14.32% 15.65% 26.96% 20.17% 14.18% 7 The most frequently used formations across all leagues are the 4-2-3-1 and 4-3-3. F urthermore, analysis conﬁrmed that the prop ortion of formations used by home and aw a y teams is nearly identical, whic h justiﬁes a simpliﬁed mo deling approach for the formation treatment. Find b elo w some ﬁxture statistics on a v erage group ed by formation and still ordered from most defensiv e to most oﬀensive ones. T able 2. Av erage Match Statistics by F ormation across All Leagues F ormation 5-4-1 4-4-2 3-5-2 4-2-3-1 4-3-3 3-4-3 Goals 1.236 1.249 1.268 1.392 1.437 1.341 Red Cards 0.110 0.114 0.104 0.106 0.103 0.098 Y ellow Cards 2.169 2.246 2.215 2.110 2.075 2.188 P ossession 48.98% 48.62% 48.24% 50.72% 51.63% 49.80% Corners 4.751 4.719 4.754 5.004 5.107 4.854 Although some formations, such as the 4-2-3-1 and 4-3-3, are asso ciated with higher av erage goals and p ossession, these preliminary correlations are not suﬃcient for a v alid causal inference. The ob- serv ed statistical relationships are lik ely a result of confounding factors and complex in teractions. The nominal formation used is a simpliﬁcation of a team’s dynamic tactical structure, which can change m ultiple times during a matc h due to in-game adaptations, substitutions, and opponent resp onses. F urthermore, a single formation can accommo date a wide v ariety of playing styles, which are not captured b y a simple numerical represen tation. Therefore, it is crucial to emplo y a robust causal inference framew ork to isolate the true eﬀect of a formation from these confounding v ariables. 4.2 Mo del P erformance Fiv e separate XGBoost Regressor mo dels were trained, each corresp onding to a diﬀerent target v ari- able. The hyperparameter tuning for each mo del was p erformed aiming to maximize the negative mean squared error (-MSE). The MSE is the primary metric for ev aluating the predictive p erformance of the ﬁrst-stage regressions, as a v alue of zero corresp onds to a p erfect ﬁt. A critical hyperparameter choice was setting the upp er limit for max depth to 5. This constraint limits the complexity of each individual tree, which reduces the model’s susceptibility to o verﬁt- ting and helps ensure a low er v ariance in the predictions. Consequently , this approach also helps main tain symmetry in the ﬁnal ˆ β estimations by preven ting the mo del from making o verly sp eciﬁc appro ximations. The predictive p erformance of each mo del, ev aluated on a held-out test set, is summarized in the table b elo w. The R 2 metric is pro vided to quantify the prop ortion of v ariance in each target v ariable explained b y the set of confounders. 8 T able 3. Predictiv e Performance Metrics for First-Stage XGBoost Mo dels Statistic MSE R 2 Goals 2.70 0.118 Red Cards 0.19 0.003 Y ellow Cards 3.18 0.030 P ossession 343.35 0.298 Corners 7.98 0.781 As the table shows, the mean squared error v alues are not directly comparable across mo dels due to the diﬀering units and scales of the target v ariables (e.g., possession is a p ercen tage, while red cards are discrete coun ts). The mo del predicting corners exhibits the highest R 2 v alue, indicating it captures the largest prop ortion of v ariance among the ﬁve targets. It is imp ortant to note that maximizing predictiv e accuracy was not the primary goal of this study . The ﬁrst-stage mo dels serv e as a means to an end, sp eciﬁcally to p erform the residualization necessary for obtaining the DML-estimated ˆ β i,j co eﬃcien ts. Giv en the high degree of unpredictability inheren t in fo otball, R 2 v alues for many match outcomes are exp ected to b e low. 4.3 F ormation Matrix This DML approac h enabled the computation of sev eral causal analyses regarding the impact of formation com binations on v arious matc h outcomes. T o interpret the resulting matrices of estimators, it is imp ortan t to recall some k ey theoretical prop erties of the framework: • The diagonal of the matrix represents a formation’s eﬀect against itself, and th us, its v alue is deﬁned as zero. • The matrix is theoretically symmetrical, with ˆ β i,j = − ˆ β j,i . Any minor deviations from p erfect symmetry are attributed to the appro ximations made by the machine learning mo dels. The statistical signiﬁcance of the estimators is determined b y p-v alues, { ∗∗∗ p < 0 . 001 , ∗∗ p < 0 . 01 , ∗ p < 0 . 05 , ns p > = 0 . 05 } . Cells with a p-v alue b elow 0.05 are considered statistically signiﬁcan t, indicating a rejection of the n ull hypothesis that the ˆ β i,j estimator is equal to zero. The following matrices presen t the pure causal eﬀect betw een formation com binations for a sp eciﬁc target v ariable. T o provide a more complete tactical context, these estimations can b e adjusted for the eﬀects of pla ying home or aw ay . The av erage eﬀect of the target v ariable for a home team is estimated and added to the causal co eﬃcien t, providing a close approximation of the total side eﬀect. This calculation is represen ted as: ˆ β side i,j = ˆ β i,j + E ( ˆ Y home ) where E ( ˆ Y home ) is the estimated av erage eﬀect on the target v ariable for the home team (e.g., goals diﬀerence). 9 4.3.1 T arget V ariable 1: Goals Diﬀerence The resulting ˆ β i,j estimators for goals diﬀerence as the target v ariable are: Figure 1. ˆ β i,j co eﬃcien ts and p-v alues for goals diﬀerence The analysis of the estimators rev eals three statistically signiﬁcant ˆ β i,j co eﬃcien ts, eac h with a p-v alue b elo w the 0.05 threshold. These signiﬁcant estimators are: • 4-2-3-1 vs. 3-5-2: This combination pro vides an estimated adv an tage of 0.16 goals for the 4-2-3-1 formation ( p < 0 . 001), suggesting strong statistical signiﬁcance. • 4-3-3 vs. 5-4-1: This combination sho ws an adv antage of 0.17 goals for the 4-3-3 formation ( p < 0 . 05). • 4-3-3 vs. 4-4-2: This combination results in a 0.11 goal adv antage for the 4-3-3 formation ( p < 0 . 05). All other estimated co eﬃcien ts for goal diﬀerence are not statistically signiﬁcan t and, thus, can b e considered zero. T o provide a more comprehensive tactical con text, the pure formation eﬀect can be combined with the well-established home-ﬁeld adv an tage. The av erage home-team eﬀect, denoted as E ( ˆ Y home ), is estimated to b e 0.285 goals when the target v ariable is goals diﬀerence. Consequently , the total 10 estimated adv antage of a 4-2-3-1 formation playing at home against a 3-5-2 would b e approximately +0.445 goals (0 . 16 + 0 . 285). The ﬁnding that only a small num b er of estimators are signiﬁcan t suggests that formation c hoice alone do es not hav e a ma jor isolated impact on goal diﬀerence. 4.3.2 T arget v ariable 2: Red Cards diﬀerence The DML mo del was also applied to the diﬀerence in red cards betw een the main and riv al teams, considering b oth direct red cards and those resulting from a second y ellow card. In this analysis, the data w ere not stratiﬁed by home or aw ay team, as the goal was to assess the pure, unadjusted causal eﬀect of eac h formation combination on the num b er of red cards. Figure 2. ˆ β i,j co eﬃcien ts and p-v alues for red cards diﬀerence The results indicate that there are no statistically signiﬁcant estimators for the red cards diﬀerence, as all p-v alues are greater than the 0.05 threshold. Consequen tly , w e cannot reject the n ull h yp othesis, and the estimated co eﬃcien ts can b e considered zero ( ˆ β i,j = 0 ∀ i, j ). This lack of signiﬁcance suggests that red cards are not causally inﬂuenced by the sp eciﬁc formation com binations, particularly after controlling for confounding factors suc h as team strength and general matc h conditions. This ﬁnding aligns with the understanding that red cards are often a result of random, unpredictable match ev en ts and are more closely linked to a pla y er’s individual intensit y and b eha vior, whic h are diﬃcult to quantify . 11 4.3.3 T arget V ariable 3: Y ellow Cards Diﬀerence The DML mo del was also applied to the diﬀerence in yello w cards b et ween the main and riv al teams, with the goal of assessing the pure causal eﬀect of each formation com bination on this outcome. This analysis did not stratify the data b y home or aw a y team. Figure 3. ˆ β i,j co eﬃcien ts and p-v alues for yello w cards diﬀerence The results rev eal three statistically signiﬁcan t pairs of co eﬃcien ts. As the matrix is symmetrical ( ˆ β i,j = − ˆ β j,i ), these represen t three unique formation combinations: • 5-4-1 vs. 4-2-3-1: The 5-4-1 formation is asso ciated with an estimated increase of 0.16 yello w cards on av erage ( p < 0 . 01). This suggests that a team in a 5-4-1 formation would receiv e appro ximately one additional yello w card ev ery six matc hes when facing a team in a 4-2-3-1 formation. • 3-5-2 vs. 4-2-3-1: The 3-5-2 formation is asso ciated with a larger estimated increase of 0.17 y ellow cards ( p < 0 . 001), indicating a strong statistical signiﬁcance. • 3-5-2 vs. 4-3-3: This com bination results in a 0.16 y ello w card increase for the 3-5-2 formation ( p < 0 . 01). 12 These ﬁndings suggest that defensive formations incur a subtle increase in y ello w card penalties when compared to more oﬀensiv e ones. How ev er, as noted b y Badiella Busquets (8), disciplinary actions suc h as y ellow cards are also heavily inﬂuenced by individual play er in tensity , which is not captured b y these mo dels. 4.3.4 T arget v ariable 4: Possession The DML mo del was also applied to analyze the causal impact of formation combinations on p osses- sion. The h yp othesis is that a team’s tactical structure signiﬁcan tly inﬂuences its ability to control the ball during a matc h. Figure 4. ˆ β i,j co eﬃcien ts and p-v alues for p ossession diﬀerence The results pro vide v aluable insights into the intrinsic b eha vior of formations with respect to p ossession. The matrix rev eals that 13 of the 18 unique formation com binations show statistically signiﬁcan t co eﬃcien ts, with most having a p-v alue of < 0 . 001. The matrix can b e divided into t wo key regions: a top-righ t blue region and a bottom-left red region, whic h represent oﬀensive-defensiv e match ups. • Oﬀensiv e vs. Defensive Match ups (T op-righ t): This region clearly sho ws that defen- siv e formations tend to hav e less con trol o ver p ossession. The estimated p ossession diﬀerence 13 b et w een defensive formations (5-4-1, 4-4-2, 3-5-2) and oﬀensiv e ones (4-2-3-1, 4-3-3, 3-4-3) is consisten tly signiﬁcan t, with all p-v alues less than 0.001. This indicates that formations with more midﬁelders and few er defenders are more eﬀective at retaining p ossession. • Defensiv e vs. Oﬀensiv e Match ups (Bottom-left): This region is a symmetrical reﬂection of the top-righ t, conﬁrming that oﬀensive formations hav e greater ball con trol when facing defensiv e ones. Ov erall, the 4-3-3 formation demonstrates the strongest p ositiv e eﬀect on ball control when com- pared to other oﬀensive formations. Interestingly , the 3-4-3, despite b eing highly oﬀensive, shows less ball dominance than the 4-2-3-1, whic h could be due to a more direct, v ertical playing style with few er play ers in midﬁeld to calmly circulate the ball. 4.3.5 T arget v ariable 5: Corners Finally , the inﬂuence of formations on corner kicks was also analyzed. The h yp othesis is that forma- tions with higher p ossession and a more oﬀensiv e structure will generate more scoring opp ortunities, consequen tly leading to a higher num b er of corners. Figure 5. ˆ β i,j co eﬃcien ts and p-v alues for corners diﬀerence The matrix of co eﬃcien ts reveals that oﬀensive formations hav e a statistically signiﬁcant p ositiv e eﬀect on corner kicks when facing defensive ones. This aligns with the ﬁndings from the possession 14 analysis, as teams with more ball con trol and a more aggressiv e tactical setup create more attacking opp ortunities. F or example, the 4-2-3-1, the most common formation in our dataset, shows a p ositiv e estimated eﬀect on corners when paired against defensiv e systems lik e the 5-4-1 (+0.19), 4-4-2 (+0.18), and 3-5-2 (+0.23). While these eﬀects are statistically signiﬁcant, their practical magnitude is small. Giv en that the a verage num b er of corners p er team is around ﬁve p er match, the estimated c hanges are not substan tial enough to b e considered a ma jor tactical adv an tage. 5 Discussion This study rev eals that while fo otball formations provide a baseline tactical framew ork, they are not suﬃcien t to deﬁne or predict a team’s playing style or matc h outcome. Each nominal formation—suc h as 4-3-3 or 4-2-3-1 can embo dy a v ariety of tactical b eha viors, and the assumption that the starting formation remains static throughout the match o v erlo oks the dynamic nature of in-game adaptations. F urthermore, some teams emplo y diﬀeren t formations when attac king compared to when they are defending. Through the application of Double Machine Learning—redeﬁned in this pro ject to handle the categorical treatmen t through a nov el matrix-based residualization of formation combinations—it b ecomes eviden t that defensiv e formations, p opularly known as ”parking the bus”, generally serv e to neutralize oﬀensive threats rather than produce statistically adv antageous outcomes in terms of goals, p ossession, corners, or disciplinary measures. Ho wev er, ”parking the bus” often implies other strategies not considered in this analysis, such as time-wasting, tactical fouls, and slo wing do wn the pace of the game. Con versely , oﬀensiv e formations lik e 4-2-3-1, 4-3-3, and 3-4-3 sho w a mo dest but statistically signiﬁcan t p ositiv e impact on p ossession and corner diﬀerences when facing more defensive systems. Ho wev er, formation choice alone has minimal impact on red cards, which appear to b e driv en b y more random or b ehavioral factors. Imp ortan tly , the causal eﬀect of formations on goal diﬀerence is limited, with only three formation combinations yielding signiﬁcant estimators—underscoring that formations do not fully accoun t for scoring dynamics. Therefore, there is no statistical evidence that ”parking the bus” oﬀers an y scoring or winning adv an tage. While these results provide insightful patterns, they come with certain limitations, including the criteria used for deﬁning team strength and the simpliﬁcation inv olv ed in formation grouping, which ma y not fully explain all asp ects of playing b ehavior. Nonetheless, this serves as a v aluable starting p oin t for deep er formation-based p erformance analysis and provides a clear, data-driv en framework that can help coac hes make more informed tactical decisions. 15 6 Ac kno wledgemen ts W e would lik e to express our sincere gratitude to Juan Jes ´ us Ro dr ´ ıguez and Juan Camilo V´ azquez for their in v aluable supp ort throughout this pro ject. Their deep fo otball kno wledge, esp ecially in helping group formations meaningfully and providing tactical insights, was essential to in terpreting the data accurately and grounding the analysis in real-w orld fo otball logic. 7 Author con tributions GR and LB conceived the study and outlined the article conten t. GR constructed the dataset, p erformed the statistical analyses and wrote the ﬁrst draft of the article. GR and LB contributed in the methodological approac h and review ed of the man uscript. All authors ha v e read and approv ed the ﬁnal v ersion of the article. 8 F unding This w ork was partially funded by the gran t R TI2018-096072-B-I00 from the Spanish Ministry of Science, Inno v ation and Univ ersities References [1] Tian yu Guan, Jiguo Cao, and Tim B Sw artz. P arking the bus. Journal of Quantitative Analysis in Sp orts , 19(4):263–272, 2023. [2] Mohamad Nizam Nazarudin, Ardo Okilanda, Y o vhandra Oc kta, Reshandi Nugraha, and Regi Dwi Septian. Ev aluating defensive strategies in fo otball: analysing the impact of defensiv e metrics on matc h outcomes. Journal of Physic al Educ ation and Sp ort , 25(5):1051–1059, 2025. [3] Rui F reitas, Anna V olossovitc h, Carlos H Almeida, and V eronica Vleck. Elite-level defensive p erformance in fo otball: A systematic review. German Journal of Exer cise and Sp ort R ese ar ch , 53(4):458–470, 2023. [4] Sp ortmonks. Sp ortmonks—sports data solutions for dev elop ers. https://www.sportmonks.com/ , 2025. Accessed on 1 July 2025. [5] Victor Chernozhuk ov, Denis Chetv eriko v, Mert Demirer, Esther Duﬂo, Christian Hansen, Whit- ney Newey , and James Robins. Double/debiased mac hine learning for treatment and structural parameters, 2018. [6] Manfred T e Grotenh uis, Ben Pelzer, Rob Eisinga, Rense Nieuw enh uis, Alexander Sc hmidt-Catran, and Rub en Konig. When size matters: adv an tages of w eighted eﬀect coding in observ ational studies. International Journal of Public He alth , 62(1):163–167, 2017. 16 [7] Tianqi Chen and Carlos Guestrin. Xgb o ost: A scalable tree b oosting system. In Pr o c e e dings of the 22nd acm sigkdd international c onfer enc e on know le dge disc overy and data mining , pages 785–794, 2016. [8] Lloren¸ c Badiella, P edro Puig, Carlos Lago-P e ˜ nas, and Mart ´ ı Casals. Inﬂuence of red and y ellow cards on team p erformance in elite so ccer. A nnals of Op er ations R ese ar ch , 325(1):149–165, 2023. 17

The Impact of Formations on Football Matches Using Double Machine Learning. Is it worth parking the bus?

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment