Relieving and Readjusting Pythagoras
Bill James invented the Pythagorean expectation in the late 70's to predict a baseball team's winning percentage knowing just their runs scored and allowed. His original formula estimates a winning percentage of ${\rm RS}^2/({\rm RS}^2+{\rm RA}^2)$, …
Authors: Victor Luo, Steven J. Miller
RELIEVING AND READJUSTING PYTHA GORAS VICTOR LUO AND STEVEN J. MILLER A B S T R AC T . Bill James in vented the Pythagorean expectation in the late 70’ s to predict a baseball team’ s winning percentage knowing just their runs scored and allowed. His original formula esti- mates a winning percentage of RS 2 / (RS 2 + RA 2 ) , where RS stands for runs scored and RA for runs allowed; later versions found better agreement with data by replacing the exponent 2 with numbers near 1.83. Miller and his colleagues provided a theoretical justification by modeling runs scored and allowed by independent W eibull distributions. They showed that a single W eibull distribution did a very good job of describing runs scored and allo wed, and led to a predicted won-loss percentage of (RS obs − 1 / 2) γ / ((RS obs − 1 / 2) γ + (RA obs − 1 / 2) γ ) , where RS obs and RA obs are the observed runs scored and allowed and γ is the shape parameter of the W eibull (typically close to 1.8). W e show a linear combination of W eibulls more accurately determines a team’ s run production and increases the prediction accuracy of a team’ s winning percentage by an average of about 25% (thus while the currently used variants of the original predictor are accurate to about four games a season, the new combination is accurate to about three). The new formula is more in volved computationally; howe ver , it can be easily computed on a laptop in a matter of minutes from publicly a v ailable season data. It performs as well (or slightly better) than the related Pythagorean formulas in use, and has the additional advantage of having a theoretical justification for its parameter v alues (and not just an optimization of parameters to minimize prediction error). C O N T E N T S 1. Introduction 2 2. Theoretical Calculations 3 2.1. Preliminaries 3 2.2. Linear Combination of W eibulls 4 3. Curve Fitting 7 3.1. Theory 7 3.2. Results 8 4. Future W ork and Conclusions 14 Appendix A. Moments of W eibulls 15 References 15 Date : Nov ember 7, 2021. 2000 Mathematics Subject Classification. 46N30 (primary), 62F03, 62P99 (secondary). K e y wor ds and phrases. Pythagorean W on-Loss Formula, W eibull Distribution, Hypothesis T esting. The second named author was partially supported by NSF grant DMS0970067. W e thank Kevin Dayaratna, Bern- hard Klingenberg and Jef frey Miller for helpful comments and code over the years. 1 2 VICTOR LUO AND STEVEN J. MILLER 1. I N T RO D U C T I O N The Pythagorean W in/Loss Formula, also known as the Pythagorean formula or Pythagorean expectation, w as in vented by Bill James in the late 1970s to use a team’ s observed runs scored and allo wed to predict their winning percentage. Originally giv en by W on − Loss P ercen tage = RS 2 RS 2 + RA 2 , (1.1) with RS the runs scored and RA the runs allo wed, it earned its name from the similarity of the denominator to the sums of squares in the Pythagorean formula from geometry . 1 Later versions found better agreement by replacing the e xponent 2 with numbers near 1.83, leading to an a v erage error of about three to four games per season. The formula is remarkably simple, requiring only the runs scored and allowed by a team in a season, and the calculation (ev en with the improved exponent) is easily done on any calculator or phone. It is one of the most commonly listed expanded statistics on websites. One reason for its prominence is its accuracy in predicting a team’ s future performance through a simple calculation and not through computationally intense simulations. Additionally , it allo ws sabermetricians and fans to assess a manager’ s impact on a team, and estimate the v alue of ne w signings by seeing ho w their presence would change the predictions. Because of its widespread use and utility , it is very desirable to ha ve impro vements. In his senior thesis, the first named author , supervised by the second named author , explored v arious attempted improv ements to the Pythagorean formula. These included replacing the observed runs scored and allo wed each game with adjusted numbers, with the adjustments coming from a v ariety of sources (such as ballpark ef fects, game state 2 , WHIP , ERA+, and W AR of the pitcher , ...). As these led to only minor improvements 3 (see [Luo] for a detailed analysis of these and other adjustments), we turned our attention to the successful theoretical model used by Miller and his colleagues [DaMil1, DaMil2, Mil, MCGLP], where it was assumed runs scored and allowed were independently drawn from W eib ull distributions with the same shape parameter . Recall the three parameter W eibull density is gi ven by f ( x ; α, β , γ ) = ( γ α (( x − β ) /α ) γ − 1 e − (( x − β ) /α ) γ if x ≥ β 0 otherwise. (1.2) The effect of α is to control the spread of the output, while β translates the distribution. The most important parameter is γ , which controls the shape. See Figure 1 for some plots. Their success is due to the fact that the three parameter W eibull is a v ery flexible family of distributions, capable of fitting man y one hump distrib utions, including to a statistically significant 1 Though of course the more natural shape in baseball is the diamond, save for some interesting stadium features, such as the triangle in Fenway Park. 2 For example, if a team is up by a large amount late in a game, they frequently use weaker relief pitchers and rest some starters, while the trailing team mak es similar mov es; thus the offensi ve productions from this point onw ard may not be indicativ e of the team’ s true abilities and a case can be made to ignore such data. 3 The only adjusted formula that was at least on par or very near the accuracy of the original Pythagorean W/L Formula was that of ballpark f actor . RELIEVING AND READJUSTING PYTHAGORAS 3 F I G U R E 1 . The v arying distributions of the W eibull family with α = 1 and β = 0 . degree the observed runs scored and allo wed data. Miller chose to use W eibulls for two reasons. First, they lead to double integrals for the probabilities that can be ev aluated in closed form. This is extremely important if we desire a simple expression such as the one posited by James (see [HJM] for alternati ve simple formulas). Second, in addition to being flexible, special values of the W eibulls correspond to well-known distributions ( γ = 1 is an exponential, while γ = 2 is the Rayleigh distribution). The goal of this paper is to show that one can significantly improv e the predicti ve power if in- stead of modeling runs scored and allo wed as being dra wn from independent W eibulls, we instead model them as being drawn from linear combinations of independent W eibulls. The adv antage of this approach is that we are still able to obtain tractable double integrals which can be done in closed form. There is a cost, ho we ver , as now more analysis is needed to find the parameters and the correct linear combinations. While this results in a more complicated formula than the standard v ariant of James’ formula, it is well w orth the cost as on a verage it is better by one game per season (thus a typical error is 3 games per team per year , as opposed to 4 which is the typical result in the current formula). Comparing it to baseball- reference.com ’ s calculated Expected WL from 1979 to 2013, we find that the linear combination of W eibulls is approximately .06 of a game better , a difference is not statistically significant. Howe ver , there are noticeable trends that appear in certain eras. 2. T H E O R E T I C A L C A L C U L A T I O N S 2.1. Preliminaries. It is important to note that we assume that runs scored and allowed are taken from continuous, not discrete, distrib utions. This allo ws us to deal with continuous integrals rather than discrete sums, which most of the time leads to easier calculations. While a discrete distribution 4 VICTOR LUO AND STEVEN J. MILLER would probably more effecti vely map runs in baseball, the assumption of dra wing runs from a continuous distribution allo ws for more manageable calculations, and is a very sensible estimate of those runs observed. It also will lead to closed form expressions, which are much easier to work with and allo w us to av oid having to resort to simulations. The W eib ulls lead to significantly easier calculations because if we hav e a random v ariable X chosen from a W eibull distrib ution with parameters α , β , and γ , then X 1 /γ is exponentially distributed with parameter α γ ; thus, a change of v ariables yields a simpler integral of e xponentials, which can be done in closed form (see Appendix 9.1 in [MCGLP] for details). In all arguments below we always take β = − 1 / 2 , though we often write β to keep the discus- sion more general for applications to other sports. The reason we do this is that we use procedures such as the Method of Least Squares to find the best fit parameters, and this requires binning the observed runs scored and allowed data. As baseball scores are discrete, there are issues if these v alues occur at the boundary of bins; it is much better if they are at the center . By taking β = − 1 / 2 we break the data into bins − 1 2 , 1 2 , 1 2 , 3 2 , 3 2 , 5 2 , · · · . (2.1) Our final assumption is that runs scored and runs allowed are independent. This ob viously cannot be true, as a baseball game ne ver ends in a tie. For example, if the Orioles and the Red Sox are playing and the O’ s score 5 runs, then the Sox cannot score 5. Statistical analyses support this hypothesis; see the independence tests with structural zeros in [Mil] or Appendix 9.2 in [MCGLP] for details. W e end this subsection with the mean and the variance of the W eib ull. The calculation follows from standard integration (see the expanded version of [Mil] or Appendix 9.1 in [MCGLP] for a proof of the formula for the mean; the deri v ation of the v ariance follows in a similar f ashion). Lemma 2.1. Consider a W eibull with par ameters α, β , γ . The mean, µ α,β ,γ , equals µ α,β ,γ = α Γ(1 + γ − 1 ) + β (2.2) while the variance, σ 2 α,β ,γ , is σ 2 α,β ,γ = α 2 Γ(1 + 2 γ − 1 ) − α 2 Γ(1 + γ − 1 ) 2 , (2.3) wher e Γ( x ) is the Gamma function, defined by Γ( x ) = R ∞ 0 e − u u x − 1 du . 2.2. Linear Combination of W eibulls. W e no w state and pro ve our main result for a linear com- bination of two W eib ulls, and leav e the straightforward generalization to combinations of more W eibulls to the reader . The reason such an expansion is advantageous and natural is that, follo w- ing [Mil], we can integrate pairs of W eibulls in the regions needed and obtain simple closed form expressions. The theorem belo w also holds if γ < 0 ; howe ver , in that situation the more your runs scored exceeds your runs allowed, the worse your predicted record due to the dif ferent shape of the W eibull (in all applications of W eibulls in survi val analysis, the shape parameter γ must be positi ve). Theorem 2.2. Let the runs scor ed and allowed per game be two independent random variables drawn fr om linear combinations of independent W eib ull distributions with the same β ’ s and γ ’ s. Specifically , if W ( t ; α, β , γ ) repr esents a W eibull distribution with parameters ( α, β , γ ) , and we RELIEVING AND READJUSTING PYTHAGORAS 5 choose non-ne gative weights 4 0 ≤ c i , c 0 j ≤ 1 (so c 1 + c 2 = 1 and c 0 1 + c 0 2 = 1 ), then the density of runs scor ed, X is f ( x ; α RS 1 , α RS 2 , β , γ , c 1 , c 2 ) = c 1 W ( x ; α RS 1 , β , γ ) + c 2 W ( α RS 2 , β , γ ) (2.4) and runs allowed, Y , is f ( y ; α RA 1 , α RA 2 , β , γ , c 0 1 , c 0 2 ) = c 0 1 W ( y ; α RA 1 , β , γ ) + c 0 2 W ( α RA 2 , β , γ ) . (2.5) In addition, we choose α RS 1 and α RS 2 so that the mean of X is RS obs and choose α RA 1 and α RA 2 such that the mean of Y is RA obs . F or γ > 0 , we have W on − Loss P ercen tage( α RS 1 , α RS 2 , α RA 1 , α RA 2 , β , γ , c 1 , c 2 , c 0 1 , c 0 2 ) = c 1 c 0 1 α γ RS 1 α γ RS 1 + α γ RA 1 + c 1 c 0 2 α γ RS 1 α γ RS 1 + α γ RA 2 + c 2 c 0 1 α γ RS 2 α γ RS 2 + α γ RA 1 + c 2 c 0 2 α γ RS 2 α γ RS 2 + α γ RA 2 = 2 X i =1 2 X j =1 c i c 0 j α γ RS i α γ RS i + α γ RA j . (2.6) Pr oof. As the means of X (runs scored) and Y (runs allo wed) are RS obs and RA obs , respectiv ely , and the random v ariables are dra wn from linear combinations of independent W eibulls, by Lemma 2.1 RS obs = c 1 ( α RS 1 Γ(1 + γ − 1 ) + β ) + (1 − c 1 )( α RS 2 Γ(1 + γ − 1 ) + β ) RA obs = c 0 1 ( α RA 1 Γ(1 + γ − 1 ) + β ) + (1 − c 0 1 )( α RA 2 Γ(1 + γ − 1 ) + β ) . (2.7) W e now calculate the probability that X e xceeds Y . W e constantly use the fact that the integral of a probability density is 1. W e need the two β and the two γ ’ s to be equal in order to obtain 4 If we had more terms in the linear combination, we would simply choose non-neg ativ e weights summing to 1. 6 VICTOR LUO AND STEVEN J. MILLER closed form expressions. 5 W e find Prob( X > Y ) = Z ∞ x = β Z x y = β f ( x ; α RS 1 , α RS 2 , β , γ , c 1 , c 2 ) f ( y ; α RA 1 , α RA 2 , β , γ , c 0 1 , c 0 2 ) dy dx = 2 X i =1 2 X j =1 Z ∞ x =0 Z x y =0 c i c 0 j γ α RS i x α RS i γ − 1 e − ( x α RS i ) γ γ α RA j x α RA j γ − 1 e − ( x α RA j ) γ dy dx = 2 X i =1 2 X j =1 c i c 0 j Z ∞ x =0 γ α RS i x α RS i γ − 1 e − ( x α RS i ) γ " Z x y =0 γ α RA j y α RA j γ − 1 e − ( y α RA j ) γ dy # dx = 2 X i =1 2 X j =1 c i c 0 j Z ∞ x =0 γ α RS i x α RS i γ − 1 e − ( x α RS i ) γ ∗ 1 − e − ( x α RA j ) γ dx = 2 X i =1 2 X j =1 c i c 0 j " 1 − Z ∞ x =0 γ α RS i x α RS i γ − 1 e − ( x α RS i ) γ − ( x α RA j ) γ dx # . (2.8) W e set 1 α γ ij = 1 α γ RS i + 1 α γ RA j = α γ RS i + α γ RA j α γ RS i α γ RA j 5 If the β ’ s are differen then in the integration below we might hav e issues with the bounds of integration, while if the γ ’ s are unequal we get incomplete Gamma functions, though for certain rational ratios of the γ ’ s these can be done in closed form. RELIEVING AND READJUSTING PYTHAGORAS 7 for 1 ≤ i, j ≤ 2 , so that (2.8) becomes 2 X i =1 2 X j =1 c i c 0 j " 1 − Z ∞ x =0 γ α RS i x α RS i γ − 1 e − ( x α ij ) γ dx # = 2 X i =1 2 X j =1 c i c 0 j " 1 − α γ ij α γ RS i Z ∞ x =0 γ α ij x α ij γ − 1 e − ( x α ij ) γ dx # = 2 X i =1 2 X j =1 c i c 0 j 1 − α γ ij α γ RS i = 2 X i =1 2 X j =1 c i c 0 j " 1 − 1 α γ RS i ∗ α γ RS i α γ RA j α γ RS i + α γ RA j # = 2 X i =1 2 X j =1 c i c 0 j " α γ RS i α γ RS i + α γ RA j # = c 1 c 0 1 α γ RS 1 α γ RS 1 + α γ RA 1 + c 1 c 0 2 α γ RS 1 α γ RS 1 + α γ RA 2 + c 2 c 0 1 α γ RS 2 α γ RS 2 + α γ RA 1 + c 2 c 0 2 α γ RS 2 α γ RS 2 + α γ RA 2 , (2.9) completing the proof of Theorem 2.2. 3. C U RV E F I T T I N G 3.1. Theory. W e now turn to finding the values of the parameters leading to the best fit. W e require β = − 1 / 2 (for binning purposes), but otherwise the parameters ( α RS 1 , α RS 2 , α RA 1 , α RA 2 , γ , c 1 , c 2 , c 0 1 , c 0 2 ) are free. 6 Our first approach was to use the Method of Moments, where we compute the number of moments equal to the number of parameters. Unfortunately the resulting equations were too in volv ed to permit simple solutions for them in terms of the observed data; for completeness they are gi ven in Appendix A (or see [Luo]). W e thus turned to the Method of Least Squares (though one could also do an analysis through the Method of Maximum Likelihood). W e look ed at the 30 teams of the entire league from the 2004 to 2012 season. W e display results from the 2011, b ut the results from any other season are similar readily av ailable (see [Luo]). W e implemented the Method of Least Squares using the bins in (2.1), which in v olved minimizing the sum of squares of the error of the runs scored data plus the sum of squares of the error of the runs allo wed data. There were se ven free parameters: α RS 1 , α RS 2 , α RA 1 , α RA 2 , γ , c 1 , and c 0 1 . Letting Bin ( k ) be the k th bin of (2.1), RS obs ( k ) and RA obs ( k ) represent the observed number of games with number of runs scored and allo wed in Bin ( k ) , and A ( α 1 , α 2 , β , γ , c 1 , k ) denote the area under the linear combination of two W eibulls with parameters ( α 1 , α 2 , β , γ , c 1 ) in Bin ( k ) , then for each 6 Subject to, of course, 0 ≤ c i , c 0 j ≤ 1 and c 1 + c 1 = c 0 1 + c 0 2 = 1 . 8 VICTOR LUO AND STEVEN J. MILLER team we found the v alues of ( α RS 1 , α RS 2 , α RA 1 , α RA 2 , γ , c 1 , c 0 1 ) that minimized Num. Bins X k =1 (RS obs ( k ) − # Games ∗ A ( α RS 1 , α RS 2 , − . 5 , γ , c 1 , k )) 2 + Num. Bins X k =1 (RA obs ( k ) − # Games ∗ A ( α RA 1 , α RA 2 , − . 5 , γ , c 0 1 , k )) 2 . (3.1) 3.2. Results. For each team, we found the best fit linear combination of W eibulls. In Figure 2, we compared the predicted wins, losses, and won-loss percentage with the observ ed ones. F I G U R E 2 . Results for the 2011 season using Method of Least Squares. The code used is a v ailable in [Luo]. Using the Method of Least Squares, the mean γ ov er all 30 teams is 1.83 with a standard deviation of 0.18 (the median is 1.79). W e can see that the exponent 1.83, considered as the best exponent, is clearly within the region of one standard deviation from the mean γ . Considering the absolute v alue of the dif ference between observed and predicted wins, we ha ve a mean of 2.89 with a standard de viation of 2.34 (median is 2.68). W ithout considering the absolute v alue, the mean is 0.104 with a standard de viation of 3.75 (and a median of 0.39). W e only concern ourselves with the absolute value of the difference, as this really tells how accurate our predicted v alues are. These v alues are significant improv ements on those obtained when using a single W eibull distri- bution to predict runs (which essentially reproduces James’ original formula, though with a slightly dif ferent exponent), which produces a mean number of games off of 4.43 with standard deviation 3.23 (and median 3.54) in the absolute v alue case. W e display the results ov er seasons from 2004 to 2012 in Figure 3. It is apparent that the linear combination of W eibulls better estimates teams’ RELIEVING AND READJUSTING PYTHAGORAS 9 win/loss percentage; in fact, it is over one game better at estimating than the single W eib ull! The mean number of games of f for a single W eibull from 2004 to 2012 w as 4.22 (with a standard de vi- ation of 3.03), while that of the linear combination of W eib ulls w as 3.11 (with a standard de viation of 2.33). In addition, there is less standard de viation in the estimates. Thus, it appears that the linear combination of W eibulls provides a much tighter , better estimate than the single W eibull does. F I G U R E 3 . Mean number of games off (with standard deviation) for single W eibull and linear combination of W eibulls from 2004-2012. T o further demonstrate ho w accurate the quality of the fit is, we compare the best fit linear com- bination of W eibulls of runs scored and allowed with those observed of the 2011 Seattle Mariners in Figure 4; we can see that the fit is visually very good. Of course, the fit cannot be worse, as we can always set c 1 = 0 = c 0 1 ; howe ver , we can see the linear combination of W eibulls does a better job tracking the shape of the runs scored. W e then performed an independent two-sample t-test with unequal v ariances in R using the t.test command to see if the difference between the games off determined by the single W eibull and those by linear combinations of W eibulls is statistically significant in Figure 5. W ith a p -v alue less than 0.01 and a 95% confidence interval that does not contain 0, we can see that the dif ference is in fact statistically significant. In addition, we compared the mean number of games off of baseball- reference.com ’ s Pythagorean W in-Loss statistic (pythWL) and those of the linear combination of W eibulls from 1979 to 2013. Originally , we used ESPN’ s ExWL statistic 7 which used an exponent of 2; ho we ver , ESPN only went down to the year 2002, and it has been sho wn that using the exponent 1.83 is more accurate than using ESPN’ s exponent of 2. Using baseball- reference.com ’ s Pythagorean W in-Loss statistic (pythWL) to obtain more data ( baseball- reference.com allowed us to 7 See the bottom of the page http://espn.go.com/mlb/stats/rpi/_/year/2011 . 10 VICTOR LUO AND STEVEN J. MILLER ( A ) Single W eibull mapping runs scored and allo wed. ( B ) Linear Combination of W eibulls mapping runs scored and allo wed. F I G U R E 4 . Comparison of best fit linear combination of W eibulls v ersus single W eibull for runs scored (top) and allo wed (bottom) for the 2011 Seattle Mariners against the observed distrib ution of scores. F I G U R E 5 . t-test to determine whether the dif ference between the games of f de- termined by the single W eibull and those by linear combinations of W eib ulls is statistically significant. RELIEVING AND READJUSTING PYTHAGORAS 11 go all the way down to 1979, rather than just 2002, which ESPN giv es), the pythWL statistic 8 is calculated as ( runs scored 1 . 83 ) / ( runs scored 1 . 83 + runs allo wed 1 . 83 ) . W e display the results of our comparisons in Figure 6. The mean number of games of f for the pythWL statistic was 3.09 with a standard de viation of 2.26, numbers only slightly worse than those of the linear combination of W eibulls (mean of 3.03 with standard deviation of 2.21). So, we can see that the linear combination of W eibulls is doing, on a verage, about .06 of a game better than the pythWL statistic. W e performed an independent two-sample t-test with unequal v ariances in R using the t.test command to see if the difference between the games of f determined by the pythWL statistic and the linear combinations of W eib ulls is statistically significant in Figure 7. W ith a very large p -v alue, we fail to reject the null hypothesis, suggesting that the difference in mean number of games off is not in fact statistically significant. W e also display a plot (Figure 8) that models the difference in the number of games between the pythWL statistic and the linear combination of W eibulls; it seems to be a constant positi ve v alue for the most part, suggesting that the linear combination of W eibulls is doing slightly better than the p ythWL statistic. F I G U R E 6 . Mean number of g ames of f (with standard de viation) for baseball- reference.com ’ s p ythWL statistic and linear combination of W eibulls from 1979-2013. Looking at Figure 8 more closely , we can see that there are parts/eras of the graph in which the pythWL statistic does better, and parts where the linear combination of W eibulls does better . In 8 At http://www.sports- reference.com/blog/baseball- reference- faqs/ see the section “What is Pythagorean W inning Percentage?". 12 VICTOR LUO AND STEVEN J. MILLER the era from 1979-1989, the pythWL statistic is more accurate, beating the linear combination of W eibulls in 7 out of the 11 years. Ho wev er, from 1990 to 2013, the linear combination of W eib ulls wins in 15 out of the 24 years, and does so by around 0.3 games in those years. Furthermore, when the pythWL statistic does beat the linear combination of W eibulls in the years from 1990 to 2013, it does so by around 0.25 games, including the point at 2004, which seems very out of the ordinary; without this point, the pythWL statistic wins by about .2 games in the years between 1990 and 2013 that it does beat the linear combination of W eibulls. Thus, in more recent years, it may make more sense to use the linear combination of W eibulls. In addition, with respect to the standard de viation of number of games off of the pythWL statistic (2.26) and the linear combination of W eibulls (2.21), we can see that the linear combination of W eibulls provides on a verage a tighter fit, i.e., there is less fluctuation in the mean number of games off for each team in each year (from 1990 to 2013, the pythWL statistic standard deviation in games of f is 2.34 while that of the linear combination of W eibulls is 2.22, so we again see that the linear combination does noticeably better in recent years). It is important to note that the pythWL statistic just takes the functional form of the Pythagorean W in/Loss Formula with an exponent ( γ ) of 1.83, while we giv e theoretical justification for our formula. F I G U R E 7 . t-test to determine whether the dif ference between the games of f deter - mined by ESPN ExWL and those by linear combinations of W eibulls is statistically significant. W e also performed χ 2 tests to determine the goodness of fit to see how well the linear combi- nation of W eibulls maps the observ ed data, and whether runs scored and allowed are independent. W e used the bins as in (2.1) and test statistic # Bins X k =1 (RS obs ( k ) − # Games ∗ A ( α RS 1 , α RS 2 , − . 5 , γ , c 1 , k )) 2 # Games ∗ A ( α RS 1 , α RS 2 , − . 5 , γ , c 1 , k ) + # Bins X k =1 (RA obs ( k ) − # Games ∗ A ( α RA 1 , α RA 2 , − . 5 , γ , c 0 1 , k )) 2 # Games ∗ A ( α RA 1 , α RS A , − . 5 , γ , c 0 1 , k ) (3.2) for the goodness of fit tests, with 2 ∗ (# Bins − 1) − 1 − 7 = 16 degrees of freedom, the factor of 7 coming from estimating 7 parameters, namely α RS 1 , α RS 2 , α RA 1 , α RA 2 , γ , c 1 , and c 0 1 . W e did not estimate β , as we took it to be -.5. Ha ving 16 degrees of freedom giv es critical threshold v alues of 26.3 (at the 95% le vel) and 32.0 (at the 99% lev el). Ho we ver , since there are multiple RELIEVING AND READJUSTING PYTHAGORAS 13 F I G U R E 8 . Difference in mean number of games off for baseball- reference.com ’ s pythWL statistic and linear combination of W eibulls from 1979-2013. comparisons being done (namely 30 for the different teams), we use a Bonferroni adjustment and obtain critical thresholds of 37.7 (95%) and 42.5 (99%). From the first column of Figure 9, all the teams fall within the unadjusted 99% threshold, with the exception of the T exas Rangers (just barely!), who easily fall into the Bonferroni adjusted 95% threshold. Therefore, the observed data closely follo ws a linear combination of W eibulls with the proper estimated parameters. Since the test for independence of runs scored and allowed requires that the row and column of the contingency table hav e at least one non-zero entry , the bins used to bin the runs score and allo wed were [0 , 1) ∪ [1 , 2) ∪ · · · ∪ [9 , 10) ∪ [11 , ∞ ) . (3.3) W e use inte ger endpoints because we are using the observed runs from games. W e hav e a 12 by 12 contingency table with zeroes along the diagonal, since runs scored and allo wed can nev er be equal. This leads to an incomplete 12 by 12 contingency table with (12 − 1) 2 − 12 = 109 de grees of freedom; constructing a test requires the use of structural zeroes. The theory behind tests using structural zeroes can be seen in [Mil] or Appendix 9.2 of [MCGLP]. W e observe that 109 degrees of freedom gi ve critical threshold values of 134.37 (at the 95% lev el) and 146.26 (at the 99% le vel). Again, since we are doing multiple comparisons, we use a Bonferroni adjustment, obtaining critical thresholds of 157.68 (95%) and 166.45 (99%). From the second column of Figure 9, all the teams fall within the 99% threshold, with the exception of the Los Angeles Angels (just barely!), who easily fall into the Bonferroni adjusted 95% threshold. Thus, runs scored and allo wed are acting as though they are statistically independent. A more in depth discussion of the justification behind the tests can be found in [Mil]. 14 VICTOR LUO AND STEVEN J. MILLER F I G U R E 9 . χ 2 test results of the 2011 season from least squares of goodness of fit and independence of runs score and allo wed. 4. F U T U R E W O R K A N D C O N C L U S I O N S While a one game impro vement in prediction is very promising, as our formula requires us to fit the runs scored and allowed distributions we explored simplifications. W e tried to simplify the formula, ev en giving up some accurac y , in order to de vise a formula that could be easily imple- mented using just a team’ s runs scored and allowed (and the variance of each of these) in order to determine the team’ s winning percentage. Unfortunately , the weight parameters c 1 and c 0 1 plays too much of a factor; in 2011, the mean of the parameter c 1 is 0.21 with a standard de viation of 0.39 (and a median of 0.21). W ith such large fluctuations in the weight parameters from team to team, the task of finding a simpler formula was almost impossible, as creating a uniform formula that e very team could use was not feasible when two of the key parameters were so v olatile. T aking this into account, we tried fixing the γ , c 1 , and c 0 1 parameters, allo wing for us to just solv e a quartic in volving the first and second moments to find the other parameters ( α RS 1 , α RS 2 , α RA 1 , and α RA 2 ). Ho we ver , while we were able to solve for the other parameters, plugging these values of the pa- rameters ga ve us a significantly worse prediction of teams’ win-loss percentage compared to linear combination of W eibulls and baseball- reference.com ’ s Pythagorean W in-Loss statistic. One of the great attractions of James’ Pythagorean formula is its ease of use; we hope to return to other simplifications and approximations in a later paper . Our hope is to find a linearization or approximation of our main result, similar to ho w Dayaratna and Miller [DaMil1] showed the linear predictor of Jones and T appin [JT] follo ws from a linearization of the Pythagorean formula. RELIEVING AND READJUSTING PYTHAGORAS 15 T o summarize our results, using a linear combination of W eibulls rather than a single W eibull increases the prediction accuracy of a team’ s W/L percentage. More specifically , we saw that the single W eibull’ s predictions for a team’ s wins were on a verage 4.22 games off (with a standard de viation of 3.03), while the linear combination of W eibull’ s predictions for a team’ s wins were from 2004-2012 were on a verage 3.11 games off (with a standard deviation of 2.33), producing about a 25% increase in prediction accuracy . W e also performed χ 2 goodness of fit tests for the linear combination of W eibulls and tested the statistical independence of runs scored and allowed (a necessary assumption), and see that in fact the linear combination of W eibulls with properly estimated parameters obtained from least squares analysis closely maps the observed runs and that runs scored and allowed are in fact statistically independent. In addition, when compared against baseball- reference.com ’ s Pythagorean W in-Loss statistic, the linear combination of W eibulls does .06 of a g ame better in the years from 1979 to 2013, b ut this impro vement cannot be considered statistically significant. Howe ver , in more recent years, it is w orth noting that it does appear that the linear combination of W eibulls is doing better than baseball- reference. com ’ s Pythagorean W in-Loss statistic. A P P E N D I X A. M O M E N T S O F W E I B U L L S For the runs scored data, we ha ve c 2 = 1 − c 1 and the density equals c 1 γ α RS 1 x − β α RS 1 γ − 1 e − ( x − β α RS 1 ) γ + (1 − c 1 ) γ α RS 2 x − β α RS 2 γ − 1 e − ( x − β α RS 2 ) γ . (A.1) From [Mur], and using the fact that the two W eib ulls in the linear combination are independent, we obtained the follo wing moments: First Moment = c 1 ( α RS 1 Γ(1 + γ − 1 ) + β ) + (1 − c 1 )( α RS 2 Γ(1 + γ − 1 ) + β ) Second Moment = c 2 1 α 2 RS 1 Γ( 2 γ + 1) − (Γ( 1 γ + 1)) 2 + (1 − c 1 ) 2 α 2 RS 2 Γ( 2 γ + 1) − (Γ( 1 γ + 1)) 2 Third Moment = ( c 3 1 + (1 − c 1 ) 3 ) ∗ g 3 − 3 g 1 g 2 + 2 g 3 1 ( g 2 − g 2 1 ) 3 / 2 Fourth Moment = ( c 4 1 + (1 − c 1 ) 4 ) ∗ g 4 − 4 g 1 g 3 + 6 g 2 g 2 1 − 3 g 4 1 ( g 2 − g 2 1 ) 2 + 6 ∗ c 2 1 (1 − c 1 ) 2 ∗ h α 2 RS 1 α 2 RS 2 g 2 − g 2 1 2 i , (A.2) where g i = Γ(1 + i γ ) , and Γ( x ) is the Gamma function defined by Γ( x ) = R ∞ 0 e − u u x − 1 du . W e next used R to find the observed moments for teams’ runs scored from 2007. The code is av ailable in [Luo]. R E F E R E N C E S [DaMil1] K. Dayaratna and S. J. Miller, F irst Or der Approximations of the Pythagorean W on-Loss F ormula for Pr edicting MLB T eams W inning P ercentag es , By The Numbers – The Newsletter of the SABR Statistical Analysis Committee 22 (2012), no 1, 15–19. 16 VICTOR LUO AND STEVEN J. MILLER [DaMil2] K. Dayaratna and S. J. Miller, The Pythagor ean W on-Loss F ormula and Hock ey: A Statistical Justification for Using the Classic Baseball F ormula as an Evaluative T ool in Hocke y (with K evin Dayaratna), The Hockey Research Journal: A Publication of the Society for International Hockey Research (2012/2013), pages 193–209. [HJM] C. N. B. Hammond, W . P . Johnson and S. J. Miller , The J ames Function , preprint 2014. http://xxx. tau.ac.il/pdf/1312.7627v2 . [Ja] B. James, Baseball Abstract 1983 , Ballantine, 238 pages. [JT] M. A. Jones and L. A. T appin, The Pythagor ean Theor em of Baseball and Alternative Models , The UMAP Journal 26 (2005), no. 2, 12 pages. [Luo] V . Luo, Relieving and Readjusting Pythagoras , Senior Thesis (supervised by S. J. Miller), Williams College 2014. http://web.williams.edu/Mathematics/sjmiller/public_html/math/papers/st/VictorLuo.pdf . [Mil] S. J. Miller, A Derivation of the Pythagor ean W on-Loss F ormula in Baseball , Chance Magazine (2007), no. 1, 40–48 (an abridged version appeared in The Newsletter of the SABR Statistical Analysis Committee 16 (February 2006), no. 1, 17–22, and an expanded version is av ailable online at 0509698.pdf ). [MCGLP] S. J. Miller , T . Corcoran, J. Gossels, V . Luo and J. Porfilio, Pythagor as at the Bat , in: Social Networks and the Economics of Sports (edited by V ictor Zamaraev), Springer- V erlag, 2014 (to appear). [Mur] G. Muraleedharan, Char acteristic and Moment Generating Functions of Three P arame- ter W eibull Distribution – an Independent Appr oach , Research Journal of Mathematical and Statistical Sciences 1 (2013), no. 8, 25–27. E-mail addr ess : victor.d.luo@williams.edu D E PAR T M EN T O F M A T H E M A T I C S A N D S T A T I S T I C S , W I L L I A M S C O L L E G E , W I L L I A M S T O W N , M A 0 1 2 6 7 E-mail addr ess : sjm1@williams.edu , Steven.Miller.MC.96@aya.yale.edu D E PAR T M EN T O F M A T H E M A T I C S A N D S T A T I S T I C S , W I L L I A M S C O L L E G E , W I L L I A M S T O W N , M A 0 1 2 6 7
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment