A Statistical View of Learning in the Centipede Game

In this article we evaluate the statistical evidence that a population of students learn about the sub-game perfect Nash equilibrium of the centipede game via repeated play of the game. This is done by formulating a model in which a player's error in…

Authors: Anton H. Westveld, Peter D. Hoff

Decision making under uncertainty has long been of interest to a wide variety of academic disciplines: biology, computer science, economics, mathematics, philosophy, political science, and statistics, to name a few. The main mathematical method for examining multi-agent decision theory has been game theory. However, the game theoretic solutions of some simple games have been called into question, with a classic example being the Sub-Game Perfect Nash Equilibrium (SPNE) of the centipede game. In experimental settings, individuals rarely choose the SPNE solution (McKelvey and Palfrey, 1993). To explain this, McKelvey and Palfrey (1995, 1996, 1998) suggest that players' strategies can be represented by a Quantal Response Equilibrium (QRE), in which players' choices deviate from the SPNE because of "mistakes" in decision making. The mistakes or errors may be due to lack of information, information overload, even the fact that human beings are not perfect optimizers, or as is often the case they are not optimizing according the specific criterion set out by researchers in a given study. Our examination of the data collected by McKelvey and Palfrey (1993) on the centipede game shows that on average players move toward the SPNE with repeated play. This idea of moving toward a game theoretic equilibrium through repeated play has been called learning by Fudenberg and Tirole (1991). An extensive amount of theoretical work has been written on the subject, as well as a fair amount of empirical work based on the centipede game (El-Gamal, McKelvey and Palfrey, 1993). We expand the QRE framework, which allows for a statistical interpretation of game theoretic models, by allowing the error distribution to change as players gain experience. We also build upon the notion of heterogeneity of players, discussed by McKelvey and Palfrey (1996) through the introduction of different parameters for each type of player in the game and finally expanding that notion to a statistical random effects model that allows for heterogeneity over all the subjects in the data set. The models we employed represent the data better than previous models based on the Bayesian Information Criteria (BIC) as a measure of adequacy (Kass and Raftery, 1995). The outline of the paper is as follows: In Section 2, the game and the experimental design are discussed. In Section 3, an exploratory data analysis is presented. In Section 4, several models are examined that allow the distribution of a player's error to change through experience. In Section 5, the model with the best BIC is developed further to allow for heterogeneity of players in the data through a random effects model which accounts for the correlation of outcomes involving a common player. The paper then ends with a discussion of model limitations and the potential for future investigation. The data were gathered by McKelvey and Palfrey (1993) based upon the four-stage centipede game shown in Figure 1. A single run of the centipede game involves two player types -Player A and Player B. Player A initiates the game and in the first stage has an opportunity to either Take or Pass. If Player A chooses Take, the game ends and Players A and B receive 40 and 10 cents, respectively. If Player A passes, then Player B has an opportunity to either choose Take or Pass. Again, if Player B chooses Take the game ends and Players A and B receive 20 and 80 cents, respectively. At each subsequent stage the dollar amounts are doubled and switched between Player A and Player B. The fourth stage is the last regardless of whether Player B chooses Take or Pass. Based upon this pattern, there are five possible outcomes of the game (y = {1, . . . , 5}), where an outcome is the total number of stages played in the game. The traditional game theoretic solution, the SPNE, can be determined via backwards induction; at the fourth stage, based upon utility maximization under the assumption that a Player's utility is determined solely by the monetary outcomes, it seems natural for Player B to choose Take. If the game were to reach stage 3, then Player A should realize this and following a similar argument would choose Take in the third stage. This continues backwards through the game tree, yielding the unique solution that Player A should choose Take at the first stage. However, this solution is Pareto inferior (Mas-Colell, Whinston and Green, 1995), since both players would strictly benefit by moving further out in the game (stage 3 and beyond). The data were collected in three different sessions, two of which consisted of 20 and 18 students from Pasadena Community College, and a third session of 20 students from the California Institute of Technology. Each subject took part in only one of the three sessions. In each session, subjects were randomly assigned to be either a Player A or Player B and this assignment was kept throughout their allotted session. In the first and third sessions, subjects played 10 games, while in the second session they played only 9 games. After each game, the subjects who were Players B were rotated so that two subjects never played each other more than once. The layout for the two 10-game experiments can be seen in Table 1. The subjects who are Players A and Players B are on the rows and columns, respectively. The (i, j) th entry of the table indicates the game number played between the i th Player A and the j th Player B. Since each individual only plays 10 games the experimental design is a Latin square (i.e. in each row and column each game number appears only once). Notice that we could place the game number on the columns and fill in the table the Players B and we still would have a Latin square. In fact any permutation of Players A, Players B, and game number with rows, columns and table entries is a Latin square. Let y [i,j](s) be the outcome {1, . . . , 5} for the game played between the i th Player A and j th Player B in the s th session, where i = {1, . . . , N (s)} and j = {1, . . . , N (s)}. Since session 2 has only 18 subjects N (s = 2) = 9, compared to N (s = 1, 3) = 10 Thus the total number of cases in the data are Finally, In an attempt to conform to the notions of rationality required by game theoretic solutions, the structure of the game, number of times the game would be played, and payment structure were made common knowledge to all the subjects. This was done by reading a set of instructions, a practice session, as well as the administration and correction of a quiz. It is important to note that the subjects were not "taught" what an optimal strategy was in any sense. Additionally, the games were conducted on computers so that the subjects did not know whom they were playing against and at the end of each session, the subjects were privately paid the amount of money they had earned from the 9 or 10 games. Further discussion of the experimental design and data collection can be found in McKelvey and Palfrey (1993). The left panel of Figure 2 presents the frequencies of the outcomes for all the games played for the combined sessions. The traditional game theoretic solution is for Player A to choose Take at the first stage. If the subjects actually played in this manner all the mass in the histogram would be contained on outcome 1. However, most of the mass occurs on outcomes 2 and 3. Surprisingly, we see some mass on outcome 5 even though we would expect a subject reaching this stage to examine their payoffs of $3.20 versus $1.60 and choose Take. Since it is clear that most subjects do not play the SPNE, a primary scientific question of interest is whether subjects, through repeated gaming, move toward the SPNE. In the right panel of Figure 2, a scatter plot of the number of games played against the five possible outcomes of the game is presented with a locally smooth regression1 . Due to the discrete nature of the data the points were jittered. The decreasing trend of the smoother suggests that on average the subjects move toward the first outcome with repeated play, which is the SPNE. The smoother suggests that the relationship between the number of games played and the outcomes of the games is approximately linear. The slope of the linear component of the trend was estimated as γ obs = -0.067 via least squares. Because of the potential for dependent outcomes due to repeated play by each subject, typical regression standard errors are inappropriate. To overcome this, we conducted a randomization test to examine whether the observed slope was statistically different from zero (Fisher, 1935;Box and Anderson, 1955;Besag and Diggle, 1977). As Besag and Diggle's (1977) state "a primary advantage of [randomization] testing is that the investigator q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 2 4 6 8 10 is free to use a variety of informative statistics of his own choosing, rather than be dictated to by known distributional theory. Indeed, even when the relevant asymptotic distribution theory is available, [randomization] testing provides an exact alternative for small samples." Thus in the randomization testing framework, which does not consider tests based on population parameters, our null hypothesis is H 0 : the game number does not affect the outcome. In order to investigate this hypothesis, we consider the slope as our chosen test statistic. The test proceeds by randomly sampling appropriate permutations of the data y perm under H 0 . Then, for each permutation computing γ(y perm ), and finally comparing this null distribution to the statistic calculated from the observed data γ(y obs ). We now provide further details on this procedure. In conducting the test we need to be faithful to the design, so the permutations were done according to a Latin square design (Cox, 1958). For each of the three sessions, the data can be represented by a Latin square with the Players A represented on the rows and Players B on the columns as in Table 1. For each pair of players A i and B j , information exists on the outcome of the game that the pair played, as well as the current game number. The rows and columns of this matrix were permuted while keeping fixed the row and columns labels, and the outcome of the game for that pair. Each permutation shuffled the "times" at which the games were played but maintained who played each game and the outcome. This was done for each session. The data from the three sessions were placed together and the slope of the linear trend was estimated γ(y perm ). One thousand values of γ(y perm ) were sampled in this manner, and compared to γ(y obs ). The results of the randomization test are displayed in Figure 3. The approximate one sided p-value, P [γ(y perm ) ≤ γ(y obs )|H 0 ], was 0 (i.e., none of the statistics from the randomization met or exceeded the observed value) suggesting that we reject the null hypothesis. Since the outcomes of the centipede game take on five values it is natural to model the data using a multinomial distribution -the key question pertains to the parameterization of the probabilities of each outcome occurring. The SPNE is perhaps the simplest model and states that probability of the first outcome is always one, P [y [i,j](s) = 1] = 1 ∀ i, j, s, which is clearly not a good model for these data. Considering this, McKelvey and Palfrey, in a series of papers (McKelvey and Palfrey, 1995, 1996, 1998) relaxed this criterion through the development of the Quantal Response Equilibrium (QRE) model based on the work in statistical choice modeling by McFadden (1973). Their model still uses the decision making process to inform the specification of the probabilities of the five outcomes, but differs from the SPNE by allowing players to make mistakes through a stochastic component added to players' decisions. We expand upon this model to capture the observed mean trend over time which could be interpreted as learning. We also expand the model to allow for heterogeneity in the type of players (i.e. whether the subject is a Player A or Player B). The section ends with a Bayesian analysis of the model with the best BIC and confirmatory data analysis that will be used to motivate the random effects model of the following section. , 1995, 1996, 1998). We will present their one-parameter QRE model since it will serve as the basis for the models presented in this paper. The QRE model parameterizes the probabilities of players' decisions as functions of their payoffs and the precision (or variance) of their errors. Based upon the extensive form of the game depicted in Figure 1, the decision probabilities that need to be specified for each pair A i , B j within each session s are: q2 [i,j](s) = P [Player B j chooses Take at stage 4|Player B j reaches stage 4 when playing against A i ], p2 [i,j](s) = P [Player A i chooses Take at stage 3|Player A i reaches stage 3 when playing against B j ], q1 [i,j](s) = P [Player B j chooses Take at stage 2|Player B j reaches stage 2 when playing against A i ], p1 [i,j](s) = P [Player A i chooses Take at stage 1 when playing against B j ]. For example, for a Player B to choose Take at the fourth stage, the perceived utility gained from that choice should be greater than the perceived utility gained by choosing Pass. The perceived utilities that drive a Players B decision are modeled as , where U B (y [i,j](s) = 4) and U B (y [i,j](s) = 5) are the monetary payoffs for the outcomes four and five. For the rest of this paper we will make this assumption, however the authors note that with larger monetary values other utility functions may be more appropriate. Finally, the α's and γ's are random deviations that can vary across players, games, and stages within a single game. Therefore the probability that a Player B chooses Take is: McKelvey and Palfrey assume that the errors have a largest extreme value (lev) distribution and that α(q2) [i,j](s) and γ(q2) [i,j](s) are independent leading to their subtraction (q2) [i,j](s) being a logistic distribution -the QRE Multinomial Logit model (McFadden, 1973;McKelvey and Palfrey, 1995, 1996, 1998). We will follow this convention throughout the rest of the paper for computational convenience. Based upon this distributional choice for the deviations, q2 [i,j] (s) can be determined explicitly -assuming the 's are distributed from the logistic distribution with precision λ we have: McKelvey and Palfrey (1998) analyze the centipede game depicted in Figure 1 as a game of perfect information, thus the decision probabilities for the QRE are determined via backwards induction. Using an expected utility argument, p2 [i,j](s) can be determined from q2 [i,j](s) and the errors in a Players A's decision. 40) . By continuing to work backwards, the other two probabilities (q1, p1) [i,j](s) can be determined. Based upon the four decision probabilities the probabilities of the five outcomes of the game can easily be determined as follows: Figure 4 shows the four different decision probabilities (q2, p2, q1, p1) [i,j](s) plotted as a function of λ and in each of the four cases, as λ increases toward ∞, the probability of Take goes to 1 which is the SPNE. When λ = 0, the probability between Take and Pass is 50/50 since a player is completely uncertain about which of the two choices is best. In order to examine the possibility of learning within the QRE framework, we account for the information about the t th game played by a pair A i and B j through the addition of a covariate to McKelvey and Palfrey's base QRE model presented in Section 4.1. This is in line with the work of Signorino (1999) where additional information about a game or the subjects involved allows researchers to gain an understanding of how variation in covariates leads to variation in outcomes of the game. Figures 2 and3 suggested a decrease in the outcome of the game as the number of games played by an individual increased. We consider modeling this by allowing the magnitude of the precision parameter to change over repeated playleading to the following QRE parameterization: ( (p1), (q1), (p2), (q2)) [i,j](s) ∼ logistic(shape = 0, precision = λe βt ). In McKelvey and Palfrey (1998), the authors had hoped that modeling heterogeneity, in terms of player type (A or B), would lead to a significant improvement in the fit of their This leads to the complete statistical specification of the model as follows: 1) , . . . , θ (5) ) [i,j](s) ), (θ (1) , . . . , θ (5) ) [i,j](s) are determined by the game tree in Figure 1 and the following QRE specification: ( (p1), (q1), (p2), ( q2)) [i,j](s) ∼ logistic(shape = 0, precision = λ p e βt ), Based upon the QRE model the likelihood of the observed data is: . The first two models in Table 2 are the results from fitting the two learning models -one with a common λ and the other with heterogeneity among player types. The next two models (3 and 4) are for comparison and are discussed in McKelvey andPalfrey (1993, 1998). The models consist of the one-parameter QRE model and a two parameter model which assumes "that there is some small probability that players are 'altruistic' (and hence choose [Pass] at every opportunity)". Finally, we also fit a standard ordered multinomial probit model which does not consider the underlying decision making process. All the models were fit by maximum likelihood estimation for an expeditious estimation of the models and model comparisons via BIC. Using the BIC as a measure of fit, Model 2 appears to represent the data better than the other models. For this reason it was investigated further using a Bayesian approach in order to obtain credible intervals and examine goodness-of-fit statistics through the use of the posterior predictive tests. Additionally, since this model will be further expanded by utilizing random effects, the move to Bayesian inference at this point is natural. The Bayesian analysis for Model 2 based on Equations 4.2 was conducted with the following diffuse priors: log(λ A ), log(λ B ) ∼ normal(mean = 0, variance = 100), β ∼ normal(mean = 0, variance = 100). The resulting posterior distribution is: The Bayesian estimation for the model was conducted using the Metropolis algorithm. Each of the three parameters were updated separately. A total of 500,000 iterations of the Metropolis algorithm were conducted and the first 20,000 were removed for burn-in. The remaining iterations were thinned by sampling every 25th iteration resulting in 20,000 sampled values of λ A , λ B , and β from the posterior distribution -diagnostics suggested convergence to the posterior distribution. Figure 5 present the densities of the posterior distributions for the three model parameters. The main scientific question of interest depends upon the marginal posterior distribution for β. Since the empirical P (β > 0) = 1, as the number of games played increases so does the precision. We are interpreting this increase in the precision (or decrease in variance) as statistical learning, which can also be considered as learning in the game theoretic sense since increasing the precision leads to the SPNE for the QRE model specified by Equations 4.2. Also, it can be seen in the figure that λ B < λ A , in fact the empirical P (λ B < λ A ) = 1, suggesting that the Players A have higher base precision. It is important to note that λ A and λ B not only represent each player's base precision about their utilities, but also represent each player's estimates about the precision of the other type of player. Thus, λ B < λ A suggests that both populations of A's and B's "estimate" that Players A are more "certain" (i.e. have a higher base level of precision) in their choices through out the game. This matter will be discussed further in Section 5. The notion of learning can be seen more explicitly in Figure 6, which plots the probability of the 5 different outcomes in relation to the game number (based upon the means of the posterior distributions of the parameters). As the number of games is extrapolated to 100, the probability of the first outcome P (y = 1) goes to 1. In order to check the model fit, we conducted four posterior predictive tests (Gelman et al., 1997). Each test consists of generating the posterior distribution of a test statistic of interest T (y rep |∆) based on replicate data sets generated from the model y rep |∆ and then comparing that distribution to the test statistic computed from the observed data T (y obs ). Here, ∆ represents the posteriors of the model parameters. In particular, 20,000 replicate data sets y rep were generated from the joint posterior distribution of the parameters based on the 20,000 MCMC scans. These tests are a way to see if the model is capturing features of interest in the observed data. A way to quantify the notion of "capturing" is through Bayesian p-values based on a particular test statistic: P [|T (y rep |∆)| ≥ T (y obs )]. A small p-value suggests that the model is not capturing the statistic of interest. For this data, we 1) ) θ θ ( ( 2) ) θ θ ( (3) ) θ θ ( ( 4) ) θ θ ( ( 5) ) are concerned with examining whether the model captures: 1. the trend of the outcomes against the game numbers; 2. potential differences among the Players A; 3. potential differences between the Players B; 4. potential differences between the sessions. For (1), in order to make a comparison to the randomization test conducted in Section 3.1, the test statistic used was the slope of the linear trend of the outcomes versus the number of games played. For ( 2)-( 4), the variance between the Players A, Players B, or sessions was compared to the variance within each of these groups via an F-statistic. Here we use the term 'potential differences' since our model did not account for differences in each of those groups, which a priori may be an adequate assumption. Before we state the four test statistics explicitly some additional notation is necessary: . In particular the four test statistics we considered are: The second row of Figure 7 presents the results for the posterior predictive tests (PP). The first panel in that row depicts the posterior distribution of the slope γ(y rep |∆) while the vertical line represents the test statistic from the observed data γ(y obs ). The one sided Bayesian p-value for this posterior predictive test is P [γ(y rep |∆) ≤ γ(y obs )] = 0.421, which suggests that model is capturing the slope for the linear trend. In comparison, the Bayesian p-values for the tests examining the differences among the Players A and Players B are both zero, suggesting that the model could be expanded to allow for differences among the subjects within each player type. Finally, the last panel suggests that incorporating a session effect into the model may not be necessary since P [F sessions (y rep |∆) ≥ F sessions (y obs )] = 0.056, considering 0.05 a cut-off value. In comparison, the top row of Figure 7 presents the results for a set of randomization tests. The first panel replicates the results from Section 3.1 and allows for a comparison between the randomization test and the posterior predictive test when examining the trend. Recall that the hypothesis for the randomization test was H 0 : the game number does not affect the outcome. From this randomization test, we concluded that we could reject the null hypothesis. Now through modeling the trend, the posterior regard test suggests that our model is capturing that trend. This approach allows one to initially investigate a set of hypotheses of interest via the randomization testing approach and then using the same test statistics compare the results based upon a particular model using posterior predictive tests. In regard to randomization tests and hypothesis testing in general Besag and Diggle's (1977) state "we contend that significance testing is rarely to be treated as an end in itself, its purpose being more usually as an aid in suggesting further hypotheses relevant data collection". Or in this case, as a preliminary tool for exploratory data analysis which can lead to further modeling. The next two panels in the top row examine the following two null hypothesis: 1. H 0 : there are no differences among the Players A; 2. H 0 : there are no differences among the Players B. The following procedure was used for the randomization tests: 1.) within each session, a random permutation of the Latin square, with either Players A in the center or Players B in the table entries, was conducted under the null hypotheses of no difference; 2.) the three sessions were combined and the appropriate F-statistic was computed. 3.) This procedure was repeated 1,000 times leading to the null distributions displayed in Figure 7. From the histograms labeled 'Players A (R)' and 'Players B (R)' it is clear that the p-values are zero, and we should reject the null hypotheses. These results coincide with the posterior predictive tests, which were based on a model which did not account for differences among the subjects. Finally, we are unable to conduct a randomization test for differences among the three sessions which is faithful to the design. Since different subjects are nested within each session and each session has Latin square design, we cannot simply permute a subject between sessions without destroying the Latin square design. We could conduct a randomization test which ignored the design, these are typically called 'unrestricted' randomization tests since each subject can be allocated to any treatment combination however as Garthwaite, Jolliffe and Jones (1995) note "[t]his has disadvantages for testing whether one factor affected the responses, since the influence of other factors may bias the results". Since we have a model at this point, we will rely on a posterior predictive test to examine the question of differences among sessions and forgo the unrestricted randomization test. All of the previous models have assumed no subject specific effect for the Players A or the Players B; however, repeated observations from the same subject could result in statistical dependence of the outcomes which are assumed to be independent in the previous QRE models. As was noted from both the randomization tests as well as the posterior predictive tests there does appear to be substantial differences between the subjects (either Players A or Players B). In order to account for this correlation, a random effects model was employed. A QRE random effects model is easily developed through a Bayesian hierarchical approach. Now, each subject has their own set of parameters which come from a population of parameters for each player types -λ A i , β A i , λ B j , and β B j . Another important consideration is that the experimental design is such that players do not know whom they are playing against. The QRE however assumes that every player knows every other player's error distribution. In the simple case, where we modeled two player types (Players A and Players B) and did not consider individual subject effects. This probabilistic model induces relationships between λ A , λ B and the probabilities of choosing Take at various stages of the game (q2, p2, q1, p1) [i,j](s) that are not as simple as in the one parameter model depicted in Figure 4. To clarify the point, Figure 8 demonstrates the relationship between λ A , λ B , and p1 (probability that Player A will choose Take at the first stage of the game), based on the model defined by Equations 4.2, where with out loss of generality, we take β = 0. The figure shows that simple statements can not be made about p1 as λ A increases. The probability for Player A to choose Take at stage 1 depends upon the specification of λ B , and could go to 1 or to 0 (away from the SPNE) as λ A increases. The reason for this lies in the assumptions about the QRE model, in that the players' are assumed to know everyone else's error distribution. Thus, if λ B is small, suggesting that Player B is equally indifferent between choosing Take or Pass, then Player A will maximize her expected utility by choosing Pass as λ A increases. This is the point that McKelvey and Palfrey (1996) make in their example about chess players: "[Consider] a chess game between an expert and a beginner. If this were common knowledge, then the expert might adopt a different strategy than she would against another expert." When we only model player types, the assumption that that subjects know the parameters may not seem unreasonable, but when we move to a random effects model where each subject has there own set of parameters, then this assumption of knowledge about the other players is far too strong. This is especially important considering that the subjects did not know whom they were playing against. To return to a slightly less restrictive assumption, we assume that subjects may not know the distribution of the subjects they are playing against but the empirical means of parameters associated with the opposing player type. Thus each choice is a probabilistic function of subject specific λ and β parameters, empirical means of the λ and β parameters of the opposing player type, and the current game number: Player As' decisions: Player Bs' decisions: P (T ake) B j = F B j (λ B j , β B j , λA , βA , t); j ∈ {1, . . . , 29}. (2) The above can further be clarified as follows. The probability that Player B j will choose Take at stage 4 of the game (q2) when playing against Player A i is given by: Now the probability that Player A i will choose Take at stage 3 of the game (p2) when playing against Player B j depends on empirical means of the Players B: Next the probability that Player B j will choose Take at stage 2 of the game (q1) when playing against Player A i depends on the empirical means related to the Players A. Note that The main parameters of interest are the population means and variances of δ A i , δ B j , β A i , and β B j whose posterior distributions can be seen in Figure 9. The µ δ 's represent a base level of precision for the population of Players A and Players B. Again we see that, on average, the Players A have a higher base precision, but their base precision is more variable compared to the Players B. The figure also presents the posterior distributions of the means and variances of the β A i 's and β B j 's. The medians of the posterior distributions of µ β A and µ β B are greater than zero. While the 95% credible intervals contain zero in both cases, 95% and 51% of the population A and population B, respectively will have a mean for β that is greater than zero. Thus for that proportion of the population, as the game number increases so will the precision, which we interpret as learning in both the statistical and game-theoretic senses. Again there does appear to be greater variation among the Players A compared to the Players B in regards to β's, but this is not nearly as great as was seen for the δ's. Additionally, we can also compare the distribution of β's for each subject by examining the 95% credible intervals in Figure 10. The dots and triangles in the figure are the medians. The dots represent subjects where the probability that β for that subject is greater than 0 is between 50% and 75%. The triangles represent subjects where the probability that β for that subject is greater than 0 is between 75% and 100%. For the Players A, 22 out of 29 have medians greater than zero, compared to Players B where it is only 13 out of 29 have medians greater than zero. Finally, the plot in Figure 11 depicts a scatter plot of the medians for each subject's δ and β grouped again by player type. The lack of linearity, or more precisely sphericity, justifies our model assumptions of no correlation between the slope and intercept for each subject. As before, it is important to check the fit of the model compared to the data using the posterior predictive tests. The same set of test statistics were employed and the results can be seen in Figure 12. The first and second rows are the randomization tests and posterior predictive tests from Figure 7. The last row presents the posterior predictive tests for the random effects model (PP-RE). The p-values associated with these tests are 0.272, 0.057, 0, and 0.118, respectively. From these results, it appears that we are capturing the features related to the trend and differences between the Players A. While there is an extremely slight shift in the histograms examining the differences between the Players B between the models with and without random effects, this is a feature which the model still does not appropriately capture. Since some Players B do pass at the last stage of the game, incorporating some version of the 'altruistic' model suggested by McKelvey and Palfrey (1998) may lead to a better fit. A two component mixture model for the random effects associated with the Players B may pick up on this notion of altruism. However, with only 29 subjects the estimation may really heavily on the priors, since potentially only a few individuals would end up in the 'altruistic' group. Finally, since we allow for differences for each subject through the random effects, we would expect to be able to also capture potential differences between the sessions. -1 0 1 2 Players A β β A 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 q q q q q q q q -1 0 q q q q q q q q (50-75] (75-100] that needs to be addressed is the one of the distribution of player errors. QRE models, as well as, statistical choice models in general are constrained by the type of player error distributions that are employed, being either largest extreme value of Gaussian. Quinn and Westveld (2009) have utilized a semi-parametric approach to solve this problem within the QRE framework. This method could also be applied with the QRE random effects model to allow for greater flexibility in the modeling and testing of learning -again a larger data set would be needed so that priors are not completely dominating the results. More generally, the modeling approach in this paper can be seen as an attempt to inform a statistical analysis of a complicated data set with an underlying behavioral model, or conversely, expand upon a behavioral model to allow for a more accurate description of observed data by incorporating certain characteristics of natural variability. Approaches such as this may help bridge the gap between the purely statistical analyses of social relations data Wasserman and Faust (1994); Gill and Swartz (2001); Hoff, Raftery and Handcock (2002); Hoff (2003Hoff ( , 2005) ) and game-theoretic models based on rational choice theory. q q q q q q q q q q q q q q q q q q q q q q q q q q q q q -0.4 0.0 0.4 0.8 0.5 1.0 1.5 2.0 Medians of individual δ δ A 's q q q q q q q q q q q q q q q q q q q q q q q q q q q q q -0. The default setting of the lowess() function in the R statistical package was used.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment