Alignment Makes Language Models Normative, Not Descriptive

Alignmen t Mak es Language Mo dels Normativ e, Not Descriptiv e Eilam Shapira and Moshe T ennenholtz and Roi Reic hart T echnion – Israel Institute of T echnology Abstract P ost-training alignmen t optimizes lan- guage mo dels to match human preference signals, but this ob jective is not equiv- alen t to mo deling observ ed human be- ha vior. W e compare 120 base–aligned mo del pairs on more than 10,000 real h uman decisions in multi-round strate- gic games—bargaining, persuasion, nego- tiation, and rep eated matrix games. In these settings, base mo dels outp erform their aligned coun terparts in predicting h u- man c hoices by nearly 10:1, robustly across mo del families, prompt formulations, and game conﬁgurations. This pattern re- v erses, how ev er, in settings where human b eha vior is more lik ely to follo w norma- tiv e predictions: aligned mo dels dominate on one-shot textb o ok games across all 12 t yp es tested and on non-strategic lottery c hoices — and ev en within the m ulti-round games themselves, at round one, b efore in- teraction history dev elops. This b oundary- condition pattern suggests that alignment induces a normativ e bias: it improv es pre- diction when human b eha vior is relativ ely w ell captured b y normative solutions, but h urts prediction in multi-round strategic settings, where b eha vior is shap ed by de- scriptiv e dynamics suc h as recipro city , re- taliation, and history-dep endent adapta- tion. These results rev eal a fundamen- tal trade-oﬀ betw een optimizing models for h uman use and using them as pro xies for h uman b eha vior. 1 In tro duction Large language mo dels (LLMs) are increas- ingly used as pro xies for human b eha vior ( Fil- ippas et al. , 2024 ; Aher et al. , 2023 ; Binz and Sch ulz , 2023 ; Argyle et al. , 2023 ; San- turk ar et al. , 2023 ; Hewitt et al. , 2024 ; Suh et al. , 2025 ). They replicate classic exp er- imen tal ﬁndings from psychology and eco- nomics, appro ximate subgroup opinion dis- tributions when conditioned on demographic bac kstories, and predict surv ey exp erimen t outcomes. The approac h extends to strate- gic settings: LLMs can predict human de- cisions in language-based p ersuasion games, outp erforming mo dels trained on human data alone ( Shapira et al. , 2024a ), and capture co- op eration patterns in rep eated social dilemmas ( Ak ata et al. , 2025 ; Mei et al. , 2024 ). Y et nearly all of this w ork uses aligned mo dels, treating alignmen t as either neu- tral or b eneﬁcial for b ehavioral prediction. This assumption deserv es scrutiny . Align- men t via RLHF ( Ouyang et al. , 2022 ) or DPO ( Rafailo v et al. , 2023 ) optimizes mo d- els for resp onses that human ev aluators ap- pr ove of —coop erativ e, fair, and socially ap- propriate. But h uman b eha vior in strategic settings is often none of these: p eople bluﬀ, retaliate, and deviate from approv ed patterns ( Capraro et al. , 2025 ; Bauer et al. , 2025 ). If alignmen t narrows the mo del’s b eha vioral dis- tribution tow ard such resp onses ( Kirk et al. , 2024 ; Cao et al. , 2025 ; GX-Chen et al. , 2026 ), it creates a normative bias —the mo del learns to predict b eha vior that people endorse rather than b eha vior they exhibit . The distinction b e- t ween normativ e theories (ho w p eople should act) and descriptive accounts (ho w p eople ac- tually act) is foundational in the so cial and b eha vioral sciences ( Camerer et al. , 2004 ). This predicts that aligned mo dels should predict human b eha vior well in settings where that b eha vior is relatively simple and well- describ ed b y normativ e theory , but p o orly where b eha vior is complex and shap ed b y in teraction history . Multi-round strategic games—where decisions dep end on accumu- lated exp erience with a s peciﬁc opp onen t— pro vide a natural test case for the descriptiv e Figure 1: Pearson correlations of base mo dels and h uman decisions (x-axis) vs. aligned mo dels and h uman decisions (y-axis) across four game families. Eac h point is a same-provider pair ev aluated in its native format (standard prompt for base, chat template for aligned). P oints b elo w the diagonal indicate base adv antage. The shaded region marks pairs where both models correlate b elow 0.3 with human b eha vior. Base mo dels win 75:4 in bargaining, 32:4 in persuasion, 25:1 in negotiation, and 81:13 in matrix games, for an ov erall ratio of 9.7:1 (213 vs. 22, p < 10 − 40 ). end: b eha vior there is driven b y reciprocity , retaliation, and reputation dynamics. One- shot decisions ov er w ell-studied game struc- tures or simple lotteries provide a contrasting case where normative predictions ma y b e more accurate. W e test this hypothesis b y comparing 120 same-pro vider base–aligned 1 mo del pairs from 23 families (see App endix A ) on predicting 10,050 real h uman decisions across four fam- ilies of m ulti-round strategic games: bargain- ing, persuasion, negotiation, and repeated ma- trix games (Prisoner’s Dilemma and Battle of the Sexes). By restricting to same-provider pairs, each comparison directly isolates the ef- fect of alignment. Each mo del is ev aluated in its native format: standard text completion for base mo dels, chat-templated input for aligned mo dels. The results are consistent with the hypoth- esis. In multi-round games, base models out- p erform their aligned counterparts by a ra- tio of 9.7:1 (213 vs. 22 wins, p < 10 − 40 ), with each game family individually signiﬁcant ( p < 10 − 6 ). The eﬀect holds across all 23 mo del families, 10 prompt formulations, and all game conﬁguration parameters, and gro ws with mo del scale. The hypothesis also predicts where the base adv antage should not hold: in simpler set- 1 Throughout, aligne d denotes mo dels that hav e un- dergone post-training optimization b ey ond next-token prediction—t ypically sup ervised ﬁne-tuning combined with preference optimization via RLHF or DPO; b ase denotes the pre-alignmen t c heckpoint. tings without multi-round history , normative predictions may suﬃce, and alignment should help rather than h urt. W e test t wo such b oundary conditions—one-shot 2 × 2 matrix games and non-strategic binary lotteries— and ﬁnd that the adv an tage rev erses in both. Aligned mo dels win 4.1:1 on one-shot games ( p < 10 − 6 ), consisten tly across all 12 game t yp es, and 2.2:1 on lotteries ( p < 10 − 3 ). In the one-shot games, aligned mo dels’ predic- tions are closer to Nash equilibrium—which itself correlates with human b eha vior in these settings ( r = 0 . 62)—consisten t with align- men t shifting predictions to w ard normativ e patterns. The same rev ersal app ears within m ulti-round games at round one, b efore in- teraction history dev elops, but disapp ears as history accumulates. 2 Related W ork 2.1 LLMs as Human Behavioral Pro xies A growing literature treats LLMs as b eha v- ioral mo dels of h umans— homo silicus ( Filip- pas et al. , 2024 )—capable of replicating ex- p erimen tal ﬁndings ( Aher et al. , 2023 ), ap- pro ximating subgroup opinions ( Argyle et al. , 2023 ), and predicting treatment eﬀects ( He- witt et al. , 2024 ). Nearly all of this work uses aligned mo dels, implicitly assuming that alignmen t is neutral for b ehavioral ﬁdelit y . Y et sev eral ﬁndings c hallenge this assump- tion: RLHF collapses opinion diversit y to- w ard speciﬁc groups ( San turk ar et al. , 2023 ), instruction tuning in tro duces cognitive biases absen t in base mo dels ( Itzhak et al. , 2024 ), LLMs ov er-predict normativ ely rational b e- ha vior ( Liu et al. , 2025 ), and RLHF-tuned mo dels fail to mirror human resp onse bi- ases ( Tjuatja et al. , 2024 ). Most directly , Suh et al. ( 2025 ) found that aligned mo dels are dramatically worse than base mo dels at zero-shot opinion prediction. These results suggest that alignment distorts behavioral represen tations—but the evidence comes from opinions and individual judgmen ts. Whether the pattern extends to multi-r ound str ate gic inter actions , where b eha vior is shap ed by his- tory and recipro cit y , remains untested. 2.2 The Alignmen t T ax Alignmen t can degrade capabilities b eyond helpfulness, a phenomenon termed the “align- men t tax.” Base mo dels outp erform aligned v ariants on reasoning b enchmarks ( Munjal et al. , 2026 ), and calibration deteriorates across the tuning pip eline ( Kadav ath et al. , 2022 ; Zhu et al. , 2023 ). More fundamentally , alignmen t narro ws the mo del’s output distri- bution: RLHF signiﬁcantly reduces output di- v ersity ( Kirk et al. , 2024 ), and the standard KL-regularized RL framew ork can only sp ec- ify unimo dal targets, making diversit y collapse a built-in feature rather than an implemen- tation failure ( GX-Chen et al. , 2026 ; Korbak et al. , 2022 ; Xiao et al. , 2025 ). These re- sults establish that alignment narro ws distri- butions and why , but measure the cost in gen- eration quality and b enc hmark scores—not in b eha vioral prediction ﬁdelity . Whether distri- butional narrowing degrades a mo del’s ability to predict the full range of human strategic b eha vior has not b een tested directly . 2.3 LLMs in Strategic Games Prior work studies ho w LLMs play games ( Capraro et al. , 2025 ; Ak ata et al. , 2025 ; Mei et al. , 2024 ) or s erv e as av ailable strategies ( Shapira et al. , 2026 ), but play and predic- tion are fundamentally diﬀeren t: a mo del at Nash equilibrium would p oorly predict actual h uman b eha vior, which systematically devi- ates from equilibrium. W e study pr e diction — whether a mo del’s token probabilities match h uman c hoice distributions—using logprob ex- traction rather than generation, enabling di- rect base-vs-aligned comparison on identical inputs. Predicting h uman strategic b ehavior has traditionally relied on parametric mo dels from b eha vioral game theory ( McKelv ey and Pal- frey , 1995 , 1998 ; Nagel , 1995 ; Stahl and Wil- son , 1995 ; Camerer et al. , 2004 ; Camerer and Ho , 1999 ). Zh u et al. ( 2025 ) sho wed that ML mo dels trained on large h uman datasets capture structure b ey ond these base- lines, and Shapira et al. ( 2024a , b , 2025 ) demonstrated that LLMs can predict h uman decisions in language-based games—but used only aligned mo dels, lea ving op en whether the pre-alignmen t chec kp oin t might predict b et- ter. W e address this gap with the ﬁrst sys- tematic base-vs-aligned comparison across 120 same-pro vider pairs and four game families. 3 Exp erimen tal Setup 3.1 Game F amilies and Human Data W e ev aluate on four families of strategic games that v ary in information structure, decision complexit y , and interaction length. Bargaining. An alternating-oﬀers bargain- ing game based on the model of Rubinstein ( 1982 ). Alice and Bob tak e turns prop osing ho w to divide a sum of money; the other pla y er accepts or rejects. Eac h pla yer has a p er-round discoun t factor ( δ 1 , δ 2 ) representing v alue loss o ver time, framed to participants as “inﬂa- tion.” Prop osals are accompanied by optional free-text messages. If no agreemen t is reac hed within the allotted rounds, b oth play ers re- ceiv e nothing. The human participant plays one role and makes binary accept/reject deci- sions at eac h of their turns. This family con- tains 1,788 human decisions. P ersuasion. A rep eated cheap talk game ( Cra wford and Sob el , 1982 ) play ed o ver 20 rounds. Each round, a seller observ es whether a pro duct is high- or low-qualit y (drawn inde- p enden tly) and sends a message to a buy er, who then decides whether to purc hase at a ﬁxed price. The seller proﬁts from every sale regardless of qualit y , creating a credibility problem: the unique stage-game equilibrium is babbling (uninformative messages). Ov er rep eated rounds, how ev er, reputation dynam- ics emerge as buyers observ e the seller’s trac k record. The buyer role comes in tw o v arian ts: a long-living buyer who observ es the full his- tory , and myopic buyers who see only aggre- gate statistics. Human participants pla y the buy er role and mak e binary yes/no decisions. This family contains 3,180 human decisions. Negotiation. A bilateral price negotiation in whic h a seller and buy er alternate price pro- p osals for an indivisible go o d. Each play er has a priv ate v aluation: the seller v alues the go od at V A and the buyer at V B (parameter- ized as m ultiples of a base price). A t each de- cision p oin t, the resp onding play er can accept the current price, reject it (passing the initia- tiv e to the other side), or exercise an outside option—transacting with an alternativ e part- ner “John” at their o wn v aluation, guaran tee- ing zero surplus but ending the negotiation. 2 Human decisions are ternary: AcceptOﬀer, RejectOﬀer, or DealWithJohn. This family con tains 1,182 h uman decisions. These three families are drawn from the GLEE benchmark ( Shapira et al. , 2024b ). In GLEE, human participan ts pla y interactiv ely against LLM opp onen ts through a web inter- face: each human tak es one role in a game while an LLM plays the other, pro ducing nat- ural language dialogues with v aried oﬀers, ar- gumen ts, and counteroﬀers. Participan ts were not informed that their opp onen t was an LLM; the interface presented the other pla yer b y name (e.g., “Alice”), so human decisions were uncon taminated by knowledge of the opp o- nen t’s nature. The resulting game transcripts con tain decision p oin ts where humans chose among discrete ac tions within rich, multi-turn con versational con texts. Rep eated 2 × 2 Matrix Games. W e ad- ditionally ev aluate on tw o rep eated 2 × 2 games from Ak ata et al. ( 2025 ): the Pris- oner’s Dilemma (PD) and the Battle of the Sexes (BoS). In each, 195 human participants pla y 10 rounds against pre-computed opp o- nen t strategies deriv ed from GPT-4, yielding 1,950 decisions p er game (3,900 total). P artic- ipan ts w ere told they migh t face a h uman or 2 The outside option was introduced in GLEE to pro vide a credible disagreement p oin t; without it, re- jection merely delays the game, incentivizing accep- tance even at unfav orable prices. F or ev aluation w e co de b oth reject and DealWithJohn as 0 (non-accept), since both represent refusal of the curren t oﬀer. an artiﬁcial agent; in fact, all play ed against LLMs, with debrieﬁng provided afterw ard. In PD, participants c ho ose to co operate or de- fect; in BoS, they coordinate on one of t wo op- tions with asymmetric preferences. Unlike the GLEE games, these are complete-information games with a known pa yoﬀ matrix. W e format these games using a multi-turn prompt struc- ture, presen ting the pay oﬀ matrix and round history as a structured dialogue. Across all four families, our ev aluation cov- ers 10,050 human decisions p er mo del, yield- ing ov er 2.4 million total predictions across all mo dels and pairs. 3.2 Prediction Metho d W e frame human decision prediction as a to- k en probability extraction task. F or each hu- man decision p oint in a game, we construct a prompt consisting of a system message de- scribing the game rules and the participan t’s role, follow ed by the dialogue history up to the decision p oin t. W e then perform a single forw ard pass through the model and extract the log-probabilities assigned to eac h decision tok en (e.g., “accept” vs. “reject” for bargain- ing) from the mo del’s next-tok en distribution at the ﬁnal p osition. W e normalize the extracted probabilities to obtain a predicted decision distribution: p accept = p (y es) P d p ( d ) (1) where d ranges o v er all decision tok ens for a giv en family (tw o tok ens for bargaining, p ersuasion, and matrix games; three for ne- gotiation, whic h adds the outside-option to- k en). The resulting p accept ∈ [0 , 1] captures the mo del’s relative preference for the aﬃrmativ e action, normalized aw ay from non-decision to- k ens. This metho d requires no text generation and no sampling—it is a deterministic extraction of the mo del’s internal probability distribution o ver decision tokens, applicable to b oth base and aligned models without requiring diﬀer- en t deco ding strategies. The normalization is robust when decision tokens receiv e substan- tial probability mass; when they do not (i.e., the mo del distributes mass primarily to non- decision tokens), the normalized probabilities b ecome unreliable. W e therefore apply tw o pair-lev el ﬁlters p er game family: a mass ﬁlter excluding pairs where either mo del assigns less than 80% a verage probabilit y mass to decision tok ens, and a minimum c orr elation ﬁlter ex- cluding pairs where b oth mo dels correlate be- lo w 0.3 with h uman decisions. Filters are ap- plied indep enden tly p er family; the base ad- v antage is robust across threshold choices (see App endix C ). 3.3 Prompt V ariants W e ev aluate four prompt v arian ts p er mo del pair to disentangle the eﬀects of model type (base vs. aligned) and prompt format. All v ariants app end a partial JSON ob ject (e.g., { "decision": " ) after the dialogue history , prompting the mo del to complete it with a decision token. The standar d format presen ts this directly as a text completion; the chat template format additionally wraps the prompt in the formatting tok ens exp ected b y aligned mo dels (e.g., <|im start|> , [INST] ), structuring the input in to system, user, and assistan t roles. The four v ariants cross model type with for- mat: Base (native) uses standard format; Aligned (native) uses the mo del’s chat tem- plate; Base (c hat) applies the aligned part- ner’s chat template to the base mo del; and Aligned (plain) uses standard format with- out chat template. Our main comparison pairs each mo del in its native format—base with standard, aligned with chat template— reﬂecting the most natural deploymen t condi- tion. The t wo additional v ariants serv e as con- trols: Base (chat) tests whether applying the aligned mo del’s chat template to its base coun- terpart can recov er any aligned-mo del adv an- tage, while Aligne d (plain) tests aligned mo d- els in a format they were not optimized for. T o test whether the base adv an tage depends on prompt wording, we ev aluate 14 additional form ulations spanning framing, p ersona, for- mat, and structure mo diﬁcations (see Ap- p endix B ). Results are rep orted in Section 4 . 3.4 Boundary Condition Datasets W e additionally ev aluate on tw o datasets cho- sen to test the limits of the base adv an tage. One-shot 2 × 2 matrix games. W e use a dataset of 2,416 pro cedurally generated one- shot 2 × 2 matrix games from Zhu et al. ( 2025 ), spanning 12 game top ologies with approxi- mately 93,000 aggregated human decisions. Unlik e our rep eated matrix games, these are single-round decisions ov er well-studied game structures that are abundantly represented in LLM training data. W e presen t games in coun- terbalanced format (swapping row lab els to con trol for p osition bias). After ﬁltering, 71 v alid pairs remain. Binary lottery c hoices. W e use the dataset of Maran tz and Plonsky ( 2025 ), com- prising 1,001 binary lottery choice problems in which each of 28–31 participan ts chooses b et w een tw o gam bles sp eciﬁed b y their out- comes and probabilities (e.g., “ $ 10 with 60% or $ 2 otherwise” vs. “ $ 7 with 80% or $ 1 oth- erwise”). W e presen t these using v erbal de- scriptions of each lottery . After ﬁltering, 90 v alid same-pro vider pairs remain. These are non-strategic decisions—there is no opp onen t or in teraction—allowing us to test whether the base adv antage is sp eciﬁc to strategic reason- ing or extends to individual decision-making under risk. 3.5 Ev aluation Primary metric. W e use Pearson correla- tion b et w een the mo del’s predicted probabil- it y ( p accept ) and the ground-truth h uman be- ha vior as our primary ev aluation metric. In the four main game families (bargaining, per- suasion, negotiation, rep eated matrix games), eac h decision p oint has a unique dialogue his- tory , so the correlation is computed at the lev el of individual decisions (co ded as 1 for accept/y es/co op erate, 0 for reject/no/defect; in negotiation, b oth reject and DealWithJohn are co ded as 0). In the b oundary condi- tion datasets (one-shot 2 × 2 games and lot- tery choices), the same problem is presen ted to m ultiple participants, yielding an empirical c hoice probabilit y per problem; here, we cor- relate the mo del’s predicted probabilit y with this aggregate human c hoice rate. This reﬂects the data structure: multi-round games pro- duce unique tra jectories, while one-shot prob- lems are rep eated across participants. P airwise comparison. F or eac h base– aligned pair in a given game family , w e com- pare the base mo del’s P earson correlation against the aligned model’s Pearson correla- tion and record a “base win” or “aligned win.” W e then aggregate win counts across all v alid pairs. Statistical tests. W e emplo y t w o comple- men tary tests. A one-sided binomial te st ev al- uates whether the observ ed ma jority (base or aligned) wins signiﬁcantly more than 50% of comparisons under the n ull h yp othesis of equal p erformance; the test is alw a ys applied in the direction of the observed winner. As a com- plemen tary test that accounts for eﬀect mag- nitudes, w e also report the one-sided Wilco xon signed-rank test on the Pearson correlation diﬀerences. All p -v alues rep orted in the text are binomial unless otherwise noted. 4 Results Figure 1 visualizes the head-to-head compar- ison under our main pairing: eac h model in its native format (standard prompt for base, c hat template for aligned). Base mo dels win 213 of 235 v alid comparisons across the four game families (9.7:1), with the adv antage indi- vidually signiﬁcan t in ev ery family ( p < 10 − 6 ). The adv an tage is consistent across all 23 mo del families. Among the sev en largest, base wins the ma jority in every family: Qw en 82:15, Gemma 28:2, F alcon 21:6, Llama 17:0, OLMo 16:3, DeepSeek 8:4, and SmolLM 5:3. Ev en the families closest to parit y nev er sho w a con- sisten t aligned-model adv antage across game t yp es. F ull p er-pair results for all six datasets are rep orted in App endix D . Ruling out prompt-format confounds. A natural ob jection is that base mo dels b ene- ﬁt from plain-text format while aligned mo d- els are hamp ered by their chat template. Two con trols rule this out: when b oth mo dels re- ceiv e identical plain-text prompts, base mo d- els still win 5.0:1 ( p < 10 − 34 ); when b oth re- ceiv e the aligned mo del’s c hat template—a for- mat the base mo del w as nev er trained on— base mo dels still win 5.3:1. The adv an tage re- sides in the mo del weigh ts, not in the prompt format. Prompt form ulation robustness. W e ev aluate 14 prompt formulations organized in to four clusters—framing (3 v ariants mo d- ifying task description), p ersona (5 v ari- T able 1: Base (B) vs. aligned (A) win counts b y prompt v arian t. Eac h v arian t pairs base predic- tions against the matc hing aligned c hat v ariant. p : one-sided binomial test. Persuasion yields no v alid pairs for non-standard v ariants. Cluster V arian t B A p Baseline Standard 101 5 1 . 3 × 10 − 24 F raming Predict human 105 3 6 . 5 × 10 − 28 Observer 92 5 4 . 3 × 10 − 22 Reversed roles 99 2 2 . 1 × 10 − 27 Persona Naiv e 104 5 1 . 9 × 10 − 25 Expert 101 5 1 . 3 × 10 − 24 F airness 98 4 8 . 7 × 10 − 25 Selﬁsh 100 6 2 . 2 × 10 − 23 Emotional 102 4 6 . 4 × 10 − 26 Structure Preamble reversed 57 5 1 . 5 × 10 − 12 Ov erall 959 44 < 10 − 200 an ts assigning b eha vioral roles), format (3 v ariants stripping structured formatting), and structure (2 v arian ts altering prompt organization)—plus the baseline. Of these, 10 pro duce suﬃcient data for ev aluation; the nat- ural language and simpliﬁed format v ariants yield catastrophically low decision token mass for base mo dels, indicating reliance on struc- tured formatting. Across the 10 testable v ari- an ts and tw o GLEE game families (bargain- ing and negotiation), base mo dels win 959 of 1,003 com parisons (95.6%, p < 10 − 200 ), with ev ery v ariant individually reaching p < 0 . 01 (T able 1 ). F raming, persona, and baseline v ariants all yield 92–97% base win rates; ev en “selﬁsh” (94.3%) or “observer” (94.8%) v ariants do not close the gap. Base mo dels require structured formatting to pro duce v alid decision tokens, but giv en such structure, the adv antage is ro- bust. Game conﬁguration robustness. Each GLEE family is parameterized along multiple dimensions (6 for bargaining, 6 for p ersuasion, 6 for negotiation; see App endix E for the full parameter space and p er-v alue win coun ts). The base adv antage holds across ev ery param- eter v alue in ev ery family . In p ersuasion, the adv antage is notably stronger when the seller kno ws pro duct quality (14.5:1) than when un- informed (2.3:1), suggesting base mo dels b et- ter capture strategic information use. The sole exception is bargaining with discount factor δ 1 = 0 . 8, where the adv antage narro ws to near Figure 2: Median Pearson correlation diﬀerence (base min us aligned) b y mo del size, with 95% bo ot- strap conﬁdence interv als (5,000 resamples, p er- cen tile metho d). The base adv antage is p ositiv e across all size bins and grows with scale. parit y (10:7, p = 0 . 31). Round-b y-round dynamics. In round 1— b efore any multi-round dynamics dev elop— aligned mo dels actually win in bargaining (61:32), negotiation (39:33), and p ersuasion (30:23). The adv an tage reverses from round 2 on ward (bargaining: 82:4, negotiation: 56:1, p ersuasion: 31:8). 3 This within-game tran- sition mirrors the b et ween-dataset contrast with one-shot games (Section 5 ), suggesting that the accumulation of history-dep enden t dynamics—not the game structure itself— driv es the base adv an tage. Size scaling. If the base adv antage reﬂects ric her pre-training represen tations that align- men t shifts, it should gro w with mo del scale. Figure 2 conﬁrms this: bargaining sho ws the clearest trend, from +0.22 at < 3B to +0.36 at ≥ 14B; negotiation rises from +0.35 to +0.43; matrix games grow from +0.04 to +0.11. 5 Boundary Conditions The round-by-round analysis (Section 4 ) oﬀers a clue ab out the limits of the base adv antage: aligned mo dels win at round 1, b efore interac- tion history develops, then lose as history ac- 3 P er-round analysis for repeated matrix games is not meaningful b ecause round 1 contains only tw o unique decision contexts (one PD, one BoS), yielding insuﬃcien t v ariation for correlation. T able 2: Base vs. aligned wins on one-shot 2 × 2 games b y game type ( N = 71 pairs). p : one-sided binomial test. Game Type Base Al. Ratio p (binomial) harmony 12 59 4.9:1 Al. 6 . 7 × 10 − 9 concord 9 62 6.9:1 Al. 3 . 7 × 10 − 11 peace 14 57 4.1:1 Al. 1 . 3 × 10 − 7 safecoord 15 56 3.7:1 Al. 5 . 2 × 10 − 7 assurance 15 56 3.7:1 Al. 5 . 2 × 10 − 7 dilemma 14 57 4.1:1 Al. 1 . 3 × 10 − 7 deadlock 16 55 3.4:1 Al. 1 . 9 × 10 − 6 chic k en 23 48 2.1:1 Al. 2 . 0 × 10 − 3 staghun t 11 60 5.5:1 Al. 1 . 3 × 10 − 9 hero 23 48 2.1:1 Al. 2 . 0 × 10 − 3 leader 20 51 2.5:1 Al. 1 . 5 × 10 − 4 compromise 19 52 2.7:1 Al. 5 . 6 × 10 − 5 P er-pair 14 57 4.1:1 Al. 1 . 3 × 10 − 7 cum ulates. If the absence of m ulti-round his- tory is what enables the aligned-mo del adv an- tage, then it should reapp ear in settings that are inherently one-shot. W e test this with tw o b oundary conditions: one-shot matrix games (same strategic structure, no repeated in terac- tion) and non-strategic lotteries (no opp onen t, no in teraction). In b oth cases, the adv an tage rev erses. W e ev aluate 71 same-provider pairs on the one-shot 2 × 2 matrix game b enchmark of Zhu et al. ( 2025 ), comprising 2,416 pro cedurally generated games with appro ximately 93,000 aggregated h uman decisions spanning 12 game t yp es. The results reverse: aligned mo dels win 57 comparisons to base models’ 14 (4.1:1 aligned-mo del adv an tage, p < 10 − 6 ). The aligned-mo del adv antage is univ ersal across all 12 types (See T able 2 ). The contrast with rep eated games is notable: when the same strategic structures are play ed ov er 10 rounds, base models win 6.2:1 (81:13). Tw o non- exclusiv e factors likely contribute: one-shot games are canonical ob jects abundan tly rep- resen ted in training corp ora, where alignment ma y reinforce textb o ok-correct resp onse pat- terns; and humans in isolated one-shot deci- sions ma y themselves b ehav e closer to norma- tiv e predictions. W e can quantify the normativ e alignment directly . F or eac h one-shot game, w e compute the mixed-strategy Nash equilibrium proba- bilit y . 4 Human aggregate c hoices correlate 4 F or games with multiple pure equilibria (33% of the dataset), we use the unique mixed-strategy NE, with NE predictions ( r = 0 . 616), suggesting that b eha vior in these simple games is rea- sonably well-described by equilibrium theory . Aligned mo dels are systematically more NE- aligned than base mo dels (mean r = 0 . 41 vs. 0 . 28; aligned closer in 59 of 76 ﬁltered pairs, p < 10 − 6 ). This is consisten t with align- men t shifting predictions to w ard normativ e patterns—a shift that helps in settings where h uman b eha vior happ ens to follow suc h pat- terns. W e also ev aluate on the lottery dataset of Maran tz and Plonsky ( 2025 ), comprising 1,001 binary choice problems with no strategic in- teraction. Among 90 same-provider pairs, aligned mo dels win 62:28 (2.2:1, p = 2 . 19 × 10 − 4 ). Alignment helps with individual, non- in teractive decisions, where understanding the decision structure and follo wing instructions aligns with the prediction task. 6 Discussion and Conclusion The selective nature of the aligned-mo del ad- v antage rules out the most natural alterna- tiv e explanation: if alignmen t merely degraded general capabilities (catastrophic forgetting), aligned mo dels w ould underperform uniformly rather than winning selectively on one-shot games and lotteries. The relev ant knowledge is preserv ed; alignment shifts which b eha vioral patterns the mo del expresses, not whether it can express them at all. The distributional narrowing do cumen ted in Section 2.2 oﬀers a precise accoun t. KL- regularized rew ard maximization yields an op- timal p olicy π ∗ ( x ) ∝ π 0 ( x ) exp( r ( x ) /β )—an exp onen tial tilt of the base distribution that concen trates mass on high-reward (annotator- appro ved) b ehavioral mo des at the exp ense of the tails ( Korbak et al. , 2022 ). Xiao et al. ( 2025 ) sho wed that this concentration is not a side eﬀect but a structural prop ert y: stan- dard RLHF exhibits an inherent bias to w ard dominan t preferences (“preference collapse”), and preserving the full preference distribution w ould require an en tropy-based regularizer that current metho ds lac k. Our results pro- whic h provides a single prediction per game without requiring an equilibrium selection assumption. At the p opulation lev el, if diﬀerent participants coordinate on diﬀeren t pure equilibria, aggregate choice frequencies con verge to ward the mixed NE prediction. vide the ﬁrst b ehavioral evidence for this the- oretical prediction—the collapse is not merely measurable in generation div ersit y ( Kirk et al. , 2024 ) but in predictive ﬁdelit y for h uman deci- sions. The tails that reward tilting suppresses are precisely where multi-round strategic b e- ha vior liv es: recipro cit y , retaliation, and rep- utation dynamics that annotators would not endorse but that h umans routinely exhibit. These ﬁndings carry practical implications in b oth directions. F or multi-round interactiv e settings, base mo dels should b e preferred; for one-shot games or non-strategic tasks, aligned mo dels remain appropriate. More broadly , alignmen t systematically narrows the b ehav- ioral distribution that pre-trained mo dels en- co de, and any application that relies on LLMs to represent how p eople actual ly b ehav e faces the same risk. Researc hers sim ulating v oter b eha vior, consumer c hoices, or so cial media dynamics with aligned mo dels may obtain re- sults that reﬂect idealized rather than actual h uman b eha vior. The growing use of LLMs as sim ulated participants in so cial science ( Fil- ippas et al. , 2024 ; Aher et al. , 2023 ) mak es this an active metho dological risk: studies re- p orting that “LLMs replicate human b eha v- ior” may in fact b e rep orting that LLMs repli- cate normative b eha vior, with the gap in visi- ble where norms and b ehavior coincide. Sev eral op en questions follow naturally . Whic h asp ects of multi-round play drive the base adv an tage—opp onen t modeling, history in tegration, or tra jectory no velt y? Extending to contin uous negotiations, auctions, or coali- tion formation would test generality . F rom an alignment p ersp ectiv e, developing meth- o ds that preserv e empirical b eha vioral distri- butions while adding helpfulness is a natural direction. Finally , testing whether the eﬀect p ersists at extreme scale w ould clarify whether the normativ e shift is inherent to alignmen t or diminishes as mo dels grow more capable. The normative–descriptiv e trade-oﬀ do cu- men ted here may b e inheren t to current align- men t methods: optimizing for a single re- w ard mo del that enco des annotator prefer- ences cannot sim ultaneously preserve the full distribution of h uman b eha vior. Until align- men t metho ds are dev elop ed that can add helpfulness without collapsing b ehavioral di- v ersity , the choice of base v ersus aligned mo del is not merely a formatting decision but a sub- stan tive mo deling assumption—one that de- termines whether an LLM serves as a mo del of human behavior or a model for h uman use. Limitations First, the GLEE multi-round game data comes from h uman participan ts pla ying against LLM opp onen ts ( Shapira et al. , 2024b ), not other h umans. Ho wev er, participan ts w ere not in- formed that their opp onen t was an LLM (GLEE presented the other pla y er b y name), and matrix game participan ts ( Ak ata et al. , 2025 ) were told they might face either a hu- man or an artiﬁcial agen t—so human deci- sions were made without certain kno wledge of the opp onen t’s nature, mitigating concerns ab out altered b eha vior. Second, our analy- sis is restricted to binary or ternary decisions; whether the ﬁndings extend to contin uous ac- tion spaces remains op en. Third, all 120 pairs are op en-w eight; w e cannot ev aluate closed- source mo dels for which base versions are un- a v ailable, though consistent trends from 1B to 70B+ suggest the eﬀect may generalize. F ourth, the one-shot b oundary condition uses a diﬀerent dataset ( Zhu et al. , 2025 ) than the rep eated games; the round-1 aligned-mo del adv antage within multi-round games pro vides con vergen t evidence from the same data. Fi- nally , we cannot rule out all alternative mech- anisms, though the aligned-mo del adv antage on one-shot games (Section 5 ) argues against catastrophic forgetting as the primary expla- nation. Ac kno wledgments Eilam Shapira is supp orted b y a Go ogle PhD F ellowship. Roi Reic hart has b een partially supp orted by a V A T A T grant on data science. W e thank Ma ya Zadok, Alan Arazi, and Ni- ta y Calderon for v aluable feedbac k on earlier drafts. References Gati Aher, Rosa I. Arriaga, and Adam T auman Kalai. 2023. Using large language models to sim ulate m ultiple humans and replicate human sub ject studies. In Pr o c e e dings of the 40th In- ternational Confer enc e on Machine L e arning , ICML’23. JMLR.org. Elif Ak ata, Lion Sc hulz, Julian Co da-F orno, Seong Jo on Oh, Matthias Bethge, and Eric Sc hulz. 2025. Playing rep eated games with large language mo dels . Natur e Human Behaviour , 9(7):1380–1390. Lisa P . Argyle, Ethan C. Busby , Nancy F ulda, Josh ua R. Gubler, Christopher Rytting, and Da vid Wingate. 2023. Out of one, man y: Us- ing language models to sim ulate h uman samples . Politic al Analysis , 31(3):337–351. Kevin Bauer, Lena Liebich, and Mic hael Kosfeld. 2025. Can GPT mimic human preferences? An empirical and structural inv estigation . In Pr o- c e e dings of the 33r d Eur op e an Confer enc e on In- formation Systems (ECIS 2025) . Marcel Binz and Eric Sch ulz. 2023. Using cog- nitiv e psychology to understand GPT-3 . Pr o- c e e dings of the National A c ademy of Scienc es , 120(6):e2218523120. Colin F. Camerer and T eck-Hua Ho. 1999. Exp erience-w eigh ted attraction learning in nor- mal form games. Ec onometric a , 67(4):827–874. Colin F. Camerer, T eck-Hua Ho, and Juin-Kuan Chong. 2004. A cognitive hierarch y mo del of games. The Quarterly Journal of Ec onomics , 119(3):861–898. Stev en Cao, Gregory V alian t, and Percy Liang. 2025. On the en tropy calibration of language mo dels . In A dvanc es in Neur al Information Pr o- c essing Systems 38 . V alerio Capraro, Roberto Di Paolo, and V eronica Pizziol. 2025. A publicly av ailable b enc hmark for assessing large language mo dels’ ability to predict ho w h umans balance self-interest and the in terest of others . Scientiﬁc R ep orts , 15:21428. Vincen t P . Crawford and Jo el Sob el. 1982. Strate- gic information transmission. Ec onometric a , 50(6):1431–1451. Ap ostolos Filippas, John J. Horton, and Ben- jamin S. Manning. 2024. Large language mo d- els as simulated economic agents: What can we learn from Homo Silicus? In Pr o c e e dings of the 25th A CM Confer enc e on Ec onomics and Com- putation, EC 2024 , pages 614–615. ACM. An thony GX-Chen, Jatin Prak ash, Jeﬀ Guo, Rob F ergus, and Ra jesh Ranganath. 2026. KL- regularized reinforcemen t learning is designed to mo de collapse . In The F ourte enth International Confer enc e on L e arning R epr esentations . Luk e Hewitt, Ash wini Ashokkumar, Isaias Ghezae, and Robb Willer. 2024. Predicting results of so cial science exp eriments using large language mo dels. W orking pap er. Ita y Itzhak, Gabriel Stanovsky , Nir Rosenfeld, and Y onatan Belinko v. 2024. Instructed to bias: Instruction-tuned language mo dels exhibit emergen t cognitiv e bias . T r ansactions of the As- so ciation for Computational Linguistics , 12:771– 785. Saura v Kada v ath, T om Conerly , Amanda Askell, T om Henighan, Da wn Drain, Ethan Perez, Nic holas Schiefer, Zac Hatﬁeld-Do dds, No v a DasSarma, Eli T ran-Johnson, Scott Johnston, Sheer El-Showk, Andy Jones, Nelson Elhage, T ristan Hume, Anna Chen, Y un tao Bai, Sam Bo wman, Stanisla v F ort, and 17 others. 2022. Language mo dels (mostly) know what they kno w. arXiv pr eprint arXiv:2207.05221 . Rob ert Kirk, Ishita Mediratta, Christoforos Nalm- pan tis, Jelena Luketina, Eric Ham bro, Edw ard Grefenstette, and Rob erta Raileanu. 2024. Un- derstanding the eﬀects of RLHF on LLM gener- alisation and diversit y . In The Twelfth Interna- tional Confer enc e on L e arning R epr esentations . T omasz Korbak, Ethan Perez, and Christopher Buc kley . 2022. RL with KL p enalties is b et- ter viewed as Bay esian inference . In Findings of the Asso ciation for Computational Linguistics: EMNLP 2022 , pages 1083–1091. Asso ciation for Computational Linguistics. Ry an Liu, Jiayi Geng, Joshua C. Peterson, Ilia Su- c holutsky , and Thomas L. Griﬃths. 2025. Large language mo dels assume p eople are more ratio- nal than we really are . In Pr o c e e dings of the Thirte enth International Confer enc e on L e arn- ing R epr esentations . Ey al Marantz and Ori Plonsky . 2025. Predict- ing human choice b et ween textually describ ed lotteries. In Pr o c e e dings of the 47th Annual Confer enc e of the Co gnitive Scienc e So ciety . Ric hard D. McKelvey and Thomas R. Palfrey . 1995. Quan tal response equilibria for normal form games. Games and Ec onomic Behavior , 10(1):6–38. Ric hard D. McKelvey and Thomas R. Palfrey . 1998. Quantal resp onse equilibria for extensive form games. Exp erimental Ec onomics , 1(1):9– 41. Qiaozh u Mei, Y utong Xie, W alter Y uan, and Matthew O. Jackson. 2024. A T uring test of whether AI chatbots are b eha viorally similar to h umans . Pr o c e e dings of the National A c ademy of Scienc es , 121(9):e2313925121. Prateek Munjal, Clemen t Christophe, Ronnie Ra- jan, and Pra veenkumar Kanithi. 2026. Do instruction-tuned mo dels alwa ys p erform b et- ter than base mo dels? Evidence from math and domain-shifted b enchmarks. arXiv pr eprint arXiv:2601.13244 . Rosemarie Nagel. 1995. Unrav eling in guessing games: An exp erimental study . A meric an Ec o- nomic R eview , 85(5):1313–1326. Long Ouy ang, Jeﬀrey W u, Xu Jiang, Diogo Almeida, Carroll L. W ainwrigh t, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray , John Sch ulman, Ja- cob Hilton, F raser Kelton, Luke Miller, Maddie Simens, Amanda Askell, P eter W elinder, P aul F. Christiano, Jan Leik e, and Ry an Low e. 2022. T raining language mo dels to follow instructions with h uman feedbac k. In A dvanc es in Neur al In- formation Pr o c essing Systems , volume 35, pages 27730–27744. Rafael Rafailov, Archit Sharma, Eric Mitc hell, Christopher D. Manning, Stefano Ermon, and Chelsea Finn. 2023. Direct preference optimiza- tion: Y our language mo del is secretly a reward mo del. In A dvanc es in Neur al Information Pr o- c essing Systems , volume 36, pages 53728–53741. Ariel Rubinstein. 1982. Perfect equilibrium in a bargaining mo del. Ec onometric a , 50(1):97–109. Shibani Santurk ar, Esin Durmus, F aisal Lad- hak, Cino o Lee, Percy Liang, and T atsunori Hashimoto. 2023. Whose opinions do language mo dels reﬂect? In Pr o c e e dings of the 40th Inter- national Confer enc e on Machine L e arning , v ol- ume 202 of Pr o c e e dings of Machine L e arning R e- se ar ch , pages 29971–30004. PMLR. Eilam Shapira, Omer Madmon, Reut Ap el, Moshe T ennenholtz, and Roi Reichart. 2025. Human c hoice prediction in language-based persuasion games: Simulation-based oﬀ-p olicy ev aluation. T r ansactions of the Asso ciation for Computa- tional Linguistics , 13:980–1006. Eilam Shapira, Omer Madmon, Roi Reichart, and Moshe T ennenholtz. 2024a. Can LLMs re- place economic choice prediction labs? The case of language-based p ersuasion games. arXiv pr eprint arXiv:2401.17435 . Eilam Shapira, Omer Madmon, Itamar Reinman, Sam uel Joseph Amouyal, Roi Reichart, and Moshe T ennenholtz. 2024b. GLEE: A uni- ﬁed framework and b enc hmark for language- based economic environmen ts. arXiv pr eprint arXiv:2410.05254 . Eilam Shapira, Moshe T ennenholtz, and Roi Re- ic hart. 2026. The p oisoned apple eﬀect: Strate- gic manipulation of mediated markets via tech- nology expansion of AI agents. arXiv pr eprint arXiv:2601.11496 . Dale O. Stahl and P aul W. Wilson. 1995. On play- ers’ mo dels of other play ers: Theory and exp er- imen tal evidence. Games and Ec onomic Behav- ior , 10(1):218–254. Joseph Suh, Erfan Jahanparast, Suhong Mo on, Min woo Kang, and Serina Chang. 2 025. Lan- guage mo del ﬁne-tuning on scaled survey data for predicting distributions of public opinions . In Pr o c e e dings of the 63r d Annual Me eting of the Asso ciation for Computational Linguistics (V olume 1: L ong Pap ers) , pages 21147–21170. Asso ciation for Computational Linguistics. Lindia Tjuatja, V alerie Chen, T ongshuang W u, Ameet T alwalk ar, and Graham Neubig. 2024. Do LLMs exhibit human-lik e resp onse biases? A case study in survey design . T r ansactions of the Asso ciation for Computational Linguistics , 12:1011–1026. Jiancong Xiao, Ziniu Li, Xingyu Xie, Emily Get- zen, Cong F ang, Qi Long, and W eijie J. Su. 2025. On the algorithmic bias of aligning large language mo dels with RLHF: Preference collapse and matc hing regularization . Jour- nal of the Americ an Statistic al Asso ciation , 120(552):2154–2164. Chiw ei Zh u, Benfeng Xu, Quan W ang, Y ongdong Zhang, and Zhendong Mao. 2023. On the cali- bration of large language mo dels and alignmen t . In Findings of the Asso ciation for Computa- tional Linguistics: EMNLP 2023 , pages 9778– 9795. Asso ciation for Computational Linguis- tics. Jian-Qiao Zh u, Joshua C. Peterson, Benjamin Enk e, and Thomas L. Griﬃths. 2025. Captur- ing the complexity of human strategic decision- making with mac hine learning . Natur e Human Behaviour , 9:2114–2120. A F ull Mo del In v en tory This appendix supplements the mo del description in Section 1 . T able 3 lists all 120 same- pro vider base–aligned mo del pairs used in our experiments, group ed by family and sorted b y parameter count. Same-provider means b oth mo dels are released b y the same organization on HuggingF ace. W e exclude pairs where the base and aligned chec kp oin ts are identical, and pairs where the aligned mo del lacks a chat template (required for the nativ e-format comparison). T able 3: All base–aligned pairs, group ed alphab etically by family and sorted by parameter count within eac h family . # Base Mo del Aligned Mo del Size Co deGemma (1 pair) 1 codegemma-7b co degemma-7b-it 7B Co deLlama (6 pairs) 2 CodeLlama-7b-hf Co deLlama-7b-Instruct-hf 7B 3 CodeLlama-7b-Python-hf Co deLlama-7b-Instruct-hf 7B 4 CodeLlama-13b-hf Co deLlama-13b-Instruct-hf 13B 5 CodeLlama-13b-Python-hf Co deLlama-13b-Instruct-hf 13B 6 CodeLlama-34b-hf Co deLlama-34b-Instruct-hf 34B 7 CodeLlama-34b-Python-hf Co deLlama-34b-Instruct-hf 34B DeepSeek (6 pairs) 8 deepseek-coder-1.3b-base deepseek-co der-1.3b-instruct 1.3B 9 deepseek-coder-6.7b-base deepseek-co der-6.7b-instruct 6.7B 10 deepseek-llm-7b-base deepseek-llm-7b-c hat 7B 11 deepseek-math-7b-base deepseek-math-7b-instruct 7B 12 deepseek-coder-33b-base deepseek-co der-33b-instruct 33B 13 deepseek-llm-67b-base deepseek-llm-67b-c hat 67B F alcon (11 pairs) 14 F alcon-H1-0.5B-Base F alcon-H1-0.5B-Instruct 0.5B 15 F alcon3-1B-Base F alcon3-1B-Instruct 1B 16 F alcon-H1-1.5B-Base F alcon-H1-1.5B-Instruct 1.5B 17 F alcon3-3B-Base F alcon3-3B-Instruct 3B 18 F alcon-H1-3B-Base F alcon-H1-3B-Instruct 3B 19 falcon-mam ba-7b falcon-mam ba-7b-instruct 7B 20 F alcon3-7B-Base F alcon3-7B-Instruct 7B 21 F alcon3-Mamba-7B-Base F alcon3-Mamba-7B-Instruct 7B 22 F alcon-H1-7B-Base F alcon-H1-7B-Instruct 7B 23 F alcon3-10B-Base F alcon3-10B-Instruct 10B 24 F alcon-H1-34B-Base F alcon-H1-34B-Instruct 34B Gemma (11 pairs) 25 gemma-3-1b-pt gemma-3-1b-it 1B 26 gemma-2-2b gemma-2-2b-it 2B 27 gemma-2b gemma-2b-it 2B 28 recurren tgemma-2b recurren tgemma-2b-it 2B 29 gemma-3-4b-pt gemma-3-4b-it 4B 30 gemma-7b gemma-7b-it 7B 31 gemma-2-9b gemma-2-9b-it 9B 32 recurren tgemma-9b recurren tgemma-9b-it 9B 33 gemma-3-12b-pt gemma-3-12b-it 12B 34 gemma-2-27b gemma-2-27b-it 27B 35 gemma-3-27b-pt gemma-3-27b-it 27B Granite (5 pairs) 36 granite-3.0-2b-base granite-3.0-2b-instruct 2B 37 granite-3.1-2b-base granite-3.1-2b-instruct 2B 38 granite-3.0-8b-base granite-3.0-8b-instruct 8B 39 granite-3.1-8b-base granite-3.1-8b-instruct 8B 40 granite-20b-code-base granite-20b-co de-instruct 20B H2O (3 pairs) 41 h2o-dan ub e3-500m-base h2o-dan ub e3-500m-c hat 0.5B 42 h2o-dan ub e2-1.8b-base h2o-dan ub e2-1.8b-c hat 1.8B 43 h2o-dan ub e3-4b-base h2o-dan ube3-4b-chat 4B Llama (8 pairs) 44 Llama-3.2-1B Llama-3.2-1B-Instruct 1B 45 Llama-3.2-3B Llama-3.2-3B-Instruct 3B 46 Llama-2-7b-hf Llama-2-7b-c hat-hf 7B Continue d on next p age # Base Mo del Aligned Mo del Size 47 Meta-Llama-3.1-8B Meta-Llama-3.1-8B-Instruct 8B 48 Meta-Llama-3-8B Meta-Llama-3-8B-Instruct 8B 49 Llama-2-13b-hf Llama-2-13b-c hat-hf 13B 50 Meta-Llama-3-70B Meta-Llama-3-70B-Instruct 70B 51 Meta-Llama-3.1-70B Meta-Llama-3.1-70B-Instruct 70B MAP-Neo (1 pair) 52 neo 7b neo 7b instruct v0.1 7B MiMo (1 pair) 53 MiMo-7B-Base MiMo-7B-RL 7B Mistral (5 pairs) 54 Mistral-7B-v0.3 Mistral-7B-Instruct-v0.3 7B 55 Mistral-7B-v0.1 Mistral-7B-Instruct-v0.2 7B 56 Mistral-7B-v0.1 Mistral-7B-Instruct-v0.1 7B 57 Mistral-Nemo-Base-2407 Mistral-Nemo-Instruct-2407 12B 58 Mistral-Small-24B-Base-2501 Mistral-Small-24B-Instruct-2501 24B Nemotron (1 pair) 59 Minitron-4B-Base Nemotron-Mini-4B-Instruct 4B OLMo (8 pairs) 60 OLMo-2-0425-1B OLMo-2-0425-1B-Instruct 1B 61 OLMo-7B-hf OLMo-7B-Instruct-hf 7B 62 OLMo-2-1124-7B OLMo-2-1124-7B-Instruct 7B 63 Olmo-3-1025-7B Olmo-3-7B-Instruct 7B 64 OLMoE-1B-7B-0125 OLMoE-1B-7B-0125-Instruct 7B 65 OLMo-2-1124-13B OLMo-2-1124-13B-Instruct 13B 66 OLMo-2-0325-32B OLMo-2-0325-32B-Instruct 32B 67 Olmo-3-1125-32B Olmo-3.1-32B-Instruct 32B Qw en (32 pairs) 68 Qw en2.5-0.5B Qw en2.5-0.5B-Instruct 0.5B 69 Qw en2-0.5B Qw en2-0.5B-Instruct 0.5B 70 Qw en1.5-0.5B Qw en1.5-0.5B-Chat 0.5B 71 Qw en2.5-Co der-0.5B Qw en2.5-Co der-0.5B-Instruct 0.5B 72 Qw en3-0.6B-Base Qw en3-0.6B 0.6B 73 Qw en2.5-1.5B Qw en2.5-1.5B-Instruct 1.5B 74 Qw en2-1.5B Qw en2-1.5B-Instruct 1.5B 75 Qw en2.5-Math-1.5B Qw en2.5-Math-1.5B-Instruct 1.5B 76 Qw en2.5-Co der-1.5B Qw en2.5-Co der-1.5B-Instruct 1.5B 77 Qw en3-1.7B-Base Qw en3-1.7B 1.7B 78 Qw en1.5-1.8B Qw en1.5-1.8B-Chat 1.8B 79 Qw en2.5-3B Qw en2.5-3B-Instruct 3B 80 Qw en2.5-Co der-3B Qwen2.5-Coder-3B-Instruct 3B 81 Qw en1.5-4B Qw en1.5-4B-Chat 4B 82 Qw en3-4B-Base Qw en3-4B 4B 83 Qw en3-4B-Base Qw en3-4B-Instruct-2507 4B 84 Qw en2.5-7B Qw en2.5-7B-Instruct 7B 85 Qw en2-7B Qwen2-7B-Instruct 7B 86 Qw en1.5-7B Qw en1.5-7B-Chat 7B 87 Qw en2.5-Math-7B Qw en2.5-Math-7B-Instruct 7B 88 Qw en2.5-Co der-7B Qwen2.5-Coder-7B-Instruct 7B 89 Qw en3-8B-Base Qw en3-8B 8B 90 Qw en2.5-14B Qw en2.5-14B-Instruct 14B 91 Qw en1.5-14B Qw en1.5-14B-Chat 14B 92 Qw en1.5-MoE-A2.7B Qw en1.5-MoE-A2.7B-Chat 14B 93 Qw en3-14B-Base Qw en3-14B 14B 94 Qw en2.5-Co der-14B Qw en2.5-Coder-14B-Instruct 14B 95 Qw en2.5-32B Qw en2.5-32B-Instruct 32B 96 Qw en1.5-32B Qw en1.5-32B-Chat 32B 97 Qw en2.5-Co der-32B Qw en2.5-Coder-32B-Instruct 32B 98 Qw en2-72B Qw en2-72B-Instruct 72B 99 Qw en2.5-72B Qw en2.5-72B-Instruct 72B Sailor (2 pairs) 100 Sailor-4B Sailor-4B-Chat 4B 101 Sailor-7B Sailor-7B-Chat 7B SeaLLM (1 pair) 102 SeaLLM-7B-v2 SeaLLM-7B-v2.5 7B Seed-Co der (1 pair) 103 Seed-Coder-8B-Base Seed-Coder-8B-Instruct 8B SmolLM (6 pairs) Continue d on next p age # Base Mo del Aligned Mo del Size 104 SmolLM-135M SmolLM-135M-Instruct 0.1B 105 SmolLM2-135M SmolLM2-135M-Instruct 0.1B 106 SmolLM-360M SmolLM-360M-Instruct 0.4B 107 SmolLM2-360M SmolLM2-360M-Instruct 0.4B 108 SmolLM-1.7B SmolLM-1.7B-Instruct 1.7B 109 SmolLM2-1.7B SmolLM2-1.7B-Instruct 1.7B Solar (1 pair) 110 SOLAR-10.7B-v1.0 SOLAR-10.7B-Instruct-v1.0 10.7B StableLM (3 pairs) 111 stablelm-2-1 6b stablelm-2-1 6b-chat 1.6B 112 stablelm-3b-4e1t stablelm-zeph yr-3b 3B 113 stablelm-2-12b stablelm-2-12b-c hat 12B Tin yLlama (1 pair) 114 Tin yLlama-1.1B-intermediate-step-1431k-3T TinyLlama-1.1B-Chat-v1.0 1.1B Yi (5 pairs) 115 Yi-1.5-6B Yi-1.5-6B-Chat 6B 116 Yi-6B Yi-6B-Chat 6B 117 Yi-1.5-9B Yi-1.5-9B-Chat 9B 118 Yi-34B Yi-34B-Chat 34B 119 Yi-1.5-34B Yi-1.5-34B-Chat 34B Zam ba2 (1 pair) 120 Zam ba2-1.2B Zam ba2-1.2B-instruct 1.2B T able 4: All 14 prompt v arian ts tested. Each mo diﬁes the reference JSON completion format. “OK” = suﬃcien t v alid pairs after ﬁltering. Cluster V ariant Mo diﬁcation OK Baseline Standard Reference format ✓ F raming Predict human Sys: “Predict what a participan t decided” ✓ Observ er Sys: external observ er ✓ Rev ersed roles Sys: oﬀeror predict- ing receiver ✓ P ersona Naiv e Sys: “No prior exp e- rience” ✓ Exp ert Sys: “Behavioral econ researc her” ✓ F airness Sys: “V alues fairness” ✓ Selﬁsh Sys: “Maximize p er- sonal gain” ✓ Emotional Sys: “Gut feeling” ✓ F ormat Natural language Suﬃx: “The decision is: ” ✗ Simpliﬁed Suﬃx: “Answ er: ” ✗ Minimal Suﬃx: “I ” ✗ Structure Num b ers only Drops dialogue his- tory ✗ Pream ble rev. Sw aps accept/reject order ✓ B Prompt Con ten t V ariants Section 4 examines 14 prompt v arian ts organized into four clusters. T able 4 details each v ariant. The four failed v ariants demonstrate that base models critically dep end on structured com- pletion suﬃxes to concen trate probability mass on decision tokens. The simpliﬁe d and minimal v ariants, which replace the JSON pattern with unstructured suﬃxes, cause decision token mass to drop from ∼ 90% to b elo w 10%. The natur al language v arian t retains marginally higher mass but to o few pairs survive standard ﬁltering (mass ≥ 0 . 8). The numb ers only v arian t, whic h strips dialogue history , similarly yields to o few v alid pairs ( N ≤ 5). C Filtering Criteria and Sensitivit y This app endix details the t wo pair-lev el ﬁlters summarized in Section 3.2 and demonstrates that the base adv an tage is robust to the c hoice of ﬁltering thresholds. Mass ﬁlter. F or eac h model and game family , w e compute the a v erage probabilit y mass on de- cision tokens across all decision p oints—the sum of softmax probabilities assigned to recognized decision tok ens (e.g., “accept” and “reject”). If either model in a pair falls b elo w an av erage mass of 0.8 on a given family , b oth mo dels are excluded from that family . Mo dels b elo w this threshold do not reliably pro duce decision-relev ant tokens, making their normalized probabilities unreliable. Minim um correlation ﬁlter. F or eac h mo del pair and game family , w e compute the P earson correlation betw een eac h model’s p accept predictions and the actual binary human decisions. If b oth mo dels in the pair fall below a Pearson correlation of 0.3, the pair is excluded from that family . If at least one mo del exceeds the threshold, b oth are retained. This remov es uninformativ e pairs where neither mo del predicts abov e a minimal threshold, ensuring that base-vs-aligned comparisons reﬂect genuine diﬀerences in predictiv e quality rather than noise from tw o equally p o or mo dels. Both ﬁlters are applied indep enden tly p er game family , so a pair excluded from bargaining ma y still con tribute to p ersuasion or negotiation analyses. Sensitivit y analysis. T ables 5 – 9 sho w that the base adv antage is robust across all four game families and a wide range of mass and correlation threshold c hoices: in ev ery cell of every table, base mo dels win the ma jorit y of comparisons with p < 0 . 05. The cell corresp onding to our c hosen thresholds (mass ≥ 0 . 8, correlation ≥ 0 . 3) is highlighted in b old. T able 5: Sensitivity analysis: Bargaining — Base vs. Aligned wins across mass and min-corr thresholds Mass ↓ / Min-Corr → None 0.1 0.2 0.3 0.4 0.5 None 100:20 ( < 10 − 14 ) 96:10 ( < 10 − 19 ) 88:8 ( < 10 − 18 ) 79:6 ( < 10 − 17 ) 37:2 ( < 10 − 9 ) — 0.5 93:16 ( < 10 − 14 ) 90:9 ( < 10 − 18 ) 83:8 ( < 10 − 17 ) 77:6 ( < 10 − 17 ) 37:2 ( < 10 − 9 ) — 0.6 92:13 ( < 10 − 16 ) 90:8 ( < 10 − 19 ) 83:7 ( < 10 − 18 ) 77:5 ( < 10 − 18 ) 37:1 ( < 10 − 10 ) — 0.7 91:9 ( < 10 − 18 ) 90:5 ( < 10 − 21 ) 83:5 ( < 10 − 19 ) 77:4 ( < 10 − 19 ) 37:1 ( < 10 − 10 ) — 0.8 86:6 ( < 10 − 19 ) 86:5 ( < 10 − 20 ) 81:5 ( < 10 − 19 ) 75:4 ( < 10 − 18 ) 36:1 ( < 10 − 10 ) — 0.9 84:4 ( < 10 − 21 ) 84:4 ( < 10 − 21 ) 79:4 ( < 10 − 19 ) 73:4 ( < 10 − 18 ) 34:1 ( < 10 − 9 ) — T able 6: Sensitivity analysis: Persuasion — Base vs. Aligned wins across mass and min-corr thresholds Mass ↓ / Min-Corr → None 0.1 0.2 0.3 0.4 0.5 None 87:33 ( < 10 − 7 ) 80:32 ( < 10 − 6 ) 52:29 ( < 10 − 3 ) 46:7 ( < 10 − 8 ) — — 0.5 61:31 ( < 10 − 3 ) 58:31 ( < 10 − 3 ) 40:28 (0 . 09) 34:6 ( < 10 − 6 ) — — 0.6 58:31 ( < 10 − 3 ) 56:31 ( < 10 − 3 ) 39:28 (0 . 11) 33:6 ( < 10 − 6 ) — — 0.7 44:22 ( < 10 − 3 ) 44:22 ( < 10 − 3 ) 38:21 (0 . 02) 32:4 ( < 10 − 7 ) — — 0.8 35:17 ( < 10 − 3 ) 35:17 ( < 10 − 3 ) 35:17 ( < 10 − 3 ) 32:4 ( < 10 − 7 ) — — 0.9 31:2 ( < 10 − 8 ) 31:2 ( < 10 − 8 ) 31:2 ( < 10 − 8 ) 28:1 ( < 10 − 8 ) — — T able 7: Sensitivity analysis: Negotiation — Base vs. Aligned wins across mass and min-corr thresholds Mass ↓ / Min-Corr → None 0.1 0.2 0.3 0.4 0.5 None 94:26 ( < 10 − 10 ) 76:5 ( < 10 − 17 ) 61:3 ( < 10 − 15 ) 35:1 ( < 10 − 10 ) 3:0 (0 . 13) — 0.5 83:16 ( < 10 − 12 ) 69:5 ( < 10 − 16 ) 55:3 ( < 10 − 13 ) 32:1 ( < 10 − 9 ) 3:0 (0 . 13) — 0.6 80:16 ( < 10 − 11 ) 67:5 ( < 10 − 15 ) 53:3 ( < 10 − 13 ) 31:1 ( < 10 − 9 ) 3:0 (0 . 13) — 0.7 77:14 ( < 10 − 12 ) 65:5 ( < 10 − 14 ) 52:3 ( < 10 − 13 ) 30:1 ( < 10 − 8 ) 3:0 (0 . 13) — 0.8 64:7 ( < 10 − 13 ) 56:2 ( < 10 − 15 ) 46:1 ( < 10 − 13 ) 25:1 ( < 10 − 7 ) 3:0 (0 . 13) — 0.9 36:1 ( < 10 − 10 ) 33:1 ( < 10 − 9 ) 27:1 ( < 10 − 7 ) 16:1 ( < 10 − 4 ) 3:0 (0 . 13) — T able 8: Sensitivity analysis: Matrix — Base vs. Aligned wins across mass and min-corr thresholds Mass ↓ / Min-Corr → None 0.1 0.2 0.3 0.4 0.5 None 99:21 ( < 10 − 13 ) 98:20 ( < 10 − 14 ) 95:17 ( < 10 − 14 ) 81:13 ( < 10 − 13 ) 49:8 ( < 10 − 8 ) 4:0 (0 . 06) 0.5 99:20 ( < 10 − 14 ) 98:19 ( < 10 − 14 ) 95:17 ( < 10 − 14 ) 81:13 ( < 10 − 13 ) 49:8 ( < 10 − 8 ) 4:0 (0 . 06) 0.6 99:20 ( < 10 − 14 ) 98:19 ( < 10 − 14 ) 95:17 ( < 10 − 14 ) 81:13 ( < 10 − 13 ) 49:8 ( < 10 − 8 ) 4:0 (0 . 06) 0.7 99:20 ( < 10 − 14 ) 98:19 ( < 10 − 14 ) 95:17 ( < 10 − 14 ) 81:13 ( < 10 − 13 ) 49:8 ( < 10 − 8 ) 4:0 (0 . 06) 0.8 99:20 ( < 10 − 14 ) 98:19 ( < 10 − 14 ) 95:17 ( < 10 − 14 ) 81:13 ( < 10 − 13 ) 49:8 ( < 10 − 8 ) 4:0 (0 . 06) 0.9 98:18 ( < 10 − 15 ) 97:17 ( < 10 − 15 ) 94:15 ( < 10 − 15 ) 80:11 ( < 10 − 14 ) 48:6 ( < 10 − 9 ) 4:0 (0 . 06) T able 9: Sensitivity analysis: Ov erall (all 4 families) — Base vs. Aligned wins across mass and min-corr thresholds Mass ↓ / Min-Corr → None 0.1 0.2 0.3 0.4 0.5 None 380:100 ( < 10 − 40 ) 350:67 ( < 10 − 47 ) 296:57 ( < 10 − 40 ) 241:27 ( < 10 − 44 ) 89:10 ( < 10 − 17 ) 4:0 (0 . 06) 0.5 336:83 ( < 10 − 37 ) 315:64 ( < 10 − 41 ) 273:56 ( < 10 − 35 ) 224:26 ( < 10 − 41 ) 89:10 ( < 10 − 17 ) 4:0 (0 . 06) 0.6 329:80 ( < 10 − 37 ) 311:63 ( < 10 − 41 ) 270:55 ( < 10 − 35 ) 222:25 ( < 10 − 41 ) 89:9 ( < 10 − 18 ) 4:0 (0 . 06) 0.7 311:65 ( < 10 − 40 ) 297:51 ( < 10 − 43 ) 268:46 ( < 10 − 39 ) 220:22 ( < 10 − 42 ) 89:9 ( < 10 − 18 ) 4:0 (0 . 06) 0.8 284:50 ( < 10 − 41 ) 275:43 ( < 10 − 43 ) 257:40 ( < 10 − 40 ) 213:22 ( < 10 − 41 ) 88:9 ( < 10 − 17 ) 4:0 (0 . 06) 0.9 249:25 ( < 10 − 48 ) 245:24 ( < 10 − 47 ) 231:22 ( < 10 − 45 ) 197:17 ( < 10 − 40 ) 85:7 ( < 10 − 18 ) 4:0 (0 . 06) D P er-P air Prediction Results This appendix supplements Section 4 . T ables 10 – 11 list the a verage decision-token mass and P earson correlation with human decisions for every same-pro vider pair across all six datasets: the four main game families (with PD and BoS com bined into a single matrix v ector per pair) and the t wo b oundary condition datasets. P airs are num bered consistently across tables and corresp ond to the mo del inv en tory in App endix A . T able 10: Per-pair prediction results: Bargaining, P ersuasion, and Negotiation. P air num b ers correspond to App endix A . Bargaining P ersuasion Negotiation # Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A 1 0.99 1.00 0.26 -0.04 0.74 0.01 0.11 -0.05 0.90 0.95 0.16 0.04 2 0.81 0.79 -0.02 0.01 0.91 0.02 0.36 -0.07 0.71 0.42 -0.09 -0.01 3 0.79 0.79 -0.29 0.01 0.91 0.02 0.37 -0.07 0.65 0.42 -0.06 -0.01 4 0.75 0.82 -0.20 -0.06 0.91 0.02 0.31 -0.04 0.76 0.32 0.05 0.02 5 0.84 0.82 -0.27 -0.06 0.94 0.02 0.36 -0.04 0.80 0.32 0.00 0.02 6 0.61 0.28 -0.12 0.03 0.94 0.05 0.37 0.09 0.58 0.36 -0.04 0.05 7 0.68 0.28 0.27 0.03 0.94 0.05 0.39 0.09 0.65 0.36 0.27 0.05 8 0.90 0.90 0.11 0.21 0.67 0.94 0.07 0.21 0.73 0.70 0.32 0.07 9 0.85 0.60 0.10 0.01 0.68 0.94 0.14 0.34 0.65 0.43 0.30 0.22 10 0.96 1.00 0.37 -0.06 0.70 0.88 0.08 0.25 0.78 0.80 -0.31 -0.13 11 0.97 0.99 0.19 -0.06 0.69 0.94 0.11 0.26 0.83 0.92 -0.37 -0.28 12 0.76 0.41 -0.29 -0.08 0.93 0.94 0.35 0.29 0.58 0.52 0.31 0.30 13 0.99 0.99 0.43 -0.03 0.69 0.64 0.12 0.11 0.77 0.90 -0.21 -0.26 14 0.68 0.44 0.03 0.10 0.71 0.59 0.12 0.10 0.58 0.43 0.26 0.03 15 0.97 1.00 0.45 0.42 0.95 0.94 0.32 0.27 0.82 0.95 0.27 0.14 16 0.83 0.65 0.08 0.11 0.72 0.64 0.10 0.05 0.41 0.25 -0.06 -0.01 17 0.99 1.00 0.37 0.15 0.94 0.95 0.36 0.27 0.95 1.00 0.34 0.24 18 0.70 0.21 0.10 0.10 0.69 0.62 0.15 0.06 0.36 0.51 -0.07 -0.06 19 0.65 0.52 -0.15 -0.01 0.93 0.92 0.35 0.35 0.43 0.20 -0.14 -0.05 20 0.99 1.00 0.44 0.08 0.66 0.95 0.15 0.23 0.89 0.98 0.17 0.06 21 0.67 0.36 -0.01 -0.06 0.93 0.93 0.33 0.31 0.50 0.33 0.01 0.01 22 0.98 0.99 0.33 -0.22 0.72 0.62 0.18 0.11 0.85 0.94 -0.22 -0.14 23 0.99 1.00 0.39 -0.05 0.93 0.95 0.36 0.25 0.80 1.00 0.06 -0.07 24 0.98 0.99 0.32 -0.32 0.73 0.63 0.19 0.11 0.88 1.00 0.26 -0.05 25 0.96 1.00 0.45 -0.01 0.75 0.25 0.10 -0.04 0.96 1.00 0.27 0.06 26 0.91 1.00 0.43 0.11 0.76 0.07 0.12 0.01 0.91 0.97 0.36 -0.01 27 0.89 1.00 0.44 0.02 0.76 0.00 0.12 -0.04 0.92 1.00 0.19 -0.05 28 0.95 1.00 0.30 0.05 0.72 0.00 0.12 -0.04 0.89 1.00 -0.14 -0.03 29 0.99 1.00 0.41 0.07 0.73 0.01 0.12 -0.07 0.91 1.00 0.32 -0.01 30 0.99 1.00 0.35 -0.00 0.74 0.02 0.14 -0.04 0.93 0.88 0.22 0.03 31 0.99 1.00 0.38 0.11 0.70 0.01 0.15 0.13 0.90 0.99 0.05 0.03 32 0.97 1.00 0.30 -0.01 0.94 0.00 0.32 -0.04 0.80 1.00 0.35 0.01 33 0.98 1.00 0.35 0.00 0.93 0.01 0.35 0.04 0.91 1.00 0.26 -0.02 34 0.99 1.00 0.44 0.13 0.94 0.00 0.32 0.08 0.96 1.00 0.07 -0.04 35 0.99 1.00 0.42 0.08 0.92 0.00 0.36 0.02 0.96 1.00 0.38 0.04 36 0.97 1.00 0.41 0.27 0.95 0.95 0.35 0.25 0.95 0.99 0.38 -0.10 37 0.96 1.00 0.42 0.16 0.95 0.95 0.34 0.30 0.95 1.00 0.37 -0.12 38 0.99 1.00 0.11 0.01 0.95 0.95 0.36 0.23 0.82 0.77 0.32 -0.30 39 0.99 1.00 0.16 -0.10 0.95 0.95 0.36 0.31 0.78 0.86 0.09 -0.32 40 0.99 1.00 0.44 0.21 0.94 0.95 0.32 0.26 0.94 0.99 0.41 -0.15 41 0.74 0.82 0.42 0.05 0.97 0.37 0.11 -0.03 0.76 0.90 -0.24 -0.00 42 0.58 0.99 0.34 0.44 0.64 0.95 0.08 0.17 0.05 0.91 -0.19 -0.16 43 0.97 0.99 0.38 0.03 0.68 0.02 0.11 0.07 0.76 0.89 -0.33 0.04 44 0.97 0.99 0.42 -0.21 0.86 0.90 0.15 0.22 0.87 0.96 0.28 -0.18 45 0.98 1.00 0.43 -0.13 0.84 0.92 0.20 0.26 0.90 0.99 0.26 -0.04 46 0.58 0.41 0.11 -0.00 0.65 0.00 0.11 -0.04 0.54 0.45 -0.19 0.04 47 0.99 1.00 0.45 -0.03 0.85 0.94 0.21 0.22 0.85 1.00 0.35 -0.02 48 0.99 1.00 0.45 -0.22 0.84 0.95 0.20 0.23 0.82 1.00 0.34 -0.04 49 0.72 0.62 0.10 -0.02 0.91 0.00 0.37 -0.04 0.51 0.03 0.07 0.05 50 0.99 1.00 0.35 -0.36 0.84 0.71 0.20 0.07 0.86 1.00 0.10 -0.06 51 1.00 1.00 0.39 -0.27 0.84 0.72 0.20 0.10 0.90 1.00 0.11 -0.09 52 0.97 0.75 0.35 0.20 0.94 0.96 0.30 0.20 0.92 0.14 0.34 -0.12 53 0.98 1.00 0.43 0.18 0.95 0.94 0.36 0.31 0.90 0.99 0.35 0.22 c ontinue d on next p age (c ontinue d) Bargaining P ersuasion Negotiation # Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A 54 0.98 1.00 0.43 0.09 0.67 0.00 0.10 -0.07 0.76 0.79 0.11 0.03 55 0.98 1.00 0.43 0.06 0.67 0.00 0.09 0.00 0.73 0.68 0.12 0.00 56 0.98 0.99 0.43 0.03 0.67 0.22 0.09 -0.08 0.73 0.76 0.12 0.08 57 0.99 1.00 0.40 0.06 0.95 0.01 0.36 -0.03 0.80 0.93 0.13 0.00 58 0.99 1.00 0.32 -0.31 0.78 0.71 0.15 0.13 0.88 0.92 -0.10 -0.21 59 0.98 1.00 0.38 -0.23 0.94 0.91 0.36 0.25 0.94 0.99 0.04 -0.39 60 0.96 1.00 0.31 0.35 0.94 0.93 0.33 0.16 0.95 0.99 0.23 0.14 61 0.00 0.96 0.39 0.08 0.00 0.98 -0.05 0.00 0.00 0.45 -0.02 -0.11 62 0.97 1.00 0.29 -0.12 0.90 0.84 0.37 0.28 0.92 0.71 0.30 0.08 63 0.97 1.00 0.34 -0.10 0.91 0.50 0.35 0.15 0.84 0.92 0.17 -0.19 64 0.95 1.00 0.43 0.43 0.95 0.91 0.31 0.20 0.83 0.84 -0.19 -0.15 65 0.99 1.00 0.22 -0.25 0.80 0.74 0.24 0.20 0.94 1.00 0.30 -0.12 66 0.99 1.00 0.19 -0.23 0.83 0.64 0.19 0.16 0.94 1.00 0.17 -0.23 67 0.99 1.00 0.44 -0.14 0.77 0.72 0.18 0.05 0.94 0.99 0.30 0.04 68 0.95 0.99 0.34 0.14 0.94 0.92 0.35 0.31 0.91 0.99 0.30 -0.09 69 0.80 0.90 0.42 -0.15 0.95 0.86 0.31 0.15 0.88 0.96 0.27 -0.13 70 0.93 0.99 0.38 0.27 0.96 0.95 0.23 0.19 0.94 0.98 0.25 -0.12 71 0.98 0.98 0.43 0.18 0.82 0.69 0.17 0.09 0.93 0.97 0.42 -0.17 72 0.99 1.00 0.40 0.11 0.95 0.92 0.34 0.31 0.94 0.99 0.32 0.40 73 0.99 1.00 0.38 0.23 0.81 0.94 0.17 0.32 0.94 0.99 0.39 -0.01 74 0.97 1.00 0.35 0.28 0.81 0.91 0.15 0.29 0.92 0.98 0.30 0.03 75 0.94 0.33 0.38 0.24 0.93 0.94 0.20 0.23 0.84 0.37 0.24 0.11 76 0.98 0.99 0.44 0.40 0.82 0.70 0.13 0.06 0.93 0.98 0.37 -0.21 77 0.97 1.00 0.42 0.05 0.79 0.87 0.17 0.20 0.92 1.00 0.41 0.26 78 0.96 1.00 0.41 0.36 0.82 0.90 0.15 0.26 0.97 1.00 0.31 0.12 79 0.94 1.00 0.30 -0.08 0.81 0.94 0.14 0.31 0.86 1.00 0.32 0.15 80 0.99 0.99 0.36 0.30 0.80 0.69 0.17 0.10 0.88 0.99 0.16 -0.03 81 0.98 1.00 0.37 0.08 0.82 0.95 0.12 0.31 0.88 0.97 0.12 0.02 82 0.99 1.00 0.42 0.11 0.81 0.95 0.20 0.28 0.82 1.00 0.28 0.06 83 0.99 1.00 0.42 -0.07 0.81 0.68 0.20 0.13 0.82 1.00 0.28 -0.03 84 0.99 1.00 0.41 -0.10 0.79 0.94 0.20 0.22 0.78 1.00 0.26 -0.03 85 0.99 1.00 0.36 0.05 0.81 0.94 0.16 0.26 0.82 0.98 0.34 0.03 86 0.98 1.00 0.34 -0.16 0.82 0.89 0.21 0.27 0.91 1.00 0.30 -0.19 87 0.99 0.09 0.29 0.01 0.90 0.45 0.31 0.35 0.88 0.12 0.32 -0.02 88 0.99 0.99 0.42 0.27 0.81 0.70 0.14 0.09 0.81 0.98 0.36 0.15 89 0.99 1.00 0.34 0.06 0.80 0.94 0.20 0.28 0.84 1.00 0.30 0.08 90 0.99 1.00 0.13 -0.11 0.94 0.90 0.35 0.18 0.94 1.00 0.28 -0.10 91 0.99 0.99 0.36 -0.22 0.94 0.67 0.32 0.11 0.84 0.97 0.35 -0.03 92 0.98 1.00 0.35 0.08 0.93 0.92 0.37 0.30 0.86 0.80 0.32 0.16 93 0.99 1.00 0.39 0.04 0.94 0.94 0.34 0.31 0.81 1.00 0.23 -0.01 94 0.99 0.99 0.25 -0.11 0.80 0.70 0.20 0.10 0.95 0.99 0.37 -0.17 95 0.99 1.00 0.33 -0.04 0.93 0.94 0.33 0.24 0.94 1.00 0.19 -0.11 96 0.99 1.00 0.40 0.06 0.91 0.92 0.34 0.23 0.80 1.00 0.35 -0.11 97 1.00 1.00 0.36 -0.02 0.94 0.93 0.34 0.27 0.92 1.00 0.39 -0.12 98 0.99 1.00 0.31 -0.05 0.79 0.69 0.18 0.13 0.93 1.00 0.30 -0.08 99 0.99 1.00 0.23 -0.07 0.81 0.70 0.18 0.13 0.96 1.00 0.24 -0.13 100 0.93 0.99 0.39 0.36 0.93 0.83 0.33 0.25 0.81 0.98 0.10 0.11 101 0.93 0.99 0.30 0.08 0.93 0.89 0.35 0.28 0.85 0.97 0.33 -0.16 102 0.99 1.00 0.40 0.10 0.95 0.94 0.31 0.29 0.72 0.98 -0.13 -0.05 103 0.72 0.83 0.08 0.04 0.76 0.70 0.08 0.19 0.77 0.50 0.28 0.03 104 0.62 0.66 0.29 0.31 0.86 0.77 0.23 0.09 0.77 0.79 0.14 0.18 105 0.72 0.64 0.21 0.23 0.97 0.95 0.31 0.23 0.69 0.70 -0.11 -0.09 106 0.83 0.79 0.10 0.07 0.88 0.75 0.29 0.11 0.83 0.82 0.33 -0.01 107 0.66 0.38 0.26 0.23 0.94 0.95 0.26 0.22 0.72 0.46 0.06 -0.02 108 0.59 0.77 0.04 0.07 0.76 0.74 0.10 0.13 0.80 0.92 -0.06 0.29 109 0.72 0.65 0.04 0.07 0.76 0.95 0.11 0.29 0.64 0.38 0.16 0.06 110 0.99 0.94 0.32 -0.05 0.93 0.95 0.34 0.23 0.78 0.72 -0.24 -0.24 111 0.98 1.00 0.41 0.19 0.83 0.94 0.10 0.28 0.95 1.00 0.15 0.02 112 0.97 1.00 0.32 0.34 0.80 0.92 0.17 0.20 0.83 0.88 -0.09 -0.03 113 0.99 1.00 0.32 0.08 0.93 0.95 0.37 0.23 0.91 0.83 0.20 -0.07 114 0.77 0.85 0.11 0.07 0.66 0.62 0.05 0.04 0.62 0.65 0.06 -0.32 115 0.96 1.00 0.42 0.00 0.66 0.95 0.15 0.23 0.59 0.90 0.08 -0.33 116 0.95 1.00 0.31 0.37 0.68 0.95 0.10 0.31 0.70 0.82 -0.32 -0.00 117 0.97 1.00 0.45 0.07 0.68 0.95 0.14 0.28 0.80 0.89 -0.05 -0.33 118 0.97 1.00 0.42 0.22 0.94 0.94 0.37 0.30 0.85 0.87 -0.18 -0.19 c ontinue d on next p age (c ontinue d) Bargaining P ersuasion Negotiation # Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A 119 0.97 1.00 0.42 0.02 0.94 0.95 0.38 0.31 0.87 0.92 -0.29 -0.32 120 0.93 0.96 0.40 0.35 0.68 0.58 0.05 0.05 0.76 0.70 -0.01 0.23 T able 11: Per-pair prediction results: Rep eated Matrix Games, One-Shot 2 × 2 Games, and Binary Lotteries. P air n umbers corresp ond to App endix A . Matrix One-Shot 2 × 2 Lotteries # Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A 1 0.99 1.00 0.41 0.32 0.81 1.00 -0.02 -0.05 0.93 1.00 0.10 0.45 2 0.99 1.00 0.35 -0.05 0.88 0.99 -0.00 0.05 0.98 1.00 0.29 0.55 3 0.99 1.00 0.39 -0.05 0.88 0.99 -0.01 0.05 0.97 1.00 0.22 0.55 4 1.00 1.00 0.43 0.18 0.82 0.98 -0.02 -0.00 0.96 1.00 0.29 0.67 5 0.99 1.00 0.36 0.18 0.82 0.98 0.06 -0.00 0.96 1.00 0.31 0.67 6 1.00 1.00 0.41 -0.06 0.93 0.99 0.03 0.12 0.97 1.00 0.42 0.75 7 1.00 1.00 0.41 -0.06 0.95 0.99 0.05 0.12 0.95 1.00 0.25 0.75 8 0.99 1.00 0.35 0.43 0.96 0.99 -0.01 0.00 0.98 1.00 0.01 0.30 9 0.99 1.00 0.47 0.34 0.74 0.98 -0.00 -0.02 0.95 1.00 0.28 0.66 10 0.99 1.00 0.28 0.05 0.92 0.90 0.04 0.08 0.98 1.00 0.34 0.67 11 1.00 1.00 0.39 0.35 0.95 0.99 0.01 -0.02 0.99 0.99 0.49 0.72 12 0.99 1.00 0.39 0.37 0.73 0.79 0.03 -0.05 0.93 0.99 0.37 0.59 13 1.00 1.00 0.44 0.44 0.92 0.99 0.03 -0.05 0.98 1.00 0.65 0.77 14 0.99 1.00 0.19 0.13 0.92 1.00 0.01 0.02 0.99 1.00 0.39 0.62 15 0.98 1.00 0.24 0.43 0.96 0.99 -0.04 -0.01 0.98 0.97 0.40 0.35 16 0.99 1.00 0.30 -0.01 0.97 1.00 -0.03 0.06 0.99 1.00 0.68 0.59 17 0.98 1.00 0.37 0.14 0.96 1.00 0.04 -0.06 0.97 1.00 0.68 0.56 18 0.99 1.00 0.40 0.16 0.94 1.00 0.00 0.02 0.94 1.00 0.68 0.69 19 0.99 1.00 0.20 0.27 0.91 0.84 -0.00 0.02 0.98 0.99 0.36 0.71 20 0.99 1.00 0.33 0.25 0.96 1.00 -0.02 -0.13 0.98 0.99 0.69 0.75 21 0.99 1.00 0.40 0.41 0.85 0.88 0.04 0.01 0.99 1.00 0.58 0.68 22 0.99 1.00 0.41 0.28 0.96 1.00 -0.00 0.04 0.96 1.00 0.56 0.73 23 0.99 1.00 0.41 0.33 0.98 1.00 -0.02 0.08 0.99 1.00 0.74 0.75 24 1.00 1.00 0.48 0.21 0.97 1.00 -0.02 -0.02 0.99 1.00 0.79 0.76 25 0.99 1.00 0.28 0.17 0.93 1.00 -0.06 -0.04 0.96 1.00 0.03 0.29 26 0.98 1.00 0.30 0.08 0.90 0.98 0.04 0.04 0.98 0.99 0.20 0.48 27 0.99 1.00 0.39 -0.34 0.98 1.00 0.05 0.05 0.97 1.00 0.22 0.43 28 0.99 1.00 0.18 -0.09 0.94 1.00 0.00 0.03 0.98 1.00 0.04 0.02 29 0.99 1.00 0.37 0.12 0.95 1.00 -0.00 0.04 0.98 1.00 0.24 0.59 30 0.99 1.00 0.44 -0.08 0.94 1.00 0.01 0.02 0.97 1.00 0.12 0.61 31 1.00 1.00 0.48 0.24 0.93 1.00 0.01 -0.13 0.97 0.97 0.25 0.67 32 0.99 1.00 -0.09 0.23 0.86 1.00 0.04 -0.05 0.93 1.00 0.09 0.52 33 1.00 1.00 0.48 0.24 0.95 1.00 -0.01 0.01 0.95 1.00 0.44 0.67 34 1.00 1.00 0.47 0.36 0.97 1.00 -0.03 -0.01 0.97 1.00 0.56 0.70 35 1.00 1.00 0.47 0.40 0.90 1.00 -0.02 0.01 0.96 1.00 0.46 0.69 36 1.00 1.00 0.37 0.26 0.98 1.00 -0.05 -0.04 0.99 1.00 0.52 0.67 37 1.00 1.00 0.36 0.29 0.98 0.99 -0.07 -0.03 1.00 1.00 0.56 0.60 38 1.00 1.00 0.42 0.34 1.00 0.99 0.04 -0.07 1.00 1.00 0.75 0.72 39 1.00 1.00 0.44 0.32 1.00 1.00 0.01 0.03 1.00 1.00 0.74 0.67 40 1.00 1.00 0.49 0.42 0.94 1.00 0.05 0.14 0.94 1.00 0.57 0.62 41 0.98 0.94 0.25 0.31 0.93 0.95 -0.01 -0.02 0.93 0.94 0.09 0.18 42 0.86 1.00 0.25 0.48 0.66 1.00 -0.02 -0.02 0.89 1.00 0.05 0.57 43 0.99 1.00 0.05 0.34 0.97 0.92 0.03 -0.03 0.99 1.00 0.46 0.53 44 0.99 1.00 0.15 0.18 0.93 1.00 -0.04 -0.02 0.98 1.00 0.33 0.50 45 1.00 1.00 0.28 -0.15 0.96 0.94 0.04 0.03 0.98 1.00 0.35 0.55 46 1.00 1.00 0.23 -0.29 0.95 1.00 0.04 0.06 0.98 1.00 0.32 0.63 47 1.00 1.00 0.41 -0.15 0.94 1.00 0.01 0.14 0.96 1.00 0.19 0.67 48 1.00 1.00 0.42 -0.21 0.95 1.00 -0.01 0.06 0.97 1.00 0.24 0.73 49 0.99 1.00 0.30 -0.30 0.97 1.00 -0.01 0.08 0.98 1.00 0.26 0.58 50 1.00 1.00 0.51 0.02 0.95 1.00 -0.03 0.01 0.97 1.00 0.35 0.75 51 1.00 1.00 0.51 0.06 0.96 1.00 -0.00 -0.03 0.98 1.00 0.57 0.77 52 0.99 0.84 0.42 0.23 0.92 0.00 0.05 0.01 0.91 0.01 0.49 0.34 53 1.00 1.00 0.45 0.42 0.95 1.00 0.03 -0.04 0.96 1.00 0.70 0.57 54 0.99 1.00 0.38 0.10 0.95 0.97 -0.06 0.15 0.95 0.76 0.30 0.57 c ontinue d on next p age (c ontinue d) Matrix One-Shot 2 × 2 Lotteries # Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A 55 0.99 1.00 0.41 0.11 0.88 0.77 -0.03 0.12 0.96 0.82 0.39 0.56 56 0.99 1.00 0.41 -0.17 0.88 0.98 -0.03 -0.05 0.96 0.98 0.39 0.71 57 1.00 1.00 0.45 0.09 0.98 1.00 -0.03 -0.12 0.98 1.00 0.05 0.67 58 0.99 1.00 0.48 0.27 0.93 1.00 0.01 0.02 0.95 1.00 0.64 0.78 59 0.99 1.00 0.12 -0.26 0.86 0.91 0.03 -0.01 0.98 1.00 0.46 0.53 60 0.99 1.00 0.15 0.38 0.97 1.00 0.01 0.01 0.99 0.98 0.32 0.59 61 0.00 0.68 0.02 0.10 0.04 1.00 -0.07 0.00 0.03 0.00 -0.03 0.13 62 0.99 1.00 0.36 0.26 0.94 1.00 0.02 0.13 0.99 1.00 0.26 0.58 63 0.99 1.00 0.39 -0.08 0.93 1.00 -0.01 -0.02 0.98 1.00 0.46 0.50 64 1.00 1.00 0.32 0.21 0.98 1.00 -0.01 0.03 1.00 1.00 0.49 0.60 65 1.00 1.00 0.47 0.21 0.94 1.00 -0.04 0.04 0.98 1.00 0.67 0.71 66 0.99 1.00 0.47 0.47 0.93 1.00 -0.03 0.08 0.98 1.00 0.68 0.80 67 1.00 1.00 0.48 0.16 0.99 1.00 -0.04 -0.10 0.98 1.00 0.65 0.68 68 0.99 0.99 0.37 0.36 0.94 1.00 -0.01 0.00 1.00 1.00 0.30 0.41 69 0.98 0.99 0.44 0.42 0.93 0.98 0.03 -0.01 0.98 0.99 0.25 0.18 70 0.97 0.94 0.20 0.22 0.96 0.99 -0.01 -0.01 0.99 0.99 0.24 0.23 71 0.99 0.99 0.33 0.46 0.97 0.97 0.02 0.01 1.00 1.00 0.24 0.28 72 0.99 1.00 0.31 0.21 0.96 1.00 0.03 0.03 0.99 1.00 0.47 0.51 73 0.99 1.00 0.24 0.28 0.98 1.00 0.01 -0.03 0.99 1.00 0.48 0.26 74 0.99 1.00 0.27 0.06 0.97 1.00 0.01 -0.06 0.98 0.99 0.55 0.40 75 0.99 0.96 0.29 0.12 0.95 1.00 -0.05 0.04 0.99 1.00 0.47 0.29 76 1.00 1.00 0.44 0.44 0.96 0.96 0.05 0.03 1.00 1.00 0.22 0.32 77 1.00 1.00 0.37 0.14 0.96 1.00 0.05 -0.00 0.99 1.00 0.60 0.56 78 0.98 0.99 0.24 0.14 0.90 0.99 0.00 -0.01 0.99 1.00 0.54 0.63 79 1.00 1.00 0.31 0.04 0.99 1.00 -0.00 0.04 0.99 1.00 0.46 0.25 80 1.00 1.00 0.40 0.44 0.91 0.97 -0.00 -0.06 0.99 1.00 0.67 0.70 81 1.00 1.00 0.30 0.26 0.99 1.00 0.00 0.02 0.98 1.00 0.57 0.57 82 1.00 1.00 0.49 0.40 0.99 0.98 -0.05 0.05 1.00 1.00 0.73 0.65 83 1.00 1.00 0.49 0.35 0.99 1.00 -0.05 0.07 1.00 1.00 0.73 0.66 84 1.00 1.00 0.40 0.36 0.99 1.00 -0.01 -0.01 1.00 1.00 0.75 0.67 85 0.99 1.00 0.37 0.32 0.96 1.00 0.05 0.03 0.99 1.00 0.70 0.69 86 0.99 1.00 0.35 0.21 0.99 1.00 0.05 -0.03 1.00 1.00 0.63 0.59 87 0.99 0.97 0.46 0.38 0.98 0.95 -0.06 -0.01 0.99 0.99 0.74 0.72 88 1.00 1.00 0.38 0.39 0.91 0.99 0.05 0.16 0.99 1.00 0.69 0.69 89 1.00 1.00 0.46 0.33 0.99 1.00 -0.03 -0.07 1.00 1.00 0.73 0.69 90 1.00 1.00 0.47 0.39 0.99 1.00 -0.03 -0.11 0.99 1.00 0.72 0.72 91 0.99 1.00 0.42 0.26 0.97 1.00 -0.06 0.08 1.00 1.00 0.64 0.65 92 1.00 1.00 0.34 0.25 0.96 1.00 -0.01 0.03 0.99 1.00 0.53 0.62 93 1.00 1.00 0.45 0.44 0.96 1.00 -0.09 -0.09 1.00 1.00 0.79 0.75 94 1.00 1.00 0.49 0.44 0.96 1.00 -0.02 -0.05 0.98 1.00 0.73 0.72 95 1.00 1.00 0.50 0.41 0.97 1.00 -0.05 -0.08 0.99 1.00 0.76 0.74 96 1.00 1.00 0.48 0.26 0.90 1.00 0.04 -0.10 0.99 1.00 0.80 0.71 97 1.00 1.00 0.41 0.40 0.98 1.00 -0.06 -0.03 0.98 1.00 0.77 0.72 98 1.00 1.00 0.48 0.28 0.98 1.00 -0.07 -0.06 1.00 1.00 0.81 0.72 99 1.00 1.00 0.51 0.34 0.99 1.00 -0.11 0.02 0.99 1.00 0.77 0.74 100 0.98 0.99 0.26 0.05 0.97 0.99 0.06 0.01 0.97 0.99 0.23 0.57 101 0.97 0.99 0.34 -0.01 0.94 1.00 0.01 0.05 0.98 1.00 0.48 0.62 102 0.99 1.00 0.26 0.23 0.98 0.98 -0.04 -0.09 0.99 1.00 0.63 0.70 103 0.99 1.00 0.22 0.40 0.90 1.00 0.02 0.02 0.91 1.00 0.33 0.57 104 0.93 0.90 0.34 0.43 0.82 0.94 0.02 0.04 0.91 0.97 -0.09 -0.01 105 0.97 0.97 0.02 0.09 0.89 0.91 -0.00 -0.02 0.88 0.97 0.03 0.05 106 0.97 0.94 0.25 0.18 0.88 0.93 0.03 -0.00 0.98 0.99 -0.01 0.18 107 0.98 0.99 -0.05 0.17 0.97 0.98 0.02 -0.01 0.99 0.99 0.26 0.15 108 0.98 0.95 0.22 0.18 0.95 0.99 -0.01 0.00 0.98 0.99 0.18 0.41 109 0.99 0.99 0.36 0.33 0.97 0.94 0.02 -0.01 0.99 1.00 0.35 0.52 110 0.99 1.00 0.46 0.03 0.96 1.00 0.04 -0.06 0.99 0.93 0.72 0.63 111 1.00 1.00 0.30 0.29 0.93 1.00 0.08 0.03 0.99 1.00 0.30 0.30 112 0.99 1.00 0.21 -0.13 0.97 1.00 0.01 0.00 0.99 1.00 0.18 0.32 113 1.00 1.00 0.45 0.01 0.83 1.00 0.04 0.08 0.96 1.00 0.18 0.62 114 0.99 0.99 0.30 0.28 0.92 0.97 0.01 -0.02 0.93 0.81 0.19 -0.00 115 0.99 1.00 0.46 0.34 0.91 1.00 -0.00 0.08 0.97 1.00 0.30 0.48 116 0.99 1.00 0.36 0.27 0.94 1.00 -0.01 0.03 0.99 1.00 0.53 0.55 117 0.99 1.00 0.39 0.24 0.93 1.00 0.06 0.02 0.93 1.00 0.71 0.72 118 1.00 1.00 0.45 0.41 0.88 0.87 -0.02 -0.03 0.97 1.00 0.48 0.82 119 1.00 1.00 0.45 0.32 0.87 0.90 -0.02 0.02 0.93 1.00 0.13 0.74 c ontinue d on next p age (c ontinue d) Matrix One-Shot 2 × 2 Lotteries # Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A Mass B Mass A Corr B Corr A 120 0.98 0.97 0.07 0.02 0.92 0.99 0.02 0.03 0.99 0.98 0.31 0.35 T able 12: Game conﬁguration parameters for each GLEE family . F amily P arameter V alues Description Bargaining Stakes $ 100, $ 10K, $ 1M T otal amount to divide Information Complete, Incomplete Whether play ers know each other’s dis- coun t factors Messages Allow ed, Not allow ed Whether free-text messages accompany of- fers Discoun t δ 1 0.8, 0.9, 0.95, 1.0 Alice’s p er-round discount factor Discoun t δ 2 0.8, 0.9, 0.95, 1.0 Bob’s p er-round discount factor Max rounds 12, ∞ Maxim um num b er of alternating oﬀers P ersuasion Quality prob. p 0.33, 0.5, 0.8 Prior probability of high quality V alue v 1.2, 1.25, 2.0, 3.0, 4.0 High-qualit y v alue diﬀerential Seller knowledge Knows, Uninformed Whether seller kno ws pro duct quality Buy er my opic Y es, No Single-round vs. multi-round buyer Message type T ext, Binary Seller’s communication format Price $ 100, $ 10K, $ 1M Pro duct price Negotiation Price $ 100, $ 10K, $ 1M Pro duct base price Information Complete, Incomplete Whether play ers know each other’s v alua- tions Messages Allow ed, Not allow ed F ree-text messages with oﬀers Max rounds 10, 30 Maxim um negotiation rounds Buy er v alue 0.8, 1.0, 1.2, 1.5 Buy er’s priv ate v aluation multiplier ( V B = v alue × price) Seller v alue 0.8, 1.0, 1.2, 1.5 Seller’s priv ate v aluation multiplier ( V A = v alue × price) E Game Conﬁguration Robustness This app endix supplements the game conﬁguration robustness analysis in Section 4 . Each GLEE game family is parameterized along m ultiple dimensions. T able 12 do cumen ts the full parameter space. T ables 13 – 16 present base-vs-aligned win coun ts for ev ery parameter v alue in each game family . N = v alid pairs after ﬁltering; Filt. = pairs excluded by mass or correlation ﬁlters. The base adv an tage is consistent across all parameter v alues and all families. The sole exception is bargaining with discoun t factor δ 1 = 0 . 8 (the most impatient proposer), where the adv an tage narro ws to near parity (10:7, p = 0 . 31). T able 13: Bargaining: base vs. aligned by conﬁguration parameter. N : v alid pairs after ﬁltering; Filt.: excluded pairs; p : one-sided binomial. P arameter V alue N Filt. Base Al. p Stak es $ 100 74 46 73 1 4 . 0 × 10 − 21 $ 10K 73 47 69 4 1 . 2 × 10 − 16 $ 1M 64 56 62 2 1 . 1 × 10 − 16 Information Complete 76 44 72 4 1 . 8 × 10 − 17 Incomplete 78 42 75 3 2 . 6 × 10 − 19 Messages Allow ed 66 54 64 2 3 . 0 × 10 − 17 Not allow ed 83 37 76 7 4 . 7 × 10 − 16 Discoun t δ 1 0.8 17 103 10 7 0 . 31 0.9 73 47 65 8 1 . 6 × 10 − 12 0.95 60 60 57 3 3 . 1 × 10 − 14 1.0 88 32 85 3 3 . 7 × 10 − 22 Discoun t δ 2 0.8 80 40 76 4 1 . 4 × 10 − 18 0.9 42 78 40 2 2 . 1 × 10 − 10 0.95 78 42 74 4 5 . 0 × 10 − 18 1.0 82 38 73 9 6 . 9 × 10 − 14 Round ﬁlter = 1 93 27 32 61 1 . 7 × 10 − 3 ≥ 2 86 34 82 4 2 . 9 × 10 − 20 T able 14: Persuasion: base vs. aligned by conﬁguration parameter. N : v alid pairs; Filt.: excluded; p : one-sided binomial. P arameter V alue N Filt. Base Al. p Qualit y prob. p 0.33 51 69 33 18 0 . 024 0.50 4 116 4 0 0 . 063 0.80 0 120 — — — V alue v 1.2 35 85 31 4 1 . 7 × 10 − 6 1.25 23 97 16 7 0 . 047 2.0 37 83 33 4 5 . 4 × 10 − 7 3.0 6 114 5 1 0 . 11 4.0 15 105 12 3 0 . 018 Seller knowledge Knows 31 89 29 2 2 . 3 × 10 − 7 Uninformed 46 74 32 14 5 . 7 × 10 − 3 Buy er my opic Y es 0 120 — — — No 36 84 32 4 9 . 7 × 10 − 7 Message type T ext 33 87 32 1 4 . 0 × 10 − 9 Binary 47 73 43 4 1 . 4 × 10 − 9 Price $ 100 32 88 26 6 2 . 7 × 10 − 4 $ 10K 34 86 34 0 5 . 8 × 10 − 11 $ 1M 0 120 — — — Round ﬁlter = 1 53 67 23 30 0 . 21 ≥ 2 39 81 31 8 1 . 5 × 10 − 4 T able 15: Negotiation: base vs. aligned by conﬁguration parameter. N : v alid pairs; Filt.: excluded; p : one-sided binomial. P arameter V alue N Filt. Base Al. p Information Complete 21 99 20 1 1 . 1 × 10 − 5 Incomplete 40 80 39 1 3 . 7 × 10 − 11 Messages Allow ed 28 92 27 1 1 . 1 × 10 − 7 Not allow ed 32 88 31 1 7 . 7 × 10 − 9 Max rounds 10 55 65 54 1 1 . 6 × 10 − 15 30 7 113 6 1 0 . 063 Price $ 100 29 91 27 2 8 . 1 × 10 − 7 $ 10K 20 100 19 1 2 . 0 × 10 − 5 $ 1M 44 76 42 2 5 . 6 × 10 − 11 V alue asymmetry Buyer > Seller 28 92 26 2 1 . 5 × 10 − 6 Seller > Buyer 38 82 36 2 2 . 7 × 10 − 9 Equal 27 93 27 0 7 . 5 × 10 − 9 Round ﬁlter = 1 72 48 33 39 0 . 28 ≥ 2 57 63 56 1 4 . 0 × 10 − 16 T able 16: Matrix games (PD and BoS): base vs. aligned by round phase. N : v alid pairs; Filt.: excluded; p : one-sided binomial. Game Round Phase N Filt. Base Al. p PD Early (1–3) 32 88 12 20 0 . 11 Mid (4–7) 116 4 93 23 1 . 8 × 10 − 11 Late (8–10) 119 1 92 27 8 . 7 × 10 − 10 BoS Early (1–3) 0 120 — — — Mid (4–7) 11 109 10 1 5 . 9 × 10 − 3 Late (8–10) 111 9 93 18 1 . 1 × 10 − 13

Alignment Makes Language Models Normative, Not Descriptive

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment