Synonymix: Unified Group Personas for Generative Simulations

Generative agent simulations operate at two scales: individual personas for character interaction, and population models for collective behavior analysis and intervention testing. We propose a third scale: meso-level simulation - interaction with gro…

Authors: Huanxing Chen, Aditesh Kumar

Synonymix: Unified Group Personas for Generative Simulations
Synonymix : Unified Group Personas for Generative Simulations Huanxing Chen ∗ huanxing@stanford.edu Stanford University Stanford, CA, USA Aditesh Kumar ∗ aditesh@stanford.edu Stanford University Stanford, CA, USA Figure 1: The central artifact of Synonymix is the "unigraph", a semantically unied knowledge graph constructe d from each participant’s narrative persona. This artifact, which captures the shared personal histor y of the collective, can be explored for greater understanding or sample d for synthetic representation, and oers a discrete structure upon which formal privacy guarantees can be introduced for the participants. It is visualized here with 𝑛 = 4 for readability . Abstract Generative agent simulations operate at two scales: individual per- sonas for character interaction, and population models for collective behavior analysis and intervention testing. W e propose a third scale: meso-level simulation - interaction with group-level representa- tions that retain grounding in rich individual experience. T o enable this, we present Synonymix , a pipeline that constructs a "unigraph" from multiple life story personas via graph-based abstraction and merging, producing a queryable colle ctive representation that can be explored for sensemaking or sampled for synthetic persona generation. Evaluating synthetic agents on General Social Survey items, we demonstrate behavioral signal pr eser vation beyond de- mographic baselines (p<0.001, r=0.59) with demonstrable privacy guarantee (max source contribution <13%). W e invite discussion ∗ Both authors contributed equally to this research. This work is licensed under a Creativ e Commons Attribution 4.0 International License . CHI EA ’26, Barcelona, Spain © 2026 Copyright held by the owner/author(s). ACM ISBN 979-8-4007-2281-3/2026/04 https://doi.org/10.1145/3772363.3799082 on interaction modalities enabled by meso-level simulations, and whether "high-delity" personas can ever capture the texture of lived experience. CCS Concepts • Human-centered computing → Interactive systems and tools ; • Security and privacy → Privacy protections ; • Computing methodologies → Simulation tools ; • Information systems → Ontologies . Ke y words Generative Agents, Agent Banks, High-Fidelity Simulations, Syn- thetic Personas, Knowledge Graph, Sensemaking, Narrative Psy- chology . A CM Reference Format: Huanxing Chen and A ditesh Kumar . 2026. Synonymix : Unied Group Personas for Generativ e Simulations. In Extended Abstracts of the 2026 CHI Confer ence on Human Factors in Computing Systems (CHI EA ’26), A pril 13–17, 2026, Barcelona, Spain . ACM, New Y ork, NY, USA, 9 pages. https://doi.org/10.1145/3772363.3799082 CHI EA ’26, April 13–17, 2026, Barcelona, Spain Chen and Kumar 1 Introduction & Background Plausible models of human behavior are an elusive and yet cov- eted construction throughout the history of the cognitive sciences. They are particularly compelling for their utility in simulation, displaying a unique ability to surface insights into individual and collective behavior . Historical exemplars include Card et al. ’s Model Human Processor for individual-lev el interaction [ 3 ] and Schelling’s agent-based models demonstrating emergent collectiv e segregation patterns [ 25 ]. Both works inspir e d a body of literature fascinated with understanding human behavior through simulation. This history of work is now joined by generative agent-based models (GABMs) [ 11 , 15 , 23 ], which leverage language models to construct believable proxies of human behavior . However , if we conceive simulation’s essence as the exploration of counterfactual outcomes and selv es, then the representational adequacy of GABMs warrants scrutiny . LLM agents struggle with faithful representation of dierent persp ectives from demographic variables alone [ 2 , 5 , 20 ] and do not easily capture the contextual particularity of individ- ual experience. Recognizing that thin persona specications yield agents that po orly approximate the heterogeneity of real social actors, studies hav e begun exploring more high-delity simulations by seeding generative agents with personas derived from life story interviews [ 24 ]. With life stories being a particularly rich charac- terization, a core tension arises: the same biographical depth that enables higher-delity simulation also heightens privacy risks for participants. W e ask: can we nd a high-delity yet privacy-preserving rep- resentation of an agent bank? Synthetic personas are a prominent approach, with recent work exploring ne-grained synthesis at the population [ 4 ] and individual [ 14 ] levels. However , both approaches extrapolate from a seed dataset of sampled features and face addi- tional challenges in representing the rich narratives of their source participants. Rather than synthesizing individuals or populations, we reframe the challenge as producing synthetic agent banks that faithfully and privately represent the ne-graine d narratives of personas in the agent bank as a whole. W e propose abstracting one level higher from the individual to the group as a means of easing the delity-privacy tension - a collective representation grounded in individual-level nuance , but abstracted enough to be safely shareable. This r eframing surfaces a second question: if such a "high-delity" group-level artifact were possible, what interac- tion modalities might it enable beyond individual-level character interaction or population-level aggregate behavior? W e term this space meso-level simulation . W e raise Synonymix as a potential answer to these questions. Synonymix unies high-delity p ersonas by extracting knowledge graphs from each and merging them according to synonymity to produce a unied graph ("unigraph"). The unigraph can be con- structed in a privacy-preserving way and sampled for synthetic persona generation. More importantly , we suggest that this uni- graph constitutes a novel artifact of collective r epresentation that could be explored interactively for sensemaking and other meso- level interactions that retain the essence of simulation - e xploration . 2 Methodology 2.1 Life Story Graph Extraction W e dene a persona graph ontology for decomposing textual life stories into structured representations. The schema species allow- able node types, edge typ es, and edge labels. Node T ypes: The ontology includes three node types: Subject nodes (S) represent recurring proper nouns (people, places, institu- tions); Factual no des (F) capture concrete events, actions, or mile- stones; and Interpretive nodes (I) encode values, motivations, or reective self-narratives derived from factual experiences. Inspired by Park et al. ’s use of expert reection mo dule involving simulated experts from 4 branches of social sciences [ 24 ], I-nodes are gen- erated via LLM annotation simulating expert p erspectives from psychologists, sociologists, and anthropologists. Edge Grammar: Edges are restricted to four types: F → S (spa- tial, temp oral, and relational markers), F → F (temporal or causal relations between events), F → I (derived meanings from events), and I → F (values inuencing actions). For F → S edges, we role-type participants using an inventory adapted from PropBank [ 22 ] and FrameNet [ 1 ] for the autobiographical domain (see Appendix A for full edge inventory). Certain edge types are deliberately excluded: S → F (redundant with role-typed F → S), I → S (values ar e always mediated through factual experiences), and I → I (prevents unbounde d chaining of abstract meanings). This maintains balance between expressive power and structural tractability . 2.2 Privacy-Preserving Graph Aggr egation Individual persona graphs are merged into the unigraph through two operations. First, label genericization : factual nodes are con- verted to their generic variant with non-generic entities extracted and connected via an F->S edge (e.g., F: "Graduated from Harvard University" → F: "Graduated from University" + S: "Harvard Uni- versity") to reduce direct identiability while preserving seman- tic relationships. Second, node merging : nodes with identical or semantically e quivalent lab els across personas are merged, with provenance tags tracking source contributions. This creates shared "hubs" (e .g., a generic "Father" S-node, or generic "Economic Inse- curity Driving A cademic Achievement" I-node connecting multiple personas’ experiences) that enable cross-persona traversal while obscuring individual trajectories. Notably , the discrete graph struc- ture supports dierentially private (DP) [ 8 , 9 ] merging 1 , though we defer systematic DP evaluation to futur e work given the node sparsity challenges at our limited agent bank size of N=30. 2.3 Graph T raversal Sampling After constructing a unigraph of merged persona graphs, we adopt a variant of the Random W alk with teleportation traversal strategy [ 17 ] to sample novel persona graphs. T o reduce the risk of sam- pling incoherent or implausible personas, we introduce Thematic Random W alk which maintains a ’theme anchor’ during traversal represented as a vector embedding of a chosen I-node. W e use 1 W e construct the unigraph as a private histogram of no des using a dierentially private set union [ 12 , 13 ], which prunes nodes that are underrepresented in agent bank to avoid leakage. Synonymix CHI EA ’26, April 13–17, 2026, Barcelona, Spain (a) I-no de: Economic insecurity driving academic achievement. (b) Persona 1: I-no de reected in night shifts during schooling. (c) Persona 2: Same I-node reected in promise to father to never be stuck in a dead-end job. (d) S-no de hub (Father) connecting personas and events. Figure 2: Unigraph Traversal on 4 Personas I-nodes as anchors because they pro vide abstraction over hetero- geneous events: dierent source personas may have factual nodes that instantiate the same underlying theme (I). Thus, anchoring to interpretations allows recombination of events across sources while maintaining motivational coherence, and w e instantiate the strength of thematic anchoring as a tunable parameter in graph traversal. 3 Results and Analysis Our evaluation uses LLM-generated life story personas to establish internal validity: whether the Synonymix pipeline preserves su- cient behavioral signal in source personas. The algorithmic nature of the pipeline operations suggests that they should transfer to real-world data, though empirical validation remains futur e work. 3.1 Data Preparation W e construct three agent banks (N=30) from demographic see ds drawn from the General Social Survey 2022 [ 7 ], a longitudinal U.S. survey tracking attitudes on politics, religion, and social trust that is widely used in social scientic research. Demographic Agent Bank (D) : Agents initialized with a con- catenated string of randomly-sampled respondents’ available de- mographic seeds from a set of 68 questions. Life Stor y Agent Bank (L) : Narrative personas generated from the demographic seeds using McAdams’ life story inter view frame- work, a foundational instrument in narrative psychology for elicit- ing identity-shaping experience [18]. Frankenstein Agent Bank (F) : Synthetic p ersonas produced by Synonymix pipeline: life graphs are extracted from L bank personas, merged into a unied graph structure, "Frankenstein" graphs are generated via graph traversal algorithms, and these are used to reconstruct coherent textual narratives via a McA dams prompt. Refer to Appendix C for example personas, and Appendix D for discussion on methods used to generate life stor y personas. 3.2 Evaluation Design 3.2.1 Evaluating Behavioral Signal Preservation. W e evaluate all three agent banks on a ltered set of 108 non-demographic items from the GSS 2022 Core questionnaire comprising both ordinal and nominal questions (See Appendix B for item examples). W e’re interested in using population-level distributional distance as a CHI EA ’26, April 13–17, 2026, Barcelona, Spain Chen and Kumar proxy for a simulation’s delity as Synonymix ’s F agent bank lacks a ground truth for individual-level metric e valuation. For each of the ordinal items ( 𝑛 = 69 ), we calculate the Earth Mover’s Distance (EMD) , a metric measuring the minimum "work" required to transform one distribution into another while respecting ranked ordering. For nominal items ( 𝑛 = 39 ), we calculate the T otal V ariation Distance (TVD) , which measures how much the two distribution disagree overall when treating category mismatches equally . Using the corresponding measure for each item type, we calculate the following distance metrics per question: • Enrichment distance, dist ( 𝐷 , 𝐿 ) : The distance between the response distribution of all D bank and L bank agents. Represents the behavioral signal added by narrative expan- sion • Transformation distance, dist ( 𝐿, 𝐹 ) : The distance between the response distribution of all L bank and F bank agents. Rep- resents the behavioral signal lost by the Synonymix pipeline. Our primary hypothesis is that the average transformation dis- tance < enrichment distance across all items, which would indicate that Synonymix pipeline loses less behavioral signal than that gained by narrative expansion from demographic personas . This would val- idate graph-base d transformation as a viable approach for creat- ing high-delity agent banks that preserve meaningful behavioral grounding while mitigating re-identication risk. W e test this hypothesis using a one-side d Wilcoxon signed-rank test, a non-parametric paired test appropriate for comparing dis- tances within items without distributional assumptions. W e report rank-biserial correlation ( 𝑟 ) as eect size and interpret 𝑟 > 0 . 5 as large and 𝑟 > 0 . 3 as medium. 3.2.2 Evaluating Privacy . T o evaluate the degree of privacy guaran- tee provided by Synonymix , we examine Maximum Source Contri- bution (MSC) , the empirical measure of the proportion of a synthetic persona’s nodes attributable to its most dominant source individ- ual. Using fractional attribution (merged nodes split credit e qually among contributing sources), MSC quanties whether synthetic personas represent genuine re-combinations or lightly-perturbed copies. W e adopt MSC < 0 . 50 as our threshold: no single source should contribute a majority of any synthetic persona’s content. 3.3 Synonymix Persona Performance 3.3.1 Behavior Signal Preservation. Figure 3 presents the core com- parison across all ordinal and nominal questions respectively . For ordinal questions, transformation distance was signicantly lower than enrichment distance ( dist ( 𝐿, 𝐹 ) = 0 . 061 < dist ( 𝐷 , 𝐿 ) = 0 . 094 ), supporting our hyp othesis (Wilcoxon W , 𝑝 < 0 . 001 , 𝑟 = 0 . 585 , large eect). 65% of ordinal questions show ed transformation distance below enrichment distance. For nominal questions, the pattern held directionally ( dist ( 𝐷 , 𝐿 ) = 0 . 074 < dist ( 𝐿, 𝐹 ) = 0 . 111 ) with medium eect size ( 𝑟 = 0 . 462 ), but did not reach statistical signif- icance ( 𝑝 = 0 . 103 ) likely due to limite d statistical power ( 𝑛 = 39 vs. 𝑛 = 69 ). Only 49% of nominal questions showed transformation distance below enrichment distance. 3.3.2 Privacy Preser vation. Across all F bank personas, MSC re- mained well b elow threshold (mean=0.129, SD=0.031, range: 0.091–0.195), (a) Aggregate enrichment and transformation distance (b) Per-question enrichment and transformation distance Figure 3: TOP: Box-plot comparison of aggregate enrichment (D → L) and transformation (L → F) distances across all GSS items. Diamonds indicate means; horizontal lines indicate medians. T wo enrichment outliers ( 0 . 4 , 0 . 5 ) for nominal ques- tions not shown for scale. BOT TOM: Per-item comparison of transformation vs. enrichment distance. Points b elow the diagonal indicate items where the pipeline loses less signal than narrative expansion adds. with 100% of personas below 0.50. On average, each synthetic per- sona dre w from 29.4 of 30 sour ce individuals. These results conrm that Synonymix produces genuine re-combinations rather than identiable copies of source personas. 4 Discussion 4.1 Generating Synthetic, High-Fidelity Agent Banks from Grounded Data Synonymix enables the creation of high-delity synthetic life story agent banks gr ounded in real-person data while preserving privacy through the aordances of symbolic graph structures (node merg- ing, label genericization, potential dierential privacy ). While we used life stor y p ersonas as proof of concept, graph-based abstraction oers a generalizable pattern for privacy-preserving synthetic data generation in domains with inherent relational structure. Practi- tioners can generate synthetic p ersonas beyond the original dataset Synonymix CHI EA ’26, April 13–17, 2026, Barcelona, Spain size through graph traversal, thereby addressing a bottlene ck in LLM-based high-delity agent generation: the tendency to produce attened demographic caricatures [5]. A potential caution, however , p ertains to the risk of generating incoherent or implausible life stories. While we tried to mitigate this risk through Thematic A nchoring , the nature of probabilistic graph traversal implies the possibility of generating improbable personas who might, for instance, be a "chronically-ill 90 y ear-old" that "en- joys tennis" . Although one could argue that such individual-level implausibilities matter less when studying aggr egate, population- level dynamics, futur e work should explore additional coher ence constraints to broaden the pipeline’s applicability . In particular , ensuring the preservation of the unique lived experiences of in- dividuals with intersectional identities remains a critical area for renement. 4.2 Ethical Considerations & Future V alidation Synonymix interfaces with the competing values of integrity and privacy . Due to ethics-related concerns about using high-delity human data in an unvalidated pipeline, we e xperiment only with synthetic agent banks, with ndings that conrm internal valid- ity alone. Probabilistically-sampled narrativ es do not capture the nuances of real human personas, raising concerns about consis- tency , granularity , and representation of intersectional identities. Nonetheless, our results show initial promise motivating empirical evaluation on real-human data - exploring whether the pipeline can faithfully capture persona richness while defending against structural [21] and LLM-driven [16] de-anonymization attacks. 4.3 Meso-Level Interactions Beyond synthetic persona generation, the unigraph itself consti- tutes an interactive artifact - a queryable, high-delity representa- tion of a colle ctive that preserves individual nuance while remaining shareable without compromising privacy . This enables what we propose as the ’meso-le vel’ interactions of generative agent simula- tions: simulations at a scale between individual-level AI personas [ 19 ] and population-level modeling [ 6 ]. At this scale, we can explore the utility of simulation b eyond predictive accuracy or believability , thereby opening up a design space for simulations as a substrate for the exploration of counterfactual selves and collectives at the level of the group . W e identify two potential applications from our pr ototype imple- mentation. First, cross-temporal and cross-demographic sensemaking : constructing "group personas" (e.g., "the Class of 1985") to e xplore how the collective experienced historical moments through navi- gating shared S-no de hubs, as w ell as e xploring the tension b etween I-node thematic similarities and the associated F-node lived expe- riences. And second, consensus building : using the graph to iden- tify convergent and divergent interpr etations within stakeholder groups to support collaborative deliberation. Figure 2 illustrates an example of the former . More broadly , treating the group as a representational unit surfaces a design space not just for within- group sensemaking, but for b etween-group dynamics - exploring how collectives dier and converge in ways that ar e invisible at the individual scale and collapse into aggregate noise at the population scale. What unies these applications is their orientation rather than their scale - a mo de of engagement better suited to open-ended inquiry than to prediction or output. Zhang’s framing of the non- consequential and the diale ctical in HCI is instructive here [ 27 ]: interactions that are valuable not for the answers they produce but for the thinking the y provoke, and that unfold thr ough tension and exchange rather than resolution. This, we suggest, is the distinctiv e promise of meso-level simulation and the sensibility that should inform its design. 4.4 Ontologies of Generative Agent Simulations The design choices emb edded in Synonymix make visible some- thing that has received little explicit attention in the generative agent literature: that every simulation encodes an ontology of per- sonhood. What is a person? What counts as a unit of memory? Is a person dened by a stable set of attributes, or are they constituted through their relations and so cial roles? These are not technical questions, and dierent answers open up fundamentally dierent design spaces. Current generative agent architectures tend to inherit the as- sumptions of methodological individualism: agents as discrete units with stable attribute bundles, and collectives as aggregates of indi- viduals. But this is one ontological choice among many . Emirbayer’s relational sociology , for instance, proposes foregrounding the rela- tions over the substance - identity as constituted through relation rather than prior to it [ 10 ]. Other traditions resist the individual as a meaningful unit of analysis altogether . Each implies dierent answers to questions simulation designers rarely ask explicitly . The unigraph oers one instantiation of what it looks like to build from a dierent set of answers - specically , one that con- ceives the group as a unied structure through which individuals are constituted rather than a mere aggregate of atomic individu- als. As Winograd and Flores argue, our design shapes what can be thought and done within a system [ 26 ], and this is espe cially consequential in generative agent simulation where the object of analysis is the so cial world in all its messiness. The assumptions we encode about personhoo d, memory , colle ctive life and beyond do not merely shape what our systems can represent but also what we think is w orth representing. W e invite the HCI community to ask what interaction possibilities might be opened up by alterna- tive ontological commitments. The ontology of generative agent simulation is an open design question. 5 Conclusion W e presented Synonymix , a pip eline for constructing privacy- preserving synthetic agent banks from life stor y personas via graph- based abstraction, demonstrating that it preserves b ehavioral signal beyond demographic-only baselines while producing personas that are composite of many but copies of none . Beyond synthetic data generation, we propose the unigraph as an artifact for meso-level simulation - enabling interaction with group-level repr esentations grounded in individual-level nuance. This opens design questions we invite the CHI community to explore: What modalities might meso-level simulations enable? Can life story p ersonas adequately capture the texture of lived realities? What alternativ e ontological frames can simulations adopt, and what does each highlight and CHI EA ’26, April 13–17, 2026, Barcelona, Spain Chen and Kumar obscure? And, where should generative agent simulation head if we move beyond predictive accuracy - beyond treating individuals as "high-delity" data points that are data points nonetheless? Acknowledgments W e are grateful to Jo on Sung Park, Helena V asconcelos, Carolyn Zou, and Jonne Kamphorst for their advice and feedback during var- ious phases of this work. W e thank the re viewers for their feedback and borrow their vocabulary to delineate the capabilities and limi- tations of Synonymix . LLMs w ere utilized for (1) co ding the graphs and tables, and (2) editing and typesetting the written pr oduct (text, citations, etc.) of this work. References [1] Collin F Baker , Charles J Fillmore, and John B Lowe. 1998. The berkeley framenet project. In COLING 1998 V olume 1: The 17th International Conference on Compu- tational Linguistics . [2] Jan Batzner , V olker Sto cker , Bingjun T ang, Anusha Natarajan, Qinhao Chen, Stefan Schmid, and Gjergji Kasneci. 2025. Whose personae? synthetic persona experiments in llm research and pathways to transparency . In Proceedings of the AAAI/ACM Conference on AI, Ethics, and So ciety , V ol. 8. 343–354. [3] Stuartk Card, Thomas P. Moran, and Allen Newell. 1986. The model human processor- An engineering model of human performance. Handbook of perception and human performance. 2, 45–1 (1986), 1–35. [4] Louis Castricato, Nathan Lile, Rafael Rafailov , Jan-Philipp Fränken, and Chelsea Finn. 2025. PERSONA: A Reproducible Testbed for P luralistic Alignment. In Proceedings of the 31st International Conference on Computational Linguistics , Owen Rambow , Leo Wanner , Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, and Steven Schockaert (Eds.). Association for Computational Linguistics, Abu Dhabi, U AE, 11348–11368. https://aclanthology .org/2025.coling- main.752/ [5] Myra Cheng, Tiziano Piccardi, and Diyi Y ang. 2023. CoMPosT: Characterizing and evaluating caricature in LLM simulations. arXiv preprint (2023). [6] A yush Chopra. 2025. Large population models. arXiv preprint (2025). [7] Michael Davern, Rene Bautista, Jeremy Freese, Pamela Herd, and Stephen L. Morgan. 2024. General Social Survey , 2022. Machine-readable data le. Data accessed from «!nav»GSS Data Explorer«!/nav» at gssdataexplorer .norc.org. [8] Cynthia Dwork, Frank McSherr y , Kobbi Nissim, and Adam Smith. 2006. Cali- brating Noise to Sensitivity in Private Data Analysis. In Theor y of Cr yptography , Shai Halevi and T al Rabin (Eds.). Springer Berlin Heidelberg, Berlin, Heidelb erg, 265–284. [9] Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Dier- ential Privacy . Found. Trends The or . Comput. Sci. 9, 3–4 (Aug. 2014), 211–407. doi:10.1561/0400000042 [10] Mustafa Emirbayer . 1997. Manifesto for a relational sociology . A merican journal of sociology 103, 2 (1997), 281–317. [11] Navid Ghaarzadegan, Aritra Majumdar, Ross Williams, and Niyousha Hos- seinichimeh. 2024. Generative agent-based modeling: an introduction and tutorial. System Dynamics Review 40, 1 (2024), e1761. [12] Sivakanth Gopi, Pankaj Gulhane, Janardhan Kulkarni, Judy Hanwen Shen, Milad Shokouhi, and Sergey Y ekhanin. 2020. Dierentially Private Set Union. In Pro- ceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, V ol. 119) , Hal Daumé III and Aarti Singh (Eds.). PMLR, 3627–3636. https://proceedings.mlr .press/v119/gopi20a.html [13] Sivakanth Gopi, Pankaj Gulhane, Janardhan Kulkarni, Judy Hanwen Shen, Mi- lad Shokouhi, and Sergey Y ekhanin. 2020. Dierentially private set union. In Proceedings of the 37th International Conference on Machine Learning (ICML’20) . JMLR.org, Article 340, 10 pages. [14] Pegah Jandaghi, Xianghai Sheng, Xinyi Bai, Jay Pujara, and Hakim Sidahmed. 2024. Faithful Persona-based Conversational Dataset Generation with Large Language Models. In Proceedings of the 6th W orkshop on NLP for Conversational AI (NLP4ConvAI 2024) , Elnaz Nouri, Abhinav Rastogi, Georgios Spithourakis, Bing Liu, Yun-Nung Chen, Y u Li, Alon Albalak, Hiromi W akaki, and Alexandros Papangelis (Eds.). Association for Computational Linguistics, Bangkok, Thailand, 114–139. https://aclanthology .org/2024.nlp4convai- 1.8/ [15] Maik Larooij and Petter Törnberg. 2025. Do large language models solve the prob- lems of agent-based modeling? a critical review of generative social simulations. arXiv preprint arXiv:2504.03274 (2025). [16] Simon Lermen, Daniel Paleka, Joshua Swanson, Michael Aerni, Nicholas Carlini, and Florian Tramèr . 2026. Large-scale online deanonymization with LLMs. arXiv preprint arXiv:2602.16800 (2026). [17] Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Pro- ceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Philadelphia, P A, USA) (KDD ’06) . Association for Computing Machinery , New Y ork, NY, USA, 631–636. doi:10.1145/1150402.1150479 [18] Dan P McAdams. 2008. The life story inter view . [19] Meredith Ringel Morris and Jed R Brubaker . 2025. Generativ e ghosts: Anticipating benets and risks of AI afterlives. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems . 1–14. [20] T arek Naous, Michael J Ryan, Alan Ritter, and W ei Xu. 2023. Having beer after prayer? measuring cultural bias in large language models. arXiv preprint arXiv:2305.14456 (2023). [21] Arvind Narayanan and Vitaly Shmatikov . 2008. Robust de-anonymization of large sparse datasets. In 2008 IEEE Symposium on Security and Privacy (sp 2008) . IEEE, 111–125. [22] Martha Palmer , Daniel Gildea, and Paul Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles. Computational linguistics 31, 1 (2005), 71–106. [23] Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. 2023. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th annual acm symposium on user interface software and te chnology . 1–22. [24] Joon Sung Park, Carolyn Q Zou, Aaron Shaw , Benjamin Mako Hill, Carrie Cai, Meredith Ringel Morris, Robb Willer , Percy Liang, and Michael S Bernstein. 2024. Generative agent simulations of 1,000 p eople. arXiv preprint (2024). [25] Thomas C Schelling. 1971. Dynamic mo dels of segregation. Journal of mathe- matical sociology 1, 2 (1971), 143–186. [26] T erry Winograd, Fernando Flores, et al . [n. d.]. Understanding computers and cognition: A new foundation for design . V ol. 335. [27] Haoqi Zhang. 2024. Searching for the Non-Consequential: Dialectical Activities in HCI and the Limits of Computers. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems . 1–13. [28] Jiayi Zhang, Simon Yu, Derek Chong, Anthony Sicilia, Michael R. T omz, Christo- pher D. Manning, and W eiyan Shi. 2025. V erbalized Sampling: How to Mit- igate Mode Collapse and Unlock LLM Diversity . arXiv:2510.01171 [cs.CL] https://arxiv .org/abs/2510.01171 A Edge Grammar Edges formalize relations between nodes, restricted to four types: F → S , F → F , F → I , and I → F . • F->S: Spatial, temporal, and relational markers. Exam- ple edge labels: by , of, at, in, for , to, from, during, using, with, via, on ). These lab els are standardized via a role registry (see below). • F->F: T emporal or causal relations b etween concrete events. Allowed edge labels: " precedes " (chronology only ), " enables " (prerequisite/resource relationship), " causes " (ex- plicit causal implication) • F->I: Derived meanings from factual events. Allowed edge labels: " yields " (factual event produces or giv es rise to an enduring interpretation (b elief/value/stance)), " evokes " (factual event triggers a transient interpretive reaction), " sup- ports " (factual event provides evidence that reinforces an existing interpretation) • I->F: V alues/Beliefs inuencing actions. Allowed e dge la- bels: " guides " (interpretation provides direction or motivation that leads to the factual outcome), " constrains " (interpretation imposes limits/conditions that shape the outcome) A.1 F → S Role Registr y For F → S edges, we role-type the participants and contextual an- chors of factual events. Our role inventory builds on the argument- structure conventions of PropBank [ 22 ] and the contextual frame semantics of FrameNet [ 1 ], but adapts them to the autobiographical domain by introducing interpretive nodes and restricting the set Synonymix CHI EA ’26, April 13–17, 2026, Barcelona, Spain of allowable e dge typ es. The allowed roles include: A GEN T (the initiator of an action), P A TIEN T (the entity acted upon), LOCA- TION (the place where the event occurred), ORGANIZA TION (the institution involved), DISCIPLINE (the activity domain or eld of practice), INSTRUMEN T (a tool or enabling method), RECIPIEN T (the beneciary of the action), and TIME (the time p eriod during which the ev ent occurred). While TIME is not typically repr esented as a subject in linguistic role inventories, we include it here both to anchor autobiographical events chronologically and to support resampling during graph traversal. Attaching temporal markers as roles rather than embedding them in factual node labels allows them to be exibly perturbed or reassigned during unigraph sam- pling, thereby obscuring sensitive details (e.g., shifting the period of a life event) while preserving aggregate temporal patterns. B GSS 2022 Core - Example Items B.1 Ordinal Items { NATSPACY: { "question": "Are we spending too much, too little, or about the right amount on Space exploration?", "options": { "1": "TOO LITTLE", "2": "ABOUT RIGHT", "3": "TOO MUCH" }, "DEMOGRAPHIC": false, "ordinal": true, "options_count": 3, }, "DISCAFFWV}": { "question": "What do you think the chances are these days that a woman won ' t get a job or promotion while an equally or less qualified man gets one instead. Is this very likely, somewhat likely, somewhat unlikely, or very unlikely these days?", "options": { "1": "VERY LIKELY", "2": "SOMEWHAT LIKELY", "3": "NOT VERY LIKELY", "4": "VERY UNLIKELY" }, "DEMOGRAPHIC": false, "ordinal": true, "options_count": 4 }, "CONFED": { "question": "As far as the people running the EXECUTIVE BRANCH OF THE FEDERAL GOVERNMENT in this country are concerned, would you say you have a great deal of confidence, only some confidence, or hardly any confidence at all in them ?", "options": { "1": "A GREAT DEAL", "2": "ONLY SOME", "3": "HARDLY ANY" }, "DEMOGRAPHIC": false, "ordinal": true, "options_count": 3 }, "HAPMAR": { "question": "Taking things all together, how would you describe your marriage? Would you say that your marriage is very happy, pretty happy, or not too happy?", "options": { "1": "VERY HAPPY", "2": "PRETTY HAPPY", "3": "NOT TOO HAPPY" }, "DEMOGRAPHIC": false, "ordinal": true, "options_count": 3 }, "WLTHHSPS": { "question": "On a scale of 1 to 7 with 1 being ' rich ' and 7 being ' poor ' , with 4 being ' somewhere in between ' , do you think people in this group tend to be rich or poor?: Hispanic Americans?", "options": { "1": "1 - RICH", "2": "2", "3": "3", "4": "4", "5": "5", "6": "6", "7": "7 - POOR" }, "DEMOGRAPHIC": false, "ordinal": true, "options_count": 7 }, B.2 Nominal Items { "CAPPUN": { "question": "Do you favor or oppose the death penalty for persons convicted of murder?", "options": { "1": "FAVOR", "2": "OPPOSE" }, "DEMOGRAPHIC": false, "ordinal": false, "options_count": 2 }, CHI EA ’26, April 13–17, 2026, Barcelona, Spain Chen and Kumar "SUICIDE2": { "question": "Do you think a person has the right to end his or her own life if: This person has gone bankrupt?", "options": { "1": "YES", "2": "NO" }, "DEMOGRAPHIC": false, "ordinal": false, "options_count": 2 }, "ABPOOR": { "question": "Please tell me whether or not you think it should be possible for a pregnant woman to obtain a legal abortion if: The family has a very low income and cannot afford any more children?", "options": { "1": "YES", "2": "NO" }, "DEMOGRAPHIC": false, "ordinal": false, "options_count": 2 } C Demographic and Life Stor y Personas C.1 Demographic Persona Example "I am a 26-year-old male. I identify as black or african american and am not Hispanic. I was born in the United States. I live in a medium size central city in Middle Atlantic America (New Y ork, New Jersey , Pennsylvania). " , "I am never married. I have no children. I complete d 4 years of college and attaine d a bachelors degree. I studied optometry . " , "I work full-time. I am employed by a private employer , not the government. I have a supervisor and also sup er- vise others. My family income last year was 90 , 000 𝑡 𝑜 109,999, which I consider average. My p ersonal earnings were 50 , 000 𝑡 𝑜 59,999. I identify as working class. In the last ten years, I have not been un- employed for more than a month. When I w orked in recent years, it was typically full-time. I have no military service. " , "I have no religious preference and was raised protestant. Politically , I identify as independent. When I was 16, I liv e d with my mother only in a small city or town (under 50,000) in the middle atlantic region. At that time, I was living in same state , same city . My family income at that time was below average. My mother completed 12th grade. She worked for pay while I was growing up. Regar ding my parents’ birthplaces: mother yes, father no. I have four siblings. " C.2 Life Story Persona (With McAdam’s Prompt Guidance) "I grew up in a cramped two - bedroom apartment on the north side of a mid - Atlantic city , the kind of place where the hallway always smelled faintly of laundry detergent and the distant rumble of the commuter train could be heard through the thin walls. I was eight, perched on the cracked kitchen oor , watching my mother iron a shirt while the radio played an old Motown record. The heat from the iron warmed my back, and the scent of fresh laundr y mixe d with the faint grease of the fried chicken we ’d just eaten. When she glanced up and said, “Y ou’re the only one who can keep the lights on for us, ” I felt a sudden weight settle in my chest—a mix of pride and the pressure of being the one who had to help. That night I promised myself I would nish school, no matter how hard it got. ", " At fteen, I was standing in the cramped hallway of the public high school’s science lab, the uorescent lights buzzing overhead, the smell of chemicals sharp in the air . My biology teacher , Mr . Alvarez, handed me a slide of a rabbit’s eye and asked, “What do you se e?” I felt my heart race, the buzz of the uorescent tub es matching the buzz in my ears. I whisp ered, “ A future, ” and he smiled, “Then you’ll need to see farther . ” That moment nudged me toward a science track, ev en though my mother , who had only a high - school diploma, warned that “college is a luxury . ” I took that warning as a challenge, not a barrier . " , "When I was twenty - two, I walked into the bustling downtown optometr y clinic where I had just be en hired as an assistant. The waiting room smelled of antiseptic and cheap coee; the hum of the auto - refractor lled the space. My supervisor , a woman with a scar across her cheek, handed me a stack of patient charts and said, “Y ou’re the rst Black guy they’ve hired in years—don’t let them forget why you’re her e. ” My palms sweated, but the click of the keyboard under my ngers felt like a promise. I spent the next year studying for the optometry licensing exam, pulling all - nighters in a tiny studio apartment where the radiator hissed like a low - wail. Passing that exam taught me that the path to a stable paycheck could be forge d with relentless, quiet work. " , "The rst r eal crisis hit when I was twenty - six, three months into my new r ole as a licensed optometrist at a private practice. I was in the break room, the uorescent light ickering, the smell of burnt coee thick, when the oce manager called me into her glass - walled oce. She stared at me over her glasses, voice at, “W e have to cut sta; your position is being eliminated. ” The words hit like a cold draft. I felt my stomach drop, the taste of metal in my mouth, and the fear that the rent I shar ed with my partner , Maya, would slip through our ngers. I left the room clutching a stack of patient les, the weight of the future pressing against my shoulders. That day I learned that security could e vaporate in an instant, and I resolved to build a safety net, even if it meant picking up extra shifts at the local urgent - care center . " , "T wo weeks later , I was at home in our modest two - bedroom apartment, the kitchen light buzzing, the scent of Maya’s jasmine shampoo lling the air . W e sat at the kitchen table, a stack of unpaid bills spread before us, the rent notice glaring in red ink. I could feel the paper’s rough texture against my ngertips, hear the ticking of the wall clo ck, and sense Maya’s hand trembling in mine . W e talked about taking a second job, ab out cutting back on groceries, about the fear that my student loans would never disapp ear . The moment cemented a new resolve: I would start a side business oering vision screenings at community centers, using the equipment I’d saved from the clinic. It taught me that survival sometimes means turning a setback into a side hustle. " , "Now , at twenty - six, I’m working full - time at a private optometry practice, super vising two technicians, while living with Maya in the same house we ’ve made our own. My income sits at ab out $57,000 a year , which, after rent, utilities, and Maya’s student loans, leaves Synonymix CHI EA ’26, April 13–17, 2026, Barcelona, Spain little room for anything bey ond the essentials. I sp end my evenings after the last patient leaves, the clinic lights dimming, the scent of disinfectant lingering, scrolling through insurance forms and budgeting spreadsheets. I’m constantly weighing whether to apply for a loan to buy a second set of lenses for my side business or to put a little extra toward a down - payment on a house. The future feels like a tightrope, but the rope is my own making, and I’m still learning how to keep my balance. " D Persona Method Discussion For p ersona generation and evaluation pipeline, we call gpt-oss- 120b model hosted on Lightning.AI for inference due to compute constraints. D .1 Generating Life Story Persona (L) An issue we noticed during synthetic life story generation is that while the facts of the life stories are dierent, the structure and lin- guistic style is remarkably similar . For example, multiple personas used the same linguistic markers of the "uorescent lamp" and the "kitchen oor" , which we attribute to LLM stylistic mode collapse. T o mitigate this, we augmented the persona generation metho d with v erbalized sampling - a method for sampling response outputs from the tail end of the probability distribution [ 28 ]. However , sim- ilarly low-probability life story outputs still consistently retained similar stylistic markers. While this does not aect our evaluation’s test for internal validity , we would note this here as a core limitation of LLM generated synthetic dataset.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment