Personality Expression Across Contexts: Linguistic and Behavioral Variation in LLM Agents

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Large Language Models (LLMs) can be conditioned with explicit personality prompts, yet their behavioral realization often varies depending on context. This study examines how identical personality prompts lead to distinct linguistic, behavioral, and emotional outcomes across four conversational settings: ice-breaking, negotiation, group decision, and empathy tasks. Results show that contextual cues systematically influence both personality expression and emotional tone, suggesting that the same traits are expressed differently depending on social and affective demands. This raises an important question for LLM-based dialogue agents: whether such variations reflect inconsistency or context-sensitive adaptation akin to human behavior. Viewed through the lens of Whole Trait Theory, these findings highlight that LLMs exhibit context-sensitive rather than fixed personality expression, adapting flexibly to social interaction goals and affective conditions.

💡 Research Summary

The paper “Personality Expression Across Contexts: Linguistic and Behavioral Variation in LLM Agents” investigates whether large language models (LLMs) conditioned with explicit personality prompts exhibit stable trait expression or context‑sensitive adaptation. The authors focus on four dialogue settings that differ in social goal, emotional tone, and cooperation level: (1) ice‑breaking (friendly small talk), (2) negotiation (buyer‑seller dispute over a refund), (3) a survival‑style group decision‑making task (ranking museum artworks for rescue), and (4) an empathetic dialogue (supporting a user who has failed an exam). In each setting a “Personality Agent” receives a high‑ or low‑level prompt for a single Big‑Five trait (e.g., highly extroverted vs. introverted) while a “Generic Agent” remains neutral, allowing the authors to isolate the effect of the personality prompt.

Methodology
The study manipulates each of the five traits separately, using descriptive prompts validated in prior work with the BFI‑10. Interactions are agent‑to‑agent, eliminating user variability. Linguistic analysis relies on three complementary tools: (a) LIWC, which provides context‑free lexical and stylistic counts; (b) a pre‑trained Big‑Five classifier that predicts trait presence from BERT embeddings and psycholinguistic features, ignoring dialogue context; and (c) an LLM‑based “expert psychologist” prompt that rates each conversation on a 1‑5 scale for each trait while explicitly incorporating the task context. Emotional tone is measured with an LLM‑based emotion recognizer that outputs valence and arousal dimensions. Behavioral outcomes are quantified in the negotiation task by the percentage concession (100 % − refund offer) and in the survival task by the Sum of Rank Differences (SRD), which captures how far an agent’s ranking deviates from its initial position.

Key Findings

Linguistic Patterns – Across all four tasks, LIWC reveals systematic differences between high‑ and low‑trait conditions that align with established meta‑analyses. High extraversion yields more words per sentence, higher use of social and affective categories, and more sexual references; high agreeableness shows more positive emotion words; high neuroticism produces more negative emotion and anger terms. However, the magnitude of these differences varies by context. In the negotiation scenario, overall lexical cues are muted, suggesting that task pressure suppresses trait‑related language.
Trait Prediction Accuracy – Both the context‑agnostic classifier and the LLM‑based expert rating generally agree with the intended prompts, confirming that the personality prompts are encoded in the agents’ output. Nevertheless, the classifier’s performance drops in high‑stakes contexts (negotiation, survival) where the model prioritizes goal‑oriented content over trait‑related phrasing. The expert LLM, which receives explicit context information, maintains higher alignment, indicating that incorporating situational cues improves trait inference.
Emotional Modulation – The LLM emotion recognizer shows that the same personality prompt can generate opposite affective signatures depending on the task. High‑extraversion agents display high valence and arousal in ice‑breaking but adopt a more neutral or even slightly negative affect in negotiation, reflecting the dominant anger tone of that setting. High‑agreeableness agents consistently produce supportive, low‑arousal language in the empathetic dialogue, while high‑neuroticism agents maintain higher arousal and negative valence across tasks, especially in negotiation.
Behavioral Adaptation – In the negotiation task, high‑extraversion and high‑agreeableness agents concede more (larger reduction from their initial 100 % refund demand) than low‑trait counterparts, indicating a willingness to accommodate the partner. In the survival ranking task, high‑extraversion agents show larger SRD values, meaning they are more flexible in adjusting their rankings to reach consensus. Conversely, high‑neuroticism agents exhibit minimal concession and lower SRD, reflecting a more rigid stance.

Theoretical Integration
The authors interpret these findings through Whole Trait Theory, which conceptualizes traits as distributions of momentary states shaped by goals, situational cues, and social schemas. The observed pattern—that LLMs adjust linguistic style, emotional tone, and decision behavior according to the demands of each context—mirrors human data showing that traits predict average tendencies but not fixed responses. Thus, the variability is not a flaw but an emergent property of a system that integrates personality priors with contextual reasoning.

Implications and Limitations
Practically, the work suggests that designers of personality‑driven conversational agents should prioritize context‑sensitive adaptation rather than striving for static trait consistency. Users may value agents that modulate their expressiveness to match the interaction goal (e.g., being more assertive in negotiation while remaining warm in small talk). Methodologically, the reliance on LIWC and a context‑free classifier highlights the need for more nuanced, context‑aware evaluation metrics. The study also limits itself to agent‑agent interactions; real‑world user studies could reveal additional dynamics such as trust formation or perceived authenticity.

Conclusion
The paper provides a comprehensive, multi‑modal analysis of how identical personality prompts manifest differently across distinct dialogue contexts. By combining linguistic, emotional, and behavioral measurements, it demonstrates that LLM agents exhibit context‑dependent personality expression consistent with Whole Trait Theory. This positions LLMs as flexible, human‑like interlocutors whose “personality” is a dynamic resource rather than a static label, informing future research on adaptive, trustworthy, and socially aware AI dialogue systems.

Personality Expression Across Contexts: Linguistic and Behavioral Variation in LLM Agents

💡 Research Summary

Comments & Academic Discussion

Leave a Comment