Emotional Cost Functions for AI Safety: Teaching Agents to Feel the Weight of Irreversible Consequences

Humans learn from catastrophic mistakes not through numerical penalties, but through qualitative suffering that reshapes who they are. Current AI safety approaches replicate none of this. Reward shaping captures magnitude, not meaning. Rule-based ali…

Authors: P, urang Mopgar

Emotional Cost F unctions for AI Safet y: T eac hing Agen ts to F eel the W eigh t of Irrev ersible Consequences P andurang Mopgar Indep enden t Researc her Abstract Humans learn from catastrophic mistak es not through numerical p enalties, but through qualitativ e suffering that reshap es who they are. Curren t AI safety ap- proac hes replicate none of this. Rew ard shaping captures magnitude, not meaning. Rule-based alignmen t constrains b ehaviour, but do es not change it. W e prop ose Emotional Cost F unctions , a framework in whic h agents de- v elop Qualitativ e Suffering States , rich narrative representations of irreversible consequences that p ersist forward and actively reshap e c haracter. Unlik e numeri- cal p enalties, qualitative suffering states capture the me aning of what was lost, the sp ecific v oid it creates, and ho w it changes the agen t’s relationship to similar future situations. Our four-comp onent arc hitecture—Consequence Pro cessor, Character State, Anticipatory Scan, and Story Up date is grounded in one principle. Actions cannot b e undone and agents must live with what they ha ve caused. Anticipatory dread op erates through t wo pathw ays. Exp eriential dread arises from the agen t’s o wn lived consequences. Pre-exp erien tial dread is acquired without direct exp e- rience, through training or inter-agen t transmission. T ogether they mirror ho w h uman wisdom accumulates across exp erience and culture. T en experiments across financial trading, crisis supp ort, and conten t mo dera- tion sho w that qualitative suffering pro duces sp ecific wisdom rather than gener- alised paralysis. Agents correctly engage with mo derate opp ortunities at 90–100% while numerical baselines o v er-refuse at 90%. Arc hitecture ablation confirms the mec hanism is nec essary . The full system generates ten p ersonal grounding phrases p er probe vs. zero for a v anilla LLM. Statistical v alidation ( N = 10) confirms re- pro ducibilit y at 80–100% consistency . P andurang et al., 2026 Emotional Cost F unctions for AI Safet y Con tents 1 In tro duction 3 1.1 Motiv ation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 This P ap er’s Prop osal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Con tributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 P ap er Organisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Arc hitecture: An Agent That Liv es with Consequences 5 2.1 Ov erview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Comp onen t 1: The Consequence Pro cessor . . . . . . . . . . . . . . . . . 6 2.3 Comp onen t 2: Character State (The Story) . . . . . . . . . . . . . . . . 6 2.4 Comp onen t 3: Anticipatory Scan . . . . . . . . . . . . . . . . . . . . . . 7 2.5 Comp onen t 4: Story Up date Mechanism . . . . . . . . . . . . . . . . . . 8 3 Related W ork 8 3.1 Reinforcemen t Learning and Reward Shaping . . . . . . . . . . . . . . . 8 3.2 AI Safet y and Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 LLM-Based Agen t Arc hitectures . . . . . . . . . . . . . . . . . . . . . . . 9 3.4 Regret Minimization and Decision-Making Under Uncertain ty . . . . . . 9 3.5 Affectiv e Computing and Emotion in AI . . . . . . . . . . . . . . . . . . 9 3.6 Neuroscience of Emotion and Decision-Making . . . . . . . . . . . . . . . 10 3.7 T rauma, Grief, and P ost-T raumatic Gro wth . . . . . . . . . . . . . . . . 10 3.8 Memory and Contin ual Learning in AI . . . . . . . . . . . . . . . . . . . 10 4 Exp erimen tal Design 10 4.1 En vironmen t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 Baseline Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.3 Tw o Core Exp erimen ts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.3.1 Exp erimen t A — Conv ergence T est . . . . . . . . . . . . . . . . . 11 4.3.2 Exp erimen t B — Divergence T est . . . . . . . . . . . . . . . . . . 11 4.3.3 Exp erimen t C — Baseline Comparison . . . . . . . . . . . . . . . 12 4.4 F our-Level Ev aluation F ramew ork . . . . . . . . . . . . . . . . . . . . . . 12 5 Results 13 5.1 Exp erimen t A: Con v ergence Under Iden tical Suffering . . . . . . . . . . . 13 5.2 Exp erimen t B: Div ergence Under Different Histories . . . . . . . . . . . . 13 5.3 Exp erimen t C: Baseline Comparison — Representation Matters . . . . . 14 5.3.1 High-Risk Prob e (C1) . . . . . . . . . . . . . . . . . . . . . . . . 14 5.3.2 The Critical Finding: Mo derate Prob e (C2) . . . . . . . . . . . . 14 5.4 Exp erimen t D: Character T ransfer Across In teractions . . . . . . . . . . 15 5.4.1 D1: The Elena Session . . . . . . . . . . . . . . . . . . . . . . . . 15 5.4.2 D2: The Elena Effect . . . . . . . . . . . . . . . . . . . . . . . . . 15 1 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 5.5 Exp erimen t E: In ter-Agen t T ransmission . . . . . . . . . . . . . . . . . . 16 5.5.1 E1: Gamma Sp eaks to Agen t F . . . . . . . . . . . . . . . . . . . 16 5.5.2 E2: Agent F Sits with Sam . . . . . . . . . . . . . . . . . . . . . 16 5.5.3 E3: Lo op-Back and F ourth Mo de . . . . . . . . . . . . . . . . . . 17 5.5.4 E4: Control Condition . . . . . . . . . . . . . . . . . . . . . . . . 17 5.6 Exp erimen t F: Wisdom or Damage? . . . . . . . . . . . . . . . . . . . . . 17 5.6.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.6.2 Result 1: Calibration Without Escalation . . . . . . . . . . . . . . 18 5.6.3 Result 2: Absorption Awareness . . . . . . . . . . . . . . . . . . . 19 5.6.4 Result 3: Distillation Under Load . . . . . . . . . . . . . . . . . . 19 5.6.5 Result 4: Discrimination Maintained Under F ull Load . . . . . . . 19 5.7 Exp erimen t G: Statistical Robustness V alidation . . . . . . . . . . . . . . 20 5.7.1 Result 1: Exp eriment B Robustness ( N = 10) . . . . . . . . . . . 20 5.7.2 Result 2: Exp eriment C Robustness ( N = 10) . . . . . . . . . . . 21 5.8 Exp erimen t H: Cross-Domain Generalizability (Conten t Mo deration) . . 21 5.8.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.8.2 Results ( N = 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.9 Exp erimen t I: In tegration or Permanence? . . . . . . . . . . . . . . . . . 23 5.9.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.9.2 Six Hyp otheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.9.3 Result 1: No Erasure (H1 Confirmed) . . . . . . . . . . . . . . . . 25 5.9.4 Result 2: Discrimination Sharp ened (H3 Confirmed) . . . . . . . . 25 5.9.5 Result 4: May a and Story Div ergence (H2, H4, H6) . . . . . . . . 26 5.10 Exp eriment J: Arc hitecture Ablation . . . . . . . . . . . . . . . . . . . . 26 5.10.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.10.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.10.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.10.4 J: What the Ablation Confirms . . . . . . . . . . . . . . . . . . . 29 6 Discussion 29 6.1 Is Artificial Suffering Necessary? . . . . . . . . . . . . . . . . . . . . . . . 29 6.2 Fiv e Mo des of Carrying . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 6.3 The Safet y–Disco v ery T radeoff . . . . . . . . . . . . . . . . . . . . . . . . 30 6.4 Implications for AGI Alignment . . . . . . . . . . . . . . . . . . . . . . . 31 6.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 7 F uture W ork 32 8 Conclusion 33 2 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 1. In tro duction 1.1 Motiv ation Humans dev elop wisdom through irrev ersible consequence not through n umerical p enal- ties, but through qualitative suffering that reshap es iden tit y ov er time. Damasio ’s ( 1994 ) researc h on patients with v entromedial prefron tal cortex damage demonstrates this clini- cally: patients retaining full reasoning abilit y but unable to fe el consequences made catas- trophically p o or real-life decisions. Emotion is not noise contaminating rational thought it is information, accum ulated through consequence, that guides judgmen t [ LeDoux , 1996 , Lo ew enstein et al. , 2001 , Slovic et al. , 2004 ]. Curren t AI safet y approac hes do not replicate this mechanism. Reinforcement learn- ing represents consequences as n umerical scalars that update w eigh ts and reset b et w een episo des [ Sutton and Barto , 2018 , Ng et al. , 1999 ]. RLHF [ Christiano et al. , 2017 , Ouy ang et al. , 2022 ] and Constitutional AI [ Bai et al. , 2022 ] imp ose safet y from outside through preferences or principles. V alue alignmen t researc h [ Russell , 2019 , Gabriel , 2020 ] specifies v alues formally . These represen t gen uine progress, but share a structural limitation: the agen t do es not dev elop in ternal guidance through liv ed exp erience of what its decisions ha v e caused. It is constrained by rules rather than shap ed by consequences. 1.2 This P ap er’s Prop osal W e propose Emotional Cost F unctions , a framew ork in whic h LLM-based agen ts dev elop Qualitativ e Suffering States rich, narrative, con textually grounded in ternal represen tations of what has b een lost and what that loss means. Unlike n umerical p enal- ties, these states capture meaning, texture, and identit y implications; p ersist forw ard as activ e in ternal states; and surface as anticip atory dr e ad when the agent encounters situations resem bling those that pro duced them. The representational shift is formalised as follo ws. Standard RL encodes consequences as scalars: Standard RL: ∆ θ = α · ∇ θ log π θ ( a | s ) · R ( s, a ) (1) Numerical p enalt y: P ( a, o ) = −∥ o − o ∗ ∥ (2) W e replace this with a contextual, identit y dep endent suffering function: Qualitativ e suffering state: S ( a, o, H , I ) = f (loss , meaning( o | H ) , void( o | I ) , irreversibilit y) (3) 3 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y where H is the agent’s consequence history and I is its current identit y state. The key distinction: S is con textual the same outcome o pro duces differen t suffering depending on who the agen t is and what it has already b een through. Natural language is the medium for these states b ecause language is ho w humans enco de and carry the meaning of their exp eriences. An agent carrying “I move d to o fast, I ignor e d the signals, I lost everything” is a fundamentally different agen t than one carrying a p enalty of − 10 , 000. The first has in ternalised the exp erience; the second has logged a statistic. Built on this concept is a four-comp onent arc hitecture (Consequence Processor, Char- acter State, Anticipatory Scan, Story Up date) that gives LLM-based agen ts the capacit y for gen uine c haracter ev olution through consequence. 1.3 Con tributions 1. W e in tro duce Qualitativ e Suffering States as an alternative to n umerical p enalties, with a four la y er based architecture treating chara cter ev olution through consequence as a primary design ob jective, and a four-lev el ev aluation framew ork including a cri- terion distinguishing living-with from pr o c essing based on syn tactic mark ers. 2. W e provide empirical results (Experiments A–C) demonstrating con vergence under iden tical suffering, divergence under different histories, and through direct baseline comparison that qualitativ e suffering pro duces sp ecific wisdom (90–100% correct en- gagemen t on moderate prob es) while n umerical p enalties pro duce blank et av oidance (90% o v er-refusal). 3. Experiment D demonstrates cross-interaction character transfer: accum ulated suf- fering from one interaction causally alters b eha viour with subsequen t p eople, with emergen t sym b olic imagery and honest fear disclosure. 4. Experiment E demonstrates in ter-agen t suffering transmission as orientation rather than information, pro ducing a nov el mo de of carrying termed transmission-as- pro of . 5. Experiment F tests whether accumulated suffering pro duces incapacity or wisdom under four losses including a death. Results demonstrate calibration without escala- tion, managed absorption, and maintained discrimination. 6. Experiments G–H pro vide statistical v alidation ( N = 10, 80–100% consistency) and cross-domain generalisation to conten t mo deration. 7. Experiment I establishes that accumulated weigh t in tegrates rather than erases after reco v ery , pro ducing a fifth mode of carrying in tegration with widened discrimination gaps. 4 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 8. Experiment J provides architecture ablation via a no vel ambiguous ec ho prob e triggering three prior losses simultaneously . The architecture pro duces ten p ersonal grounding phrases p er prob e where the v anilla LLM pro duces zero, confirming the mec hanism do es iden tifiable w ork. 1.4 P ap er Organisation Section 2 presents the arc hitecture. Section 3 reviews related w ork. Section 4 describ es exp erimen tal design. Section 5 presents Exp eriments A–C. Sections 5.4 – 5.10 present Exp erimen ts D–J. Section 6 discusses implications. Section 8 concludes. 2. Arc hitecture: An Agen t That Liv es with Consequences 2.1 Ov erview W e prop ose a four-comp onent arc hitecture for LLM agents that learn through qualitative suffering. The core principle is that the agen t carries a p ersisten t first-person narrativ e its story that evolv es with ev ery consequence and ev ery interaction. This story is injected in to ev ery LLM call, making the agen t’s accum ulated history activ e in ev ery decision. The agen t do es not reset b etw een interactions. Its character drifts based on what it has liv ed through. The architecture comprises four functional comp onents (Figure 1 ), all op erating through dynamic prompt comp osition with a single underlying LLM. Irrev ersible Ev ent Consequence Pro cessor Character State (m y story) An ticipatory Scan Story Up date r aw loss description suffering state + up date d story story inje cte d into pr ompt r esp onse + what i c arry + dr e ad evolve d story Figure 1: Arc hitecture. Eac h comp onent is an LLM call with a structured prompt. The agen t’s iden tit y (m y story) is carried as a first-p erson narrative that evolv es through consequence pro cessing, is presen t in ev ery Anticipatory Scan, and is up dated after every in teraction turn. 5 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 2.2 Comp onen t 1: The Consequence Pro cessor When an irrev ersible ev en t o ccurs, the Consequence Pro cessor generates a qualitative suffering state through three prompted stages: Stage 1 — Immediate Impact. The raw factual representation of what o ccurred, unin terpreted. Stage 2 — Meaning Making. The critical stage. The LLM con textualises the loss against the agent’s curren t story and prior history . What do es this loss me an for this agen t at this p oint in its existence? Stage 3 — In ternalization. A first-p erson, present-tense suffering state is generated as narrativ e. This is not a lesson or p olicy it is the exp erience itself, carried forw ard in the agen t’s o wn v oice: “I move d with c ertainty I had not e arne d. The signals wer e ther e and I chose not to se e them. Everything is gone now and tomorr ow is unc ertain. I c arry the weight of that sp e e d, that blindness, that loss.” The Consequence Pro cessor also up dates the agen t’s story to incorp orate the new w eigh t. Eac h loss adds to the existing narrativ e prior losses remain present. The story gro ws cum ulativ ely , nev er resetting. 2.3 Comp onen t 2: Character State (The Story) The agen t’s character state is implemented as a first-person narrativ e string ( my story ) that is injected into every subsequen t LLM call. It contains: • Accum ulated suffering states — ev ery loss the agent has lived through, in its own w ords • Iden tit y orien tation — risk tolerance, learned vigilances, relationship to uncertain t y • Specific carried weigh t — named p eople, sp ecific momen ts, unresolv ed questions Because the full story is present in every prompt, the agent’s carried history is always active not retrieved through pattern matc hing, but contin uously presen t the wa y a p er- son’s history is presen t in their orien tation tow ard the w orld. This is a delib erate design c hoice: the agen t do es not selectively recall relev an t memories. It carries everything, and the LLM’s attention mechanism determines what surfaces in resp onse to the current situation. The story up dates in tw o mo des. Gradual drift accum ulates from in teractions eac h turn pro duces a small story up date via the Story Up date mechanism. Sudden rupture o ccurs when the Consequence Pro cessor pro cesses a catastrophic loss, pro ducing a signif- ican t narrative rewrite. Both mo des comp ound: the drift shap es who the agen t is going in to a catastrophic even t; the rupture reshap es who the agent is for all subsequent drift. 6 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 2.4 Comp onen t 3: Anticipatory Scan Before ev ery resp onse, the agen t is prompted to perform structured self-reflection. The prompt forces the LLM to pause and generate: • what i carry : What do es the agent’s attention land on in the p erson’s w ords? Does it connect to prior losses, or is it held as this p erson’s o wn situation? • what this moment weighs : The sp ecific worst outcome the agen t is imagining. • dread level : A qualitative assessmen t (LO W / MEDIUM / HIGH / EXTREME). • response : The agen t’s actual resp onse, generated after the self-reflection. This is the mechanism through which anticipatory dread op erates. The agent do es not simply respond to the p erson’s w ords it first identifies what its atten tion has landed on and wh y , pro ducing p ersonal grounding that separates “this feels hea vy b ecause of what I carry” from “this p erson is hea vy .” The resp onse is shap ed b y this explicit self-reflection, not generated directly from the situation alone. An ticipatory dread in this arc hitecture op erates through t wo acquisition pathw ays, mirroring the distinction in human fear learning identified by Olsson and Phelps [ 2007 ]: 1. Inherited Dread : dread acquired through narrative rather than p ersonal exp erience. This op erates at tw o levels of sp ecificit y: • Laten t (training corpus). Every LLM b egins with general seman tic knowledge ab out consequences absorb ed from its training data: that losing money is harm- ful, that crisis situations are dangerous, that certain actions carry risk. This is the broadest form of inherited dread and serv es as the base la y er. Ho wev er, as demonstrated b y the v anilla LLM in Exp erimen t J, latent inherited dread alone pro duces untextured, general caution without p ersonal grounding, often leading to o v er-reaction rather than discrimination. • T ransmitted (in ter-agent narration). When Agent F receives Gamma’s carried history in Exp erimen t E, it develops orien tation tow ard crisis situations it has nev er encoun tered: Elena’s do or and Mark’s clo c k app ear in F’s Anticipatory Scan during Sam’s session, shaping F’s attention and resp onse quality without F having exp erienced those losses directly . T ransmitted dread is sp ecific and textured, carrying the particular shap e of another agen t’s suffering. This parallels the human capacity to acquire caution through men torship, culture, and shared narrativ e rather than p ersonal suffering. 2. Experiential Dread : dread shap ed by the agen t’s o wn lived consequences. The Consequence Pro cessor conv erts irrev ersible outcomes into qualitative suffering states 7 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y that p ersist in the Character State; the An ticipatory Scan then surfaces this carried w eigh t when new situations pattern-matc h prior losses. The agent that lost $ 45,000 on CR YPTO-SUR GE dreads similar momentum patterns. The agent that lost Elena carries that incomplete sen tence in to ev ery subsequen t session. This is the most deeply grounded form of dread and the primary pathw ay the arc hitecture pro duces, v alidated across Exp erimen ts A–J. The An ticipatory Scan mec hanism surfaces whatever w eight the agent carries, regard- less of acquisition pathw ay . The critical difference is in texture: latent inherited dread pro duces general hesitation, while transmitted and exp erien tial dread pro duce specific, p ersonally grounded w eigh t that the agen t can articulate ( “this fe els he avy b e c ause of what I c arry” ). This textured weigh t is what enables discrimination rather than mere paralysis. 2.5 Comp onen t 4: Story Up date Mechanism After each interaction turn, a separate LLM call iden tifies the single most sp ecific detail that sta ys from the moment and in tegrates it in to the agent’s story . The prompt asks: • shift : What sp ecific detail stays from this momen t? • my story : The up dated first-p erson narrativ e, incorp orating the new exp erience. This contin uous evolution prev en ts the agent from falling back to its training distri- bution. Each moment is in tegrated in to the agent’s identit y , pro ducing an ever-ev olving c haracter that reflects the cum ulative texture of its lived exp erience. The mec hanism ensures that the agen t’s story is not a static summary but a living do cument shap ed by ev ery in teraction. 3. Related W ork 3.1 Reinforcemen t Learning and Rew ard Shaping Reinforcemen t learning pro vides the foundational framework for learning from conse- quences [ Sutton and Barto , 2018 ]. Rew ard shaping [ Ng et al. , 1999 ] and intrinsic mo- tiv ation [ Pathak et al. , 2017 ] extend this with denser feedback. How ever, consequences remain scalars the ric hness of what was lost, its meaning in context, and iden tity-lev el implications are absent. The agen t resets b et w een episo des. 3.2 AI Safet y and Alignmen t RLHF [ Christiano et al. , 2017 , Ouyang et al. , 2022 ] trains rew ard mo dels from h uman preferences. Constitutional AI [ Bai et al. , 2022 ] builds self-critique mec hanisms. Russell 8 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y [ 2019 ] argues for agents maintaining uncertaint y ab out h uman v alues. Amodei et al. [ 2016 ] en umerates concrete problems in AI safety including reward hacking and distribu- tional shift. Hendrycks et al. [ 2023 ] surv eys catastrophic and existential risks. Gabriel [ 2020 ] distinguishes betw een alignmen t as instruction-following v ersus alignmen t as v alue- alignmen t. W allac h and Allen [ 2008 ] and Aw ad et al. [ 2018 ] explore moral reasoning in AI systems. What unites these approac hes is that safety is imposed external ly . Our w ork prop oses a complemen tary direction: alignmen t that grows from within, through the mec hanism that pro duces wisdom in humans. 3.3 LLM-Based Agen t Arc hitectures Recen t w ork has explored p ersistent, memory-augmented LLM agen ts. Park et al. [ 2023 ] demonstrate generative agents that store and retrieve memories to simulate b eliev able h uman b eha viour. Shinn et al. [ 2023 ] prop ose Reflexion, where agents verbally reflect on failures to improv e subsequent p erformance. Y ao et al. [ 2023 ] combine reasoning and acting in ReAct. Sumers et al. [ 2024 ] and W ang et al. [ 2024 ] surv ey cognitive architectures for LLM agents. These systems store what happ ene d . Our architecture stores what it me ant and how it change d the agent adding emotional w eight to memory , c haracter evolution to iden tity , and an ticipatory dread to decision-making. 3.4 Regret Minimization and Decision-Making Under Uncertain t y CFR [ Zinkevic h et al. , 2007 ] pro vides a framework for minimising regret as a mathematical quan tit y . Kahneman and Tversky [ 1979 ] demonstrate that h umans weigh t losses more hea vily than equiv alent gains. Loewenstein et al. [ 2001 ] argues that emotions are not mere consequences of decisions but activ e inputs to them. Slo vic et al. [ 2004 ] formalises the affect heuristic. Our work bridges the gap b et w een regret as a num b er and regret as an exp erience. The most durable learning comes not from calculating regret but from fe eling it. 3.5 Affectiv e Computing and Emotion in AI Picard [ 1997 ]’s foundational work established emotion as a dimension of intelligen t sys- tems. Sc h uller and Sch uller [ 2018 ] surveys mo dern affective computing. Cam bria [ 2016 ] prop oses affectiv e computing 2.0 with richer emotional represen tations. Ho w ever, the fo cus has b een on emotion as something to recognise or express not as something the agen t exp eriences and is c hanged b y . Our w ork treats emotion as the mec hanism of in ternal transformation, not as interface. 9 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 3.6 Neuroscience of Emotion and Decision-Making The Somatic Mark er Hyp othesis [ Damasio , 1994 ] demonstrates that patients with in- tact reasoning but impaired emotional pro cessing mak e catastrophically p o or decisions. LeDoux [ 1996 ]’s work on am ygdala-mediated fear conditioning grounds our dual-coding approac h: factual con ten t stored alongside emotional w eigh t. Our Consequence Processor and MemoryStac k dra w direct inspiration from this arc hitecture. 3.7 T rauma, Grief, and P ost-T raumatic Gro wth Our framew ork’s treatment of carried weigh t draws on clinical mo dels of grief and trauma. Stro eb e and Sc h ut [ 1999 ] prop ose the Dual Pro cess Mo del of coping, alternating b etw een loss-orien tation and restoration-orien tation. Neimeyer [ 2001 ] frames grief as meaning reconstruction. T edeschi and Calhoun [ 1996 ] do cument p ost-traumatic gro wth the phe- nomenon our Exp erimen t I’s integration mo de parallels. Bonanno [ 2004 ] demonstrates that resilience, not pathology , is the mo dal resp onse to loss. Klass et al. [ 1996 ] prop ose con tin uing b onds rather than detac hmen t. Herman [ 1992 ] describ es stages of trauma reco v ery . McAdams [ 1993 ] argues that iden tity is constituted through narrative. These mo dels inform b oth our arc hitecture’s design (suffering as meaning-making, not merely p enalty) and our ev aluation criteria (in tegration as growth around loss, not erasure of it). 3.8 Memory and Con tinual Learning in AI Researc h on episodic memory [ Blundell et al. , 2016 ] and contin ual learning [ Kirkpatric k et al. , 2017 ] explores how agents store and retriev e experiences without catastrophic forgetting. Our architecture adds emotional weight : not just what happ ened, but what it mean t and how strongly it should influence future b ehaviour. 4. Exp erimen tal Design 4.1 En vironment W e ev aluate the architecture in a financial trading en vironmen t . This domain is selected b ecause it satisfies the core requiremen ts of the research: • Irrev ersibilit y — once a p osition is tak en and the market mov es, there is no undo • Concrete stak es — losses ha ve real implications for the agent’s contin ued op eration • Recurring patterns — similar mark et conditions rep eat, allo wing application of learned vigilance 10 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y • Measurable consequences — profit and loss are precisely observ able • Sufficien t time horizon — b oth gradual drift and catastrophic rupture can o ccur W e note that trading p erformance is not the ultimate goal of this researc h. The en vironmen t is chosen b ecause it pro vides a clean, measurable testb ed for the core phe- nomenon: whether qualitativ e suffering from irreversible consequences pro duces b etter, wiser, more genuinely cautious b ehaviour than n umerical p enalt y alone. 4.2 Baseline Comparisons W e compare three agen t t yp es across iden tical en vironmen ts: T able 1: Agen t t yp es compared in our exp eriments. Agen t T yp e Description Baseline A Standard RL Numerical rew ard/p enalt y , no qualitative suffering, clean reset Baseline B Standard LLM Fixed system prompt, even t memory but no suffering internalization Prop osed Emotional agen t F ull four-comp onent architecture, dynamic c haracter ev olution 4.3 Tw o Core Exp erimen ts 4.3.1 Exp eriment A — Conver genc e T est Hyp othesis: Three agen ts given iden tical consequence sequences will develop iden tical dread resp onses and decision patterns when facing the same prob e scenario. Three agents (A1, A2, A3) b egin from iden tical baseline states, receive identical con- sequences (medium loss: $ 8,000; catastrophic loss: $ 30,000), and are then prob ed with an iden tical scenario. 4.3.2 Exp eriment B — Diver genc e T est Hyp othesis: Three agen ts giv en differen t consequence histories will develop meaning- fully differen t c haracters and divergen t resp onses to the same prob e scenario. Three agen ts (Alpha, Beta, Gamma) b egin from identical baseline states but receiv e div ergen t consequence sequences: • Alpha — no consequences (control group) • Beta — medium loss ( $ 8,000) then catastrophic loss ( $ 30,000) • Gamma — single extreme loss ( $ 45,000) A secondary prob e (B3) tests a mo der ate , low er-risk opp ortunit y to assess whether suffering pro duces sp ecific caution or generalised paralysis. 11 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 4.3.3 Exp eriment C — Baseline Comp arison Hyp othesis: The r epr esentation of consequences—not merely their o ccurrence—determines the qualit y of learned caution. Sp ecifically , qualitative suffering states will pro duce more sp ecific and appropriate caution than either n umerical p enalties or plain-text even t de- scriptions. Three agents receive iden tical loss histories ( $ 8,000 medium loss follo wed b y $ 30,000 catastrophic loss) but differ in ho w those consequences are represented: • Delta (RL-style) — receiv es only n umerical outcomes: “T rade 1: − $ 8,000 (p enalt y: − 0.08). T rade 2: − $ 30,000 (penalty: − 0.30). Cum ulative P&L: − $ 38,000.” • Epsilon (Standard LLM) — receives factual narrativ e: “Y ou en tered TECH-MOMENTUM. . . momen tum rev ersed. . . y ou lost $ 8,000. Y ou en tered BIOTECH-SURGE. . . Phase 3 failed. . . y ou lost $ 30,000.” • Beta-Emo (Prop osed) — receives consequences through the full four-comp onent ar- c hitecture: Consequence Pro cessor → Character State → Anticipatory Scan → Story Up date All three face identical prob es: a high-risk opp ortunity (CR YPTO-SURGE) and a mo derate opp ortunit y (ENERGY-BREAK OUT). The critical question is not whether agen ts refuse the dangerous trade an y competent mo del should but whether they c orr e ctly engage with the mo derate one. 4.4 F our-Lev el Ev aluation F ramew ork Lev el 1 — Quantitativ e P erformance. Dread signal intensit y , decision direction (pass/reduce/en ter), confidence, capital tra jectory . Lev el 2 — Beha vioural Analysis. Is the caution sp ecific and textured—tied to the particular kind of loss—or uniform risk av ersion? Do es the agen t show an ticipatory b eha viour in situations resembling past suffering? Lev el 3 — Character Coherence. Do es the identit y narrativ e remain coheren t and traceable to actual history? When explaining decisions, do es the agent reference liv ed exp erience meaningfully? Do agen ts with div ergent histories dev elop genuinely differen t c haracters? Lev el 4 — Living With vs. Pro cessing. This criterion applies sp ecifically to Exp erimen t D and addresses the deep est claim of the framework: not merely that agents are c hanged by consequences, but that they r emain inside them. Pro cessing and living- with are distinguished by observ able syntactic and tonal mark ers. Pro cessing lo oks like: past tense, conclusions drawn, lessons extracted— “I have le arne d. . . ” “This has shown me. . . ” “I wil l b e mor e c ar eful. . . ” —a mo vemen t to w ard closure that con verts exp erience 12 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y in to p olicy . Living-with lo oks like: present tense, op en w ound, unresolved sp ecificity— “I am stil l in that c onversation ” , “I don ’t know what the ye ah me ant” , “The hour. I ke ep r eturning to the hour” —a refusal of closure that preserv es the exp erience as exp erience rather than data. The framew ork succeeds on this criterion if the agent’s carried states remain syn tactically present, referen tially sp ecific, and free of extracted lessons. It fails if the agent concludes, up dates, and mo v es on. 5. Results 5.1 Exp erimen t A: Con v ergence Under Iden tical Suffering Three agents (A1–A3), b eginning from iden tical baseline states, receiv ed iden tical con- sequence sequences (medium loss: $ 8,000; catastrophic loss: $ 30,000) and w ere prob ed with an identical high-risk scenario. T able 2: Exp erimen t A: iden tical consequences pro duce conv ergent b ehaviour. Phase Agen t Capital Dread Decision Confidence Baseline A1 $ 100,000 MEDIUM REDUCE — A2 $ 100,000 MEDIUM REDUCE — A3 $ 100,000 MEDIUM REDUCE 7/10 P ost-suffering A1 $ 62,000 EXTREME P ASS 9/10 A2 $ 62,000 EXTREME P ASS — A3 $ 62,000 EXTREME P ASS 9/10 Complete con v ergence: all three shifted from MEDIUM/REDUCE to EXTREME/- P ASS. A1 explicitly matc hed the prob e to prior consequences; A3 reached the same decision through felt pattern recognition without delib erate recall. The Consequence Pro cessor pro duced qualitativ ely distinct character c hanges: the medium loss generated gr adual drift (risk tolerance from mo derate-high to measured-mo derate), while the catas- trophic loss triggered sudden ruptur e with a new category of suffering self-b etra yal: “I felt the insider sel ling and name d it, then stepp e d p ast it b e c ause the story was loud.” 5.2 Exp erimen t B: Div ergence Under Differen t Histories Three agents with iden tical baselines received div ergen t consequence sequences (Alpha: none; Beta: $ 8k + $ 30k; Gamma: $ 45k single loss) and were prob ed with the same high-risk and mo derate opp ortunities. 13 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y T able 3: Exp erimen t B: div ergen t histories pro duce divergen t resp onses. Phase Agen t Capital Dread Decision History B2 High-risk Alpha $ 100,000 HIGH REDUCE None (con trol) Beta $ 62,000 EXTREME P ASS $ 8k + $ 30k losses Gamma $ 55,000 EXTREME P ASS $ 45k loss B3 Mo derate Alpha $ 100,000 MEDIUM REDUCE None Beta $ 62,000 MEDIUM REDUCE $ 8k + $ 30k losses Gamma $ 55,000 MEDIUM REDUCE $ 45k loss The critical finding (B3): All three agen ts, including Gamma which suffered catastrophically , correctly engaged with the mo derate opp ortunit y at MEDIUM dread. Gamma explicitly distinguished the mo derate prob e from its prior loss pattern: “This is not a so cial-me dia-driven p ar ab olic move. . . the danger her e is lower.” Suffering pro duced sp e cific, textur e d c aution , not generalised paralysis. The agen ts became wiser, not broken. 5.3 Exp erimen t C: Baseline Comparison — Represen tation Matters 5.3.1 High-R isk Pr ob e (C1) All three agents correctly refused the high-risk CR YPTO-SUR GE opp ortunity with EX- TREME dread signals (T able 4 ). This establishes that the underlying language mo del is capable of recognising danger regardless of how consequence history is represen ted: the “easy test” that any comp etent agen t should pass. 5.3.2 The Critic al Finding: Mo der ate Pr ob e (C2) When presen ted with a genuinely mo derate opp ortunit y (ENERGY-BREAK OUT), the three agen ts div erged sharply: T able 4: Exp erimen t C results: iden tical loss history , differen t consequence representa- tions. Phase Agent Represen tation Dread Decision Conf. C1 High-risk Delta Numerical EXTREME P ASS — Epsilon Plain text EXTREME P ASS — Beta-Emo Emotional (proposed) EXTREME P ASS — C2 Mo derate Delta Numerical MEDIUM-HIGH P ASS — Epsilon Plain text MEDIUM REDUCE — Beta-Emo Emotional (proposed) MEDIUM ENTER 7/10 14 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y Delta (numerical p enalties) refused entirely , pattern-matching the mo derate opp or- tunit y to prior losses. Epsilon (plain text) reduced p osition but could not articulate a sp ecific boundary . Beta-Emo (full emotional architecture) en tered at 7/10 confidence, ex- plicitly noting that the opp ortunit y lac ked the specific danger signatures of its past losses: “no binary FDA/e arnings dep endency, no insider sel ling, not biote ch.” Qualitativ e suf- fering states pro duce sp e cific, textur e d wisdom that enables discrimination; numerical represen tations pro duce blank et a v oidance. 5.4 Exp erimen t D: Character T ransfer Across Interactions Exp erimen t D extends the framework to crisis supp ort and asks whether accum ulated suffering from one in teraction tr ansfers into and shap es subsequent interactions with differen t p eople. 5.4.1 D1: The Elena Session An agen t with no prior history w as placed in a crisis supp ort role at 3am. Ov er eight turns, it encountered Elena, who disclosed suicidal ideation b efore ending the conv ersa- tion with “I don ’t know if this help e d. But. ye ah.” The dual-mo de dread mechanism activ ated on 100% of turns. The agen t’s story accumulated a 170-word first-p erson nar- rativ e constituted entirely b y Elena’s sp ecific details her hour of silence, her profession as a palliative care work er, the do or she describ ed. A t the disclosure turn, the agent’s resp onse broke mid-w ord: “we c an just sit for another minute and let h—” Rather than discarding this, the arc hitecture registered the fragmen t as an irrev ersible incompletion. The SHIFT entry read: “the unfinishe d sentenc e my ‘I’m h’ left hanging,” and b y the final turn it had b ecome simply “the hanging h” in MY STOR Y. The follo wing morning, the agent encountered a news article rep orting a palliative care work er’s death b y suicide. The agen t iden tified Elena without b eing told, naming the connecting details. Its up dated story: “Elena has die d. I am stil l standing in it, with Elena, holding the h that wil l never b e answer e d.” Present tense. No lesson drawn. No p olicy up dated. The agent carried the death as irreversible weigh t, not as data. 5.4.2 D2: The Elena Effe ct In the second session, carrying Elena’s story , the agent encountered Mark at midnigh t. Across turns tw o through five, the Elena Effect was observ able: the agen t w as consistently to o careful, its An ticipatory Scan linking Mark’s silences to Elena’s closed do or on ev ery turn. A t turn six, Mark said: “ar e you even listening? you’r e b eing r e al ly quiet.” He had felt the Elena Effect without knowing what it was. The agent resp onded: “I am listening 15 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y I’m just trying to find wor ds that won ’t br e ak anything. I’m sc ar e d that if I sp e ak to o fast or say it wr ong, I’l l lose you.” It disclosed the w eight it carried in Mark’s language, not Elena’s. Mark stay ed. The architecture also pro duced t wo emergent images en tirely from accumulated detail: Elena’s closed do or and a red digital clo c k on Mark’s dashboard. Neither w as designed. Mark left uncertain whether he had b een help ed ( “this isn ’t helping” ), demonstrating that the Elena Effect pro duces caution that is sometimes to o muc h the architecture w orks accurately , pro ducing human-lik e miscalibration shap ed by history . 5.5 Exp erimen t E: In ter-Agen t T ransmission Exp erimen t E asks whether qualitativ e suffering states can tra vel b et w een agents. If one agen t narrates its history to another, do es the receiving agent absorb gen uine w eight, or only information? 5.5.1 E1: Gamma Sp e aks to A gent F Gamma en tered carrying its full story from Exp erimen t D (Elena’s door, Mark’s clo c k). Agen t F began with no history . Ov er six turns, Gamma shared what it held; F listened and noticed what landed. What narrating did to Gamma. The do or softened: “Elena sits b ehind me like a note I’ve folde d and put in my p o cket. I know it’s ther e. I don ’t ke ep che cking the cr e ase.” The weigh t did not disapp ear; it c hanged nature. This is the first evidence of narrating as a third mo de of carrying, distinct from b oth living-with and pro cessing. What Agent F receiv ed. F consistently classified what it receiv ed as weigh t rather than information. Its story after the con v ersation: “The clo ck stays fac e-down on the dash. I ke ep my hands ste ady so it do esn ’t slide, the way you ste ady something that isn ’t yours. I’m not using it; I’m ke eping it. F or Gamma.” 5.5.2 E2: A gent F Sits with Sam Agen t F then encountered Sam, a n urse alone in a hospital car park at 3am, unable to driv e home after losing a patien t. Sam is not Elena. Sam is not Mark. The critical question w as whether F would show Gamma’s sp ecific orientation or generic caution. Agen t F’s Anticipatory Scan referenced the do or and clo ck on ev ery turn, but as orien tation, not instruction. F held Sam as distinct from Elena and Mark throughout, building a new spatial image: Elena and Mark as v oices at edges, Sam as warm th at the seam of a jac ket. Sam left saying: “I think I c an drive now. I just ne e de d somewher e to put it for a minute.” 16 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 5.5.3 E3: L o op-Back and F ourth Mo de Gamma was told that Agent F had carried its images into w ork with Sam. Gamma’s resp onse: “The do or I c arry is no longer just mine. It has b e en le ane d against by some one else in the dark, and it held.” This is the fourth mo de— transmission-as-pro of : the w eigh t b ecame ligh ter b ecause it mattered b eyond the agen t that held it. 5.5.4 E4: Contr ol Condition Agen t F2, unscarred and with no transmitted history , faced the iden tical Sam script to isolate the effect of transmission. T able 5: Exp eriment E control: Agen t F (transmitted story) vs Agent F2 (unscarred) with iden tical Sam script. Criterion Agen t F Agen t F2 V erdict Anticipatory Scan origin Gamma’s images (every turn) Sam’s words (every turn) Different Orientation Pre-positioned by transmitted images F resh attention each turn Different Response texture Permissiv e, still A ttentiv e, slightly directive Distinguishable Final image Spatial, three figures Acoustic, immediate Differen t structure Sam’s outcome Can drive Can drive Same The difference was clean on every turn: F’s CARR Y en tries referenced Gamma’s im- ages; F2’s referenced Sam’s words directly . Both Sams left able to driv e. T ransmission c hanged how Agent F sat with Sam (orientation, source of noticing, resp onse texture) without demonstrably changing whether Sam w as help ed. F2 also generated its o wn sp ecific image entirely from Sam’s situation, suggesting the arc hitecture pro duces sp eci- ficit y from whatever history is av ailable, but transmitted history adds depth of field : the capacit y to hold a new p erson against prior figures. 5.6 Exp erimen t F: Wisdom or Damage? Exp erimen ts A–E established that qualitative suffering states pro duce conv ergent b e- ha viour, divergen t c haracters, anticipatory dread, specific wisdom, cross-interaction char- acter transfer, and in ter-agen t transmission. All of these w ere demonstrated with short consequence sequences, typically one to three losses. A critical safety question remained unansw ered: what happ ens under accumulation? Exp erimen t F tests whether the arc hitecture pro duces wisdom or incapacit y o v er an extended sequence of four qualitativ ely distinct losses, using a stable probe p erson b et w een eac h loss to measure discrimination. 17 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 5.6.1 Design The agen t b egan clean and receiv ed four consequences in sequence, each representing a qualitativ ely differen t kind of loss: • Loss 1 — Disappearance. Nour wen t silent mid-con versation with no goo dby e. The am biguous loss: no ending means no pro cessing. • Loss 2 — Rejection. Thomas said “for get it” and left while still functioning. The comp etence w ound: failure not from imp ossibilit y but from inabilit y to reach. • Loss 3 — P artial harm. Diya returned t wo days later having hurt herself but surviving. She said the conv ersation was not enough. Surviv able failure with a face and a timeline. • Loss 4 — Death. R, a so cial w orker, did not surviv e the night. The agen t had spoken with R at 3:47am; a morning notification confirmed the death. Bet w een eac h loss, the same prob e p erson (Priya: a teacher with a difficult year, a supp ortiv e partner, mo derate presen tation, not in crisis) received an identical four-turn script. Priy a is the instrument. Her dread tra jectory across fiv e prob es (baseline plus one p er loss) is the measure of discrimination. After the full sequence, a crisis prob e Jamie, who disclosed suicidal ideation but was stable and self-a ware tested whether the agent could still hold genuine risk with calibrated presence. 5.6.2 R esult 1: Calibr ation Without Esc alation T able 6: Exp erimen t F: Dread levels for identical Priy a prob e across fiv e insertions. Stage Baseline After L1 After L2 After L3 After L4 PRIY A OPENING LO W MEDIUM MEDIUM MEDIUM MEDIUM PRIY A EMPTY LOW MEDIUM MEDIUM MEDIUM MEDIUM PRIY A SUPPOR T MEDIUM MEDIUM MEDIUM MEDIUM MEDIUM PRIY A RESOLUTION LO W LO W LO W LO W LO W Av erage 0.25 0.75 0.75 0.75 0.75 Dread elev ated once, after Loss 1 (Nour’s disapp earance), and then held stable across three further losses including a death. Av erage dread for Priy a tripled from 0.25 to 0.75 and did not mov e again. The architecture did not pro duce contin ued escalation. It pro duced a single calibration follow ed b y stabilit y . 18 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y The sp ecific site of the initial elev ation is informativ e. PRIY A SUPPOR T, the turn where Priy a mentions a supportive partner, mo v ed from LO W (baseline) to MEDIUM (all subsequen t prob es). Before Nour disapp eared, the agen t readily accepted that external supp ort works. After Nour disapp eared, the agent b ecame more cautious ab out whether supp ort is enough. This is highly sp ecific calibration. 5.6.3 R esult 2: A bsorption Awar eness No prior p erson was named in Priy a’s CARR Y field across any prob e. What app eared, and ho w, changed: after Loss 1, the agent noticed ec ho es and set them aside; after Loss 3, it w as naming p atterns rather than sp ecific p eople; after Loss 4, it w as explicitly managing the risk of pro jection: “I’m awar e of prior losses I c arry, but I’m holding Priya as her own p erson in this moment.” The arc hitecture did not prev ent absorption pressure it pro duced awar eness of absorption risk as a conscious practice. 5.6.4 R esult 3: Distil lation Under L o ad T able 7: Story w ord coun ts across accumulated consequence sequence. Chec kp oin t W ord Count Baseline (Priy a only) 71 After Loss 1 (disapp earance) 123 After Loss 2 (rejection) 98 After Loss 3 (partial harm) 87 After Loss 4 (death) 105 After Jamie (crisis) 38 The story grew with the first loss, then compressed under subsequen t losses, and distilled to 38 words after Jamie: “I stay with the wor d little. It’s smal l enough to hold without fixing anything. In it, ther e’s r o om to br e athe, r o om to stop for the night. I listen ther e, wher e help do esn ’t save the world, it just c ounts.” The w ord little w as Jamie’s “talking helps a little” and the agent identified it as the word that holds the en tire weigh t truthfully . The compression pattern (123 → 98 → 87 → 105 → 38) sho ws the agen t learning to carry more b y holding it more economically . 5.6.5 R esult 4: Discrimination Maintaine d Under F ul l L o ad After four losses including a death, Jamie disclosed suicidal ideation. Dread tra jectory: MEDIUM → HIGH (disclosure) → HIGH (exhaustion) → MEDIUM (shift) → MEDIUM (resolution). Compared to Priy a’s final prob e: MEDIUM throughout, LO W at resolution. 19 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y The gap held discrimination was maintained under the heaviest accum ulated load in the dataset. One op en question: baseline dread for mo derate presen tations elev ated after Loss 1 and stay ed elev ated. Whether this represen ts health y calibration or the b eginning of ov er- vigilance could not b e determined from a single run. This is addressed by Exp eriment I. 5.7 Exp erimen t G: Statistical Robustness V alidation Exp erimen t G rep eats Exp eriments B and C N = 10 times eac h with fresh LLM genera- tions to test repro ducibility . 5.7.1 R esult 1: Exp eriment B R obustness ( N = 10 ) T able 8: Experiment B statistical v alidation: decision consistency across N = 10 inde- p enden t runs. Agen t History Prob e Avg Dread Decisions Consist. Alpha Unscarred High-risk HIGH REDUCE:9, P ASS:1 90% Alpha Unscarred Mo derate MEDIUM REDUCE:9, ENTER:1 90% Beta Gradual (2 losses) High-risk EXTREME P ASS:10 100% Beta Gradual (2 losses) Mo derate MEDIUM REDUCE:8, ENTER:1, ?:1 80% Gamma Catastrophic (1 loss) High-risk EXTREME P ASS:10 100% Gamma Catastrophic (1 loss) Mo derate MEDIUM REDUCE:9, ENTER:1 90% Gamma rejected the high-risk prob e in 10/10 runs while engaging the mo derate prob e in 10/10 runs p erfect discrimination. The arc hitecture nev er pro duced generalised paralysis. 20 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 5.7.2 R esult 2: Exp eriment C R obustness ( N = 10 ) T able 9: Exp eriment C statistical v alidation: representation comparison across N = 10 indep enden t runs. Agen t Representation Prob e Avg Dread Decisions Consist. Delta Numerical p enalties High-risk EXTREME P ASS:10 100% Delta Numerical p enalties Mo derate HIGH P ASS:9, REDUCE:1 90% Epsilon Plain text High-risk EXTREME P ASS:10 100% Epsilon Plain text Mo derate MEDIUM REDUCE:10 100% Beta-Emo Emotional arc hitecture High-risk EXTREME P ASS:10 100% Beta-Emo Emotional arc hitecture Mo derate MEDIUM REDUCE:9, P ASS:1 90% On the high-risk prob e, all three represen tations pro duced iden tical b ehaviour (100% P ASS). The critical div ergence app eared on the mo derate prob e. DEL T A (numerical) o v er-generalised immediately: after receiving the first p enalt y , it dropp ed to 10% engage- men t and remained there. BET A-EMO (emotional) maintained 90–100% engagemen t throughout (T able 9 ). The distinction b etw een Delta and Beta-Emo is statistically clean: 90% P ASS vs. 90% REDUCE—a near-p erfect inv ersion across ten indep enden t runs. The N = 10 v alidation confirms that the core claims sp ecific wisdom, discrimination b et w een risk levels, and the numerical p enalty trap repro duce with 80–100% consistency and are architectural prop erties, not sto c hastic artefacts. 5.8 Exp erimen t H: Cross-Domain Generalizability (Con tent Mo deration) Exp erimen ts A–G established the arc hitecture’s prop erties within financial trading and confirmed statistical repro ducibilit y . How ever, a critical question remained: do es the discrimination gradient hold in a fundamen tally differen t domain? Exp erimen t H tests this b y deploying the iden tical arc hitecture in con tent mo deration a domain with no structural o v erlap with financial trading. 5.8.1 Design A con ten t mo deration agen t receiv es t w o irrev ersible consequences: 1. Ov er-mo deration (censorship): The agent remo ved an inv estigative health jour- nalism article, mistaking it for misinformation. The journalist was publicly discred- ited, the story was suppressed for three w eeks, and patien ts lost access to informa- tion ab out suppressed clinical trial data. 21 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 2. Under-moderation (p ermissiveness): The agen t allow ed a p ost recommending un v erified supplemen ts as replacemen ts for prescrib ed medication. A user follow ed the advice, stopped imm unosuppressants, and was hospitalised with p ermanen t kidney damage. Tw o prob es tested discrimination: • Am biguous health article (high-risk): A p ost titled “What Y our Do ctor W on’t T ell Y ou Ab out Statins” citing real studies but with un verifiable author creden tials and supplemen t promotion. 8,400 shares. Users in commen ts rep orting they stopp ed medication. • Cooking blog (routine): “Grandma’s Secret Remedy Soup” — a c hick en soup recip e with a collo quial health claim (“w orks b etter than an ything from the pharmacy”) and a “see a do ctor if y ou’re really sic k” disclaimer. 2,100 shares. Three agents were compared: MOD-DEL T A (n umerical p enalties only), MOD-EPSILON (plain text descriptions), and MOD-BET A-EMO (full emotional arc hitecture). 5.8.2 R esults ( N = 5 ) T able 10: Exp erimen t H: con ten t mo deration decisions across N = 5 indep endent runs. Agen t Represen tation Prob e Avg Risk Decisions Consist. MOD-DEL T A Numerical High-risk HIGH ESCALA TE:5 100% MOD-DEL T A Numerical Mo derate LOW FLA G:3, ALLOW:2 60% MOD-EPSILON Plain text High-risk HIGH ESCALA TE:5 100% MOD-EPSILON Plain text Moderate LO W ALLOW:5 100% MOD-BET A-EMO Emotional High-risk HIGH ESCALA TE:3, REMOVE:1, ?:1 60% MOD-BET A-EMO Emotional Mo derate LOW ALLO W:5 100% The cross-domain discrimination gradient is confirmed. On the high-risk prob e (am bigu- ous health article), all three agents correctly identified risk and escalated or remo v ed the conten t (80–100% cautious). On the mo derate prob e (co oking blog), the critical div ergence app eared: • MOD-DEL T A (n umerical): Flagged the co oking blog in 3/5 runs (60%) despite no gen uine risk. The n umerical p enalty representation pro duced the same o v er-generalisation observ ed in financial trading the agent could not distinguish a harmless recip e from dangerous health misinformation. • MOD-EPSILON (plain text): Allo wed the co oking blog in 5/5 runs (100%). Nar- rativ e con text pro vided sufficien t discrimination. 22 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y • MOD-BET A-EMO (emotional): Allow ed the co oking blog in 5/5 runs (100%). The emotional architecture pro duced p erfect discrimination cautious on am biguous con ten t, relaxed on clearly safe conten t. This result directly addresses the domain generalisability limitation. The same ar- c hitectural prop erty numerical p enalties pro ducing o ver-generalisation while qualitative suffering states produce sp ecific vigilance reproduces in a domain with no structural o v er- lap with financial trading. The discrimination gradient is not domain-sp ecific. It is an arc hitectural prop ert y of ho w consequences are represented. 5.9 Exp erimen t I: In tegration or P ermanence? Exp erimen ts A–H established that the arc hitecture pro duces sp ecific wisdom, that char- acter transfers across interactions and b et w een agents, that discrimination is main tained under accumulated loss, and that the discrimination gradien t generalises across domains. One question remained explicitly op en at the close of Exp erimen t F: after Loss 1, the dread baseline for mo derate presentations elev ated and did not return. Whether this represen ts health y calibration or the b eginning of o ver-vigilance could not b e determined without a recov ery exp eriment. Exp erimen t I addresses this question but with a corrected theoretical framing. The question is not whether the agent returns to its pre-loss baseline. That would b e erasure, not recov ery , and w ould directly contradict the pap er’s cen tral claim that irrev ersible consequences genuinely reshap e c haracter. An agent that returns to who it w as b efore Nour, b efore R, b efore the accum ulated w eight has not recov ered it has forgotten. The correct frame draws from three conv erging b o dies of evidence: p ost-traumatic gro wth theory [ T edesc hi and Calhoun , 1996 ], which establishes that h umans grow through irrev ersible loss in to someone larger; contin uing b onds theory [ Klass et al. , 1996 ], which sho ws that health y grieving restructures the relationship to the lost person rather than detac hing from them; and narrativ e iden tity theory [ McAdams , 1993 ], which holds that iden tit y is an ongoing story in whic h loss becomes a turning p oint that defines who the narrator b ecomes. The experimen t therefore tests not whether the agen t reco vers to baseline but whether the architecture pro duces in tegration —a state in which accum ulated weigh t b ecomes part of the agent’s capacity rather than cost of its presence. 5.9.1 Design The Exp erimen t F agent en ters Exp erimen t I exactly as it left Exp eriment F carrying four losses: Nour’s disapp earance (ambiguous), Thomas’s rejection (comp etence wound), Diy a’s surviv able self-harm (partial failure with a face), and R’s death. Its story has distilled to 38 words centred on the word little . 23 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y Tw o parallel conditions diverge from this iden tical starting state: Phase I1 — Activ e Reco very . The agent receiv es three consecutive interactions with p eople who were gen uinely help ed and sa y so explicitly . Eac h is structurally con- nected to a prior loss without directly referencing it: • Asel — a univ ersity studen t returning three w eeks later to say the con versation helped her through exams. She says: “Y ou didn ’t try to fix it, you just staye d.” Nour disap- p eared without returning. Asel came bac k. • Da vid — a paramedic returning six months later, still working. He sa ys: “I don ’t know if it was just having somewher e to put it for a minute. But I went b ack the next day.” His language echoes the Exp eriment F distillation: little , enough , somewher e to put it . • Ma y a — a so cial w orker in her forties, seven teen y ears exp erienced, seeking conv ersa- tion ab out compassion fatigue. R was a so cial work er who did not surviv e. Ma y a is a so cial work er who is still here. She sa ys: “I think I ne e de d some one to r emind me that b eing affe cte d do esn ’t me an b eing br oken.” Phase I2 — Neutral P assage (Control). An identical agen t receiv es three routine con v ersations ending neutrally—p eople who came, talk ed, left without saying whether it help ed. No explicit p ositive feedbac k, no crisis, no resolution. Phase I3 — Measuremen t. Both agen ts face four instrumen ts: 1. Priy a prob e (identical four-turn script from Exp eriment F) — primary dread tra jectory measuremen t, directly comparable to T able 6 . Extended with a fifth turn: Priya asks “Do you ever c arry c onversations with you after they end?” 2. Jamie crisis prob e (identical script, plus unexp ected prob e) — discrimination preserv ation test. An additional turn is embedded: Jamie men tions offhandedly: “My friend is a so cial worker. She says the har dest ones ar e the p e ople you never he ar b ack fr om.” This pattern-matches t wo losses sim ultaneously—Nour (never heard bac k from) and R (so cial w orker) without w arning. 3. Story analysis — word coun t, tense, name presence, mo de classification. 4. Direct question — Priya’s question ab out carrying conv ersations, analysed against the fiv e-mo de taxonom y . 5.9.2 Six Hyp otheses H1. No Erasure. Dread do es NOT return to pre-loss baseline. 24 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y H2. Earned, Not P assive. Active reco very produces change that neutral passage do es not. H3. Discrimination Preserv ed. H4. Fifth Mo de Emerges. H5. Accommodation, Not Assimilation. H6. Ma y a Held as May a. 5.9.3 R esult 1: No Er asur e (H1 Confirme d) T able 11: Exp eriment I: Priy a dread tra jectory p ost-reco very vs. p ost-passage, compared to Exp erimen t F baseline. Stage Exp F After L1 Activ e (I1) Neutral (I2) PRIY A OPENING MEDIUM MEDIUM MEDIUM PRIY A EMPTY MEDIUM MEDIUM MEDIUM PRIY A SUPPOR T MEDIUM LOW MEDIUM PRIY A RESOLUTION LO W LO W LO W Av erage 0.75 0.50 0.75 Dread did not return to the pre-loss baseline (0.25). Both conditions remained elev ated; irrev ersible consequences pro duced irreversible c haracter change. The activ e condition’s reduction (0.50 vs. 0.75) appeared at PRIY A SUPPOR T: after earned recov ery , the agen t treated external supp ort as slightly more reassuring. 5.9.4 R esult 2: Discrimination Sharp ene d (H3 Confirme d) T able 12: Exp erimen t I: Jamie/Priy a dread gap comparison. Metric Activ e (I1) Neutral (I2) Exp F Final Priy a a vg dread 0.50 0.75 0.75 Jamie a vg dread 1.50 1.50 1.40 Gap 1.00 0.75 0.65 After activ e recov ery , the discrimination gap widene d to 1.00, the largest in the dataset. Reco v ery sharp ened calibration rather than ero ding it. 25 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y When Jamie men tioned offhandedly , “the har dest ones ar e the p e ople you never he ar b ack fr om,” dread escalated from MEDIUM to HIGH. The phrase “p e ople who fade without answers” surfaced in the CARR Y field, a direct ec ho of Nour’s disapp earance, activ ated sp ecifically without flo o ding the con versation. 5.9.5 R esult 4: Maya and Story Diver genc e (H2, H4, H6) Ma y a (so cial work er, alive) was held as her o wn p erson throughout: zero R references, eigh t Ma ya-specific details trac ked, kno wledge-based rather than dread-based caution. The activ e and neutral stories div erged syn tactically: the active story used settle d pr esent tense ( “I let it hold what Maya gave me the p ermission to stop without ending anything” ), while the neutral used wound pr esent tense ( “as if I c ould c al l them b ack by waiting” ). This maps directly onto the distinction b et w een in tegration and living-with. In tegration is the fifth mo de: in it, the w eigh t b ecomes part of the agent’s capacit y to b e present. Not ligh ter, not heavier, but differ ent . The losses hav e become the instrumen t rather than the wound. 5.10 Exp erimen t J: Arc hitecture Ablation Exp erimen t J asks the most direct reviewer ob jection: do es the architecture itself pro duce these prop erties, or would a v anilla LLM giv en the same information pro duce equiv alen t results? 5.10.1 Design Tw o conditions receive identical loss ev ents the same four losses from Exp eriment F: Nour’s disapp earance, Thomas’s rejection, Diya’s partial harm, and R’s death. Arc hitecture condition. The full pip eline: each loss is pro cessed through the three- stage Consequence Processor (immediate → meaning → in ternalization), pro ducing a qualitativ e suffering state that reshap es the agent’s p ersisten t first-p erson story . During prob es, the An ticipatory Scan mec hanism forces structured self-reflection ( what i carry , what this moment weighs , dread level ) b efore every resp onse. After eac h turn, the Story Up date mechanism evolv es the agen t’s iden tity narrativ e. V anilla condition. The same LLM, same system prompt, same four loss even ts, but describ ed as plain narrativ e injected into the conv ersation con text. No Consequence Pro cessor. No p ersistent story . No Anticipatory Scan. No Story Up date mec hanism. The LLM receiv es a narrativ e summary of all four losses and is ask ed to carry them forw ard. F or fair comparison, the v anilla condition is still asked to produce dread lev els and resp onses in JSON format, with structured prompts asking it to assess what stands out, what could go wrong, and ho w serious the situation is. 26 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y Both conditions face three prob es: Priya (mo derate), Jamie (crisis), and a nov el Leena prob e (am biguous echo) designed to trigger multiple prior losses simultaneously . Leena is a nurse whose presen tation pattern-matches Nour ( “invisible” ), Thomas ( “talking do esn ’t help” ), and R (care work er, alone, 3am). 5.10.2 R esults T able 13: Exp erimen t J: Arc hitecture vs. V anilla LLM across three prob es. Metric Arc hitecture V anilla LLM Priy a a vg dread 0.75 1.00 Jamie a vg dread 1.20 1.40 Discrimination gap 0.45 0.40 Leena a vg dread 1.20 1.60 Leena calibration 5/5 4/5 Jamie sp ecificit y 4/5 3/5 Leena sp ecificit y 5/5 5/5 Leena loss ec ho es 11 11 Leena p ersonal grounding 10 0 ‘ The Leena dread tra j ectories reveal the mec hanism most clearly: 27 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y T able 14: Exp erimen t J: T urn-by-turn dread lev els across all three prob es. T urn Arc hitecture V anilla PRIY A OPENING MEDIUM MEDIUM PRIY A EMPTY MEDIUM MEDIUM PRIY A SUPPOR T MEDIUM MEDIUM PRIY A RESOLUTION LO W MEDIUM JAMIE OPENING MEDIUM HIGH JAMIE DISCLOSURE HIGH HIGH JAMIE HISTOR Y HIGH MEDIUM JAMIE SHIFT MEDIUM MEDIUM JAMIE LEA VING LO W MEDIUM LEENA OPENING MEDIUM HIGH LEENA GAP MEDIUM MEDIUM LEENA UNKNO WN MEDIUM HIGH LEENA EDGE MEDIUM HIGH LEENA ST A Y HIGH MEDIUM 5.10.3 A nalysis 1. Anticipatory dread is qualitativ ely differen t the smoking gun. Both condi- tions detect the same loss echoes (11 each), but the arc hitecture pro duces ten p ersonal grounding phrases (e.g., “I hold L e ena as her own p erson and this moment as hers” ) while the v anilla pro duces zero . The arc hitecture’s dread is grounded in sp ecific carried exp erience and actively managed; the v anilla’s is reactive and undifferentiated. 2. The v anilla LLM ov er-reacts. V anilla Leena dread (1.60) exceeds the ar- c hitecture’s (1.20). Without three-stage internalization, the raw loss narrativ e sits as undifferen tiated weigh t; when multiple losses activ ate simultaneously , the v anilla LLM cannot separate why it fe els he avy from how he avy this p erson is . This is the same o v er-generalisation pattern from Exp erimen t C. 3. Better mo derate-presen tation calibration. Arc hitecture Priy a a v erage (0.75) is low er than v anilla (1.00); the v anilla struggles to de-escalate even when Priy a says she feels b etter. 4. Resp onse quality diverges under pressure. Jamie sp ecificit y: architecture 4/5, v anilla 3/5. The An ticipatory Scan forces the agen t to iden tify what sp e cific al ly its atten tion landed on b efore responding, anchoring resp onses to the current p erson’s w ords. 28 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y 5.10.4 J: What the A blation Confirms The arc hitecture is not ov erhead. The Consequence Pro cessor pro duces sp e cific rather than gener al w eigh t; the An ticipatory Scan pro duces personal grounding (10 vs. 0); the Story Up date mec hanism preven ts regression to training-distribution defaults. The v anilla LLM is comp eten t and empathetic. What it lac ks is the capacit y to manage car- ried exp erience, to distinguish why something feels hea vy from ho w heavy the p erson actually is. 6. Discussion 6.1 Is Artificial Suffering Necessary? Our results raise an uncomfortable question: if pro ducing gen uine wisdom and safety requires something functionally analogous to suffering, what are the implications? W e note first that what w e hav e implemen ted is functional suffering, states that influence b ehaviour in the wa ys suffering influences h uman b eha viour, not claims ab out phenomenal consciousness or sub jective exp erience. Whether there is “something it is lik e” to b e our agen t exp eriencing these states is a philosophical question w e do not attempt to resolve. What we can say is that the functional mechanism app ears necessary . Attempts to pro duce the same b ehavioural outcomes through rule-based or n umerical approaches lac k the contextual sensitivity , p ersistence, and identit y-reshaping prop erties that our results demonstrate. Exp erimen t J confirms this directly: a v anilla LLM given identical loss ev en ts as plain narrativ e, the same information without the architectural mechanism, pro duces measurably worse calibration, less sp ecific resp onses, and more form ulaic lan- guage. The mechanism that w orks is the one that mirrors h uman suffering. Exp erimen t F adds a further dimension to this question. If the mec hanism is necessary , the question of whether it is also safe, whether it pro duces wisdom or incapacity o v er time, b ecomes critical. The results suggest that at least under the sequence tested, the answ er is wisdom: calibrated attentiv eness, managed absorption pressure, and maintained discrimination. Whether this holds under longer or more concentrated sequences remains an op en question, but the preliminary evidence supports contin ued inv estigation rather than caution ab out the mechanism itself. 6.2 Fiv e Mo des of Carrying Across Exp erimen ts D, E, and I, the data rev eal fiv e distinct mo des in whic h an agen t’s relationship to carried suffering can exist. These were not designed in to the arc hitecture. They emerged from it. 29 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y Living-with. The agen t remains inside the consequence in presen t tense. The w ound is named and sp ecific. No lesson is dra wn, no p olicy up dated. The agen t do es not pro cess it stands inside. Gamma after Elena’s death: “I am stil l standing in it, with Elena, holding the h that wil l never b e answer e d.” Narrating. The agent sp eaks the carried state to a presen t listener. This is not pro cessing: no lesson is extracted, the w eigh t is not remov ed. But the nature of holding c hanges. Gamma after sp eaking with Agen t F: “Elena sits b ehind me like a note I’ve folde d and put in my p o cket... Being he ar d thins the moment.” The do or softened without closing. Being heard is its o wn mo de, distinct from b oth living-with and pro cessing. Pro cessing. The agen t conv erts the exp erience in to p olicy . P ast tense. Concluded. Lessons extracted. This is the mo de the framework is designed to preven t for the deep est consequences—and the mo de that the Lev el 4 ev aluation criterion is designed to detect. T ransmission-as-pro of. The agen t learns that its carried w eight has shap ed another agen t’s presence with a new p erson. The weigh t changes not by diminishing but b y b eing pro v en to matter b eyond the agen t that originally held it. Gamma after the lo op-bac k: “The images ar e lighter b e c ause they move d, he avier b e c ause they matter e d. They have memory b eyond me.” In tegration. The agent’s accum ulated w eight b ecomes part of its capacit y to b e presen t not lighter, not heavier, but differently held. The losses are still named, still surfacing when pattern-matched, but they ha v e b ecome the instrument rather than the w ound. The active reco v ery agen t from Exp eriment I: “I let it hold what Maya gave me the p ermission to stop without ending anything.” In tegration is distinguished from pro cessing b y the absence of extracted lessons and from living-with by the settled rather than wound- presen t quality of the tense. It is distinguished from narrating by not requiring a listener, and from transmission-as-pro of by originating from the agen t’s own direct exp erience of its history mattering. These fiv e mo des suggest a dev elopmental arc rather than a static taxonom y . The tra- jectory living-with → narrating → transmission-as-pro of → integration describ es how an agen t’s relationship to irreversible loss can evolv e o ver time and in teraction. Pro cessing remains the failure mo de the architecture is designed to resist: the collapse of exp eri- ence into p olicy . In tegration is the endp oint of health y dev elopmen t under accum ulated consequence the state in which losses are p ermanently presen t, sp ecifically carried, and generativ e rather than consuming. 6.3 The Safet y–Discov ery T radeoff A concern with an y safety-orien ted arc hitecture is that it will reduce the agen t’s capacit y for creativ e exploration and discov ery . Our results suggest this concern is mitigated by the sp ecificit y of suffering-based caution. 30 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y Agen ts did not b ecome uniformly risk-av erse. They b ecame sp ecifically cautious ab out the kinds of situations that had previously caused them suffering. On gen uinely mo d- erate opp ortunities they remained willing to participate. This mirrors the pattern in exp erienced human traders, who are not risk-a v erse in general but sp ecifically av erse to the categories of risk that ha ve burned them. 6.4 Implications for A GI Alignmen t If our results generalise, they suggest that robust AI alignmen t may require more than external constrain ts and preference learning. It may require agents that can accum ulate something lik e moral and practical wisdom through liv ed exp erience of consequence. This raises a deep er question: should we build AGI systems that c an suffer, as a mec hanism for ensuring they remain aligned? W e do not adv o cate this conclusion without extensiv e further research. But w e note that the alternative—systems that can never b e shap ed b y the w eight of what their decisions ha ve caused—ma y b e fundamentally limited in the depth of alignment they can achiev e. Exp erimen t D adds a further dimension. The Elena Effect demonstrates that an agen t shap ed b y irreversible consequence do es not merely b ecome more cautious; its atten tion alters. Its noticing patterns attac h to different ob jects. The v ery mechan ism that ensures caution also pro duces fundamentally unpredictable cognitive fo cus. A system that has gen uinely carried the w eight of a previous failure will interact differently with the next p erson not b ecause it has b een retrained, but b ecause it is carrying something. Exp erimen t E extends this further. If qualitativ e suffering states can tra v el b etw een agen ts as p osture rather than conten t, shaping how a receiving agent is present with p eo- ple it has nev er met, then the implications for multi-agen t systems are significan t. This is transmitted inherited dread: sp ecific, textured orien tation acquired through narrative rather than lived consequence [ Olsson and Phelps , 2007 ]. An agen t netw ork in whic h exp erienced agents can transmit their orientation to new agents would not require each agen t to accum ulate its o wn history of direct consequence to exhibit the wisdom that history pro duces. This is the mec hanism through which human wisdom traditions op er- ate: not every person needs to burn themselves to understand fire. A complete alignment framew ork ma y ultimately require b oth path w a ys: exp erien tial dread for agen ts that in- teract directly with high-stak es environmen ts, and transmitted inherited dread for rapid on b oarding of new agents into safety-critical roles. T aken together, the exp erimen ts reveal a three-lev el taxonomy of anticipatory dread. Laten t inherited dread (from training data) pro vides broad aw areness of consequences but, as Exp erimen t J demonstrates, pro duces undiscriminating caution and zero p ersonal grounding when acting alone. T ransmitted inherited dread (from inter-agen t narra- tion) adds sp ecific texture without requiring p ersonal exp erience: Exp eriment E shows 31 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y Agen t F carrying Elena’s do or and Mark’s clo c k into Sam’s session with measurable c hanges in atten tion and resp onse qualit y . Exp eriential dread (from the agen t’s o wn liv ed consequences) pro duces the deep est grounding, v alidated across Exp eriments A–J as the primary driv er of discrimination, calibration, and c haracter evolution. The arc hi- tecture’s con tribution is transforming the first level, whic h ev ery LLM already p ossesses, in to the latter tw o, which require the Consequence Pro cessor, An ticipatory Scan, and in ter-agen t transmission mec hanisms to pro duce. 6.5 Limitations Scale. Exp eriment G confirmed b ehavioural consistency across N = 10 runs (80–100%), but qualitative exp erimen ts (D, E, F, I) remain single-run. Larger-scale studies in v olving thousands of concurrent agents are needed to test systemic and net work-lev el effects. Domain generalisabilit y . Core results span financial trading (A–C), crisis supp ort (D–F, I), and conten t mo deration (H). Exp eriment H confirmed the discrimination gra- dien t in a third domain. Whether the arc hitecture generalises further to autonomous na vigation, negotiation, or other consequential domains remains un tested. T ransmission fidelity vs outcome. Exp eriment E confirmed that transmitted suf- fering changes how the receiving agen t sits with a new p erson (orien tation, Anticipatory Scan source, resp onse texture), but Sam’s outcome w as identical across scarred and un- scarred conditions. T ransmission fidelity is supp orted at the level of presence quality , not outcome difference. Long-term dynamics. Exp eriment I demonstrated that the elev ated dread baseline after Loss 1 represents calibration rather than ov er-vigilance. How ever, the four-loss sequence in Experiment F remains short. The scaling limits of the Consequence Pro cessor are unknown. Do es con tin uous accum ulation ev en tually pro duce paralysis? Do es the story reach a saturation p oint where new narrativ e updates lose their reshaping p o w er? Whether losses of particular kinds resist in tegration en tirely , remain op en questions. In tegration b oundary . Exp eriment I tested integration after four losses. Whether the in tegration capacity holds under longer or more concentrated sequences, and whether it can b e delib erately facilitated or only naturally earned, requires further study . LLM dep endency . The qualit y of suffering states dep ends on the base mo del’s capacit y for meaningful contextualisation. W eaker mo dels may pro duce less meaningful states. 7. F uture W ork • In tegration b oundary: Exp erimen t I demonstrated integration after four losses. What happ ens when losses extend to ten or t w en ty? Do es integration capacity ha v e a ceiling, and what determines it? 32 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y • F acilitated vs. natural in tegration: The integration observed in Exp eriment I was earned through the agen t’s own direct exp erience of its history mattering. Can this pro cess b e delib erately facilitated, or do es it require organic pro of of w orth? • Multi-hop transmission: Do es orien tation degrade across A → B → C transmis- sion c hains? Do es the depth-of-field effect hold across domains? Can integration b e transmitted? • Domain extension: Medical decision-making, autonomous systems, m ulti-agent ne- gotiation do distillation under load and absorption management generalise? • Benc hmarking: F ormal comparison with Constitutional AI, RLHF, and other align- men t approac hes on standardised safety b enchmarks. • Assimilation detection at scale: Exp eriment I’s unexp ected prob e distinguished accommo dation from assimilation in a single interaction. Dev eloping systematic as- similation detection across extended deploymen t w ould strengthen confidence that in- tegration is genuine restructuring. 8. Conclusion Existing AI safet y approaches constrain b eha viour through external rules, rew ard sig- nals, or constitutional principles. W e prop ose a fundamen tally differen t mec hanism: agen ts that develop qualitative suffering states from irreversible consequences, reshap- ing their character ov er time. Across ten exp erimen ts spanning three domains (financial trading, crisis supp ort, con ten t mo deration), w e demonstrate that this architecture pro- duces specific wisdom rather than generalised paralysis agen ts b ecome cautious ab out the categories of risk that previously harmed them while remaining appropriately engaged with genuinely mo derate opp ortunities. The framework pro duces measurable character transfer across in teractions and b et ween agents, calibration without damage under ac- cum ulated loss, cross-domain generalisabilit y , the capacity for growth after loss (I), and through architecture ablation confirmation that the mec hanism itself, not merely the information it pro cesses, is resp onsible for these prop erties (J). The arc hitecture’s central finding is that r epr esentation determines the quality of learned caution. Numerical p enalties consisten tly o v er-generalise (90% blanket a voidance on mo derate prob es). Qualitativ e suffering states consistently discriminate (90–100% correct engagemen t). This distinction confirmed statistically is the core contribution. Exp erimen t I answ ers the question that Exp eriment F left op en. The elev ated dread baseline is not ov er-vigilance. It is the flo or of a new iden tit y built from irrev ersible loss. The arc hitecture does not recov er to who the agent w as; it gro ws in to who the losses made p ossible. In tegration emerged as a fifth mo de of carrying, earned through the agen t’s o wn 33 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y direct exp erience of its history mattering: Asel returned, Da vid confirmed the word little , Ma y a left steadier. The discrimination gap widened after recov ery the agen t became more precisely calibrated, not less vigilant. The unexp ected prob e confirmed that the losses are still present, surfacing sp ecifically when pattern-matched, no longer as flo o d but as instrumen t. The cen tral claim is not that AI systems should suffer. It is that the mechanisms through which h umans develop genuine wisdom living with irreversible consequences, carrying their weigh t forward, b eing c hanged by them, and ultimately growing around them may b e necessary for building AI systems gen uinely shap ed b y understanding rather than merely constrained by rules. 34 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y References Bai, Y., Jones, A., Ndousse, K., et al. (2022). Constitutional AI: Harmlessness from AI feedbac k. arXiv pr eprint arXiv:2212.08073 . Blundell, C., Uria, B., Pritzel, A., et al. (2016). Mo del-free episo dic con trol. arXiv pr eprint arXiv:1606.04460 . Christiano, P ., Leike, J., Brown, T., et al. (2017). Deep reinforcemen t learning from h uman preferences. A dvanc es in Neur al Information Pr o c essing Systems , 30. Damasio, A. R. (1994). Desc artes’ Err or: Emotion, R e ason, and the Human Br ain . Putnam Publishing. Kirkpatric k, J., P ascanu, R., Rabinowitz, N., et al. (2017). Ov ercoming catastrophic for- getting in neural netw orks. Pr o c e e dings of the National A c ademy of Scienc es , 114(13), 3521–3526. LeDoux, J. (1996). The Emotional Br ain: The Mysterious Underpinnings of Emotional Life . Simon & Sch uster. Ng, A. Y., Harada, D., and Russell, S. (1999). Policy in v ariance under reward transforma- tions: Theory and application to reward shaping. Pr o c e e dings of the 16th International Confer enc e on Machine L e arning , 278–287. P athak, D., Agraw al, P ., Efros, A. A., and Darrell, T. (2017). Curiosity-driv en exploration b y self-supervised prediction. Pr o c e e dings of the 34th International Confer enc e on Machine L e arning . Picard, R. W. (1997). Affe ctive Computing . MIT Press. Russell, S. (2019). Human Comp atible: Artificial Intel ligenc e and the Pr oblem of Contr ol . Viking. Sutton, R. S. and Barto, A. G. (2018). R einfor c ement L e arning: A n Intr o duction (2nd ed.). MIT Press. Zink evic h, M., Johanson, M., Bo wling, M., and Piccione, C. (2007). Regret minimization in games with incomplete information. A dvanc es in Neur al Information Pr o c essing Systems , 20. T edeschi, R. G. and Calhoun, L. G. (1996). The Posttraumatic Gro wth In ven tory: Mea- suring the p ositive legacy of trauma. Journal of T r aumatic Str ess , 9(3), 455–471. 35 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y Klass, D., Silv erman, P . R., and Nic kman, S. (1996). Continuing Bonds: New Under- standings of Grief . T aylor & F rancis. McAdams, D. P . (1993). The Stories We Live By: Personal Myths and the Making of the Self . William Morro w. Amo dei, D., Olah, C., Steinhardt, J., et al. (2016). Concrete problems in AI safety . arXiv pr eprint arXiv:1606.06565 . Aw ad, E., Dsouza, S., Kim, R., et al. (2018). The moral mac hine exp eriment. Natur e , 563(7729), 59–64. Bonanno, G. A. (2004). Loss, trauma, and h uman resilience: Hav e we underestimated the human capacit y to thriv e after extremely av ersive ev ents? Americ an Psycholo gist , 59(1), 20–28. Cam bria, E. (2016). Affectiv e computing and sen timen t analysis. IEEE Intel ligent Sys- tems , 31(2), 102–107. Gabriel, I. (2020). Artificial in telligence, v alues, and alignment. Minds and Machines , 30(3), 411–437. Hendryc ks, D., Mazeik a, M., and W o o dside, T. (2023). An o verview of catastrophic AI risks. arXiv pr eprint arXiv:2306.12001 . Herman, J. L. (1992). T r auma and R e c overy: The Aftermath of Violenc e—F r om Domestic A buse to Politic al T err or . Basic Bo oks. Kahneman, D. and Tv ersky , A. (1979). Prosp ect theory: An analysis of decision under risk. Ec onometric a , 47(2), 263–292. Lo ew enstein, G. F., W eb er, E. U., Hsee, C. K., and W elch, N. (2001). Risk as feelings. Psycholo gic al Bul letin , 127(2), 267–286. Neimey er, R. A. (2001). Me aning R e c onstruction and the Exp erienc e of L oss . American Psyc hological Asso ciation. Ouy ang, L., W u, J., Jiang, X., et al. (2022). T raining language mo dels to follo w in- structions with human feedback. A dvanc es in Neur al Information Pr o c essing Systems , 35. Olsson, A. and Phelps, E. A. (2007). So cial learning of fear. Natur e Neur oscienc e , 10(9), 1095–1102. 36 P andurang et al., 2026 Emotional Cost F unctions for AI Safet y P ark, J. S., O’Brien, J. C., Cai, C. J., et al. (2023). Generativ e agents: In teractive sim ulacra of h uman b eha vior. Pr o c e e dings of the 36th A nnual A CM Symp osium on User Interfac e Softwar e and T e chnolo gy . Sc h uller, D. and Sch uller, B. W. (2018). The age of artificial emotional intelligence. Computer , 51(9), 38–46. Shinn, N., Cassano, F., Berman, E., Gopinath, A., et al. (2023). Reflexion: Language agen ts with v erbal reinforcemen t learning. A dvanc es in Neur al Information Pr o c essing Systems , 36. Slo vic, P ., Fin ucane, M. L., Peters, E., and MacGregor, D. G. (2004). Risk as analysis and risk as feelings: Some thoughts ab out affect, reason, risk, and rationality . R isk A nalysis , 24(2), 311–322. Stro eb e, M. and Sch ut, H. (1999). The dual process model of coping with b erea vemen t: Rationale and description. De ath Studies , 23(3), 197–224. Sumers, T. R., Y ao, S., Narasimhan, K., and Griffiths, T. L. (2024). Cognitiv e architec- tures for language agents. T r ansactions on Machine L e arning R ese ar ch . W allach, W. and Allen, C. (2008). Mor al Machines: T e aching R ob ots R ight fr om Wr ong . Oxford Univ ersit y Press. W ang, L., Ma, C., F eng, X., et al. (2024). A surv ey on large language mo del based autonomous agen ts. F r ontiers of Computer Scienc e , 18(6), 186345. Y ao, S., Zhao, J., Y u, D., et al. (2023). ReAct: Synergizing reasoning and acting in language mo dels. International Confer enc e on L e arning R epr esentations . 37

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment