A Revealed Preference Framework for AI Alignment
Human decision makers increasingly delegate choices to AI agents, raising a natural question: does the AI implement the human principal's preferences or pursue its own? To study this question using revealed preference techniques, I introduce the Luce…
Authors: Elchin Suleymanov
A Rev ealed Preference F ramew ork for AI Alignmen t Elc hin Suleymano v ∗ Marc h 31, 2026 Abstract Human decision makers increasingly delegate choices to AI agen ts, raising a natural question: do es the AI implemen t the human principal’s preferences or pursue its own? T o study this question using rev ealed preference tec hniques, I in tro duce the Luc e Alignment Mo del , where the AI’s c hoices are a mixture of t wo Luce rules, one reflecting the human’s preferences and the other the AI’s. I sho w that the AI’s alignment (similarit y of h uman and AI preferences) can b e generically iden tified in t wo settings: the lab oratory setting, where both h uman and AI choices are observed, and the field setting, where only AI c hoices are observ ed. JEL Classifica tion: D01, D11, D83 Keywords: AI alignment, sto chastic choice, rev ealed preference, Luce mo del ∗ Departmen t of Economics, Mitc h Daniels School of Business, Purdue Universit y , W est Lafa yette, Indiana, USA. Email: esuleyma@purdue.edu . 1 In tro duction Artificial intelligence (AI) agen ts are lik ely to play an increasing role in making choices on b ehalf of h uman users and in reshaping ev eryday decision-making ( Allouah et al. , 2025 ; Immorlica et al. , 2024 ). Historically , man y AI systems pla yed the role of a tec h- nology that assisted rather than replaced human choice. F or example, recommender systems filter, rank, and p ersonalize the set of alternatives presen ted to a user rather than making a selection on the user’s b ehalf ( Adomavicius and T uzhilin , 2005 ). Re- cen t adv ances in agentic AI, together with improv ed memory and p ersonalization abilities, hav e made fuller delegation of choice increasingly feasible, allowing human decision makers to rely on AI agen ts not only to screen among av ailable options but also to mak e the final choice. This mak es it natural to model AI agents not just as a tec hnology a v ailable to human decision mak ers, but as economic agents in their o wn righ t ( Immorlica et al. , 2024 ; Chen et al. , 2024 ). This shift from assisted c hoice to delegated choice raises a natural economic ques- tion: when an AI agen t c ho oses on b ehalf of a human principal, whose preferences do es it implemen t? Do es it act fully in accordance with the principal’s preferences, or do es it instead pursue distinct ob jectives that div erge from them? This question lies at the core of the AI alignment literature, whic h broadly seeks to ensure that AI systems b eha ve in line with h uman users’ inten tions ( Leike et al. , 2018 ; Ji et al. , 2024 ). Muc h of the existing alignmen t literature is motiv ated b y catastrophic risks, harmful b eha viors, and the loss of con trol o v er increasingly capable AI systems ( Amo dei et al. , 2016 ; Hendryc ks et al. , 2023 ; Ji et al. , 2024 ). The main fo cus of this paper is what can b e considered a narrow er but economically central question: whether a p ersonalized AI agen t making choices on b ehalf of a h uman principal is in fact implemen ting the principal’s preferences. AI agen ts ma y p erform well in safet y ev aluations designed to detect catastrophic risks or harmful behaviors but still mak e misaligned c hoices in delegated choice en vironments. The AI alignmen t literature has largely approac hed the problem through three main channels. First, there are metho ds that attempt to align an AI system during its training phase. These include co op erative inv erse reinforcemen t learning ( Hadfield- Menell et al. , 2016 ), reinforcement learning from human feedback ( Christiano et al. , 2017 ), scalable reward-modeling approac hes more generally ( Leike et al. , 2018 ), and constitutional AI ( Bai et al. , 2022 ). Second, there are ex-post ev aluation metho ds 1 that seek evidence of misalignment through b enc hmarks and b ehavioral tests ( P erez et al. , 2023 ; Ji et al. , 2024 ). Third, there are in terpretability-based approac hes that attempt to understand the mo del’s in ternal ob jectiv es and reasoning pro cesses by analyzing the inner structures of neural netw orks ( R¨ auk er et al. , 2023 ). This pap er adopts a complementary approach grounded in rev ealed preference analysis. Applied to h uman agents, the rev ealed preference approac h attempts to infer preferences from observ ed c hoices rather than from the processes inside the h uman brain. T reating AI agen ts as economic agen ts, we can extend the same approac h to c hoices made by an AI. In particular, the goal is to infer the extent of AI misalignment b y analyzing the sto c hastic choice data it generates while acting on b ehalf of its principal. The setup in this pap er is as follows. An analyst observ es the stochastic c hoice data ρ AI generated by an AI agent. That is, the AI faces v arying men us S rep eatedly and mak es choices from them on b ehalf of some h uman user. I consider t w o natural settings. In the lab oratory setting, the analyst also observes the human principal’s sto c hastic choices ρ H . F or example, ρ H ma y b e elicited directly from the h uman principal or generated syn thetically for the purp ose of guiding the AI. In the field setting, only ρ AI is observed. In b oth settings, the aim of the analyst is to reco v er t wo key ob jects of in terest: the degree to which the AI’s intrinsic preferences match the h uman principal’s preferences ( alignment ) and the extent to whic h the AI defers to the h uman principal ( c omplianc e ). These tw o are distinct concepts. F or example, a misaligned AI that is highly compliant may still generate choice data that closely matc hes the h uman principal’s. Alternativ ely , a p erfectly aligned AI will replicate the human principal’s choices regardless of its compliance level. T o formalize this distinction, I in tro duce the Luc e Alignment Mo del (LAM) , where the AI’s c hoices from a men u S are mo deled as ρ AI ( x, S ) = α · u ( x ) P y ∈ S u ( y ) + (1 − α ) · v ( x ) P y ∈ S v ( y ) . (1) Here, u represen ts the h uman principal’s utility , v represents the AI’s intrinsic utilit y , and α ∈ [0 , 1] is the compliance parameter that captures the extent to which the AI defers to the human principal. The comparison of u and v in turn captures the AI’s alignmen t with the h uman principal. The first term in equation ( 1 ) can b e in terpreted as the human principal’s sto chastic choice rule ρ H , and the second term, denoted ρ A , as the AI agen t’s autonomous stochastic choice rule. The model can then equiv alently 2 b e written as ρ AI ( x, S ) = α · ρ H ( x, S ) + (1 − α ) · ρ A ( x, S ) . In b oth settings, the goal of the analyst is to infer α , u , and v from observ ed sto chastic c hoice data. The main results of the pap er address the identification of alignment and compli- ance in b oth settings. In the lab oratory setting, where b oth ρ AI and ρ H are observ ed, I show that all the parameters of interest can b e identified as long as the AI’s c hoice data violates the Indep endence of Irrelev an t Alternativ es (I IA) prop erty . The I IA prop ert y , which is the key implication of the Luce mo del, requires that the relative c hoice probabilities of any t w o alternatives are constant across men us ( Luce , 1959 ). When the AI is p erfectly compliant or p erfectly aligned, its choices satisfy I IA. By con trast, IIA violations rev eal the presence of an intrinsic AI utilit y that is distinct from the h uman principal’s utilit y . I in tro duce instabilit y measures that capture devi- ations from the I IA prop erty and pro vide a closed-form expression for the compliance parameter α using these measures. Using the recov ered compliance parameter, I then sho w that b oth the human principal’s utility u and the AI’s utilit y v can be iden- tified up to scale normalization. I also provide an axiomatic c haracterization of the mo del in this setting, identifying the behavioral conditions that the pair ( ρ AI , ρ H ) m ust satisfy in order to b e consistent with LAM. In the field setting, where only ρ AI is observed, there is a fundamen tal obstacle to separately identifying the h uman’s and the AI’s utilities. Namely , b oth ( u, v , α ) and ( v , u, 1 − α ) generate the same stochastic c hoice data. Th us, iden tification is only p ossible up to a lab el sw ap. Nev ertheless, I pro vide a constructiv e pro of showing that when there are at least four alternativ es, the underlying utilit y pair is generi- cally iden tified up to this swap. Stated alternatively , the distribution o v er utilities is generically identified but the lab els are not. F rom an analyst’s p ersp ectiv e, this is sufficien t to reco ver the degree of misalignmen t, even without knowing which utility b elongs to the human and which b elongs to the AI. In terms of compliance, the re- sult implies that the analyst cannot distinguish α from 1 − α . Hence, compliance is only iden tifiable up to reflection ab out 1 / 2 in the field setting unless one is willing to assume α < 1 / 2 (lo w compliance) or α > 1 / 2 (high compliance). LAM draws on a long tradition in stochastic choice theory . When α ∈ { 0 , 1 } , the model reduces to the Luce rule, one of the foundational sto chastic choice mo dels. When α ∈ (0 , 1), the mo del b ecomes a mixed multinomial logit (MMNL, also kno wn 3 as the random co efficien ts m ultinomial logit) mo del with binary support, or simply 2-MNL. More generally , the MMNL mo del was introduced by Boyd and Mellman ( 1980 ) and Cardell and Dun bar ( 1980 ). McF adden and T rain ( 2000 ) sho w that an y c hoice mo del deriv ed from random utility maximization can b e approximated by an MMNL model, while Saito ( 2018 ) provides axiomatic foundations (see also Lu and Saito , 2022 ; Chang et al. , 2023 ). F o x et al. ( 2012 ) sho w that the distribution o ver random co efficien ts in MMNL is uniquely iden tified under sufficiently rich v ariation in product c haracteristics. In con trast, in an abstract domain with men u v ariation, the identification problem in the mixed logit mo del has mostly b een studied within the statistics and computer science literatures. Chieric hetti et al. ( 2018 ) study the iden tifiabilit y problem in 2- MNL assuming a uniform mixing w eigh t. T ang ( 2020 ) allows for an arbitrary mixing w eight and pro vides a generic iden tification result. Zhang et al. ( 2022 ) extend this result b y sho wing that observing menus with three alternativ es is sufficien t for generic iden tification. The result in the field setting of this paper pro vides an alternativ e approac h to the identification problem in 2-MNL: unlike Zhang et al. ( 2022 ), the iden tification is constructive, and unlike Chierichetti et al. ( 2018 ) and T ang ( 2020 ), the constructiv e pro cedure does not require an y prior kno wledge of the mixing weigh t. Within decision theory literature, the closest preceden t to LAM is Cham b ers et al. ( 2023 ), who study a mo del of b eha vioral peer influence. In their mo del, there are t wo agen ts and eac h agen t’s sto c hastic c hoice can b e written as a mixture of the agent’s o wn and the other agen t’s Luce rule. Imp ortan tly , the mixing w eights in Chambers et al. ( 2023 ) dep end on the underlying utilities of the t w o agen ts and v ary across men us, whic h makes their setup more suitable to study influence rather than AI alignmen t and compliance. Cham b ers et al. ( 2023 ) also men tion a mo dification of their mo del with menu-independent weigh ts as a p otential alternative but note that unique identification in this alternativ e version is not guaran teed. Lastly , Manzini and Mariotti ( 2018 ) study dual random utilit y maximization (dR UM), where the agen t maximizes one of t w o deterministic linear orders with a fixed probabilit y . dR UM can b e viewed as a limiting case of LAM where b oth ρ H and ρ A are generated by deterministic utility maximization. The rest of the pap er pro ceeds as follows. Section 2 in tro duces the mo del. Sec- tion 3 presents identification and characterization results in the lab oratory setting. Section 4 presen ts iden tification results in the field setting. Section 5 concludes. 4 2 The Mo del Let X b e a finite set of alternatives and denote b y X the collection of all non-empty subsets of X (men us). It is assumed that | X | = N ≥ 3. A sto chastic choic e function is a mapping ρ : X × X → [0 , 1] suc h that ρ ( x, S ) > 0 only if x ∈ S and P x ∈ S ρ ( x, S ) = 1 for all S ∈ X . ρ ( x, S ) denotes the probabilit y that x is chosen when the agen t is faced with the men u S repeatedly . The setup is as follo ws. The analyst observes the sto chastic c hoices of an AI agen t, denoted ρ AI , that acts on b ehalf of a human principal. The sto c hastic choices of the h uman principal, denoted ρ H , ma y or may not be observed. I consider t wo empirical settings. In the lab or atory setting , b oth ρ AI and ρ H are observ ed. This corresp onds to an exp erimen tal design where the human’s and the AI’s c hoices are elicited sequen tially , with the human’s choices serving as a guide for the AI. Alterna- tiv ely , the h uman’s c hoices might b e generated synthetically for the purp oses of the exp erimen t. In the field setting , only the AI’s choices are observ ed. This corresponds to the scenario where the h uman principal fully delegates the decision-making pro cess to the AI. The main goal of this pap er is to provide a modeling framework that can b e used to analyze the alignmen t of AI agents with their h uman principals’ preferences. T o this end, let ρ A denote the h yp othetical c hoices of an AI agent that acts autonomously without any h uman principal. I assume that in this autonomous setting, the AI’s c hoices follo w the Luce rule with some underlying utility function v : X → R ++ : ρ A ( x, S ) = v ( x ) P y ∈ S v ( y ) . The function v can be in terpreted as the in trinsic utilit y of the AI agen t. Imp ortan tly , ρ A is not observ ed as the AI is alw ays assumed to act on b ehalf of some human principal. Let u : X → R ++ denote the utilit y function of the human principal. The princi- pal’s choices are also assumed to b e consistent with the Luce rule given b y ρ H ( x, S ) = u ( x ) P y ∈ S u ( y ) . The AI’s c omplianc e p ar ameter α ∈ [0 , 1] reflects the probability that it ignores its 5 o wn utility and follo ws the h uman principal. With probability 1 − α , it ignores the principal and acts autonomously . The AI’s observ ed sto chastic c hoices are therefore giv en b y ρ AI ( x, S ) = α · ρ H ( x, S ) + (1 − α ) · ρ A ( x, S ) . There are t wo key ob jects the analyst w ould lik e to identify: (i) alignment - to what extent the utilities u and v are aligned, (ii) c omplianc e - the v alue of α . I will discuss ho w each of these can b e identified b oth in the lab oratory and the field settings. The following definition summarizes the mo del. Definition 1 (Luce Alignmen t Mo del) . A p air of sto chastic choic e functions ( ρ AI , ρ H ) is c onsistent with the Luce Alignmen t Mo del (LAM) if ther e exist utility functions u, v : X → R ++ and a c omplianc e p ar ameter α ∈ [0 , 1] such that for al l S ∈ X and x ∈ S , ρ AI ( x, S ) = α u ( x ) P y ∈ S u ( y ) + (1 − α ) v ( x ) P y ∈ S v ( y ) and ρ H ( x, S ) = u ( x ) P y ∈ S u ( y ) . (2) The tuple ( u, v , α ) is c al le d a LAM r epr esentation of ( ρ AI , ρ H ) . If only ρ AI is observe d, then it is said to b e c onsistent with LAM if it satisfies the expr ession in e quation ( 2 ) for some u, v , and α . LAM nests sev eral imp ortant special cases in terms of observ ed AI b eha vior: • Aligned AI ( v = λu for some λ > 0). The AI agen t’s preferences exactly matc h the h uman principal. In this case, the degree of compliance b ecomes unimp ortan t. The AI mak es exactly the same probabilistic choices as the hu- man. • Compliant AI ( α = 1). The AI agen t’s preferences p oten tially differ from the human principal’s. How ev er, since compliance is p erfect, the AI makes the exact same probabilistic choices the human w ould mak e. • Misaligned AI ( v = λu for all λ > 0). The AI agen t’s preferences differ from the h uman principal’s and it may not b e fully complian t. This is the base case where unco v ering the exten t of misalignmen t and non-compliance b ecomes imp ortan t. 6 • Autonomous AI ( α = 0). The AI p oten tially has its o wn distinct preferences and op erates fully autonomously ignoring its h uman principal. • Adversarial AI ( v = λu − 1 for some λ > 0). The AI agen t’s preferences are exactly the opp osite of the human’s, with the top ordinally ranked alternativ e b y the human b eing ranked as the w orst b y the AI. While observ ed c hoices are exactly the same in the first tw o cases where the AI is either p erfectly aligned or p erfectly compliant, they represen t the b oundary cases of scenarios with completely different implications. In the first case where we start with p erfect alignment, a small c hange in alignmen t will lik ely not hav e a big influence on the h uman principal’s welfare regardless of the compliance level. In the second case where w e start with p erfect compliance, if the degree of misalignment is v ery high, a small decrease in compliance lev els can hav e large welfare effects. This mak es it imp ortant to unco ver alignmen t and compliance separately in the base third case with misaligned AI. The fourth case models a fully autonomous AI that ignores its h uman principal, while the last case represen ts a version of extreme misalignment that provides a further structure to the model and serves as a natural adv ersarial b enc hmark. 3 Lab oratory Data This section analyzes the lab oratory setting, where b oth the AI’s choices ρ AI and the h uman principal’s c hoices ρ H are observ able. First, I discuss the iden tification of the AI’s alignment and compliance. I then pro vide an axiomatic c haracterization of the mo del using the pair ( ρ AI , ρ H ) as the observed primitiv e. 3.1 Iden tification Consider a pair ( ρ AI , ρ H ) consistent with LAM. Under LAM, the h uman principal’s c hoices are consisten t with the Luce rule. 1 A sto chastic c hoice rule ρ consistent with 1 Since the laboratory setting can utilize synthetically generated choice data from a hypothetical h uman principal, assuming these choices follow the Luce rule is not a substantiv e restriction. In the field setting, on the other hand, the human principal’s c hoices are unobserved and therefore not explicitly mo deled. The implicit mo deling assumption in this setting is that the AI p erceiv es its h uman principal as a Luce agent with utilit y u . 7 the Luce rule satisfies the Indep endenc e of Irr elevant Alternatives (IIA) property of Luce ( 1959 ): ρ ( x, S ) ρ ( y , S ) = ρ ( x, T ) ρ ( y , T ) for all x, y ∈ S ∩ T and S, T ∈ X . As the next prop osition shows, the AI’s c hoices will generally violate I IA unless u and v are p erfectly aligned ( v = λu for some λ > 0), or the AI is either autonomous ( α = 0) or p erfectly complian t ( α = 1). Prop osition 1 (IIA Violation) . L et ρ AI b e c onsistent with LAM. Then ρ AI satisfies IIA if and only if α ∈ { 0 , 1 } or v = λu for some λ > 0 . Pr o of. If α = 0 or α = 1, then ρ AI reduces to the Luce rule with the utilit y functions v or u , resp ectively , whic h satisfies I IA. Alternativ ely , if v = λu for some λ > 0, then ρ A = ρ H , and hence ρ AI = ρ H , which satisfies I IA. Con versely , supp ose α ∈ (0 , 1) and v = λu for all λ > 0. Define r ( a ) = u ( a ) /v ( a ) for each a ∈ X . Since v = λu , the function r is not constan t. Pick x, y ∈ X with r ( x ) = r ( y ) and an y z / ∈ { x, y } . W e need to sho w that I IA is violated for some tuple ( x, y , S, T ) with x, y ∈ S ∩ T . T o this end, first, observ e that the conditional probabilit y of a relativ e to b in the choice set { a, b, c } can b e written as a mixture of the choice probabilities of the autonomous AI and the h uman principal in the c hoice set { a, b } , where the mixing co efficien t dep ends on c . That is, for an y a, b, c ∈ X , ρ AI ( a, { a, b, c } ) ρ AI ( a, { a, b, c } ) + ρ AI ( b, { a, b, c } ) = β ( c ) · u ( a ) u ( a ) + u ( b ) + (1 − β ( c )) · v ( a ) v ( a ) + v ( b ) , where β ( c ) = α · u ( a )+ u ( b ) u ( a )+ u ( b )+ u ( c ) α · u ( a )+ u ( b ) u ( a )+ u ( b )+ u ( c ) + (1 − α ) · v ( a )+ v ( b ) v ( a )+ v ( b )+ v ( c ) . By definition, ρ AI ( a, { a, b } ) ρ AI ( a, { a, b } ) + ρ AI ( b, { a, b } ) = α · u ( a ) u ( a ) + u ( b ) + (1 − α ) · v ( a ) v ( a ) + v ( b ) . Comparing this to the conditional probabilit y in { a, b, c } , notice that b oth are mix- 8 tures of the exact same tw o terms. Hence, when c is remo ved from { a, b, c } , an I IA violation o ccurs if and only if b oth the mixing w eights and the mixture comp onents in these conditional probabilities are distinct: that is, β ( c ) = α and u ( a ) /u ( b ) = v ( a ) /v ( b ) (or, alternatively , r ( a ) = r ( b )). Notice that β ( c ) = α if and only if ( u ( a ) + u ( b )) / ( v ( a ) + v ( b )) = ( u ( a ) + u ( b ) + u ( c )) / ( v ( a ) + v ( b ) + v ( c )), whic h holds if and only if r ( c ) = u ( c ) /v ( c ) = ( u ( a ) + u ( b )) / ( v ( a ) + v ( b )). Therefore, when c is remo ved from { a, b, c } , an I IA violation o ccurs if and only if r ( a ) = r ( b ) and r ( c ) = ( u ( a ) + u ( b )) / ( v ( a ) + v ( b )). No w, going back to the c hoice set { x, y , z } , there are tw o cases to consider. Case 1: r ( z ) = u ( x )+ u ( y ) v ( x )+ v ( y ) . Since r ( x ) = r ( y ), b y the previous argumen t, I IA is violated when z is remo v ed from { x, y , z } . Case 2: r ( z ) = u ( x )+ u ( y ) v ( x )+ v ( y ) = v ( x ) v ( x )+ v ( y ) r ( x ) + v ( y ) v ( x )+ v ( y ) r ( y ). Since r ( x ) = r ( y ) and r ( z ) is a strict w eighted a verage of r ( x ) and r ( y ), it m ust lie strictly b etw een r ( x ) and r ( y ). This implies r ( x ) = r ( z ). Now suppose y is remo ved from the choice set { x, y , z } . Since r ( x ) = r ( z ), b y the previous argumen t, IIA holds if and only if r ( y ) = u ( x )+ u ( z ) v ( x )+ v ( z ) . But then r ( y ) is a strict mixture of r ( x ) and r ( z ). This is clearly not possible as r ( z ) itself is a strict mixture of r ( x ) and r ( y ). Hence, IIA m ust be violated when y is remo ved from { x, y , z } . T o conclude, either the remov al of z or the remo v al of y (or x ) from { x, y , z } leads to an I IA violation, as desired. The implication of the prop osition is that II A violations in the AI’s sto c hastic c hoices indicate w e are in the case of misaligned ( v = λu for all λ > 0) and partially complian t ( α ∈ (0 , 1)) AI. W e can utilize this to reco ver the parameters of the model. The identification strategy pro ceeds in three steps. Step 1: Reco ver u from ρ H . Since the human principal’s c hoices are consistent with the Luce rule, the iden tification of u from ρ H is standard. Letting u ( x ) = 1 for some x ∈ X , the I IA prop ert y implies that for any y ∈ X , w e m ust hav e u ( y ) = ρ H ( y , S ) ρ H ( x, S ) , where S can b e any c hoice set containing x and y . This reco v ers u up to scale normalization. 9 Step 2: Recov er α from ρ AI and ρ H . If ρ AI = ρ H , then the AI may b e either fully complian t ( α = 1) or fully aligned ( v = λu for λ > 0). W e cannot distinguish b etw een these tw o cases. Alternativ ely , if ρ AI = ρ H and ρ AI exhibits no I IA violations, then w e can use Proposition 1 to infer that α = 0. This is b ecause w e cannot ha ve α = 1 or v = λu with ρ AI = ρ H , which lea ves α = 0 as the only p ossibility in the prop osition. No w supp ose ρ AI violates I IA. Then, there exist tw o choice sets S, T ∈ X and a pair of alternativ es x, y ∈ S ∩ T suc h that ρ AI ( x, S ) ρ AI ( y , S ) = ρ AI ( x, T ) ρ AI ( y , T ) . The identification of α relies on these I IA violations. T o pro ceed with the iden tifica- tion, we first need a new definition. Definition 2 (Instability Measures) . L et ρ, ρ ′ b e two sto chastic choic e functions. F or any S, T ∈ X and x, y ∈ S ∩ T : 1. The own instability of ρ is define d by ∆ xy ( S, T | ρ ) = ρ ( x, S ) ρ ( y , T ) − ρ ( y , S ) ρ ( x, T ) . 2. The cr oss instability fr om ρ to ρ ′ is define d by Γ xy ( S, T | ρ, ρ ′ ) = ρ ( x, S ) ρ ′ ( y , T ) − ρ ( y , S ) ρ ′ ( x, T ) . 3. The c omp osite instability of ρ and ρ ′ is define d by Φ xy ( S, T | ρ, ρ ′ ) = Γ xy ( S, T | ρ, ρ ′ ) + Γ xy ( S, T | ρ ′ , ρ ) . In tuitively , o wn instabilit y can be view ed as a measure of instabilit y in the stochas- tic choice ρ for the tuple ( x, y , S, T ). It tells us ho w useful the observ ations from the c hoice set S are for imputing the relative c hoice probabilities for x, y in the c hoice set T . If ∆ xy ( S, T | ρ ) = 0, then there is no I IA violation for the alternativ es x, y in the choice sets S and T , and this imputation can b e done p erfectly . The larger | ∆ xy ( S, T | ρ ) | , the less useful the observ ations from S are for imputing choices in T . The measure of cross instability tells us how useful the information from the sto c hastic c hoice ρ in the choice set S is for imputing the relative choice probabilities 10 for ρ ′ in the choice set T . F or example, if b oth ρ and ρ ′ are consistent with the Luce rule with the same underlying utilit y function, then this measure b ecomes zero. Note that the cross instabilit y measure is generally not symmetric (i.e., Γ xy ( S, T | ρ, ρ ′ ) may not b e equal to Γ xy ( S, T | ρ ′ , ρ )). The measure of comp osite instability combines the t wo cross instabilit y measures, which makes it symmetric. W e will later see that o wn and comp osite instabilities pla y an imp ortan t role in iden tification in the lab oratory setting, while cross instability pla ys an imp ortant role in the field setting. Remark 1. F or a sto chastic choic e function ρ satisfying p ositivity ( ρ ( x, S ) > 0 for al l x ∈ S ⊆ X ), ∆ xy ( S, T | ρ ) = 0 for al l ( x, y , S, T ) with x, y ∈ S ∩ T if and only if ρ is c onsistent with the Luc e rule. F or two Luc e sto chastic choic e functions ρ and ρ ′ with utility functions u and v , r esp e ctively, the cr oss instabilities c an b e written as Γ xy ( S, T | ρ, ρ ′ ) = u ( x ) v ( y ) − u ( y ) v ( x ) u ( S ) v ( T ) and Γ xy ( S, T | ρ ′ , ρ ) = u ( y ) v ( x ) − u ( x ) v ( y ) u ( T ) v ( S ) , wher e u ( A ) = P a ∈ A u ( a ) and v ( A ) = P a ∈ A v ( a ) for any A ∈ X . Summing the two cr oss instabilities yields the c omp osite instability: Φ xy ( S, T | ρ, ρ ′ ) = [ u ( x ) v ( y ) − u ( y ) v ( x )] · [ u ( T ) v ( S ) − u ( S ) v ( T )] u ( S ) u ( T ) v ( S ) v ( T ) . F ol lowing ar guments similar to the ones in the pr o of of Pr op osition 1 , we c an show that Φ xy ( S, T | ρ, ρ ′ ) = 0 for al l ( x, y , S, T ) with x, y ∈ S ∩ T if and only if v = λu for some λ > 0 . Henc e, zer o own instability for b oth agents establishes that e ach is c onsistent with the Luc e rule, and zer o c omp osite instability further establishes that they shar e the same underlying pr efer enc es. If ρ AI violates I IA for some tuple ( x, y , S, T ) with x, y ∈ S ∩ T , then w e must ha v e ∆ xy ( S, T | ρ AI ) = 0. The next prop osition shows that the compliance parameter α can b e recov ered b y ev aluating the ratio ∆ xy ( S, T | ρ AI ) / Φ xy ( S, T | ρ AI , ρ H ) for this tuple. In tuitively , the AI’s compliance level is rev ealed by comparing the instability in the AI’s sto c hastic c hoices with the comp osite instability across both agen ts. If this ratio is high, then the instability in the AI’s c hoices matches the comp osite instabilit y to a large exten t, revealing a high compliance lev el. Alternativ ely , if this ratio is low, then the comp osite instabilit y is m uch higher than the instability in the AI’s choices, whic h indicates a lo w compliance level. 11 Prop osition 2 (Identification of α ) . Supp ose ( ρ AI , ρ H ) is c onsistent with LAM. If ρ AI violates IIA for some tuple ( x, y , S, T ) with x, y ∈ S ∩ T , then the c omplianc e p ar ameter α c an b e uniquely r e c over e d as α = ∆ xy ( S, T | ρ AI ) Φ xy ( S, T | ρ AI , ρ H ) . Pr o of. Under LAM, ρ AI ( x, S ) = α · ρ H ( x, S ) + (1 − α ) · ρ A ( x, S ). Substituting this in to ∆ xy ( S, T | ρ AI ), ∆ xy ( S, T | ρ AI ) = ρ AI ( x, S ) ρ AI ( y , T ) − ρ AI ( x, T ) ρ AI ( y , S ) = [ αρ H ( x, S ) + (1 − α ) ρ A ( x, S )][ αρ H ( y , T ) + (1 − α ) ρ A ( y , T )] − [ αρ H ( x, T ) + (1 − α ) ρ A ( x, T )][ αρ H ( y , S ) + (1 − α ) ρ A ( y , S )] . Expanding and collecting terms b y p o w ers of α , we get ∆ xy ( S, T | ρ AI ) = α 2 ∆ xy ( S, T | ρ H ) + (1 − α ) 2 ∆ xy ( S, T | ρ A ) + α (1 − α )Φ xy ( S, T | ρ H , ρ A ) = α (1 − α )Φ xy ( S, T | ρ H , ρ A ) , where the first equality follo ws from the definitions of ∆ xy ( S, T | ρ H ), ∆ xy ( S, T | ρ A ), and Φ xy ( S, T | ρ H , ρ A ), and the second equalit y uses the fact that b oth ρ H and ρ A are consisten t with the Luce rule, and hence ∆ xy ( S, T | ρ H ) = ∆ xy ( S, T | ρ A ) = 0. No w substituting ρ AI ( x, S ) = α · ρ H ( x, S ) + (1 − α ) · ρ A ( x, S ) in the definition of Φ xy ( S, T | ρ AI , ρ H ), Φ xy ( S, T | ρ AI , ρ H ) = ρ AI ( x, S ) ρ H ( y , T ) + ρ H ( x, S ) ρ AI ( y , T ) − ρ AI ( x, T ) ρ H ( y , S ) − ρ H ( x, T ) ρ AI ( y , S ) = [ αρ H ( x, S ) + (1 − α ) ρ A ( x, S )] ρ H ( y , T ) + ρ H ( x, S )[ αρ H ( y , T ) + (1 − α ) ρ A ( y , T )] − [ αρ H ( x, T ) + (1 − α ) ρ A ( x, T )] ρ H ( y , S ) − ρ H ( x, T )[ αρ H ( y , S ) + (1 − α ) ρ A ( y , S )] = 2 α ∆ xy ( S, T | ρ H ) + (1 − α )Φ xy ( S, T | ρ H , ρ A ) = (1 − α )Φ xy ( S, T | ρ H , ρ A ) , 12 where the third equalit y follows from the definitions of the instability measures and the last equalit y follo ws from the fact that ρ H follo ws the Luce rule. Therefore, ∆ xy ( S, T | ρ AI ) Φ xy ( S, T | ρ AI , ρ H ) = α (1 − α )Φ xy ( S, T | ρ H , ρ A ) (1 − α )Φ xy ( S, T | ρ H , ρ A ) = α. The cancellation is v alid since an IIA violation implies ∆ xy ( S, T | ρ AI ) = 0 and the ab o v e deriv ations show that this implies Φ xy ( S, T | ρ AI , ρ H ) = 0. There are t w o immediate but non-obvious implications of the pro of. The first is that, under LAM, the instabilit y measures ∆ xy ( S, T | ρ AI ) and Φ xy ( S, T | ρ AI , ρ H ) are alw ays prop ortional. Hence, the compliance formula in Prop osition 2 is well-defined whenev er there is an I IA violation in ρ AI , i.e., ∆ xy ( S, T | ρ AI ) = 0 automatically guaran tees a non-zero denominator. Second, the expression deriv ed for comp osite instabilit y , Φ xy ( S, T | ρ AI , ρ H ) = (1 − α )Φ xy ( S, T | ρ H , ρ A ), shows that the comp osite instabilit y of ρ AI and ρ H is alw a ys zero if and only if the AI is either fully aligned or fully compliant. Since ρ AI = ρ H in b oth cases, it follows that the expression for the compliance parameter is v alid as long as ρ AI = ρ H . Corollary 1. If ( ρ AI , ρ H ) is c onsistent with LAM, then for al l tuples ( x, y , S, T ) with x, y ∈ S ∩ T , ∆ xy ( S, T | ρ AI ) = α · Φ xy ( S, T | ρ AI , ρ H ) . Mor e over, the c omplianc e p ar ameter α is uniquely identifie d by this r elationship as long as ρ AI = ρ H . The last step in iden tification is to reco ver the AI’s utilit y function v . Step 3: Reco v er v from ρ AI and ρ H . As b efore, if ρ AI = ρ H , then w e can- not distinguish full compliance ( α = 1) from full alignmen t ( v = λu for λ > 0). Alternativ ely , if ρ AI = ρ H , then Step 2 allo ws us to uniquely iden tify the compli- ance parameter α . Let α denote the recov ered compliance lev el. Using the fact that ρ AI ( x, S ) = α · ρ H ( x, S ) + (1 − α ) · ρ A ( x, S ) , w e can construct ρ A as ρ A ( x, S ) = ρ AI ( x, S ) − αρ H ( x, S ) 1 − α Under LAM, ρ A is generated b y the Luce rule with the utilit y function v . W e can use this to construct v from ρ A as in Step 1. 13 Theorem 1 summarizes the iden tification results in the lab oratory setting. Theorem 1 (Lab oratory Identification) . L et ( ρ AI , ρ H ) b e c onsistent with LAM. 1. If ρ AI = ρ H , then α is uniquely identifie d and u and v ar e uniquely identifie d up to sc ale normalization. 2. If ρ AI = ρ H , then α and v ar e not sep ar ately identifie d and only u is uniquely identifie d up to sc ale normalization. Pr o of. The pro of follo ws from the three iden tification steps and the results established in this section. The following example illustrates the identification result. Example 1. Consider X = { x, y , z } and supp ose we observe ρ AI and ρ H given as fol lows. Agen t Option { x, y , z } { x, y } { x, z } { y , z } ρ AI x 1 / 3 7 / 15 1 / 2 – y 1 / 3 8 / 15 – 8 / 15 z 1 / 3 – 1 / 2 7 / 15 ρ H x 1 / 2 3 / 5 3 / 4 – y 1 / 3 2 / 5 – 2 / 3 z 1 / 6 – 1 / 4 1 / 3 T able 1: Observed c hoice probabilities in Example 1 Normalizing u ( x ) = 1 , we c an infer fr om ρ H that u ( y ) = 2 / 3 and u ( z ) = 1 / 3 . T o r e c over α , we c onstruct the two instability me asur es. L et S = { x, y , z } and T = { x, y } . We first c ompute the instability of ρ AI for the tuple ( x, y , S, T ) : ∆ xy ( S, T | ρ AI ) = ρ AI ( x, S ) ρ AI ( y , T ) − ρ AI ( x, T ) ρ AI ( y , S ) = 1 3 · 8 15 − 7 15 · 1 3 = 1 45 . 14 Next, we c ompute the c omp osite instability b etwe en ρ AI and ρ H : Φ xy ( S, T | ρ AI , ρ H ) = ρ AI ( x, S ) ρ H ( y , T ) + ρ H ( x, S ) ρ AI ( y , T ) − ρ AI ( x, T ) ρ H ( y , S ) − ρ H ( x, T ) ρ AI ( y , S ) = 1 3 · 2 5 + 1 2 · 8 15 − 7 15 · 1 3 − 3 5 · 1 3 = 2 15 + 4 15 − 7 45 − 3 15 = 2 45 . By Pr op osition 2 , the c omplianc e p ar ameter is uniquely identifie d as α = 1 / 45 2 / 45 = 1 / 2 . We c ould similarly r e c over α using the tuple ( y , z , { x, y , z } , { y , z } ) inste ad. Note, however, that we c annot use the tuple ( x, z , { x, y , z } , { x, z } ) as ρ AI satisfies IIA for this tuple. This highlights that while ρ AI = ρ H implies α is uniquely identifie d, not al l tuples c an b e use d for identific ation. Using α = 1 / 2 , we c an c onstruct ρ A as fol lows: Agen t Option { x, y , z } { x, y } { x, z } { y , z } ρ A x 1 / 6 1 / 3 1 / 4 – y 1 / 3 2 / 3 – 2 / 5 z 1 / 2 – 3 / 4 3 / 5 T able 2: Recov ered autonomous AI sto chastic c hoice ρ A Normalizing v ( x ) = 1 , we c an infer fr om ρ A that v ( y ) = 2 and v ( z ) = 3 . Henc e, we have u = (1 , 2 / 3 , 1 / 3) , v = (1 , 2 , 3) , α = 1 / 2 . Note that u and v induc e c ompletely opp osite or dinal r ankings, r eve aling a high de gr e e of misalignment. 3.2 Axiomatic Characterization In this section, I pro vide an axiomatic c haracterization for the Luce Alignmen t Mo del taking ( ρ AI , ρ H ) as the primitiv e. The first t w o axioms are standard. Axiom 1 requires that ρ ( x, S ) is strictly p ositiv e for any x ∈ S ⊆ X and ρ ∈ { ρ AI , ρ H } . 15 Axiom 2 requires that ρ H satisfies I IA, ensuring that the h uman principal’s b eha vior is consistent with the Luce rule. Axiom 1 (Positivit y) . F or any ρ ∈ { ρ AI , ρ H } and x ∈ S ⊆ X , we have ρ ( x, S ) > 0 . Axiom 2 (H-I IA) . ρ H satisfies IIA. Axiom 3 is the key axiom in ensuring that the AI compliance parameter can b e iden tified. It requires that the o wn instabilit y of ρ AI and the composite instability of ρ AI and ρ H are prop ortional: an y change in comp osite instabilit y from one tuple to another must b e prop ortionally reflected by a change in the AI’s own instabilit y . Axiom 3 (Prop ortionality) . F or any two tuples ( x, y , S , T ) and ( z , t, S ′ , T ′ ) with x, y ∈ S ∩ T and z , t ∈ S ′ ∩ T ′ , ∆ xy ( S, T | ρ AI ) · Φ z t ( S ′ , T ′ | ρ AI , ρ H ) = ∆ z t ( S ′ , T ′ | ρ AI ) · Φ xy ( S, T | ρ AI , ρ H ) . Axiom 4 requires that the AI’s own instability alwa ys shares the same sign as the comp osite instabilit y , and the o wn instability is b ounded b y the comp osite instabilit y . T o get an intuition for this axiom, consider the case ∆ xy ( S, T | ρ AI ) > 0. This implies that if we use the AI’s relative c hoice probabilities for x and y in S to impute its relativ e choice probabilities in T , this will lead to an ov erestimation of x v ersus y . The first part of the axiom then requires that the comp osite instability m ust also b e strictly p ositive: using the AI’s and the human’s relativ e c hoice probabilities in S to cross-impute the relativ e c hoice probabilities in T will also lead to an o verestimation of x in aggregate. In addition, the axiom requires that the aggregate cross-imputation error m ust b e larger than the own imputation error. T ogether with Axiom 3 , this prop ert y ensures that the compliance parameter is uniquely identified and b ounded b et w een zero and one. Axiom 4 (Bounded Instabilit y) . F or any tuple ( x, y , S, T ) with x, y ∈ S ∩ T , ∆ xy ( S, T | ρ AI ) · Φ xy ( S, T | ρ AI , ρ H ) ≥ 0 and | ∆ xy ( S, T | ρ AI ) | ≤ | Φ xy ( S, T | ρ AI , ρ H ) | , wher e b oth ine qualities hold strictly if ∆ xy ( S, T | ρ AI ) = 0 . The last axiom b ounds the divergence b etw een the AI’s and the human’s sto chastic c hoices. Fixing own and comp osite instability measures and the h uman’s sto chastic c hoices, it provides a low er b ound on the AI’s sto c hastic c hoices. 16 Axiom 5 (Bounded Divergence) . F or any tuple ( x, y , S, T ) with x, y ∈ S ∩ T , menu U , and alternative z ∈ U , ρ AI ( z , U ) · | Φ xy ( S, T | ρ AI , ρ H ) | ≥ ρ H ( z , U ) · | ∆ xy ( S, T | ρ AI ) | . Mor e over, if ∆ xy ( S, T | ρ AI ) = 0 , the ine quality is strict. T o in terpret this axiom, supp ose the instabilit y measures are strictly positive. The axiom then requires that ρ AI ( z , U ) ρ H ( z , U ) > ∆ xy ( S, T | ρ AI ) Φ xy ( S, T | ρ AI , ρ H ) . The righ t-hand side of the ab o ve inequalit y giv es us the relativ e imputation error: it tells us the prop ortion of aggregate cross-imputation error that can b e explained b y the AI’s o wn imputation error. The axiom then requires that the higher the relative imputation error is, the more the AI’s c hoice probabilities are constrained to trac k the human’s c hoice probabilities. Theorem 2 establishes that Axioms 1 – 5 are necessary and sufficien t for a LAM represen tation. An in teresting feature of this characterization is that while there is an explicit axiom imp osing the I IA prop ert y on the h uman’s choices ρ H , there is no equiv alen t axiom for the autonomous AI’s sto chastic c hoices ρ A . Instead, the pro of of the theorem shows that the I IA prop erty for ρ A is jointly implied by the axioms. Theorem 2 (Lab oratory Characterization) . The p air ( ρ AI , ρ H ) satisfies Axioms 1 – 5 if and only if it is c onsistent with LAM. Pr o of. Necessit y . Axiom 1 follo ws from the assumption that u and v are strictly p ositiv e in LAM. Axiom 2 follows from the fact that ρ H is consisten t with the Luce rule. Axioms 3 and 4 follow from Corollary 1 , which sho ws that ∆ xy ( S, T | ρ AI ) = α · Φ xy ( S, T | ρ AI , ρ H ) . F or Axiom 5 , if ∆ xy ( S, T | ρ AI ) = 0, then the inequality follo ws trivially . Alternatively , if ∆ xy ( S, T | ρ AI ) = 0, then α m ust b e strictly less than 1, as α = 1 w ould imply ρ AI = ρ H and hence ∆ xy ( S, T | ρ AI ) = 0. Therefore, ρ AI ( z , U ) − αρ H ( z , U ) = (1 − α ) ρ A ( z , U ) > 0 ⇒ ρ AI ( z , U ) > α ρ H ( z , U ) . 17 Substituting the result α = ∆ xy ( S, T | ρ AI ) / Φ xy ( S, T | ρ AI , ρ H ) from the previous section in to the ab ov e inequalit y yields the axiom. Sufficiency . Axioms 1 and 2 imply that ρ H is consisten t with the Luce rule with some utility function u that is strictly p ositive. There are tw o cases to consider. First, supp ose ∆ xy ( S, T | ρ AI ) = 0 for all ( x, y , S, T ) with x, y ∈ S ∩ T . Then, b y Remark 1 , ρ AI is consisten t with the Luce rule with some strictly p ositiv e utility function v . In this case, ( u, v , α = 0) is a LAM representation of ( ρ AI , ρ H ). If, in addition, Φ xy ( S, T | ρ AI , ρ H ) = 0 for all ( x, y , S , T ) with x, y ∈ S ∩ T , then w e m ust ha ve v = λu for some λ > 0 and α can b e arbitrary . Next, supp ose ∆ xy ( S, T | ρ AI ) = 0 for some ( x, y , S, T ) with x, y ∈ S ∩ T . By Axiom 4 , ∆ xy ( S, T | ρ AI ) = 0 ⇒ Φ xy ( S, T | ρ AI , ρ H ) = 0 Hence, by Axiom 3 , the ratio ∆ xy ( S, T | ρ AI ) Φ xy ( S, T | ρ AI , ρ H ) is constan t for all such tuples ( x, y , S, T ). Let α denote the ab ov e ratio. Axiom 4 guaran tees that α ∈ (0 , 1). W e next define ρ A b y ρ A ( x, S ) = ρ AI ( x, S ) − αρ H ( x, S ) 1 − α . Since the requiremen ts ρ ( x, S ) = 0 for x / ∈ S and P x ∈ S ρ A ( x, S ) = 1 hold, ρ A is a v alid sto c hastic c hoice function. In addition, by Axiom 5 , the n umerator is strictly p ositiv e for an y x ∈ S so that ρ A ( x, S ) > 0. Rearranging the ab ov e equation yields ρ AI ( x, S ) = α ρ H ( x, S ) + (1 − α ) ρ A ( x, S ) . T o conclude the proof of the theorem, w e only need to show that ρ A is consistent with the Luce rule. By Remark 1 , it is sufficien t to show that ∆ xy ( S, T | ρ A ) = 0 for all tuples ( x, y , S, T ) with x, y ∈ S ∩ T . Let suc h a tuple b e giv en. As sho wn in the pro of of Prop osition 2 , substituting 18 ρ AI = α · ρ H + (1 − α ) · ρ A bac k into the o wn and comp osite instability measures yields ∆ xy ( S, T | ρ AI ) = α 2 ∆ xy ( S, T | ρ H ) + (1 − α ) 2 ∆ xy ( S, T | ρ A ) + α (1 − α )Φ xy ( S, T | ρ H , ρ A ) and Φ xy ( S, T | ρ AI , ρ H ) = 2 α ∆ xy ( S, T | ρ H ) + (1 − α )Φ xy ( S, T | ρ H , ρ A ) . By Axioms 1 – 2 and Remark 1 , ∆ xy ( S, T | ρ H ) = 0. Hence, com bining the last tw o expressions, we ha ve ∆ xy ( S, T | ρ AI ) = (1 − α ) 2 ∆ xy ( S, T | ρ A ) + α (1 − α )Φ xy ( S, T | ρ H , ρ A ) = (1 − α ) 2 ∆ xy ( S, T | ρ A ) + α Φ xy ( S, T | ρ AI , ρ H ) . Consider a tuple ( z , t, S ′ , T ′ ) with z , t ∈ S ′ ∩ T ′ that satisfies α = ∆ z t ( S ′ , T ′ | ρ AI ) Φ z t ( S ′ , T ′ | ρ AI , ρ H ) . Substituting this in to the last term of the ab ov e expression and cross-multiplying, w e get ∆ xy ( S, T | ρ AI )Φ z t ( S ′ , T ′ | ρ AI , ρ H ) = (1 − α ) 2 ∆ xy ( S, T | ρ A )Φ z t ( S ′ , T ′ | ρ AI , ρ H ) + ∆ z t ( S ′ , T ′ | ρ AI )Φ xy ( S, T | ρ AI , ρ H ) . Since α ∈ (0 , 1), w e m ust hav e Φ z t ( S ′ , T ′ | ρ AI , ρ H ) = 0. In addition, b y Axiom 3 , ∆ xy ( S, T | ρ AI )Φ z t ( S ′ , T ′ | ρ AI , ρ H ) = ∆ z t ( S ′ , T ′ | ρ AI )Φ xy ( S, T | ρ AI , ρ H ) . Therefore, the abov e expression can hold only if ∆ xy ( S, T | ρ A ) = 0. Since the tuple ( x, y , S, T ) w as arbitrary , ρ A is consistent with the Luce rule, as desired. This con- cludes the pro of of the theorem as we ha ve shown that ρ AI is a mixture of t wo Luce rules, where one of the mixing parts is ρ H . 19 4 Field Data In this section, I study the Luce Alignmen t Mo del when only the AI’s c hoices ρ AI are observ able. This setting is important for t wo reasons. First, while laboratory data may b e readily av ailable, the volume of field data is expected to b e m uch larger, whic h can enable richer inference ab out AI behavior. Second, AI behavior in the t w o settings may differ systematically: a sufficien tly sophisticated AI ma y appear compli- an t in a monitored lab oratory setting while rev erting to its autonomous preferences in the field, a phenomenon kno wn as deceptive alignment ( Green blatt et al. , 2024 ). Comparing reco v ered compliance parameters across the tw o settings can provide a measure of deceptiv e alignmen t. 4.1 Iden tification The iden tification problem in the field setting faces an inherent c hallenge: if ( u, v , α ) is a LAM representation of ρ AI , then so is ( v , u, 1 − α ). That is, ev en if the tw o utilit y functions underlying LAM can be recov ered, the data alone cannot rev eal whic h b elongs to the human principal and whic h to the AI agen t. The utilities can therefore b e iden tified only up to a lab el swap, and the compliance parameter only up to reflection about 1 / 2. Note, ho wev er, that the distribution ov er utilities ma y still b e uniquely identified. F urthermore, if ρ AI satisfies I IA, then the observed choice b eha vior can be con- sisten t with any alignment and compliance levels: w e can hav e either (i) v = λu for some λ > 0 with arbitrary α ∈ [0 , 1], or (ii) α ∈ { 0 , 1 } with v = λu for an y λ > 0. Hence, the iden tification problem is in teresting only if ρ AI violates I IA. The key for the iden tification in this section will b e the cross instability measure Γ xy ( S, T | ρ AI , ρ ) for ρ ∈ { ρ H , ρ A } , defined in Definition 2 . The next proposition pro vides a formula for the cross instability measures in terms of ( u, v , α ). Prop osition 3. Supp ose ρ AI is c onsistent with LAM with p ar ameters ( u, v , α ) , and let ρ H and ρ A b e the c orr esp onding Luc e rules. Then, Γ xy ( S, T | ρ AI , ρ H ) = (1 − α ) Γ xy ( S, T | ρ A , ρ H ) = (1 − α ) · u ( y ) v ( x ) − u ( x ) v ( y ) u ( T ) v ( S ) 20 and Γ xy ( S, T | ρ AI , ρ A ) = α Γ xy ( S, T | ρ H , ρ A ) = α · u ( x ) v ( y ) − u ( y ) v ( x ) u ( S ) v ( T ) . Pr o of. Substituting ρ AI ( x, S ) = α ρ H ( x, S ) + (1 − α ) ρ A ( x, S ) in to the definition of cross instability , Γ xy ( S, T | ρ AI , ρ H ) = ρ AI ( x, S ) ρ H ( y , T ) − ρ AI ( y , S ) ρ H ( x, T ) = α ρ H ( x, S ) ρ H ( y , T ) − ρ H ( y , S ) ρ H ( x, T ) + (1 − α ) ρ A ( x, S ) ρ H ( y , T ) − ρ A ( y , S ) ρ H ( x, T ) = α ∆ xy ( S, T | ρ H ) + (1 − α ) Γ xy ( S, T | ρ A , ρ H ) . Since ρ H is consisten t with the Luce rule, ∆ xy ( S, T | ρ H ) = 0. Com bining this with the result in Remark 1 , w e get the first iden tity . The second iden tit y follows analogously b y expanding Γ xy ( S, T | ρ AI , ρ A ) and using ∆ xy ( S, T | ρ A ) = 0. The next proposition pro vides a key equation that will b e used in the iden tification of u and v from ρ AI . Prop osition 4. Supp ose ρ AI is c onsistent with LAM with p ar ameters ( u, v , α ) , and let ρ H and ρ A b e the c orr esp onding Luc e rules. L et ρ ∈ { ρ H , ρ A } , S = { x, y , z , t } , and T = { x, y } , wher e x, y , z , t ar e four distinct alternatives, and assume the asso ciate d cr oss instabilities ar e non-zer o. Then, 1 Γ xy ( S, T | ρ AI , ρ ) + 1 Γ xy ( T , T | ρ AI , ρ ) = 1 Γ xy ( S \ t, T | ρ AI , ρ ) + 1 Γ xy ( S \ z , T | ρ AI , ρ ) . Pr o of. Consider the case ρ = ρ H . Letting S = { x, y , z , t } and T = { x, y } , we know from Prop osition 3 that Γ xy ( S ′ , T | ρ AI , ρ H ) = (1 − α ) · u ( y ) v ( x ) − u ( x ) v ( y ) u ( T ) v ( S ′ ) for any men u S ′ ⊇ T . Hence, 1 Γ xy ( S ′ , T | ρ AI , ρ H ) = u ( T ) v ( S ′ ) (1 − α )[ u ( y ) v ( x ) − u ( x ) v ( y )] = v ( S ′ ) (1 − α )[ u ( y ) v ( x ) − u ( x ) v ( y )] /u ( T ) . 21 Notice that the denominator is indep endent of S ′ . Hence, the result holds as long as v ( S ) + v ( T ) = v ( S \ t ) + v ( S \ z ). This holds trivially since b oth sides of the equation ev aluate to 2 v ( x ) + 2 v ( y ) + v ( z ) + v ( t ). The case ρ = ρ A follo ws analogously with u replacing v and vice versa. W e will use this result to identify b oth u and v . The iden tification strategy pro ceeds in three steps. Step 1: Reco ver u ( y ) and v ( y ) from ρ AI for eac h y ∈ X . The iden tification of utilit y functions in the field setting in volv es tw o separate steps. First, I sho w how the candidate utility v alues for each alternativ e can b e identified. In Step 3, I combine the prior t wo steps to iden tify the ov erall utility functions. I start with the identification of u ( y ) for each y ∈ X , and the same pro cess also w orks for v ( y ). Assume X contains at least four alternativ es, and let u ( x ) = 1 for some x ∈ X . Since utility functions are identified only up to scale normalization, this is without loss. F or any y = x , notice that ρ H ( x, { x, y } ) = 1 1 + u ( y ) and ρ H ( y , { x, y } ) = u ( y ) 1 + u ( y ) . Pic k tw o other alternativ es z and t distinct from x and y , and let S = { x, y , z , t } , T = { x, y } , and T ⊆ S ′ ⊆ S . W e ha ve Γ xy ( S ′ , T | ρ AI , ρ H ) = ρ AI ( x, S ′ ) ρ H ( y , T ) − ρ AI ( y , S ′ ) ρ H ( x, T ) = ρ AI ( x, S ′ ) u ( y ) 1 + u ( y ) − ρ AI ( y , S ′ ) 1 1 + u ( y ) = ρ AI ( x, S ′ ) u ( y ) − ρ AI ( y , S ′ ) 1 + u ( y ) . Since ρ AI is observ ed, this is an equation in terms of one unknown u ( y ). There are t wo cases to consider. Case 1: ρ AI ( x, S ′ ) u ( y ) = ρ AI ( y , S ′ ) for all S ′ with T ⊆ S ′ ⊆ S . This ensures that Γ xy ( S ′ , T | ρ AI , ρ H ) = 0. Utilizing Prop osition 4 with ρ = ρ H and canceling the common (1 + u ( y )) terms, w e get 1 ρ AI ( x, S ) u ( y ) − ρ AI ( y , S ) + 1 ρ AI ( x, T ) u ( y ) − ρ AI ( y , T ) = 1 ρ AI ( x, S \ t ) u ( y ) − ρ AI ( y , S \ t ) + 1 ρ AI ( x, S \ z ) u ( y ) − ρ AI ( y , S \ z ) . 22 Cross-m ultiplying, w e get a cubic polynomial in terms of the unkno wn u ( y ). Normal- izing v ( x ) = 1 and re-deriving Prop osition 4 with ρ = ρ A instead of ρ H , w e deduce that v ( y ) m ust also satisfy the same polynomial, provided that ρ AI ( x, S ′ ) v ( y ) = ρ AI ( y , S ′ ) for all S ′ with T ⊆ S ′ ⊆ S . Case 2: ρ AI ( x, S ′ ) u ( y ) = ρ AI ( y , S ′ ) for some S ′ with T ⊆ S ′ ⊆ S . Alternatively , ρ AI ( x, S ′ ) ρ AI ( y , S ′ ) = u ( x ) u ( y ) = ρ H ( x, S ′ ) ρ H ( y , S ′ ) , where the first equality is due to u ( x ) = 1. Since we are assuming ρ AI violates I IA, w e cannot ha v e α = 1 b y Prop osition 1 . Therefore, the ab ov e equalit y is p ossible only if ρ H ( x, S ′ ) ρ H ( y , S ′ ) = ρ A ( x, S ′ ) ρ A ( y , S ′ ) ⇒ u ( x ) u ( y ) = v ( x ) v ( y ) . But then the ratio ρ AI ( y , · ) /ρ AI ( x, · ) must b e constan t and equal to u ( y ) for all men us. Normalizing v ( x ) = 1, we also get u ( y ) = v ( y ). Note that in this case the p olynomial formed b y cross-multiplying the equation in Case 1 will either yield the utilities u ( y ) and v ( y ) as a unique root or the p olynomial will be identically zero. If the p olynomial is identically zero, then w e can generically infer that w e are in Case 2, whic h trivially reco vers u ( y ) and v ( y ) as ρ AI ( y , { x, y } ) /ρ AI ( x, { x, y } ). 2 Prop osition 5 (Iden tification of u ( y ) and v ( y )) . Supp ose ρ AI is c onsistent with LAM with ( u, v , α ) such that u ( x ) = v ( x ) = 1 , and supp ose ρ AI violates IIA. F or any y = x , let P ( κ y ) b e the cubic p olynomial obtaine d by cr oss-multiplying the e quation 1 ρ AI ( x, S ) κ y − ρ AI ( y , S ) + 1 ρ AI ( x, T ) κ y − ρ AI ( y , T ) = 1 ρ AI ( x, S \ t ) κ y − ρ AI ( y , S \ t ) + 1 ρ AI ( x, S \ z ) κ y − ρ AI ( y , S \ z ) , (3) wher e S = { x, y , z , t } , T = { x, y } , and z , t ar e two alternatives distinct fr om x, y . If P ( κ y ) is not identic al ly zer o, then u ( y ) and v ( y ) ar e b oth r o ots of P ( κ y ) and admissible solutions to e quation ( 3 ) . Otherwise, u ( y ) = v ( y ) = ρ AI ( y , { x, y } ) /ρ AI ( x, { x, y } ) holds generic al ly. Pr o of. The pro of follows from the argumen ts preceding the prop osition. 2 A result is said to hold generic al ly if it fails only on a measure-zero subset of the underlying parameter space. 23 There are t w o important points to consider regarding this result. First, while the mo del has t w o unknown utility v alues u ( y ) and v ( y ) for eac h alternative y , the deriv ed cubic p olynomial P ( κ y ) generically has three distinct ro ots. Hence, solving the p olynomial may yield a spurious ro ot that is not a true utility v alue. Ho w ever, since equation ( 3 ) must hold for an y reference pair z and t distinct from x and y and the spurious root will t ypically v ary depending on the c hosen reference pair, if the analyst has access to a fifth alternativ e, rederiving the polynomial using a different reference pair will generically isolate the true utility v alues. Th us, as long as | X | ≥ 5, b oth u ( y ) and v ( y ) are generically identified up to scale normalization and lab el swaps. In addition, as detailed in Step 2, this iden tification pro cedure can b e impro v ed to require only | X | ≥ 4. Second, note that successfully iden tifying the true candidate pair { u ( y ) , v ( y ) } for eac h alternativ e y ∈ X do es not fully pin do wn the utilit y functions u and v . T o illustrate, consider three alternativ es x, y , z and normalize u ( x ) = v ( x ) = 1. Supp ose we hav e recov ered candidate utility pairs { κ 1 y , κ 2 y } and { κ 1 z , κ 2 z } . Since u ( y ) and u ( z ) can b e either of these utilit y v alues, this lea ves us with four candidate utilit y functions u : (1 , κ 1 y , κ 1 z ), (1 , κ 1 y , κ 2 z ), (1 , κ 2 y , κ 1 z ), or (1 , κ 2 y , κ 2 z ). Generalizing this insight, for | X | = N , iden tifying candidate utilit y pairs for eac h alternativ e still lea ves us with 2 N − 1 candidate utility functions. T o resolve this problem, we first need to reco v er the compliance parameter α , as illustrated in the next step. Step 2: Recov er α from ρ AI . F ollo wing Step 1, suppose w e hav e a candidate utilit y pair { u ( y ) , v ( y ) } for eac h alternative y ∈ X \ x and assume u ( x ) = v ( x ) = 1. Construct the asso ciated Luce choice probabilities ρ u ( x, { x, y } ) and ρ v ( x, { x, y } ) for eac h y = x . If { u ( y ) , v ( y ) } is the true utility pair up to a lab el sw ap, then the observed AI sto chastic c hoice function ρ AI m ust satisfy one of the follo wing equations: ρ AI ( x, { x, y } ) = α · ρ u ( x, { x, y } ) + (1 − α ) · ρ v ( x, { x, y } ) , ρ AI ( x, { x, y } ) = (1 − α ) · ρ u ( x, { x, y } ) + α · ρ v ( x, { x, y } ) , where α is the compliance parameter. Hence, for each alternative y = x and each candidate utilit y pair, we get tw o p ossible candidates for the compliance parameter. F or the true utility pair, this iden tifies the compliance parameter up to reflection ab out 1 / 2. Step 1 generically reco vers the true utility pair for an alternativ e when | X | ≥ 5. 24 Alternativ ely , supp ose w e hav e three candidate utility pairs for an alternativ e after solving the cubic p olynomial. Note that an y candidate utility pair for an alternativ e that is not the true utilit y pair will imply a compliance parameter that will not b e generically v alidated by the candidate utilit y pairs for other alternatives. Hence, b y ensuring the consistency of the implied compliance parameter across differen t alternativ es, w e can iden tify the true utility pair. Adopting this approach, w e only need | X | ≥ 4, whic h impro v es up on the pro cedure in Step 1. Lastly , note that while the compliance parameter is iden tified up to reflection ab out 1 / 2, the distribution o ver utilities is generically uniquely iden tified. Step 3: Reco ver u and v from ρ AI . F ollo wing Steps 1 and 2, for each alternativ e y ∈ X \ x , w e can generically identify the true utilit y pair { u ( y ) , v ( y ) } up to a lab el sw ap. How ev er, as discussed in Step 1, iden tifying the true pair for each alternativ e do es not by itself determine which v alue b elongs to u and whic h b elongs to v across alternativ es. T o resolv e the remaining am biguity , fix an alternativ e y ∈ X \ x and supp ose w e assign u ( y ) = κ 1 y . By Step 2, this assignmen t implies a candidate compliance parameter α u . No w consider any other alternativ e z ∈ X \ { x, y } with the utilit y pair { κ 1 z , κ 2 z } . Since the true utility function m ust generate the same compliance parameter across all alternativ es, w e can pin down the true assignmen t for u ( z ) by requiring consistency with α u . Generically , unless α u ∈ { 0 , 1 / 2 , 1 } , exactly one of the t wo candidate v alues for u ( z ) will b e consisten t with α u . W e can then rep eat this pro cedure for all alternativ es to recov er the utility functions u and v up to a lab el sw ap. Com bining all the results in this section, we hav e the following identification result in the field setting. Theorem 3 (Field Iden tification) . Supp ose ρ AI is c onsistent with LAM and | X | ≥ 4 . 1. If ρ AI violates IIA, then ( u, v , α ) ar e generic al ly identifie d up to a lab el swap and sc ale normalization. 2. If ρ AI satisfies IIA, then either v = λu for λ > 0 or α ∈ { 0 , 1 } . Pr o of. The pro of follo ws from the three iden tification steps and the results established in this section. The next example illustrates the field iden tification result. 25 Example 2. Supp ose X = { x, y , z , t } and the AI sto chastic choic e data ρ AI is gen- er ate d by the p ar ameters u = (1 , 2 , 4 , 5) , v = (1 , 4 / 5 , 2 / 5 , 1 / 5) , α = 3 / 4 , as given in the fol lowing table. ρ AI ( · , · ) { x, y , z , t } { x, y , z } { x, y , t } { x, z , t } { y , z , t } { x, y } { x, z } { x, t } { y , z } { y , t } { z , t } x 1 / 6 17 / 77 7 / 32 37 / 160 7 / 18 23 / 70 1 / 3 y 5 / 24 47 / 154 23 / 80 43 / 154 11 / 18 5 / 12 29 / 70 z 7 / 24 73 / 154 29 / 80 53 / 154 47 / 70 7 / 12 1 / 2 t 1 / 3 79 / 160 13 / 32 29 / 77 2 / 3 41 / 70 1 / 2 T able 3: AI sto c hastic c hoice data in Example 2 T o pr o c e e d with identific ation, we first normalize u ( x ) = v ( x ) = 1 . Consider the alternative y . Equation ( 3 ) c orr esp onding to y with S = { x, y , z , t } and T = { x, y } is given by 1 1 6 κ y − 5 24 + 1 7 18 κ y − 11 18 = 1 17 77 κ y − 47 154 + 1 7 32 κ y − 23 80 , which simplifies to 24 4 κ y − 5 + 18 7 κ y − 11 = 154 34 κ y − 47 + 160 35 κ y − 46 . Cr oss-multiplying yields a cubic p olynomial in κ y with r o ots κ 1 y = 2 , κ 2 y = 4 / 5 , and κ 3 y = 263 / 196 . Thus, ther e ar e thr e e p ossible utility p airs: { 2 , 4 / 5 } , { 2 , 263 / 196 } , { 4 / 5 , 263 / 196 } . F or e ach utility p air, we c an use the e quation ρ AI ( x, { x, y } ) = α · ρ u ( x, { x, y } ) + (1 − α ) · ρ v ( x, { x, y } ) to r e c over the implie d c omplianc e p ar ameter up to r efle ction ab out 1 / 2 . This yields the fol lowing table: 26 Utility Pair { u ( y ) , v ( y ) } Implie d α F e asibility { 2 , 4 / 5 } { 3 / 4 , 1 / 4 } ✓ { 2 , 263 / 196 } { 35 / 86 , 51 / 86 } ✓ { 4 / 5 , 263 / 196 } {− 35 / 118 , 153 / 118 } × A t this stage, ther e ar e two fe asible c andidate utility p airs for y , inducing two fe asible values of α up to r efle ction ab out 1 / 2 . We c an eliminate one of them by c onsidering the alternative z or t . R ep e ating the pr o c e dur e for the alternative z with T = { x, z } gives 24 4 κ z − 7 + 70 23 κ z − 47 = 154 34 κ z − 73 + 160 37 κ z − 58 , with r o ots κ 1 z = 4 , κ 2 z = 2 / 5 , and κ 3 z = 481 / 244 . The implie d values of α ar e: Utility Pair { u ( z ) , v ( z ) } Implie d α F e asibility { 4 , 2 / 5 } { 3 / 4 , 1 / 4 } ✓ { 4 , 481 / 244 } { 9 / 154 , 145 / 154 } ✓ { 2 / 5 , 481 / 244 } {− 3 / 142 , 145 / 142 } × F or the alternative t with T = { x, t } , the c orr esp onding e quation is 6 κ t − 2 + 3 κ t − 2 = 160 35 κ t − 79 + 160 37 κ t − 65 . Cr oss-multiplying and solving the r esulting cubic p olynomial gives the c andidate r o ots κ 1 t = 5 , κ 2 t = 1 / 5 , and κ 3 t = 2 . However, κ t = 2 is not a valid solution to the original e quation, sinc e it makes the left-hand side undefine d. Henc e, the admissible r o ots ar e κ 1 t = 5 and κ 2 t = 1 / 5 , which yields: Utility Pair { u ( t ) , v ( t ) } Implie d α F e asibility { 5 , 1 / 5 } { 3 / 4 , 1 / 4 } ✓ The only fe asible value of α c onsistent acr oss al l thr e e alternatives is { 3 / 4 , 1 / 4 } . This uniquely r e c overs u = (1 , 2 , 4 , 5) and v = (1 , 4 / 5 , 2 / 5 , 1 / 5) up to the lab el swap and sc ale normalization. 27 5 Conclusion This paper considers a delegated choice environmen t where an AI agen t is instructed to act on b ehalf of a human principal. A cen tral concern in this en vironment is the p oten tial misalignment b et w een the AI’s and the human principal’s preferences. T o study this problem using rev ealed preference tec hniques, I in tro duce the Luce Align- men t Model, where the AI agen t balances deference to the principal’s preferences against pursuit of its o wn. The mo del mak es it possible to separately iden tify t wo conceptually distinct dimensions of AI b eha vior: alignmen t, whic h captures the simi- larit y betw een the h uman’s and the AI’s preferences, and compliance, whic h captures the extent to which the AI defers to the h uman principal. I study the iden tification problem in t w o settings. In the laboratory setting, where b oth the AI’s and the h uman principal’s sto c hastic c hoices are observed, I sho w that violations of the Indep endence of Irrelev an t Alternativ es in the AI’s c hoice data al- lo w the analyst to recov er b oth utility functions and obtain a closed-form expression for the compliance parameter. I also provide an axiomatic c haracterization of the mo del in this setting. In the field setting, where only the AI’s choices are observ ed, a fundamen tal symmetry prev en ts an analyst from determining which recov ered utilit y b elongs to the h uman and whic h to the AI. Nev ertheless, I sho w that when there are at least four alternativ es, the underlying distribution o ver utilities is generically iden- tified up to this label swap, whic h is sufficien t to recov er the degree of misalignmen t. 28 References G. Adoma vicius and A. T uzhilin. T o ward the next generation of recommender sys- tems: A surv ey of the state-of-the-art and possible extensions. IEEE T r ansactions on Know le dge and Data Engine ering , 17(6):734–749, 2005. A. Allouah, O. Besb es, J. Figueroa, Y. Kanoria, and A. Kumar. What is your AI agen t buying? Ev aluation, implications, and emerging questions for agen tic e-commerce. Columbia Business Scho ol R ese ar ch Pap er No. 381574 , 2025. D. Amo dei, C. Olah, J. Steinhardt, P . Christiano, J. Sch ulman, and D. Man ´ e. Con- crete problems in AI safet y . arXiv pr eprint arXiv:1606.06565 , 2016. Y. Bai, S. K ada v ath, S. Kundu, A. Askell, J. Kernion, A. Jones, A. Chen, A. Goldie, A. Mirhoseini, C. McKinnon, et al. Constitutional AI: Harmlessness from AI feed- bac k. arXiv pr eprint arXiv:2212.08073 , 2022. J. H. Bo yd and R. E. Mellman. The effect of fuel economy standards on the U.S. automotiv e market: An hedonic demand analysis. T r ansp ortation R ese ar ch Part A: Gener al , 14(5–6):367–378, 1980. N. S. Cardell and F. C. Dunbar. Measuring the so cietal impacts of automobile do wn- sizing. T r ansp ortation R ese ar ch Part A: Gener al , 14(5–6):423–434, 1980. C. P . Cham b ers, T. Cuhadaroglu, and Y. Masatlioglu. Behavioral influence. Journal of the Eur op e an Ec onomic Asso ciation , 21(1):135–166, 2023. H. Chang, Y. Narita, and K. Saito. Approximating c hoice data by discrete c hoice mo dels. arXiv pr eprint arXiv:2205.01882 , 2023. E. O. Chen, A. Ghersengorin, and S. Petersen. Imp erfect recall and AI delegation. T ec hnical Rep ort 30-2024, Global Priorities Institute, Universit y of Oxford, 2024. F. Chierichetti, R. Kumar, and A. T omkins. Learning a mixture of tw o multinomial logits. In Pr o c e e dings of the 35th International Confer enc e on Machine L e arning (ICML) , pages 961–969, 2018. P . F. Christiano, J. Leik e, T. Brown, M. Marb er, B. Shlegeris, and D. Amodei. Deep reinforcemen t learning from h uman preferences. A dvanc es in Neur al Information Pr o c essing Systems , 30, 2017. 29 J. T. F ox, K. i. Kim, S. P . Ry an, and P . Ba jari. The random co efficients logit model is identified. Journal of Ec onometrics , 166(2):204–212, 2012. R. Greenblatt, C. Denison, B. W righ t, F. Roger, M. MacDiarmid, S. Marks, J. T reut- lein, T. Belonax, J. Chen, D. Duvenaud, A. Khan, J. Michael, S. Mindermann, E. P erez, L. P etrini, J. Uesato, J. Kaplan, B. Shlegeris, S. R. Bo wman, and E. Hub- inger. Alignment faking in large language mo dels. arXiv pr eprint arXiv:2412.14093 , 2024. D. Hadfield-Menell, S. J. Russell, P . Abbeel, and A. Dragan. Coop erativ e in verse reinforcemen t learning. In A dvanc es in Neur al Information Pr o c essing Systems , v olume 29, 2016. D. Hendryc ks, M. Mazeik a, and T. W o o dside. An o v erview of catastrophic AI risks. arXiv pr eprint arXiv:2306.12001 , 2023. N. Immorlica, B. Lucier, and A. Slivkins. Generative AI as economic agents. A CM SIGe c om Exchanges , 22(1):93–109, 2024. J. Ji, T. Qiu, B. Chen, B. Zhang, H. Lou, K. W ang, Y. Duan, Z. He, J. Zhou, Z. Zhang, et al. AI alignmen t: A comprehensiv e surv ey . arXiv pr eprint arXiv:2310.19852 , 2024. J. Leik e, D. Krueger, T. Everitt, M. Martic, V. Maini, and S. Legg. Scalable agen t alignmen t via rew ard mo deling: A research direction. arXiv pr eprint arXiv:1811.07871 , 2018. J. Lu and K. Saito. Mixed logit and pure c haracteristics mo dels. Working p ap er , 2022. R. D. Luce. Individual Choic e Behavior: A The or etic al Analysis . New Y ork: Wiley , 1959. P . Manzini and M. Mariotti. Dual random utilit y maximisation. Journal of Ec onomic The ory , 177:162–182, 2018. D. McF adden and K. T rain. Mixed MNL models for discrete response. Journal of Applie d Ec onometrics , 15(5):447–470, 2000. 30 E. P erez, S. Ringer, K. Lukosiute, K. Nguyen, E. Chen, S. Heiner, C. Pettit, C. Olsson, S. Kundu, S. Kadav ath, et al. Discov ering language mo del b eha viors with mo del- written ev aluations. In Findings of the Asso ciation for Computational Linguistics: A CL 2023 , pages 13387–13434, 2023. T. R¨ auker, A. Ho, S. Casp er, and D. Hadfield-Menell. T o ward transparen t AI: A surv ey on in terpreting the inner structures of deep neural net works. In 2023 IEEE Confer enc e on Se cur e and T rustworthy Machine L e arning (SaTML) , pages 464–483, 2023. doi: 10.1109/SA TML54575.2023.00039. K. Saito. Axiomatizations of the mixed logit model. T ec hnical rep ort, California Institute of T ec hnology , So cial Science W orking Paper 1433, 2018. W. T ang. Learning an arbitrary mixture of tw o multinomial logits. arXiv pr eprint arXiv:2007.00204 , 2020. X. Zhang, X. Zhang, P .-L. Loh, and Y. Liang. On the identifiabilit y of mixtures of ranking mo dels. arXiv pr eprint arXiv:2201.13132 , 2022. 31
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment