AI Runtime Infrastructure

We introduce AI Runtime Infrastructure, a distinct execution-time layer that operates above the model and below the application, actively observing, reasoning over, and intervening in agent behavior to optimize task success, latency, token efficiency…

Authors: Christopher Cruz

AI Runtime Infrastructure
AI Run time Infrastructure. Christopher Cruz F ebruary 2026 Abstract Agen tic AI systems increasingly op erate ov er long horizons, interact with external to ols, and m ust adapt to dynamic environmen ts during ex- ecution. While significant progress has b een made in mo del serving in- frastructure, orc hestration framew orks, and p ost-hoc observ ability , these approac hes do not address failures, inefficiencies, and safet y risks that emerge during agent execution. In practice, many of the most costly agen t failures occur at run time, after planning has b egun and outside the scop e of static orchestration or offline analysis. W e in tro duce AI Run time Infrastructure , a distinct execution-time la yer that op erates ab o ve the mo del and below the application, actively observing, reasoning o ver, and in tervening in agent b ehavior to optimize task success, latency , tok en efficiency , reliability , and safety while the agen t is running. Unlik e model-level optimizations or passive logging sys- tems, run time infrastructure treats execution itself as an optimization sur- face, enabling adaptive memory management, failure detection, recov ery , and p olicy enforcement o ver long-horizon agen t workflo ws. W e formalize the scope, responsibilities, and boundaries of AI run time infrastructure, distinguishing it from related areas suc h as inference opti- mization, agent orchestration, and observ ability to oling. W e outline core design principles for run time systems, including execution-time interv en- tion, long-horizon state aw areness, and integrated recov ery mechanisms. Finally , we describe Adaptive F o cus Memory (AFM) and VIGIL as early instan tiations of this lay er, demonstrating ho w runtime infrastructure can materially improv e agent robustness and efficiency in real-world settings. W e argue that AI runtime infrastructure represen ts a foundational comp onen t of scalable and reliable agentic systems, and that formalizing this lay er is necessary for the next generation of pro duction-grade AI agen ts. 1 In tro duction Agen tic AI systems are increasingly deploy ed to p erform complex tasks ov er long horizons, interacting with external to ols, APIs, and environmen ts while op erating under latency , cost, and safety constraints. Unlike single-turn mo del inference, these systems execute multi-step workflo ws in whic h decisions made 1 early in execution can hav e cascading effects on downstream b eha vior, resource consumption, and failure mo des. As a result, man y of the most significant c hallenges in pro duction agentic systems arise not at mo del inv ocation time, but during execution itself. Existing infrastructure has largely addressed adjacent concerns. Mo del serv- ing and inference infrastructure fo cuses on optimizing the execution of individual mo del calls through techniques such as batching, caching, and hardware-a ware sc heduling. Agen t orc hestration framew orks provide abstractions for comp os- ing to ols, prompts, and control flo w, enabling developers to sp ecify ho w agents should act. Observ abilit y and AgentOps to oling captures logs, traces, and met- rics to supp ort debugging and offline analysis. Safety mechanisms are often applied p ost-ho c, filtering or mo derating outputs after generation. While eac h of these lay ers is essential, none is designed to actively in tervene in agent b e- ha vior during execution. In practice, agent failures frequentl y emerge after execution has b egun: con- text windows ov erflow, intermediate reasoning drifts off task, to ol in teractions comp ound errors, or latent safety risks surface mid-w orkflow. Because these failures occur at runtime, static orchestration logic and p ost-hoc analysis are insufficien t to preven t or mitigate them. Once an agent has en tered an unrecov- erable execution state, logging the failure provides insight but do es not restore correctness, efficiency , or safety . This gap suggests the need for a distinct execution-time lay er that treats agen t run time behavior itself as a first-class optimization surface. Suc h a lay er m ust be capable of observing execution state o v er long horizons, reasoning about emerging failure mo des, and interv ening dynamically to adjust memory , con- trol flow, resource usage, or p olicy enforcement while the agen t is running. Imp ortan tly , this functionality is orthogonal to mo del-lev el optimization and application-sp ecific logic, and cannot b e reduced to either. In this work, we introduce AI Runtime Infrastructure , a systems lay er that operates ab ov e the model and b elo w the application, providing active execution-time ov ersigh t and interv ention for agentic systems. W e argue that formalizing this lay er is necessary for building scalable, reliable, and safe AI agen ts, and that its absence represents a structural limitation in current agent deplo yments. The remainder of this pap er defines the scop e and b oundaries of AI runtime infrastructure, situates it relative to prior work, and describ es early instan tiations that demonstrate its practical v alue. 2 Defining AI Run time Infrastructure W e define AI Run time Infrastructure as an execution-time systems lay er that actively observes, reasons o ver, and interv enes in the b eha vior of agen tic AI systems while they are running. This lay er op erates ab o ve the mo del and b elo w the application, and is responsible for optimizing agent execution with resp ect to task success, latency , token efficiency , reliability , and safety ov er long horizons. 2 Unlik e mo del serving infrastructure, which fo cuses on optimizing the p er- formance of individual inference calls, AI runtime infrastructure treats agent execution itself as a first-class ob ject. Its scope includes monitoring ev olving execution state, detecting emerging failure mo des, and applying corrective ac- tions during runtime rather than after execution has completed. This distinc- tion is critical in agen tic systems, where errors often comp ound across steps and cannot b e addressed through static orchestration or post-ho c analysis alone. F ormally , an AI run time infrastructure system satisfies three necessary prop- erties. First, it op erates during exe cution , maintaining con tinuous visibilit y into agen t state, intermediate outputs, and environmen tal interactions across multi- ple steps. Second, it p erforms active intervention , mo difying execution b eha vior through actions such as adaptive memory management, control-flo w adjustment, reco very triggering, or p olicy enforcement. Third, it reasons ov er long-horizon c ontext , incorporating execution history rather than relying solely on the current prompt or mo del inv o cation. AI runtime infrastructure is distinct from several adjacent categories of sys- tems. It do es not encompass inference optimization techniques such as cachi ng, batc hing, or hardware-a ware sc heduling, which impro ve model execution but do not reason ab out agent b eha vior. It is not equiv alen t to agent orchestration framew orks, whic h define control flow and tool comp osition but lack execution- time in trosp ection and adaptive interv en tion. It is also separate from observ- abilit y and AgentOps to oling, which capture execution traces for offline analysis but do not influence outcomes while an agen t is running. Finally , runtime in- frastructure differs from p ost-hoc safety lay ers, as it addresses safety risks as they emerge during execution rather than filtering outputs after generation. The resp onsibilities of AI run time infrastructure include, but are not lim- ited to: main taining execution-time state representations; iden tifying deviations from task ob jectiv es or safety constraints; allocating and compressing contextual information; triggering recov ery or rollback mechanisms; and enforcing runtime p olicies that balance efficiency , robustness, and risk. Imp ortantly , this lay er is designed to b e application-agnostic, providing general-purp ose execution con trol rather than enco ding domain-sp ecific logic. By formalizing AI runtime infrastructure as a distinct systems lay er, we aim to clarify the architectural requirements of reliable agentic AI and to pro vide a foundation for principled design and ev aluation. The following sections situate this la yer within the broader AI systems stack and outline core design principles for effectiv e runtime infrastructure. 3 Arc hitectural P ositioning AI runtime infrastructure o ccupies a distinct position within the agentic sys- tems stack , op erating betw een mo del execution and application-level logic. This placemen t is inten tional: run time infrastructure must hav e sufficient pro ximity to the mo del to observe intermediate outputs and resource utilization, while re- maining abstracted from application-sp ecific ob jectiv es and domain logic. Fig- 3 ure 1 illustrates the full agentic systems stack and the arc hitectural role of AI run time infrastructure within it. A t the low est lev el of the stack, mo del serving and inference infrastructure is resp onsible for executing individual mo del calls efficiently . This includes con- cerns suc h as batching, cac hing, hardware sc heduling, and latency optimization. These comp onents exp ose inference capabilities but do not reason ab out m ulti- step agen t b ehavior or execution history . Ab o ve this lay er, AI run time infrastructure main tains contin uous visibility in to the agen t’s execution state across steps. It consumes signals such as in- termediate mo del outputs, to ol resp onses, memory utilization, and p olicy con- strain ts, and uses these signals to make execution-time decisions. Crucially , this lay er is emp o wered to in tervene during execution, for example b y mo dify- ing contextual inputs, adjusting control flow, triggering reco very mechanisms, or enforcing runtime policies. These interv entions occur without requiring changes to the underlying mo del or to the application logic that inv okes the agent. A t the top of the stack, application logic sp ecifies task ob jectives, user in- teraction patterns, and domain-sp ecific b eha vior. Applications define what an agen t should accomplish, but they typically lack mechanisms to monitor or correct execution failures as they unfold. By decoupling execution ov ersigh t from application logic, AI run time infrastructure enables reusable, application- agnostic con trol ov er agent b ehavior. Arc hitecturally , AI runtime infrastructure can b e implemen ted as an execution- time control plane that in terfaces with b oth the agent execution loop and exter- nal system resources. The agent pro duces execution artifacts—such as interme- diate reasoning steps, to ol in vocations, and partial outputs—whic h are observed b y the runtime lay er. In resp onse, the runtime la yer may emit control signals that alter subsequen t execution, forming a closed feedbac k lo op that p ersists for the duration of the agent’s op eration. This structure distinguishes runtime infrastructure from static orchestration pip elines, which define execution paths but do not adapt based on observ ed outcomes. Imp ortan tly , AI runtime infrastructure do es not replace existing lay ers but comp oses with them. Mo del serving infrastructure remains resp onsible for ef- ficien t inference, orc hestration frameworks contin ue to manage high-level task decomp osition, and observ ability systems pro vide retrosp ectiv e analysis with- out influencing execution-time b ehavior. Run time infrastructure complements these comp onen ts by providing execution-time in telligence that bridges the gap b et ween planning and outcome. By explicitly formalizing this arc hitectural role, w e clarify how adaptive, reliable, and safe agentic systems can b e constructed without en tangling concerns across lay ers. 4 4 Design Principles for AI Run time Infrastruc- ture AI runtime infrastructure introduces a distinct set of design requirements that differ from those of mo del serving systems, orc hestration framew orks, and ob- serv ability to oling. T o clarify what constitutes effective runtime infrastructure for agentic systems, we outline a set of core design principles. These principles are not tied to sp ecific implementations, but instead characterize the essential prop erties required for execution-time ov ersight and control. 4.1 Execution-Time Interv ention AI runtime infrastructure must b e capable of interv ening during agent execu- tion rather than op erating solely b efore or after a run. Man y agent failures emerge only after execution has b egun, when intermediate reasoning, tool inter- actions, or accumulated context div erge from intended ob jectives. Systems that observ e failures but cannot alter execution b ehavior in resp onse do not satisfy this requirement. Runtime infrastructure m ust therefore supp ort mechanisms that can mo dify agen t inputs, con trol flo w, or execution state while the agent is activ ely running. 4.2 Long-Horizon State Awareness Agen tic systems frequen tly operate o ver extended horizons inv olving dozens or h undreds of steps. AI runtime infrastructure must maintain visibility into ex- ecution history across these horizons, rather than relying exclusively on the curren t prompt or most recen t mo del output. This includes tracking in ter- mediate decisions, memory utilization, to ol outcomes, and prior interv entions. Without long-horizon state aw areness, runtime systems are unable to reason ab out cumulativ e failure mo des or comp ounding inefficiencies. 4.3 Closed-Lo op Control Effectiv e runtime infrastructure forms a closed feedbac k lo op b etw een observ a- tion and action. Execution signals produced b y the agen t—such as intermediate outputs, latency measurements, or to ol resp onses—are contin uously ev aluated and used to inform subsequent interv entions. This closed-lo op structure distin- guishes run time infrastructure from static orchestration pip elines, which define execution paths in adv ance but do not adapt based on observed outcomes during execution. 4.4 Mo del-Agnostic Op eration AI runtime infrastructure should op erate indep endently of sp ecific mo del ar- c hitectures or providers. While it must interface closely with mo del execution to observe outputs and resource usage, it should not require mo dification of 5 the underlying model or rely on mo del-specific in ternals. This separation en- ables run time infrastructure to generalize across different mo dels and to ev olve indep enden tly as mo del capabilities c hange. 4.5 Application-Agnostic Control Run time infrastructure is designed to provide execution-time control that is reusable across applications. It should not enco de domain-sp ecific task logic or application-lev el ob jectives, which remain the resp onsibilit y of the application la yer. By maintaining this separation, runtime infrastructure can serve as a general-purp ose control plane that supp orts diverse agen tic workloads without en tangling execution ov ersight with business logic. 4.6 Safet y , Cost, and Reliability as Runtime Concerns Safet y , efficiency , and reliability constraints m ust b e enforced as part of execution- time decision making rather than solely through static p olicies or p ost-ho c fil- tering. AI runtime infrastructure enables these concerns to b e ev aluated dy- namically as execution unfolds, allo wing systems to respond to emerging risks, escalating costs, or degraded p erformance b efore failures become irreversible. T reating these dimensions as runtime concerns is essential for deploying agentic systems in pro duction environmen ts. T ogether, these principles define AI run time infrastructure as an execution- time con trol lay er that complemen ts existing comp onents of the agen tic systems stac k. Systems that satisfy these criteria can actively shape agent behavior as it unfolds, enabling adaptive, robust, and scalable agentic AI b ey ond what static orc hestration or offline analysis alone can provide. 5 Early Systems and Precursors The formalization of AI runtime infrastructure is motiv ated by practical chal- lenges encountered in long-horizon agentic systems, where failures, inefficiencies, and safet y risks emerge during execution rather than at planning time. Prior to the explicit definition of runtime infrastructure as an execution-time con- trol lay er, several systems addressed asp ects of runtime b eha vior without fully satisfying the criteria outlined in Section 4. In this section, we describ e tw o suc h systems—VIGIL and Adaptiv e F o cus Memory (AFM)—to illustrate the progression from run time-aw are precursors to a fully realized instantiation of AI run time infrastructure. 5.1 VIGIL: A Runtime-Aw are Precursor VIGIL [ ? ] is a reflectiv e runtime system designed to diagnose and resp ond to failures in long-running agen t workflo ws . It analyzes structured execution logs and traces to detect anomalous behavior, degraded p erformance, or violations of 6 exp ected execution patterns, and can trigger remediation actions or human-in- the-lo op escalation. By reasoning o ver execution histories that span man y agen t steps, VIGIL demonstrates the limitations of purely p ost-hoc observ ability in agen tic systems. While VIGIL is explicitly run time-aw are, it op erates primarily outside the agen t execution lo op. Its diagnostic and recov ery mechanisms are inv ok ed after failures hav e been detected, and its influence on agent b eha vior o ccurs through external remediation rather than contin uous, in-loop control. As a result, VIGIL do es not p erform execution-time interv ention in the sense required for AI run- time infrastructure. Instead, it serv es as a precursor system that exp oses the need for tighter integration b et ween execution monitoring and control, and mo- tiv ates the developmen t of runtime infrastructure capable of in tervening directly during agen t execution. 5.2 Adaptiv e F o cus Memory: AI Run time Infrastructure Adaptiv e F o cus Memory (AFM) [2] represents an early instan tiation of AI run- time infrastructure as defined in this work. AFM op erates directly within the agen t execution lo op, contin uously observing execution state and interv ening in real time to manage contextual information o ver long horizons. By dynami- cally allo cating, compressing, and reweigh ting memory during execution, AFM activ ely shap es agent b ehavior while tasks are in progress. AFM satisfies the core prop erties of AI runtime infrastructure. It p erforms execution-time interv ention by mo difying the contextual inputs provided to the mo del as execution unfolds. It reasons o ver long-horizon state by maintain- ing and adapting memory represen tations across many agent steps. Finally , it participates in a closed-lo op control pro cess, where execution signals inform sub- sequen t interv entions that directly influence agent b ehavior. These op erations o ccur without requiring changes to the underlying mo del or application logic, p ositioning AFM as an execution-time con trol la yer rather than an orchestration or observ ability comp onent. 5.3 F rom Precursors to Run time Infrastructure T ogether, VIGIL and AFM illustrate the evolution from run time-aw are monitor- ing tow ard fully integrated execution-time con trol. VIGIL demonstrates that p ost-hoc diagnostics and reco very are insufficient for managing long-horizon agen t failures, while AFM op erationalizes the principles of AI run time infras- tructure by embedding adaptive control directly in to agent execution. This progression underscores the necessit y of formalizing runtime infrastructure as a distinct systems lay er and clarifies the arc hitectural and functional b oundary b et ween precursors and true execution-time control systems. 7 6 Related W ork AI runtime infrastructure in tersects with several established areas of research and engineering, including mo del serving infrastructure, agent orchestration framew orks, observ ability and AgentOps to oling, and AI safety systems. While these domains address important asp ects of agen tic system deploymen t, they do not pro vide execution-time control ov er agent b ehavior as defined in this work. 6.1 Mo del Serving and Inference Infrastructure A large b o dy of work fo cuses on optimizing the execution of individual mo del in vocations through tec hniques such as batc hing, cac hing, quantization, and hardw are-aw are sc heduling. These systems aim to impro ve throughput, latency , and cost efficiency at inference time, and are critical for deploying large-scale language mo dels in production en vironments. How ever, mo del serving infras- tructure treats each inference call largely in isolation and do es not reason ab out m ulti-step agent execution, long-horizon state, or task-level outcomes. As a result, inference optimization alone is insufficient for managing failures or inef- ficiencies that emerge during extended agen t workflo ws. 6.2 Agen t Orc hestration F ramew orks Agen t orchestration framew orks provide abstractions for comp osing prompts, to ols, and control flow into structured agent b eha viors. These frameworks en- able developers to sp ecify execution graphs, routing logic, and to ol usage pat- terns, and ha ve b een instrumental in accelerating the developmen t of agen tic systems. How ever, orchestration frameworks primarily define execution plans rather than execution-time control. Once an agent is running, orchestration logic typically executes as sp ecified, with limited ability to adapt based on ob- serv ed runtime b ehavior. In con trast, AI run time infrastructure reasons o ver execution state as it unfolds and in tervenes dynamically to influence agent b e- ha vior during op eration. 6.3 Observ ability and Agen tOps T o oling Observ ability and Agen tOps systems capture logs, traces, metrics, and ev alu- ation artifacts from agent executions to supp ort debugging, monitoring, and offline analysis. These to ols provide v aluable insight into agent p erformance and failure mo des, particularly in pro duction settings. How ever, they are inher- en tly retrosp ectiv e: execution data is collected for insp ection after the fact, and do es not directly influence agent b eha vior during execution. While runtime- a ware precursors such as VIGIL [1] demonstrate the limitations of purely p ost- ho c analysis, observ ability to oling alone do es not satisfy the requirements of execution-time in terven tion and closed-lo op con trol. 8 6.4 AI Safety and P olicy Enforcement Systems AI safet y mec hanisms are often implemen ted as static policies or p ost-ho c filters that constrain or mo derate mo del outputs. These approaches pla y an important role in mitigating harmful b ehavior, but typically op erate outside the agen t execution lo op and lac k visibility in to long-horizon execution state. More recen t w ork explores adaptive safety mechanisms that resp ond to contextual signals, but these are rarely integrated as general-purpose execution-time control la y ers. AI runtime infrastructure treats safety , reliability , and efficiency as run time concerns, enabling dynamic in terven tion as risks emerge during execution rather than solely at output time. 6.5 P ositioning AI Run time Infrastructure AI runtime infrastructure complements, rather than replaces, these existing systems. Mo del serving infrastructure remains resp onsible for efficien t inference, orc hestration frameworks contin ue to define high-level b ehavior, observ ability to oling supp orts retrosp ectiv e analysis, and safety systems enforce constraints. Run time infrastructure addresses a distinct gap b y providing execution-time o versigh t and control across long-horizon agent workflo ws. By formalizing this la yer, we clarify architectural b oundaries and enable principled design of systems that activ ely shap e agent b ehavior as it unfolds. 7 Implications and F uture Directions F ormalizing AI runtime infrastructure as a distinct execution-time lay er has sev eral implications for the design, ev aluation, and deploymen t of agentic sys- tems. By treating agent execution itself as an optimization surface, runtime infrastructure enables new classes of adaptiv e b eha vior that are difficult or im- p ossible to achiev e through static orchestration, mo del-lev el optimization, or p ost-hoc analysis alone. 7.1 Scalable Reliability for Long-Horizon Agen ts As agentic systems are deplo yed to p erform increasingly long-horizon tasks, failure mo des that comp ound ov er time b ecome a dominant source of cost and unreliabilit y . AI runtime infrastructure pro vides a mechanism for addressing these failures during execution, b efore they propagate into irrecov erable states. This suggests a shift from reactiv e debugging tow ard proactive execution-time con trol as a foundation for scalable agent reliability . 7.2 Run time-Aw are Safety and Gov ernance T reating safety and policy enforcemen t as runtime concerns enables more nu- anced and adaptiv e gov ernance of agen t b ehavior. Rather than relying solely on 9 static constraints or output filtering, runtime infrastructure can resp ond dynam- ically to evolving execution con text, emerging risks, or changes in en vironmental conditions. This opens opp ortunities for safet y mec hanisms that are sensitiv e to long-horizon b ehavior and cumulativ e risk, rather than isolated mo del outputs. 7.3 Ev aluation Beyond P ost-Ho c Metrics The presence of an execution-time control lay er also motiv ates new approaches to ev aluating agentic systems. T raditional metrics that summarize outcomes after execution may fail to capture the benefits of runtime interv ention, such as a voided failures or reduced reco very costs. F uture ev aluation frameworks may need to account for execution tra jectories, interv ention timing, and coun terfac- tual outcomes enabled b y runtime infrastructure. 7.4 Op en Research Directions AI run time infrastructure introduces several op en researc h c hallenges. These in- clude designing principled p olicies for interv ention under uncertaint y , balancing comp eting ob jectives suc h as efficiency and safet y at runtime, and develop- ing abstractions that generalize across div erse agent architectures and en viron- men ts. Additionally , understanding how run time infrastructure interacts with learning-based adaptation remains an op en question, particularly in systems that com bine execution-time control with online or contin ual learning. More broadly , formalizing run time infrastructure highligh ts the need for clearer architectural b oundaries in agentic AI systems. As agents b ecome more autonomous and are entrusted with higher-impact tasks, execution-time con trol is lik ely to b ecome a foundational requirement rather than an optional enhance- men t. 8 Conclusion Agen tic AI systems increasingly operate ov er long horizons, in teract with ex- ternal environmen ts, and must satisfy constraints on reliability , efficiency , and safet y during execution. While existing infrastructure addresses mo del execu- tion, orchestration, observ ability , and p ost-hoc ev aluation, these comp onen ts do not provide execution-time con trol ov er agent b eha vior. As a result, many criti- cal failure mo des remain unaddressed until after execution has already degraded or failed. In this work, we formalize d AI runtime infr astructur e as a distinct execution- time systems la yer that op erates ab ov e the mo del and b elow the application. W e defined its scop e, resp onsibilities, and architectural b oundaries, and identi- fied core design principles that distinguish runtime infrastructure from adjacent systems. Through the examination of runtime-a ware precursors and early in- stan tiations, w e illustrated how execution-time interv ention enables adaptive 10 con trol that cannot b e ach ieved through static orc hestration or retrosp ective analysis alone. By explicitly naming and formalizing this lay er, we aim to clarify the ar- c hitectural requirements of scalable, reliable agentic systems and to provide a foundation for principled system design and ev aluation. As agen tic AI con tinues to mov e tow ard more autonomous and high-impact deplo yments, execution-time con trol is likely to b ecome a foundational requirement rather than an optional enhancemen t. AI run time infrastructure pro vides a framew ork for meeting this requiremen t and for adv ancing the next generation of pro duction-grade agen tic systems. References [1] Cruz, C. (2025). V.I.G.I.L: A Reflective Runtime for Self-Healing LLM Agen ts. arXiv pr eprint arXiv:2512.07094 . [2] Cruz, C. (2025). Adaptive F o cus Memory for Long-Horizon Agentic Sys- tems. arXiv pr eprint arXiv:2511.12712 . [3] Shinn, N., et al. (2023). Reflexion: Language Agen ts with V erbal Rein- forcemen t Learning. A dvanc es in Neur al Information Pr o c essing Systems . [4] Madaan, A., et al. (2023). Self-Refine: Iterative Refinement with Self- F eedback. arXiv pr eprint arXiv:2303.17651 . [5] Y ao, S., et al. (2022). ReAct: Synergizing Reasoning and Acting in Lan- guage Mo dels. arXiv pr eprint arXiv:2210.03629 . [6] Park, J., et al. (2023). Generative Agen ts: Interactiv e Simulacra of Human Beha vior. arXiv pr eprint arXiv:2304.03442 . [7] W ang, X., et al. (2023). V oy ager: An Op en-Ended Em b o died Agen t with LLMs. arXiv pr eprint arXiv:2305.16291 . [8] Zhou, M., et al. (2023). SWE-agent: Autonomously Co ding in the Wild. arXiv pr eprint arXiv:2305.18276 . [9] T orantulino, S. (2023). Auto-GPT: An Autonomous GPT-4 Exp erimen t. GitHub r ep ository . https://github.com/Torantulino/Auto- GPT . [10] Liu, J., et al. (2023). MemPrompt: Memory-Augmented Prompting for Language Mo dels. arXiv pr eprint arXiv:2305.10417 . [11] Chen, E., et al. (2023). T eaching Large Language Mo dels to Self-Debug. arXiv pr eprint arXiv:2304.05128 . 11 Figure 1: The full agentic AI systems stack. AI runtime infrastructure op erates as an execution-time control la yer b etw een agent orchestration and mo del serv- ing, observing execution state and in tervening during runtime to optimize task success, efficiency , reliability , and safety . Observ ability and ev aluation systems span the stac k but do not influence execution-time b ehavior. 12

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment