AI-Assisted Engineering Should Track the Epistemic Status and Temporal Validity of Architectural Decisions

AI-Assisted Engineering Should Track the Epistemic Status and Temporal Validity of Architectural Decisions
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This position paper argues that AI-assisted software engineering requires explicit mechanisms for tracking the epistemic status and temporal validity of architectural decisions. LLM coding assistants generate decisions faster than teams can validate them, yet no widely-adopted framework distinguishes conjecture from verified knowledge, prevents trust inflation through conservative aggregation, or detects when evidence expires. We propose three requirements for responsible AI-assisted engineering: (1) epistemic layers that separate unverified hypotheses from empirically validated claims, (2) conservative assurance aggregation grounded in the Gödel t-norm that prevents weak evidence from inflating confidence, and (3) automated evidence decay tracking that surfaces stale assumptions before they cause failures. We formalize these requirements as the First Principles Framework (FPF), ground its aggregation semantics in fuzzy logic, and define a quintet of invariants that any valid aggregation operator must satisfy. Our retrospective audit applying FPF criteria to two internal projects found that 20-25% of architectural decisions had stale evidence within two months, validating the need for temporal accountability. We outline research directions including learnable aggregation operators, federated evidence sharing, and SMT-based claim validation.


💡 Research Summary

The paper addresses a pressing problem in modern software development: large language model (LLM) coding assistants such as GitHub Copilot, Cursor, Claude Code, and Gemini Code Assist can generate architectural recommendations far faster than human engineers can validate them. This speed creates a gap where decisions are made without clear epistemic qualification, and the underlying evidence can become stale as libraries, benchmarks, or operational contexts evolve. The authors argue that AI‑assisted engineering must incorporate explicit mechanisms to (1) distinguish conjecture from empirically verified knowledge, (2) aggregate evidence conservatively to avoid confidence inflation, and (3) track the temporal validity of evidence automatically.

To meet these needs they propose the First Principles Framework (FPF), which introduces three core constructs. First, every claim is annotated with a three‑dimensional trust tuple (F‑G‑R): Formality (F) captures the rigor of the claim on a scale from informal observation (F0) to formal proof (F3), each with a predefined reliability ceiling (e.g., F0 ≤ 70 %). Scope (G) encodes the context where the claim applies, expressed as a path‑tag set, and is penalized through Congruence Levels (CL) that quantify how well the evidence matches the target environment (e.g., CL3 = no penalty, CL0 = 90 % penalty). Reliability (R) is a numeric confidence score in


Comments & Academic Discussion

Loading comments...

Leave a Comment