Phoenix: A Modular and Versatile Framework for C/C++ Pointer Analysis

Phoenix: A Modular and Versatile Framework for C/C++ Pointer Analysis
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present Phoenix, a modular pointer analysis framework for C/C++ that unifies multiple state-of-the-art alias analysis algorithms behind a single, stable interface. Phoenix addresses the fragmentation of today’s C/C++ pointer analysis ecosystem by cleanly separating IR construction, constraint generation, solver backends, and client-facing queries, making analyses easy to compare, swap, and compose while exposing explicit precision-performance trade-offs. We evaluate Phoenix against SVF under two representative configurations: a flow- and context-insensitive setting and a more precise flow- and context-sensitive setting, on 28 GNU coreutils programs. Phoenix delivers robust speedups in the baseline configuration (up to 2.88x) and remains competitive, and often faster, even in the stronger precision regime (up to 2.91x), without a systematic runtime penalty. In production, Phoenix serves as the analysis substrate for static analysis and fuzzing tools that have uncovered hundreds of new bugs and enabled deployments reporting more than 1000 bugs found in an industrial toolchain.


💡 Research Summary

The paper introduces Phoenix, a modular and versatile framework for pointer (alias) analysis of C/C++ programs. The authors observe that the current ecosystem for C/C++ pointer analysis is highly fragmented: most existing tools hard‑code a single analysis style, expose internal data structures, and lack a common intermediate representation (IR) or query interface. This fragmentation hampers reproducibility, comparative evaluation, and the integration of new analyses. To address these issues, Phoenix is built on top of the Lotus infrastructure and follows the IDEA principles—Integrate, Diversify, Extend, and Advance. Its architecture is deliberately layered into four distinct components: (1) IR construction, which translates LLVM IR into a unified representation; (2) constraint generation, which maps program statements to a common constraint language; (3) solver back‑ends, which implement a variety of propagation strategies and work‑list policies; and (4) an interface layer that provides a stable, algorithm‑agnostic API for querying pointer information.

The framework supports a broad algorithmic portfolio. Inclusion‑based analyses (e.g., Andersen’s subset‑based analysis) are provided in flow‑insensitive, flow‑sensitive, context‑insensitive, and context‑sensitive variants. The context‑sensitivity can be tuned using 1‑CF, 2‑CF, etc., allowing systematic precision‑performance studies. Unification‑based analyses are also supported via equality constraints, enabling DycK‑style variants that recover precision for common C/C++ idioms while retaining near‑linear solving time. Crucially, the constraint language and result adapters decouple the solvers from the client code, so a client can switch between subset‑based and equality‑based solvers, or between flow‑insensitive and flow‑sensitive configurations, without any code changes.

Solver back‑ends are highly configurable. Phoenix treats simplifications (e.g., hash‑based value numbering, hard‑ekopf & Lin’s HVN, lazy cycle detection) and propagation strategies (wave, deep, diff, partial update) as first‑class plug‑ins. Work‑list iteration orders include FIFO, LIFO, least‑recently‑fired (LRF), 2‑phase LRF, and topological order, giving researchers fine‑grained control over performance. Points‑to sets can be represented either with BDDs for compactness or with SIMD‑accelerated sparse bitvectors for fast bulk operations; the framework’s abstraction allows swapping these implementations transparently.

Phoenix’s query interface exposes four core primitives: MayAlias(p,q), PointedBy(p,o), GetPointsToSet(p), and GetAliasSet(v). These primitives are built on top of the normalized result adapters, ensuring that client analyses never depend on solver internals. In addition, Phoenix provides access to several intermediate representations—Program Dependence Graph (PDG), Inter‑procedural Control‑Flow Graph (ICFG), MemorySSA, and Static Single Information (SSI)—which serve as shared substrates for downstream analyses such as program slicing, memory‑dependency tracking, and abstract interpretation.

The authors evaluate Phoenix on 28 GNU coreutils programs, comparing two configurations against the widely used SVF framework: (a) a flow‑ and context‑insensitive baseline, and (b) a more precise flow‑ and context‑sensitive configuration. In the baseline, Phoenix outperforms SVF on every benchmark, achieving up to 2.88× speedup while using comparable memory. In the precision‑focused configuration, Phoenix remains competitive and is often faster (up to 2.91×), despite operating in a strictly stronger analysis regime (flow‑ and context‑sensitive versus SVF’s flow‑sensitive, context‑insensitive). The authors note that the modest slowdowns observed on a few benchmarks do not dominate the overall suite, indicating that the modular design does not impose a systematic runtime penalty.

Beyond benchmarks, Phoenix has been integrated into several production tools within the authors’ Lotus ecosystem. It underpins a null‑pointer detector, a taint‑tracking engine, a numeric abstract interpreter that combines bit‑level and word‑level reasoning, and directed fuzzing pipelines that steer input generation toward memory‑critical regions. These applications have collectively uncovered hundreds of previously unknown defects in open‑source software and have been deployed at Huawei, where they contributed to the discovery of more than 1,000 bugs in an industrial toolchain.

The paper acknowledges some limitations. The SIMD‑accelerated sparse bitvector backend is still experimental, and large‑scale scalability tests beyond the coreutils suite are limited. The current context‑sensitivity models are restricted to simple call‑site abstractions; more sophisticated object‑sensitive or hybrid contexts remain future work. Nevertheless, the authors release Phoenix as open‑source (https://tinyurl.com/433z9y66), providing documentation and examples to facilitate adoption and further research.

In summary, Phoenix delivers a comprehensive, extensible, and high‑performance platform for C/C++ pointer analysis. By cleanly separating IR construction, constraint generation, solving, and querying, it enables researchers and practitioners to experiment with a wide spectrum of analysis algorithms, conduct fair comparative studies, and integrate pointer analysis seamlessly into downstream static analysis, verification, and fuzzing tools. Its empirical results demonstrate that modularity does not sacrifice speed or precision, and its real‑world deployments validate its practical impact.


Comments & Academic Discussion

Loading comments...

Leave a Comment