Harpocrates: A Statically Typed Privacy Conscious Programming Framework
In this paper, we introduce Harpocrates, a compiler plugin and a framework pair for Scala that binds the privacy policies to the data during data creation in form of oblivious membranes. Harpocrates eliminates raw data for a policy protected type from the application, ensuring it can only exist in protected form and centralizes the policy checking to the policy declaration site, making the privacy logic easy to maintain and verify. Instead of approaching privacy from an information flow verification perspective, Harpocrates allow the data to flow freely throughout the application, inside the policy membranes but enforces the policies when the data is tried to be accessed, mutated, declassified or passed through the application boundary. The centralization of the policies allow the maintainers to change the enforced logic simply by updating a single function while keeping the rest of the application oblivious to the change. Especially in a setting where the data definition is shared by multiple applications, the publisher can update the policies without requiring the dependent applications to make any changes beyond updating the dependency version.
💡 Research Summary
The paper introduces Harpocrates, a novel privacy‑preserving programming framework for Scala that consists of a compiler plugin and a supporting runtime library. The central idea is to bind privacy policies directly to data at the point of creation by wrapping the data in an “oblivious membrane”. This membrane ensures that raw, unprotected data of a policy‑protected type never exists in the application code; instead, only the protected wrapper can be used. Policy enforcement is centralized at the declaration site, which means that the logic for checking whether data may be accessed, mutated, declassified, or sent across application boundaries resides in a single place.
Policy classes implement a Policy trait and must provide a check method that encodes the de‑classification conditions. Developers mix a policy class into a data type’s constructor using an enforce keyword. The compiler plugin, which runs in fifteen phases, transforms the constructor so that any instance of the type can only be created in a protected state. It also copies all methods of the original type into the generated policy class, preserving structural equality. When a copied method’s body cannot be proven side‑effect‑free, the plugin wraps the call in a conditional that invokes the policy’s check function. Implicit conversions are injected to automatically lift raw values to their policy‑wrapped counterparts and to allow standard library functions that expect raw types to accept protected values transparently.
Because the policy checks are performed at runtime, Harpocrates can handle dynamic policies that depend on external services or user‑specific context, such as GDPR‑required “right to withdraw consent”. The compiler automatically inserts calls to the check method at identified application boundaries (e.g., database reads/writes, HTTP requests). This eliminates the need for developers to manually sprinkle guard clauses throughout the codebase.
The authors evaluated Harpocrates by integrating it into Vision, a real‑world micro‑service‑based project for musicians with over 20 000 lines of code. The integration required only a few hundred lines of modifications, mainly policy class definitions and constructor annotations. Compilation time increased by roughly 5–10 %, and runtime overhead was limited to the boundary checks, resulting in negligible impact on overall latency.
In comparison with related work, the paper discusses monad‑based privacy wrappers, static information‑flow control systems such as JFlow and LIFTY, dynamic IFC approaches like Jacqueline and Resin, and contract‑based membranes in Racket. Monadic solutions still expose raw data and demand extensive type changes; static IFC requires exhaustive annotations and cannot express dynamic, context‑dependent policies; dynamic IFC often needs custom runtimes or incurs high overhead; and contract systems demand developers to manually define propagation logic. Harpocrates addresses these shortcomings by providing transparent, type‑preserving wrappers, centralized policy definitions, and automatic enforcement without requiring a bespoke runtime.
Overall, Harpocrates demonstrates that binding privacy policies to types at compile time, while allowing free data flow inside protected membranes, yields a practical balance between security, developer ergonomics, and performance. The approach is especially suited for environments where multiple services share data schemas and where regulatory requirements evolve frequently, as policy updates can be made by changing a single policy class without touching dependent applications. Future work includes extending the policy language, supporting additional host languages, and exploring richer static analyses to reduce the need for runtime checks.
Comments & Academic Discussion
Loading comments...
Leave a Comment