FirecREST v2: lessons learned from redesigning an API for scalable HPC resource access
Introducing FirecREST v2, the next generation of our open-source RESTful API for programmatic access to HPC resources. FirecREST v2 delivers a 100x performance improvement over its predecessor. This paper explores the lessons learned from redesigning FirecREST from the ground up, with a focus on integrating enhanced security and high throughput as core requirements. We provide a detailed account of our systematic performance testing methodology, highlighting common bottlenecks in proxy-based APIs with intensive I/O operations. Key design and architectural changes that enabled these performance gains are presented. Finally, we demonstrate the impact of these improvements, supported by independent peer validation, and discuss opportunities for further improvements.
💡 Research Summary
The paper “FirecREST v2: lessons learned from redesigning an API for scalable HPC resource access” presents a comprehensive architectural overhaul of an open-source RESTful API designed for programmatic access to High-Performance Computing (HPC) resources. The primary objective of this redesign was to achieve a 100-fold performance increase while integrating enhanced security and high throughput as fundamental architectural requirements.
The authors identify and systematically address three critical bottlenecks that plagued the predecessor, FirecREST v1. The first bottleneck involved the limitations of the Python Global Interpreter Lock (GIL) and the thread overhead inherent in the Gunicorn-based sequential processing model. To overcome this, FirecREST v2 transitioned to an asynchronous paradigm using the ASGI standard, specifically leveraging FastAPI and the Uvicorn server with the uvloop event loop. This allows the API to handle thousands of concurrent I/O-bound requests within a single thread by utilizing non-blocking asynchronous execution.
The second bottleneck was the latency introduced by the online introspection of JWT tokens. In the previous version, every request required a network round-trip to the identity provider for token validation, creating significant delays. The redesigned architecture implements offline signature verification by locally caching OIDC (OpenID Connect) certificates. By assuming the use of short-lived tokens, the system can verify authenticity locally, reducing the authentication overhead to near zero.
The third bottleneck was the high cost of establishing new SSH sessions for every file-system operation. The original implementation frequently hit the MaxStartups and MaxSessions limits of the SSH daemon, leading to massive request queues. FirecREST v2 introduces an asynchronous SSH client implementation using AsyncSSH and implements a user-specific connection pooling mechanism. This allows the API to reuse existing SSH connections for continuous requests from the same user, significantly reducing the load on the SSH daemon and increasing overall throughput.
Furthermore, the paper introduces a sophisticated 4-layer architecture that clearly separates the authentication/authorization layer from the forwarding layer. This separation enables the system to validate state and resource readiness before initiating any costly backend calls, thereby minimizing unnecessary backend traffic.
The performance impact of these changes is quantitatively demonstrated through rigorous testing on the CSCS Alps cluster using Postman-based scenarios. The results are profound: while the original version reached saturation at just 50 requests per second, FirecREST v2 maintains an average latency of under 20ms even at a concurrency level of 5,000 requests per second. This architectural evolution provides a robust and scalable model for integrating HPC-centric data pipelines and complex workflow engines like AiiDA.
Comments & Academic Discussion
Loading comments...
Leave a Comment