Enabling Object Storage via shims for Grid Middleware
The Object Store model has quickly become the basis of most commercially successful mass storage infrastructure, backing so-called “Cloud” storage such as Amazon S3, but also underlying the implementation of most parallel distributed storage systems. Many of the assumptions in Object Store design are similar, but not identical, to concepts in the design of Grid Storage Elements, although the requirement for “POSIX-like” filesystem structures on top of SEs makes the disjunction seem larger. As modern Object Stores provide many features that most Grid SEs do not (block level striping, parallel access, automatic file repair, etc.), it is of interest to see how easily we can provide interfaces to typical Object Stores via plugins and shims for Grid tools, and how well experiments can adapt their data models to them. We present evaluation of, and first-deployment experiences with, (for example) Xrootd-Ceph interfaces for direct object-store access, as part of an initiative within GridPP\cite{GridPP} hosted at RAL. Additionally, we discuss the tradeoffs and experience of developing plugins for the currently-popular {\it Ceph} parallel distributed filesystem for the GFAL2 access layer, at Glasgow.
💡 Research Summary
The paper investigates how modern object storage systems, exemplified by Ceph, can be integrated into traditional Grid middleware such as XRootD and GFAL2 through the development of shims and plugins. It begins by contrasting the hierarchical, POSIX‑like namespace used by Grid Storage Elements (SEs) with the flat, GUID‑based namespace of object stores, highlighting that the latter offers features like block‑level striping, parallel access, and automatic repair which are largely absent in conventional SEs.
The authors describe two main implementation efforts. The first, carried out at the UK Tier‑1 site (RAL), builds an XRootD‑Ceph interface using the radosstripper library. Ceph’s native RADOS interface stores each object as a single extent on a single OSD, which is inefficient for multi‑gigabyte physics files. radosstripper splits a file into multiple chunks (e.g., a 16+2 erasure‑coded pool) and stores each chunk as a separate object, enabling striping across OSDs. The XRootD‑Ceph plugin (originally by Sebastien Ponce) then provides direct object‑store access to Grid jobs. Performance tests compare three access patterns—copy‑plus‑read, direct sequential read, and direct random read—against a reference CASTOR deployment. Initial results showed Ceph lagging by two orders of magnitude. Detailed profiling identified two root causes: (1) the XRootD‑Ceph plugin used a very small asynchronous I/O segment size, and (2) the client configuration did not enable 10‑way parallelism. After increasing the segment size and explicitly turning on parallel streams, the Ceph pool saturated its I/O capacity and achieved performance comparable to CASTOR. These fixes have been incorporated into subsequent releases of the plugin, XRootD, and Ceph.
The second effort focuses on the GFAL2 layer, which serves as a common abstraction for many Grid data‑access protocols. The authors note that the pseudo‑URI format used by the XRootD‑Ceph plugin (`rados
Comments & Academic Discussion
Loading comments...
Leave a Comment