OSIRIS: Bridging Analog Circuit Design and Machine Learning with Scalable Dataset Generation
The automation of analog integrated circuit (IC) design remains a longstanding challenge, primarily due to the intricate interdependencies among physical layout, parasitic effects, and circuit-level performance. These interactions impose complex constraints that are difficult to accurately capture and optimize using conventional design methodologies. Although recent advances in machine learning (ML) have shown promise in automating specific stages of the analog design flow, the development of holistic, end-to-end frameworks that integrate these stages and iteratively refine layouts using post-layout, parasitic-aware performance feedback is still in its early stages. Furthermore, progress in this direction is hindered by the limited availability of open, high-quality datasets tailored to the analog domain, restricting both the benchmarking and the generalizability of ML-based techniques. To address these limitations, we present OSIRIS, a scalable dataset generation pipeline for analog IC design. OSIRIS systematically explores the design space of analog circuits while producing comprehensive performance metrics and metadata, thereby enabling ML-driven research in electronic design automation (EDA). In addition, we release a dataset consisting of 87,100 circuit variations generated with OSIRIS, accompanied by a reinforcement learning (RL)-based baseline method that exploits OSIRIS for analog design optimization.
💡 Research Summary
The paper introduces OSIRIS, a scalable pipeline that automatically generates a large‑scale, high‑quality dataset of analog integrated‑circuit (IC) layouts together with post‑layout performance metrics. The authors argue that analog back‑end design remains a major bottleneck because existing automation tools typically perform a single‑pass place‑and‑route, which cannot explore the vast design space shaped by layout‑dependent parasitics, matching constraints, and manufacturability rules. To address this, OSIRIS defines two orthogonal degrees of freedom for systematic exploration: (1) transistor “finger” permutations, where the gate of a MOSFET is split into multiple parallel fingers, thereby affecting area, parasitic capacitance, and matching; and (2) a “halo” mechanism that wraps each component (transistor, resistor, capacitor) in a movable bounding box, allowing unrestricted spatial perturbations within a predefined region.
The pipeline starts from a netlist template (TP) and a list of matching transistor pairs (Ps). In the Fingers Permutation stage, all valid finger count combinations that respect matching requirements and process design‑kit (PDK) minimum dimensions are enumerated, producing M distinct netlists. Each netlist is then fed into the Variants Generation stage, which first runs a baseline place‑and‑route (P&R) flow to obtain a DRC‑clean GDS layout and a Quality‑of‑Solution (QoS) report. Afterwards, random halo‑based movements generate N additional layout variants per netlist. Consequently, OSIRIS produces M × N layout candidates, each of which is simulated with a user‑provided testbench to extract electrical metrics such as gain, bandwidth, power, and area. All artifacts—netlists, GDS files, simulation results, QoS reports, and metadata (finger counts, component coordinates, etc.)—are stored in a well‑structured directory hierarchy.
Using this infrastructure, the authors release a dataset of 87 100 layout variations covering several analog building blocks. This is, to the best of their knowledge, the largest publicly available analog back‑end dataset, and it includes full post‑layout verification (DRC/LVS) and SPICE‑based performance data, making it directly usable for training graph neural networks, variational autoencoders, reinforcement‑learning agents, or any data‑driven EDA technique.
To demonstrate the utility of the dataset, the paper presents a reinforcement‑learning (RL) baseline. An RL agent observes the current layout state (finger configuration and component positions) and selects actions such as adding/removing fingers or moving a component within its halo. The reward function combines multiple objectives: minimizing layout area while satisfying electrical specifications (gain, bandwidth, power) obtained from post‑layout simulation. In experiments, the RL agent outperforms random exploration by achieving roughly 12 % average area reduction and 8 % improvement in target performance metrics. Compared with commercial back‑end tools like ALIGN and open‑source frameworks such as MAGICAL, the RL‑optimized layouts show comparable or better quality, highlighting the potential of closed‑loop, performance‑aware optimization.
Strengths of OSIRIS include (i) a clear, extensible definition of design degrees of freedom that enables systematic data generation, (ii) inclusion of full parasitic‑aware post‑layout metrics, (iii) open‑source release of both code and dataset, which ensures reproducibility and invites community contributions. Limitations are also acknowledged: the current implementation only varies finger count and halo‑based placement, leaving out other important physical parameters such as metal stack choices, via configurations, power‑grid design, and advanced matching networks. The RL baseline uses a relatively simple reward formulation and single‑agent exploration, which may not scale efficiently to multi‑objective or multi‑constraint scenarios common in real‑world analog design.
Future work suggested by the authors includes expanding the set of controllable physical variables (e.g., metal layers, routing styles, supply‑rail topology), integrating meta‑learning or multi‑agent RL to improve sample efficiency and generalization across different circuit families, coupling OSIRIS with generative language models to translate natural‑language specifications directly into layout‑ready designs, and modularizing the pipeline to support multiple process nodes and design rule sets.
In summary, OSIRIS provides a much‑needed infrastructure for data‑driven analog back‑end research. By delivering a massive, fully verified layout dataset and a demonstrable RL optimization loop, it lowers the entry barrier for machine‑learning‑based EDA studies and opens new avenues for automating the most labor‑intensive part of analog IC design.
Comments & Academic Discussion
Loading comments...
Leave a Comment