Adaptive Federated Learning to Optimize Integrated Flows in Cyber-Physical Data Centers
Data centers play an increasingly critical role in societal digitalization, yet their rapidly growing energy demand poses significant challenges for sustainable operation. To enhance the energy efficiency of geographically distributed data centers, this paper formulates a multi-period optimization model that captures the interdependence of electricity, heat, and data flows. The optimization of such integrated multi-domain flows inherently involves mixed-integer formulations and the access to proprietary or sensitive datasets, which correspondingly exacerbate computational complexity and raise data-privacy concerns. To address these challenges, an adaptive federated learning-to-optimization approach is proposed, accounting for the heterogeneity of datasets across distributed data centers. To safeguard privacy, cryptography techniques are leveraged in both the learning and optimization processes. A model acceptance criterion with convergence guarantee is developed to improve learning performance and filter out potentially contaminated data, while a verifiable double aggregation mechanism is further proposed to simultaneously ensure privacy and integrity of shared data during optimization. Theoretical analysis and numerical simulations demonstrate that the proposed approach preserves the privacy and integrity of shared data, achieves near-optimal performance, and exhibits high computational efficiency, making it suitable for large-scale data center optimization under privacy constraints.
💡 Research Summary
This paper addresses the critical challenge of optimizing energy efficiency in geographically distributed, hyperscale data centers while rigorously preserving data privacy and integrity. The authors identify two major limitations of conventional centralized optimization approaches: the computational intractability of large-scale, multi-period Mixed-Integer Nonlinear Programming (MINLP) problems involving electricity, heat, and data flows; and the significant privacy concerns arising from the need to collect sensitive operational data (e.g., workload patterns, consumption data) from multiple independent stakeholders.
To overcome these challenges, the paper first formulates a holistic, cyber-physical model for data center energy management (P1). This model integrates server power consumption (idle and dynamic), cooling system thermodynamics with temperature dynamics, data queue evolution with Service Level Agreement (SLA) constraints, internal power generation, battery energy storage systems (ESS), and interaction with wholesale electricity markets. The initially nonlinear model is reformulated into a Mixed-Integer Linear Programming (MILP) problem (P2) using McCormick envelopes, making it computationally more manageable but still requiring centralized data access.
The core contribution of the paper is a novel adaptive federated learning-to-optimization framework designed to solve P2 in a distributed, privacy-preserving manner. The framework operates in two main phases:
-
Adaptive Federated Learning Phase: Instead of sharing raw data, each data center trains a local ensemble of neural networks on its own historical optimal operational decisions. These local models learn to predict key decision variables (e.g., server allocation, cooling setpoints) based on environmental inputs (e.g., electricity prices, incoming workload). Critically, only the parameters of the first few layers of these local models are shared with a central aggregator (e.g., a utility company). To handle the inherent statistical heterogeneity (non-IID data) across data centers and defend against malicious or poor-quality updates, the authors introduce a model acceptance criterion with convergence guarantees. This criterion filters local model updates before aggregation, ensuring only beneficial updates contribute to the global model, thereby improving training stability and final performance.
-
Privacy-Preserving Optimization with Verifiable Integrity: After the learning phase, the local models are used to infer decisions. The subsequent coordination for system-wide constraints (like power balance) still requires sharing some aggregated information. To protect privacy and ensure trust in this step, the authors propose a verifiable double-aggregation mechanism based on cryptographic techniques like secret sharing. This mechanism allows data centers to collaboratively compute aggregated values (e.g., total power demand) without revealing their individual private inputs. Furthermore, it enables participants to verify the correctness of the aggregation process, ensuring that neither the inputs nor the final result have been tampered with, thus guaranteeing both privacy and data integrity.
The proposed method is supported by theoretical analysis and comprehensive numerical simulations. The results demonstrate that the adaptive federated learning approach achieves near-optimal operational costs compared to the centralized baseline. Simultaneously, it provides strong privacy guarantees by preventing raw data exposure. The verifiable aggregation mechanism successfully maintains data integrity with minimal computational overhead. In conclusion, this work presents a scalable, efficient, and secure framework for the coordinated energy management of large-scale, distributed data center networks under strict privacy constraints, offering a practical path forward for sustainable digital infrastructure.
Comments & Academic Discussion
Loading comments...
Leave a Comment