An adaptive adjoint-oriented neural network for solving parametric optimal control problems with singularities
In this work, we present an adaptive adjoint-oriented neural network (adaptive AONN) for solving parametric optimal control problems governed by partial differential equations. The proposed method integrates deep adaptive sampling techniques with the adjoint-oriented neural network (AONN) framework. It alleviates the limitations of AONN in handling low-regularity solutions and enhances the generalizability of deep adaptive sampling for surrogate modeling without labeled data ($\text{DAS}^2$). The effectiveness of the adaptive AONN is demonstrated through numerical examples involving singularities.
💡 Research Summary
This paper introduces an adaptive adjoint‑oriented neural network (adaptive AONN) for solving parametric optimal control problems (OCPs) constrained by partial differential equations (PDEs), especially when the solutions exhibit singularities or low regularity. The method builds upon the previously proposed adjoint‑oriented neural network (AONN) framework, which employs three separate deep neural networks to approximate the state y, control u, and adjoint p functions that satisfy the Karush‑Kuhn‑Tucker (KKT) optimality conditions. AONN enforces boundary conditions without penalty terms by decomposing each network into a boundary‑satisfying component and an interior component multiplied by a distance‑like factor.
The novel contribution of this work is the integration of deep adaptive sampling for surrogate modeling without labeled data (DAS²). In DAS², the residual r(x, ξ; θ)²—computed from the current neural network approximations—is interpreted as a probability density function. New collocation points are then sampled from this residual‑induced distribution, concentrating training data in regions where the model error is large (e.g., near singularities). This adaptive sampling loop is tightly coupled with the AONN training: after each AONN update, the residual distribution is recomputed, new samples are generated, and the networks are retrained on the enriched dataset. The process iterates until convergence, with step‑size c and epoch counts gradually adjusted.
Algorithmically, the procedure starts with an initial random sample set S₀ and initial network parameters θ⁽⁰⁾. At iteration i, the three networks are trained sequentially (state → adjoint → control) using empirical loss functions derived from the L² norms of the state, adjoint, and control residuals. The control update incorporates a projection onto the admissible set U_ad(ξ) and a gradient‑descent step based on the derivative of the objective functional, ensuring that inequality constraints are satisfied exactly. After the networks are updated, the residual r² is evaluated over the current training points, and DAS² generates a new sample set Sᵢ₊₁ by sampling from the estimated residual distribution (e.g., via MCMC or a variational auto‑encoder). This new set replaces or augments the previous one, and the training proceeds to the next iteration.
The authors validate the adaptive AONN on two benchmark problems featuring singularities: (1) a geometry‑parametric 2‑D heat conduction problem where the domain shape changes sharply with the parameter, and (2) a nonlinear reaction‑diffusion control problem with box constraints on the control. In both cases, the adaptive method achieves significantly lower L₂ errors and objective values compared with the original AONN that relies on uniform random sampling. Moreover, the number of training epochs required for convergence is reduced by roughly 30 %, and the method remains effective when the parameter dimension is increased to five, demonstrating scalability.
Key advantages of the proposed approach include:
- Improved handling of low‑regularity solutions through residual‑driven sampling that automatically focuses computational effort near singularities.
- Label‑free data generation, as DAS² does not require pre‑computed high‑fidelity solutions; the residual itself drives the sampling process.
- Penalty‑free enforcement of boundary conditions via the AONN decomposition, which enhances stability for complex geometries.
- Fast online inference once the surrogate models are trained, enabling rapid evaluation of optimal states and controls for new parameter values without solving the PDE repeatedly.
Limitations are acknowledged: the residual‑based density estimation introduces an extra optimization step (e.g., kernel density estimation or training a variational sampler), which adds computational overhead. In very high‑dimensional parameter spaces (>10), the efficiency of the adaptive sampling may degrade, and stochastic variability inherent in Monte‑Carlo sampling can affect reproducibility.
Future research directions suggested include: extending the adaptive sampling to more sophisticated variational techniques for high‑dimensional spaces, hybridizing with physics‑based priors to further reduce sample requirements, developing online updating mechanisms for real‑time control, and applying the framework to other PDE families such as Navier‑Stokes or wave equations.
In summary, the adaptive AONN framework successfully merges the structural strengths of AONN (multiple networks, penalty‑free boundary handling) with the efficiency of DAS² adaptive sampling, delivering a robust and scalable solution methodology for parametric PDE‑constrained optimal control problems, particularly those involving singularities.
Comments & Academic Discussion
Loading comments...
Leave a Comment