Multi-Scale Control of Large Agent Populations: From Density Dynamics to Individual Actuation
We review a body of recent work by the author and collaborators on controlling the spatial organisation of large agent populations across multiple scales. A central theme is the systematic bridging of microscopic agent-level dynamics and macroscopic …
Authors: Mario di Bernardo
Multi-Scale Con trol of Large Agen t P opulations: F rom Densit y Dynamics to Individual Actuation ⋆ Mario di Bernardo ∗ ∗ Dep artment o f Ele ctric al Engine ering and Information T e chnolo gy, University of N aples F e deric o II, and S cuola Su p erior e Meridionale, Naples, Italy (e-mail: mario.dib ernar do@unina.it) Abstract: W e review a b o dy of recent w ork by the author and collab ora tors on controlling the spatial organis ation of lar ge agen t p opulations acro ss multiple scales. A cen tral theme is the systematic bridging of micro scopic ag ent-lev el dynamics a nd macrosco pic density descriptions, enabling control design at the mos t natur al level of abstra c tio n and subsequent translatio n ac r oss scales. W e show ho w this m ulti-s cale p er s pe ctive provides a unified appr oach to b oth dir e ct c ont r ol , where every agent is actuated, and indir e ct c ontr ol , where few lea ders o r herders steer a larger uncontrolled p opulation. The review cov er s contin uification-based control with robustness under limited sensing and decen tralis e d implemen tation via distributed density estimation; leader–follower densit y regulation with dual-feedback s tability guara ntees and bio -inspired plasticity; optimal-transp ort metho ds for coverage control a nd macro- to -micro discretisation; nonrecipro ca l field theory for collective decision-ma k ing; mea n- field control barrier functions for po pulation-level sa fet y; and hierarchical reinforcement lear ning for settings wher e clo sed-form solutions are intractable. T ogether, these results demonstr ate the breadth a nd v ersa tility of a m ulti-s cale control fra mework that integrates a nalytical metho ds, lea rning, and physics-inspired approaches for la rge ag ent p opula tions. Keywor ds: Multi-agent sy s tems; contin uification; density control; mea n-field control; shepherding; leade r –follow er systems; lar ge-sca le systems; P DE control; optimal transp o r t; control barr ier functions; reinfo r cement learning . 1. INTRODUCTION Controlling the collective b ehaviour o f large groups of int er acting agents is a central challenge across swarm rob otics, tr affic management, and synthetic biolog y (D’Souza et al., 20 2 3). When the num ber o f agents N is larg e , designing individual control inputs b eco mes intractable. Mean-field theory overcomes this by reformulating the problem in terms o f density functions gov erned by partial differential equations (PDEs), with complexity indep en- dent of N (F orna sier and Solombrino, 20 14). The key paradigm is that of multi-sc ale c ontr ol : closing the feed- back lo o p b etw een ma crosco pic observ ables—such as the spatial density of the agents—and microsco pic actuatio n, as illustrated in Fig. 1. Over the past few years, our group has developed a systematic framework ba s ed o n c ontinuific ation —a three- step pip eline o riginally prop osed in Nikitin et al. (202 2) that (i) lifts the multi-agent sys tem to a macr oscopic PDE via a mean-field limit, (ii) designs a cont r ol law at the density level, and (iii) discr e tises the result back to agent inputs. W e hav e shown that this pip e line provides a unified appro ach to tw o fundament a lly different class es o f problems: ⋆ This work has b een partially supp orted by the Eur opean Union (EU HO RIZON SHARESP ACE GA 101092889) and by the MUR PRIN pro ject MENTOR (CUP: E53D23001160006 ). • D ir e ct c ontr ol : every agent in a single po pulation receives a control input, a nd the goal is to shap e the collective density tow ards a desir ed spatial pr o file. • In dir e ct c ontr ol : a s mall n umber of controllable agents (leaders o r herder s) must steer a lar ger p opulation o f uncontrolled agents tow ards a tar get configura tio n, through inter-po pulation interactions. In b oth cases, contin uification provides the common multi- scale backbo ne: the control go al is formulated a t the macrosco pic density level, but actua tio n is exer ted at the microscopic agent level (s e e Fig. 1). The remainder o f this pap er reviews the main contributions dev elop ed in our group along these tw o directions . 2. THE CONTINUIFICA TION PIP ELINE The contin uificatio n (or c o ntin uation) appr oach, orig inally prop osed in Nikitin et al. (2022), consists o f three steps (Fig. 1). Step 1 – Continuific ation. A mean-field limit maps the microscopic ag ent dynamics, e.g. ˙ x i = N X j =1 f ( { x i , x j } π ) + u i , i = 1 , . . . , N , (1) where f is a pairwise in ter action kernel and u i is the velocity control input, to a macr oscopic mass conser v ation Microscopic ˙ x i = P j f ( { x i , x j } π ) + u i Macroscopic ρ t + [ ρ ( f ∗ ρ )] x = q ( x, t ) Control design q = K p e − [ eV d ] x − [ ρV e ] x Discretisation u i ( t ) = U ( x i , t ) Step 1 continuification Step 2 PDE control Step 3 discretise deploy u i ρ d ( x, t ) Fig. 1. The contin uification control pip eline, as intro duce d in Maffettone et a l. (2022 , 2 0 23). The three steps apply to bo th direct a nd indirect control pro blems. PDE ρ t + [ ρ ( f ∗ ρ + U )] x = 0 , (2) where ρ ( x, t ) is the a g ents’ density , U ( x, t ) is the macr o - scopic control velo city field, and ‘ ∗ ’ denotes convolution. Step 2 – Macr osc opic c ontr ol design. A control input is designed at the PDE level s o that ρ ( x, t ) tends to some desir ed config uration enco ded by the tar get density ρ d ( x, t ). Step 3 – Discr etisation. The macrosc o pic c o ntrol a c tion is sampled back to the agent level so that each a gent receives u i ( t ) = U ( x i , t ). The critica l insight is that this pip eline applies regar dless of whether u i acts o n every agent ( dir e ct c ontr ol ) or only on a subset of leader s/herder s that influence the remaining po pulation through interaction forces ( indir e ct c ontr ol ). In b oth cases , the con tro l design is p erfo r med on a PDE describing the density of the co ntrolled po pulation, and the resulting macrosc opic ac tio n is discretis e d to o btain feasible a gent-lev el inputs. 3. DIRECT CONTROL In the dir ect control setting, all N agents a r e actuated and the ob jectiv e is to steer the p opula tion density ρ tow ards a desir e d profile ρ d . In Maffettone et al. (202 2), w e int r o duced a contin uification- based control law for ag ents int er acting on a ring. The macros copic control sourc e q = K p e − [ e V d ] x − [ ρ V e ] x , e = ρ d − ρ, (3) where V d = f ∗ ρ d and V e = f ∗ e , is s hown to guarantee global as ymptotic co nv ergence to ρ d via Lyapunov a naly- sis. The macr oscopic c ontrol velo city U is r ecov ered fr om [ ρ U ] x = − q and dis cretised to ea ch a gent a s u i = U ( x i , t ) via spatial collo cation. Since V e = f ∗ e requires globa l knowledge o f the er ror field, in Maffettone et al. (2023 ) we studied agents with finite sensing r adius ∆ and proved semiglob al asymptotic stability : for any compact s e t o f initial conditions, choo sing K p sufficiently large ensures conv erge nc e . Bounded co nv er- gence is a lso establis hed under spatio -tempo ral velocity disturbances and interaction kernel perturba tions, with the residual stea dy-state error ma de arbitrarily small by increasing K p . A further step tow ar ds full scalability was taken in Di Lorenzo et al. (202 5 b), where w e replaced cen tralis e d density knowledge with a distribute d estimation scheme. Each agent i maintains a lo cal estimate ˆ ρ ( i ) of the global density , constructed from the ag ent s ’ own p o si- tions via k er nel densit y ev a luation and updated thro ug h a prop or tional-integral consensus pr o to col over a commu- nication graph. The lo cal estimates conv erge to the true density with b ounded error ov er any strongly connected graph, so that each a g ent can compute its own c o ntrol input from its lo cal estimate alone, closing the lo op acro ss scales witho ut any centralised computation. The decen- tralised s tr ategy matches the p erfo rmance o f its centralised counterpart while relying only on lo cal communication. In Maffettone et a l. (2024 ), w e v a lidated the contin uifi- cation pip eline exp erimentally on a physical swarm of mobile ro b o ts op erating in a mixed-reality e nvironment. The density is estimated online fro m ro b o t po sitions and the macrosco pic control is discre tised in real time. E xp er- imen ts confirm the effectiveness of the appr o ach fo r up to 100 r ob ots tracking multimodal tar get distributions, demonstrating that contin uification control is not merely a theo retical construct but a deplo yable technology . The exp erimental pla tform merits further comment. In Maf- fettone et a l. (20 24), the mixed-rea lit y setup couples a small num b er of physical differential-drive ro b ots with a larger virtual populatio n, all in teracting in real time through a shared spatial domain. The macros c opic density is es timated o nline fro m the joint physical-virtual p ositions and the contin uification control law (3) is discretised at each sampling step. In ex pe r iments with up to 10 0 a gents tracking unimo dal, bimoda l, and time-v ar ying targ et dis- tributions, the clo sed-lo op de ns it y conv erg es reliably to ρ d with erro rs consistent with the theoretical pr edictions of Maffettone et al. (20 23). This v alida tio n confirms that the contin uificatio n pip eline is ro bust to the delays, quan- tisation err ors, and communication imper fections inher ent in a physical swarm implementation. 3.1 Optimal tr ansp ort for density c ontr ol When the target density ρ d v ar ies in time, the contin uifi- cation control law (3) may pro duce velo city fields that a r e neither ma ss-prese r ving nor energ y-efficient. In Nap olitano and di Bernardo (2026 ), we reformulated the direct control problem thro ugh optimal t r ansp ort (OT) theor y . Given a current density ρ ( · , t ) and a desire d density ρ d ( · , t + ∆ t ), one solves for the tra nsp ort map T minimising inf T : T # ρ = ρ d Z k x − T ( x ) k 2 ρ ( x ) dx, (4) where T # ρ denotes the push-for ward o f ρ through T . The resulting v elo cit y field U OT ( x, t ) = ( T ( x ) − x ) / ∆ t is mass-pres e rving by construction and minimises the kinetic energy o f the tra nsp ort. F or o ne - dimensional p opula tions, closed-form solutio ns via the q uantile function are de- rived, while for higher dimensions entropy-regularised for- m ula tions yield smo oth, co mputationally tracta ble plans. The OT-based a pproach provides a principled a lternative to the L yapuno v-bas ed law (3) when optimality o f the transp ort plan is desired, a nd naturally interfaces with the co ntin uifica tio n pipeline throug h Step 3 discretisation: each ag ent re c eives u i = U OT ( x i , t ). The O T p ersp ective also op ens a geometric viewp oint: the space of density functions, equipp ed with the W asserstein- 2 metric, b eco mes a Rie ma nnian manifold on which the controlled de ns it y traces a tr a jector y . This connection sug- gests a path tow a rds W asser stein-space Lyapunov analysis and g eometric c o ntrol design tha t we ar e actively pursuing. 4. INDIRECT CONTROL In indirect control, only a small subset of ag ents is actu- ated: these le aders or h er ders must s tee r a larger p op- ulation of uncontrolled agents through in ter-p opula tion int er actions. W e have shown that co ntin uifica tion natu- rally extends to this s etting by lifting the coupled m ulti- po pulation dynamics to a system of PDEs and desig ning the macros copic control for the actuated p opulation alone. 4.1 L e ader–fol lower density c ontr ol In Maffettone et a l. (2025 ), w e formulated the leader– follow er densit y control pro blem. Co nt r ollable lea ders (densit y ρ L ) steer unco ntrolled follow ers (density ρ F ) through a repulsive interaction kernel f . The coupled macrosco pic mo del reads ∂ t ρ L + [ ρ L u ] x = 0 , (5) ∂ t ρ F + [ ρ F ( f ∗ ρ L )] x = D ρ F, xx , (6 ) where u ( x, t ) is the leaders’ v elo cit y fie ld to b e designed and D > 0 mo dels sto chastic follow er b e haviour. The g oal is to find u such that ρ F ( x, t ) → ¯ ρ F ( x ). W e der ived fe asibility c onditions o n the minimum leader mass as a function of the desir ed follow er profile, diffusion, and kernel pa rameters. A fe e dforwar d s cheme achiev es exp onential leader conv erge nce with global follow er co n- vergence prov en via the Poincar´ e–Wirtinger inequality . A r efer enc e-governor (R G) dual-feedback scheme a dapts the leaders’ refere nc e as ˆ ρ L = ¯ ρ L + α ( t ) W based on the fol- low er tracking err or, with α ( t ) ∈ [0 , 1 ] ensur ing p ositivity and mass conser v a tio n. The RG scheme reduces stea dy- state erro rs by up to 90 % co mpared to feedforward under disturbances. Crucially , the leaders’ control u is ultimately discretised via Step 3 of the contin uificatio n pip eline: ea ch leader ag ent receives u j = u ( x j , t ). 4.2 Bio-inspi r e d plasticity and heter o gene ous p opulations In Maffettone et al. (2026 a), we extended the leader– follow er framework with bio-inspir e d plasticity : the cou- pling adapts online, mimicking biolog ical swarms where int er action strengths evolv e with exper ience. The result- ing a daptive architecture pro vides improved robustness to mo delling er r ors and time-v arying e nvironments. In Maffettone et al. (2026b), we further g e neralised the frame- work to heter o gene ous po pulations with v arying dynamics and interaction kernels. The key challenge is that density- level control design as s umes a homo geneous po pulation, while rea l swarms exhibit int er -agent v ariability in s pe ed, sensing radius, and interaction strength. W e mo delled this v ariability as a matched p e rturbation of the nominal macrosco pic dynamics and designed a r obust control law that guarantees ultimate b oundedness of the density track- ing erro r. The er r or b o und is explicit in the heter ogeneity level and c a n b e made arbitr arily small by increa sing the control gain, at the cos t of higher actuatio n effor t. The robust for m ula tion also applies to s cenarios with unk nown bo unded disturbances—for instanc e unmo delled environ- men ta l forces—extending the fra mework’s applicability to real-world conditions where pre cise model knowledge is unav ailable. 4.3 Shepher ding and n onr e cipr o c al fi eld the ory Shepherding in swarm rob otics repre s ents an extreme for m of indirect control: M her ders must dr ive N T ≫ M target agents tow ar d a goal reg ion. In Lama and di Ber na rdo (2024), we esta blished sc aling laws for her dability : the minim um num b er of herders satisfies M ∗ ∼ N α T with α < 1, co nfir ming that sublinear sparse actua tion suffices. A critical density thresho ld be low which herda bilit y degr ades was ident ified a nd explained thr ough per colation o f a herdability gr aph. In La ma et al. (2025 ), we developed a nonr e cipr o c al field the ory for the shepherding problem. The first requir ed step to apply a co nt inuification bas ed appro ach to the problem. A t the micr o scopic level, each herder i se lects a target via a soft-max rule that approximates the choice of the furthest target fro m the goa l within its sens ing radius ξ : T ∗ i = P a ∈N i,ξ e γ ( | T a |−| H i | ) T a P a ∈N i,ξ e γ ( | T a |−| H i | ) , (7) where γ ≥ 0 co ntrols the s e lection sp ecificity (from centre- of-mass averaging at γ = 0 to furthest-targ et selection as γ → ∞ ). The herder then p os itions itself b ehind the selected targe t via the feedba ck control input u i = − H i − T ∗ i − δ b T ∗ i , (8) where δ ≥ 0 c o ntrols the goal- directedness of the tra jectory planning a nd b T ∗ i = T ∗ i / | T ∗ i | . The tw o decision-making parameters γ (tar get selection) and δ (tra jectory planning) int r o duce nonrecipr o cal and thre e -b o dy couplings that hav e no co un ter part in the targets’ dynamics. By deriving a mea n-field limit, we obtained a nov el set of coupled field equatio ns for the target dens it y ρ T and herder density ρ H : ∂ t ρ T = ∇ · h D T ( ρ T ) ∇ ρ T + ˜ k T ρ T ∇ ρ H i , (9) ∂ t ρ H = ∇ · D H ( ρ H ) ∇ ρ H − v 1 ( x ) ρ H ρ T − v 2 ( x ) ρ H ∇ ρ T , (10) where D A ( ρ A ) ar e renor malised diffusivities a rising fr om noise and short-ra nge r epulsion. The co upling functions v 1 ( x ) and v 2 ( x ) enco de the decision-mak ing: v 2 main- tains a co nstant sign reflecting consistent her der-tar g et attraction, while v 1 changes s ign dep ending on the herder po sition r elative to the goal, enco ding goa l-directed shep- herding. The bilinear nonrecipr o cal term v 1 ( x ) ρ H ρ T is absent in standar d nonrecipro cal field theories a nd a rises uniquely from the con tro l-oriented, three-b o dy character of the decision-mak ing rules (7 )–(8). This analy sis reveals that decisio n-making is itse lf a fundamental s ource of nonrecipro city , driving phase trans itions b etw een ho mo - geneous and confined target c o nfigurations. In Di Lo renzo et al. (2025a ), we a pplied the contin uifi- cation pip eline directly to the shepherding pr oblem: the target populatio n is describ ed by a conserv ation PDE, a macrosco pic herding str a tegy is desig ne d at the density level, and the resulting control is discretised to the her der agents. This enables r egulation of a rbitrarily large target po pulations with few her ders. In Nap olitano et al. (2 025), we prop osed a hier ar chic al le arning-b ase d a rchitecture that combines high-level planning with low-level lear ned p oli- cies for robust s hepherding of sto chastic age nts under uncertaint y and par tial obser v ability . 4.4 Populatio n- L evel Safety In realistic deployment scenarios , multi-agen t s ystems m ust satisfy s afety constra ints—collision avoidance with obstacles, confinemen t to admissible reg io ns, a nd inter- agent spacing gua r antees—while pursuing their density control ob jective. A natura l approa ch within the multi- scale fra mework is to lift the Contr ol Barrier F unction (CBF) metho dolog y to the macrosco pic level, defining ba r- rier functionals dir ectly on the p opulation density rather than on individual ag ent states. The r esulting me an-field CBFs (MF-CBFs) imp ose sa fety co nstraints on the macro- scopic P DE via a minima lly inv asive Q P-based s a fet y fil- ter, indep endent of the num be r o f agents. W e are currently developing this framework for b o th direct and indirect control, including shepher ding in cluttered environments. In a complementary direction, in Punzo et al. (2025) we show ed tha t a driv ing p olicy tr a ined via deep RL in a min- imal single-o bstacle scena rio tra nsfers directly to complex m ulti-a gent settings, pr o ducing co llis ion-free tra jectories without retr aining. 5. CO NCLUSIONS AND OP EN CHALLENGES The b o dy of work reviewed in this pap er demo ns trates that m ulti-s cale control—the systema tic bridging of macr o- scopic density descriptions and micros copic agent actuation— provides a versatile and rigor o us framework for control- ling large a gent p o pulations. The contin uification pipeline, optimal trans po rt, nonr ecipro cal field theory , mean-field control barrier functions, and hiera rchical reinfor cement learning each address different asp ects of the pro blem, and their combination within the macro-to -micro a nd micro- to-macro ar chitectures yields a control to o lb ox o f consid- erable brea dth. Key op en challenges include tightening the formal clo s ed- lo op gua rantees when estimation and co ntrol int er act across scales , extending the safety framew o rk to rich er classes of constra int s and system dynamics, and under- standing when analytical, le a rning-bas ed, a nd transp o rt- theoretic metho ds should b e combined r ather than used in isolation. On the application side, the gener ality of the multi-scale pa radigm invites v alidation b eyond swarm rob otics, in domains where large in ter acting po pulations and spa rse actuation ar e the norm. A CK NOWLEDGEMENTS The author thanks a ll the co- a uthors of the pap er s pre- sented here for their contributions. Dur ing the prepa ra- tion o f this work the author used AI to ols to assist with manuscript forma tting and r evision. REFERENCES B. Di Lo renzo, G.C. Maffettone, and M. di Bernardo . A contin uification-base d control so lution for larg e - scale shepherding. Eur op e an Journal of Contr ol , 86(A):1013 24, 20 25. B. Di Lo renzo, G.C. Maffettone, and M. di Ber nardo. De- centralized contin uificatio n control of m ulti-a gent sys- tems via distributed density estimatio n. IEEE Contr ol Systems L etters , 9:1 580– 1585, 2025 . R.M. D’Souza, M. di Bernardo, and Y.-Y. L iu. Controlling complex net works with co mplex no des. Natur e R eviews Physics , 5:25 0–26 2 , 2 023. M. F ornasier and F. So lombrino. Mean-field optimal control. ESAIM: COCV , 20(4):112 3–11 52, 2014 . A. Lama and M. di Bernardo . Shepherding and herdability in complex multiagent systems. Physic al R eview Re - se ar ch , 6(3):L032 012, 2 024. A. Lama , M. di B ernardo , and S.H.L. Klapp. Nonrecipro- cal field theory for decisio n-making in multi-agent con- trol systems. Natur e Communic ations , 16:84 50, 2 025. G.C. Maffettone, A. Bo ldini, M. di Ber nardo, and M. Por- firi. Contin uification control o f large - scale m ultiag ent systems in a r ing. IEEE Contr ol Systems L ett ers , 7:841– 846, 202 2 . G.C. Maffettone, M. Porfiri, and M. di Bernar do. Con- tin uifica tion con trol of large-scale multiagen t systems under limited sensing a nd structura l per turbations. IEEE Contr ol Syst ems L ett ers , 7:242 5–24 3 0, 20 23. G.C. Maffettone, L. L ig uori, E. Palermo, M. di B ernardo, and M. Porfiri. Mixed reality environment and high- dimensional contin uification control for swarm rob o tics. IEEE T r ansactions on Contr ol Systems T e chnolo gy , 32(6):248 4–24 91, 202 4 . G.C. Maffettone, A. Boldini, M. Porfiri, and M. di Bernar do. Leader–follower density control of spa tial dynamics in lar ge-scale multi-agent systems. IEEE T r ansactions on Automatic Contr ol , 70(1 0):6783 – 6798, 20 2 5. G.C. Maffettone, A. Bo ldini, M. di Ber nardo, and M. Por- firi. Bio -inspired density control of multi-agen t swarms via leader– follow er pla sticity . A u tomatic a , ac c epted for publication, 202 6. G.C. Maffettone, D. Salzano, and M. di Bernar do. Robus t macrosco pic density co nt r ol of heterogeneo us multi- agent systems. IEEE T r ansactions on A ut omatic Con- tr ol , submitted, 2 026. I. Nap olitano, S. Cov o ne, A. La ma, F. De Lellis, a nd M. di Be r nardo. Hierarchical learning-bas ed control for mu lti-a gent shepher ding of s to c hastic autonomous agents. IEEE T r ansactions on Contr ol Systems T e ch- nolo gy , submitted, 2 025. I. Nap olitano and M. di Bernar do. Optimal tr ansp ort for time-v ar ying mult i- a gent cov era ge control. Automatic a , submitted, 202 6. D. Nikitin, C. Canudas-de-Wit, and P . F rasc a . A contin u- ation metho d for larg e-scale mo deling and control: from ODEs to PDE, a round trip. IE EE T r ansactions on Automatic Contr ol , 67(10):51 18–5 133, 20 2 2. C. P unzo, I. Nap o litano, C. T omaselli, a nd M. di Bernar do. Decent r alized shepherding of no n-cohesive swarms through cluttered environments via deep reinforce ment learning. arXiv pr eprint arXiv:2511. 21405 , 2 025.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment