Multi-Scale Control of Large Agent Populations: From Density Dynamics to Individual Actuation

Multi-Scale Con trol of Large Agen t P opulations: F rom Densit y Dynamics to Individual Actuation ⋆ Mario di Bernardo ∗ ∗ Dep artment o f Ele ctric al Engine ering and Information T e chnolo gy, University of N aples F e deric o II, and S cuola Su p erior e Meridionale, Naples, Italy (e-mail: mario.dib ernar do@unina.it) Abstract: W e review a b o dy of recent w ork by the author and collab ora tors on controlling the spatial organis ation of lar ge agen t p opulations acro ss multiple scales. A cen tral theme is the systematic bridging of micro scopic ag ent-lev el dynamics a nd macrosco pic density descriptions, enabling control design at the mos t natur al level of abstra c tio n and subsequent translatio n ac r oss scales. W e show ho w this m ulti-s cale p er s pe ctive provides a uniﬁed appr oach to b oth dir e ct c ont r ol , where every agent is actuated, and indir e ct c ontr ol , where few lea ders o r herders steer a larger uncontrolled p opulation. The review cov er s contin uiﬁcation-based control with robustness under limited sensing and decen tralis e d implemen tation via distributed density estimation; leader–follower densit y regulation with dual-feedback s tability guara ntees and bio -inspired plasticity; optimal-transp ort metho ds for coverage control a nd macro- to -micro discretisation; nonrecipro ca l ﬁeld theory for collective decision-ma k ing; mea n- ﬁeld control barrier functions for po pulation-level sa fet y; and hierarchical reinforcement lear ning for settings wher e clo sed-form solutions are intractable. T ogether, these results demonstr ate the breadth a nd v ersa tility of a m ulti-s cale control fra mework that integrates a nalytical metho ds, lea rning, and physics-inspired approaches for la rge ag ent p opula tions. Keywor ds: Multi-agent sy s tems; contin uiﬁcation; density control; mea n-ﬁeld control; shepherding; leade r –follow er systems; lar ge-sca le systems; P DE control; optimal transp o r t; control barr ier functions; reinfo r cement learning . 1. INTRODUCTION Controlling the collective b ehaviour o f large groups of int er acting agents is a central challenge across swarm rob otics, tr aﬃc management, and synthetic biolog y (D’Souza et al., 20 2 3). When the num ber o f agents N is larg e , designing individual control inputs b eco mes intractable. Mean-ﬁeld theory overcomes this by reformulating the problem in terms o f density functions gov erned by partial diﬀerential equations (PDEs), with complexity indep en- dent of N (F orna sier and Solombrino, 20 14). The key paradigm is that of multi-sc ale c ontr ol : closing the feed- back lo o p b etw een ma crosco pic observ ables—such as the spatial density of the agents—and microsco pic actuatio n, as illustrated in Fig. 1. Over the past few years, our group has developed a systematic framework ba s ed o n c ontinuiﬁc ation —a three- step pip eline o riginally prop osed in Nikitin et al. (202 2) that (i) lifts the multi-agent sys tem to a macr oscopic PDE via a mean-ﬁeld limit, (ii) designs a cont r ol law at the density level, and (iii) discr e tises the result back to agent inputs. W e hav e shown that this pip e line provides a uniﬁed appro ach to tw o fundament a lly diﬀerent class es o f problems: ⋆ This work has b een partially supp orted by the Eur opean Union (EU HO RIZON SHARESP ACE GA 101092889) and by the MUR PRIN pro ject MENTOR (CUP: E53D23001160006 ). • D ir e ct c ontr ol : every agent in a single po pulation receives a control input, a nd the goal is to shap e the collective density tow ards a desir ed spatial pr o ﬁle. • In dir e ct c ontr ol : a s mall n umber of controllable agents (leaders o r herder s) must steer a lar ger p opulation o f uncontrolled agents tow ards a tar get conﬁgura tio n, through inter-po pulation interactions. In b oth cases, contin uiﬁcation provides the common multi- scale backbo ne: the control go al is formulated a t the macrosco pic density level, but actua tio n is exer ted at the microscopic agent level (s e e Fig. 1). The remainder o f this pap er reviews the main contributions dev elop ed in our group along these tw o directions . 2. THE CONTINUIFICA TION PIP ELINE The contin uiﬁcatio n (or c o ntin uation) appr oach, orig inally prop osed in Nikitin et al. (2022), consists o f three steps (Fig. 1). Step 1 – Continuiﬁc ation. A mean-ﬁeld limit maps the microscopic ag ent dynamics, e.g. ˙ x i = N X j =1 f ( { x i , x j } π ) + u i , i = 1 , . . . , N , (1) where f is a pairwise in ter action kernel and u i is the velocity control input, to a macr oscopic mass conser v ation Microscopic ˙ x i = P j f ( { x i , x j } π ) + u i Macroscopic ρ t + [ ρ ( f ∗ ρ )] x = q ( x, t ) Control design q = K p e − [ eV d ] x − [ ρV e ] x Discretisation u i ( t ) = U ( x i , t ) Step 1 continuiﬁcation Step 2 PDE control Step 3 discretise deploy u i ρ d ( x, t ) Fig. 1. The contin uiﬁcation control pip eline, as intro duce d in Maﬀettone et a l. (2022 , 2 0 23). The three steps apply to bo th direct a nd indirect control pro blems. PDE ρ t + [ ρ ( f ∗ ρ + U )] x = 0 , (2) where ρ ( x, t ) is the a g ents’ density , U ( x, t ) is the macr o - scopic control velo city ﬁeld, and ‘ ∗ ’ denotes convolution. Step 2 – Macr osc opic c ontr ol design. A control input is designed at the PDE level s o that ρ ( x, t ) tends to some desir ed conﬁg uration enco ded by the tar get density ρ d ( x, t ). Step 3 – Discr etisation. The macrosc o pic c o ntrol a c tion is sampled back to the agent level so that each a gent receives u i ( t ) = U ( x i , t ). The critica l insight is that this pip eline applies regar dless of whether u i acts o n every agent ( dir e ct c ontr ol ) or only on a subset of leader s/herder s that inﬂuence the remaining po pulation through interaction forces ( indir e ct c ontr ol ). In b oth cases , the con tro l design is p erfo r med on a PDE describing the density of the co ntrolled po pulation, and the resulting macrosc opic ac tio n is discretis e d to o btain feasible a gent-lev el inputs. 3. DIRECT CONTROL In the dir ect control setting, all N agents a r e actuated and the ob jectiv e is to steer the p opula tion density ρ tow ards a desir e d proﬁle ρ d . In Maﬀettone et al. (202 2), w e int r o duced a contin uiﬁcation- based control law for ag ents int er acting on a ring. The macros copic control sourc e q = K p e − [ e V d ] x − [ ρ V e ] x , e = ρ d − ρ, (3) where V d = f ∗ ρ d and V e = f ∗ e , is s hown to guarantee global as ymptotic co nv ergence to ρ d via Lyapunov a naly- sis. The macr oscopic c ontrol velo city U is r ecov ered fr om [ ρ U ] x = − q and dis cretised to ea ch a gent a s u i = U ( x i , t ) via spatial collo cation. Since V e = f ∗ e requires globa l knowledge o f the er ror ﬁeld, in Maﬀettone et al. (2023 ) we studied agents with ﬁnite sensing r adius ∆ and proved semiglob al asymptotic stability : for any compact s e t o f initial conditions, choo sing K p suﬃciently large ensures conv erge nc e . Bounded co nv er- gence is a lso establis hed under spatio -tempo ral velocity disturbances and interaction kernel perturba tions, with the residual stea dy-state error ma de arbitrarily small by increasing K p . A further step tow ar ds full scalability was taken in Di Lorenzo et al. (202 5 b), where w e replaced cen tralis e d density knowledge with a distribute d estimation scheme. Each agent i maintains a lo cal estimate ˆ ρ ( i ) of the global density , constructed from the ag ent s ’ own p o si- tions via k er nel densit y ev a luation and updated thro ug h a prop or tional-integral consensus pr o to col over a commu- nication graph. The lo cal estimates conv erge to the true density with b ounded error ov er any strongly connected graph, so that each a g ent can compute its own c o ntrol input from its lo cal estimate alone, closing the lo op acro ss scales witho ut any centralised computation. The decen- tralised s tr ategy matches the p erfo rmance o f its centralised counterpart while relying only on lo cal communication. In Maﬀettone et a l. (2024 ), w e v a lidated the contin uiﬁ- cation pip eline exp erimentally on a physical swarm of mobile ro b o ts op erating in a mixed-reality e nvironment. The density is estimated online fro m ro b o t po sitions and the macrosco pic control is discre tised in real time. E xp er- imen ts conﬁrm the eﬀectiveness of the appr o ach fo r up to 100 r ob ots tracking multimodal tar get distributions, demonstrating that contin uiﬁcation control is not merely a theo retical construct but a deplo yable technology . The exp erimental pla tform merits further comment. In Maf- fettone et a l. (20 24), the mixed-rea lit y setup couples a small num b er of physical diﬀerential-drive ro b ots with a larger virtual populatio n, all in teracting in real time through a shared spatial domain. The macros c opic density is es timated o nline fro m the joint physical-virtual p ositions and the contin uiﬁcation control law (3) is discretised at each sampling step. In ex pe r iments with up to 10 0 a gents tracking unimo dal, bimoda l, and time-v ar ying targ et dis- tributions, the clo sed-lo op de ns it y conv erg es reliably to ρ d with erro rs consistent with the theoretical pr edictions of Maﬀettone et al. (20 23). This v alida tio n conﬁrms that the contin uiﬁcatio n pip eline is ro bust to the delays, quan- tisation err ors, and communication imper fections inher ent in a physical swarm implementation. 3.1 Optimal tr ansp ort for density c ontr ol When the target density ρ d v ar ies in time, the contin uiﬁ- cation control law (3) may pro duce velo city ﬁelds that a r e neither ma ss-prese r ving nor energ y-eﬃcient. In Nap olitano and di Bernardo (2026 ), we reformulated the direct control problem thro ugh optimal t r ansp ort (OT) theor y . Given a current density ρ ( · , t ) and a desire d density ρ d ( · , t + ∆ t ), one solves for the tra nsp ort map T minimising inf T : T # ρ = ρ d Z k x − T ( x ) k 2 ρ ( x ) dx, (4) where T # ρ denotes the push-for ward o f ρ through T . The resulting v elo cit y ﬁeld U OT ( x, t ) = ( T ( x ) − x ) / ∆ t is mass-pres e rving by construction and minimises the kinetic energy o f the tra nsp ort. F or o ne - dimensional p opula tions, closed-form solutio ns via the q uantile function are de- rived, while for higher dimensions entropy-regularised for- m ula tions yield smo oth, co mputationally tracta ble plans. The OT-based a pproach provides a principled a lternative to the L yapuno v-bas ed law (3) when optimality o f the transp ort plan is desired, a nd naturally interfaces with the co ntin uiﬁca tio n pipeline throug h Step 3 discretisation: each ag ent re c eives u i = U OT ( x i , t ). The O T p ersp ective also op ens a geometric viewp oint: the space of density functions, equipp ed with the W asserstein- 2 metric, b eco mes a Rie ma nnian manifold on which the controlled de ns it y traces a tr a jector y . This connection sug- gests a path tow a rds W asser stein-space Lyapunov analysis and g eometric c o ntrol design tha t we ar e actively pursuing. 4. INDIRECT CONTROL In indirect control, only a small subset of ag ents is actu- ated: these le aders or h er ders must s tee r a larger p op- ulation of uncontrolled agents through in ter-p opula tion int er actions. W e have shown that co ntin uiﬁca tion natu- rally extends to this s etting by lifting the coupled m ulti- po pulation dynamics to a system of PDEs and desig ning the macros copic control for the actuated p opulation alone. 4.1 L e ader–fol lower density c ontr ol In Maﬀettone et a l. (2025 ), w e formulated the leader– follow er densit y control pro blem. Co nt r ollable lea ders (densit y ρ L ) steer unco ntrolled follow ers (density ρ F ) through a repulsive interaction kernel f . The coupled macrosco pic mo del reads ∂ t ρ L + [ ρ L u ] x = 0 , (5) ∂ t ρ F + [ ρ F ( f ∗ ρ L )] x = D ρ F, xx , (6 ) where u ( x, t ) is the leaders’ v elo cit y ﬁe ld to b e designed and D > 0 mo dels sto chastic follow er b e haviour. The g oal is to ﬁnd u such that ρ F ( x, t ) → ¯ ρ F ( x ). W e der ived fe asibility c onditions o n the minimum leader mass as a function of the desir ed follow er proﬁle, diﬀusion, and kernel pa rameters. A fe e dforwar d s cheme achiev es exp onential leader conv erge nce with global follow er co n- vergence prov en via the Poincar´ e–Wirtinger inequality . A r efer enc e-governor (R G) dual-feedback scheme a dapts the leaders’ refere nc e as ˆ ρ L = ¯ ρ L + α ( t ) W based on the fol- low er tracking err or, with α ( t ) ∈ [0 , 1 ] ensur ing p ositivity and mass conser v a tio n. The RG scheme reduces stea dy- state erro rs by up to 90 % co mpared to feedforward under disturbances. Crucially , the leaders’ control u is ultimately discretised via Step 3 of the contin uiﬁcatio n pip eline: ea ch leader ag ent receives u j = u ( x j , t ). 4.2 Bio-inspi r e d plasticity and heter o gene ous p opulations In Maﬀettone et al. (2026 a), we extended the leader– follow er framework with bio-inspir e d plasticity : the cou- pling adapts online, mimicking biolog ical swarms where int er action strengths evolv e with exper ience. The result- ing a daptive architecture pro vides improved robustness to mo delling er r ors and time-v arying e nvironments. In Maﬀettone et al. (2026b), we further g e neralised the frame- work to heter o gene ous po pulations with v arying dynamics and interaction kernels. The key challenge is that density- level control design as s umes a homo geneous po pulation, while rea l swarms exhibit int er -agent v ariability in s pe ed, sensing radius, and interaction strength. W e mo delled this v ariability as a matched p e rturbation of the nominal macrosco pic dynamics and designed a r obust control law that guarantees ultimate b oundedness of the density track- ing erro r. The er r or b o und is explicit in the heter ogeneity level and c a n b e made arbitr arily small by increa sing the control gain, at the cos t of higher actuatio n eﬀor t. The robust for m ula tion also applies to s cenarios with unk nown bo unded disturbances—for instanc e unmo delled environ- men ta l forces—extending the fra mework’s applicability to real-world conditions where pre cise model knowledge is unav ailable. 4.3 Shepher ding and n onr e cipr o c al ﬁ eld the ory Shepherding in swarm rob otics repre s ents an extreme for m of indirect control: M her ders must dr ive N T ≫ M target agents tow ar d a goal reg ion. In Lama and di Ber na rdo (2024), we esta blished sc aling laws for her dability : the minim um num b er of herders satisﬁes M ∗ ∼ N α T with α < 1, co nﬁr ming that sublinear sparse actua tion suﬃces. A critical density thresho ld be low which herda bilit y degr ades was ident iﬁed a nd explained thr ough per colation o f a herdability gr aph. In La ma et al. (2025 ), we developed a nonr e cipr o c al ﬁeld the ory for the shepherding problem. The ﬁrst requir ed step to apply a co nt inuiﬁcation bas ed appro ach to the problem. A t the micr o scopic level, each herder i se lects a target via a soft-max rule that approximates the choice of the furthest target fro m the goa l within its sens ing radius ξ : T ∗ i = P a ∈N i,ξ e γ ( | T a |−| H i | ) T a P a ∈N i,ξ e γ ( | T a |−| H i | ) , (7) where γ ≥ 0 co ntrols the s e lection sp eciﬁcity (from centre- of-mass averaging at γ = 0 to furthest-targ et selection as γ → ∞ ). The herder then p os itions itself b ehind the selected targe t via the feedba ck control input u i = −  H i − T ∗ i − δ b T ∗ i  , (8) where δ ≥ 0 c o ntrols the goal- directedness of the tra jectory planning a nd b T ∗ i = T ∗ i / | T ∗ i | . The tw o decision-making parameters γ (tar get selection) and δ (tra jectory planning) int r o duce nonrecipr o cal and thre e -b o dy couplings that hav e no co un ter part in the targets’ dynamics. By deriving a mea n-ﬁeld limit, we obtained a nov el set of coupled ﬁeld equatio ns for the target dens it y ρ T and herder density ρ H : ∂ t ρ T = ∇ · h D T ( ρ T ) ∇ ρ T + ˜ k T ρ T ∇ ρ H i , (9) ∂ t ρ H = ∇ ·  D H ( ρ H ) ∇ ρ H − v 1 ( x ) ρ H ρ T − v 2 ( x ) ρ H ∇ ρ T  , (10) where D A ( ρ A ) ar e renor malised diﬀusivities a rising fr om noise and short-ra nge r epulsion. The co upling functions v 1 ( x ) and v 2 ( x ) enco de the decision-mak ing: v 2 main- tains a co nstant sign reﬂecting consistent her der-tar g et attraction, while v 1 changes s ign dep ending on the herder po sition r elative to the goal, enco ding goa l-directed shep- herding. The bilinear nonrecipr o cal term v 1 ( x ) ρ H ρ T is absent in standar d nonrecipro cal ﬁeld theories a nd a rises uniquely from the con tro l-oriented, three-b o dy character of the decision-mak ing rules (7 )–(8). This analy sis reveals that decisio n-making is itse lf a fundamental s ource of nonrecipro city , driving phase trans itions b etw een ho mo - geneous and conﬁned target c o nﬁgurations. In Di Lo renzo et al. (2025a ), we a pplied the contin uiﬁ- cation pip eline directly to the shepherding pr oblem: the target populatio n is describ ed by a conserv ation PDE, a macrosco pic herding str a tegy is desig ne d at the density level, and the resulting control is discretised to the her der agents. This enables r egulation of a rbitrarily large target po pulations with few her ders. In Nap olitano et al. (2 025), we prop osed a hier ar chic al le arning-b ase d a rchitecture that combines high-level planning with low-level lear ned p oli- cies for robust s hepherding of sto chastic age nts under uncertaint y and par tial obser v ability . 4.4 Populatio n- L evel Safety In realistic deployment scenarios , multi-agen t s ystems m ust satisfy s afety constra ints—collision avoidance with obstacles, conﬁnemen t to admissible reg io ns, a nd inter- agent spacing gua r antees—while pursuing their density control ob jective. A natura l approa ch within the multi- scale fra mework is to lift the Contr ol Barrier F unction (CBF) metho dolog y to the macrosco pic level, deﬁning ba r- rier functionals dir ectly on the p opulation density rather than on individual ag ent states. The r esulting me an-ﬁeld CBFs (MF-CBFs) imp ose sa fety co nstraints on the macro- scopic P DE via a minima lly inv asive Q P-based s a fet y ﬁl- ter, indep endent of the num be r o f agents. W e are currently developing this framework for b o th direct and indirect control, including shepher ding in cluttered environments. In a complementary direction, in Punzo et al. (2025) we show ed tha t a driv ing p olicy tr a ined via deep RL in a min- imal single-o bstacle scena rio tra nsfers directly to complex m ulti-a gent settings, pr o ducing co llis ion-free tra jectories without retr aining. 5. CO NCLUSIONS AND OP EN CHALLENGES The b o dy of work reviewed in this pap er demo ns trates that m ulti-s cale control—the systema tic bridging of macr o- scopic density descriptions and micros copic agent actuation— provides a versatile and rigor o us framework for control- ling large a gent p o pulations. The contin uiﬁcation pipeline, optimal trans po rt, nonr ecipro cal ﬁeld theory , mean-ﬁeld control barrier functions, and hiera rchical reinfor cement learning each address diﬀerent asp ects of the pro blem, and their combination within the macro-to -micro a nd micro- to-macro ar chitectures yields a control to o lb ox o f consid- erable brea dth. Key op en challenges include tightening the formal clo s ed- lo op gua rantees when estimation and co ntrol int er act across scales , extending the safety framew o rk to rich er classes of constra int s and system dynamics, and under- standing when analytical, le a rning-bas ed, a nd transp o rt- theoretic metho ds should b e combined r ather than used in isolation. On the application side, the gener ality of the multi-scale pa radigm invites v alidation b eyond swarm rob otics, in domains where large in ter acting po pulations and spa rse actuation ar e the norm. A CK NOWLEDGEMENTS The author thanks a ll the co- a uthors of the pap er s pre- sented here for their contributions. Dur ing the prepa ra- tion o f this work the author used AI to ols to assist with manuscript forma tting and r evision. REFERENCES B. Di Lo renzo, G.C. Maﬀettone, and M. di Bernardo . A contin uiﬁcation-base d control so lution for larg e - scale shepherding. Eur op e an Journal of Contr ol , 86(A):1013 24, 20 25. B. Di Lo renzo, G.C. Maﬀettone, and M. di Ber nardo. De- centralized contin uiﬁcatio n control of m ulti-a gent sys- tems via distributed density estimatio n. IEEE Contr ol Systems L etters , 9:1 580– 1585, 2025 . R.M. D’Souza, M. di Bernardo, and Y.-Y. L iu. Controlling complex net works with co mplex no des. Natur e R eviews Physics , 5:25 0–26 2 , 2 023. M. F ornasier and F. So lombrino. Mean-ﬁeld optimal control. ESAIM: COCV , 20(4):112 3–11 52, 2014 . A. Lama and M. di Bernardo . Shepherding and herdability in complex multiagent systems. Physic al R eview Re - se ar ch , 6(3):L032 012, 2 024. A. Lama , M. di B ernardo , and S.H.L. Klapp. Nonrecipro- cal ﬁeld theory for decisio n-making in multi-agent con- trol systems. Natur e Communic ations , 16:84 50, 2 025. G.C. Maﬀettone, A. Bo ldini, M. di Ber nardo, and M. Por- ﬁri. Contin uiﬁcation control o f large - scale m ultiag ent systems in a r ing. IEEE Contr ol Systems L ett ers , 7:841– 846, 202 2 . G.C. Maﬀettone, M. Porﬁri, and M. di Bernar do. Con- tin uiﬁca tion con trol of large-scale multiagen t systems under limited sensing a nd structura l per turbations. IEEE Contr ol Syst ems L ett ers , 7:242 5–24 3 0, 20 23. G.C. Maﬀettone, L. L ig uori, E. Palermo, M. di B ernardo, and M. Porﬁri. Mixed reality environment and high- dimensional contin uiﬁcation control for swarm rob o tics. IEEE T r ansactions on Contr ol Systems T e chnolo gy , 32(6):248 4–24 91, 202 4 . G.C. Maﬀettone, A. Boldini, M. Porﬁri, and M. di Bernar do. Leader–follower density control of spa tial dynamics in lar ge-scale multi-agent systems. IEEE T r ansactions on Automatic Contr ol , 70(1 0):6783 – 6798, 20 2 5. G.C. Maﬀettone, A. Bo ldini, M. di Ber nardo, and M. Por- ﬁri. Bio -inspired density control of multi-agen t swarms via leader– follow er pla sticity . A u tomatic a , ac c epted for publication, 202 6. G.C. Maﬀettone, D. Salzano, and M. di Bernar do. Robus t macrosco pic density co nt r ol of heterogeneo us multi- agent systems. IEEE T r ansactions on A ut omatic Con- tr ol , submitted, 2 026. I. Nap olitano, S. Cov o ne, A. La ma, F. De Lellis, a nd M. di Be r nardo. Hierarchical learning-bas ed control for mu lti-a gent shepher ding of s to c hastic autonomous agents. IEEE T r ansactions on Contr ol Systems T e ch- nolo gy , submitted, 2 025. I. Nap olitano and M. di Bernar do. Optimal tr ansp ort for time-v ar ying mult i- a gent cov era ge control. Automatic a , submitted, 202 6. D. Nikitin, C. Canudas-de-Wit, and P . F rasc a . A contin u- ation metho d for larg e-scale mo deling and control: from ODEs to PDE, a round trip. IE EE T r ansactions on Automatic Contr ol , 67(10):51 18–5 133, 20 2 2. C. P unzo, I. Nap o litano, C. T omaselli, a nd M. di Bernar do. Decent r alized shepherding of no n-cohesive swarms through cluttered environments via deep reinforce ment learning. arXiv pr eprint arXiv:2511. 21405 , 2 025.

Multi-Scale Control of Large Agent Populations: From Density Dynamics to Individual Actuation

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment