Sensing-Assisted Adaptive Beam Probing with Calibrated Multimodal Priors and Uncertainty-Aware Scheduling

JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, A UGUST 2021 1 Sensing-Assisted Adapti v e Beam Probing with Calibrated Multimodal Priors and Uncertainty-A ware Scheduling Abidemi Orimogunje, V ukan Ninko vic, Ognjen Kundacina, Hyunwoo P ark, Sunwoo Kim, Dejan V ukobrato vic, Ev ariste T wahirwa, and Gaspard Gashema Abstract —Highly directional mmW ave/THz links r equire rapid beam alignment, yet exhaustiv e codebook sweeps incur pro- hibitive training overhead. This letter proposes a sensing- assisted adaptive probing policy that maps multimodal sensing (radar/LiD AR/camera) to a calibrated prior over beams, pr edicts per -beam reward with a deep Q-ensemble whose disagreement serves as a practical epistemic-uncertainty proxy , and schedules a small probe set using a Prior -Q upper-conﬁdence score. The probing budget is adapted from prior entropy , explicitly coupling sensing conﬁdence to communication overhead, while a margin- based safety rule pr events low signal-to-noise ratio (SNR) locks. Experiments on DeepSense-6G (train: scenarios 42 and 44; test: 43) with a 21-beam discrete Fourier transform (DFT) codebook achieve T op-1/T op-3 of 0.81/0.99 with E [ K ] ≈ 2 pr obes per sweep and zero observed outages at θ = 0 dB with ∆ = 3 dB. The results show that multimodal priors with ensemble uncertainty match link quality and improv e reliability compared to ablations while cutting overhead with better predictiv e model. Index T erms —Bandit algorithm, Beam management, 6G, Mul- timodal sensing, Q-ensembles I . I N T RO D U C T I O N D IRECTION AL millimeter wave (mmW av e) and terahertz (THz) communication systems rely on narrow beams to ov ercome sev ere path loss, but aligning these beams during initial access or beam tracking incurs large training ov er- head [1]–[5]. In con ventional frame works, the transmitter and receiv er must sweep through dif ferent beam directions to ﬁnd the best alignment, leading to latency , energy e xpenditure, and potential link outage if alignment fails [6]–[8]. T o handle these challenges, future 6G systems are expected to integrate communication and sensing functionalities, le veraging e xternal sensors to aid beam alignment [9], [10]. Recent research has shown that multimodal sensing features from camera, radar , light detection and ranging (LiD AR), and global positioning system (GPS) sensors can be exploited for beam alignment and to predict the optimal beam, thereby minimizing the need for exhausti ve search [11]–[13]. F or example, using camera images and position data to guide beam selection yields 75 % of correct beam ranked ﬁrst (top- 1 accuracy) and 100 % correct beam included within the top three predictions i.e. top-3 accuracy in realistic vehicular scenarios [10]. Like wise, radar-aided beam prediction has demonstrated over 90 % top-5 prediction accuracy (correct beam prediction within top 5 predictions) while saving 93 % of the beam training overhead [14]. This paper was produced by the IEEE Publication T echnology Group. They are in Piscataway , NJ. Manuscript received April 19, 2021; revised August 16, 2021. These sensor -aided approaches highlight the potential of using external multi-modal sensing sources to drastically cut alignment ov erhead and latency [15], [16]. Howe ver , previous works often optimizes beam classiﬁcation accuracy b ut (i) does not explicitly quantify uncertainty to control probing resources, and (ii) lacks a reliability mechanism that reports outage-related metrics under a clear policy . In this letter , we propose a sensing-assisted beam-probing policy that couples sensing conﬁdence to communication probing ov erhead with the follo wing contributions: • A calibrated multimodal beam prior that con verts sens- ing features into a per-sweep probability mass function (PMF) over beams (with temperature scaling dri ven by prior sharpness). • A Q-ensemble that predicts per-beam re ward and uses ensemble disagreement as an epistemic uncertainty proxy [17]. • A upper conﬁdence bound (UCB)-style scheduler that selects a small probing set S t and adapts K t from the entropy of the calibrated prior , making the sensing– communication resource coupling explicit. • A lightweight safety rule and reporting protocol that logs both threshold-outage and margin-outage, plus the activ ation frequency of the safety mechanism. I I . S Y S T E M M O D E L A N D P R O BL E M F O R M U L A T I O N W e consider a single downlink link where an access point (AP) performs beam selection from a discrete Fourier transform (DFT) codebook of size B , with beam index set B = { 1 , . . . , B } . At each sweep t , an AP-based platform provides sensing side information in the form of a fused feature vector x t (from frequency modulated continuous wav e (FMCW) radar , camera, and LiDAR processing). The AP can probe only a small subset of beams S t ⊂ B with cardinality |S t | = K t ≪ B by transmitting pilots and collecting per- beam in-phase and quadrature (IQ) recordings Z t,b for b ∈ S t . Here, K t is the beam-training overhead (control-plane cost) per sweep. W e treat sensing as giv en side information (i.e., sensing-assisted communications) and focus on ho w sensing conﬁdence can reduce communication probing o verhead while preserving link reliability . A. SNR pr oxy and oracle beam From the IQ recording Z t,b , we compute a robust absolute- SNR proxy using a percentile power ratio: SNR dB ( Z t,b ) = 10 log 10 p erc p s  | Z t,b | 2  p erc p n  | Z t,b | 2  + ε ! , (1) 0000–0000/00$00.00 © 2021 IEEE JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, A UGUST 2021 2 with ( p s , p n ) = (99 . 7 , 20) and ε > 0 for numerical stability . For reporting T op- k metrics, we deﬁne the oracle best beam as the maximizer over the full codebook: b ⋆ t = arg max b ∈B SNR dB ( Z t,b ) . (2) Importantly , b ⋆ t is used only to form e v aluation/training targets (av ailable in the dataset via exhausti ve logging), whereas the online policy probes only K t beams. B. Probing decision, locking rule, and r eliability metrics Giv en a probed set S t , the best probed candidate is b best t = arg max b ∈S t SNR dB ( Z t,b ) . (3) T o av oid low-mar gin locks, we apply a lightweight safety shield with outage threshold θ and safety margin ∆ ≥ 0 : b lock t = ( b best t , if SNR dB ( Z t,b best t ) ≥ θ + ∆ , b safe t , otherwise , (4) where b safe t is selected from a local neighbor set around the previous lock to preserve continuity . Let N ( b lock t − 1 ) = { b ′ ∈ B : circ dist( b ′ , b lock t − 1 ) ≤ w } denote a circular neighborhood of width w ; the shield picks the neighbor with highest last- known SNR dB among those satisfying SNR dB ≥ θ + ∆ (else it retains b lock t − 1 ). T o transparently quantify reliance on the safety fallback, we deﬁne the shield acti vation indicator I shield t = 1 { b lock t  = b best t } , and report the shield activation rate E [ I shield t ] in the results. W e also deﬁne the threshold-outage ev ent for a lock ed beam as I out t = 1 { SNR dB ( Z t,b lock t ) < θ } , (5) and report its empirical rate to quantify reliability under the chosen θ . For of ﬂine learning, we form a per -sweep IQ-deriv ed rew ard vector R IQ t ∈ R B with [ R IQ t ] b = SNR dB ( Z t,b ) . W e optionally reduce label variance by blending IQ rewards with a sensing- deriv ed beam prior (before calibration) to form a normalized hybrid tar get R hyb t used to train the Q-ensemble: R hyb t = α zscore t ( R IQ t ) + (1 − α ) zscore t ( π sense t ) , (6) where α ∈ [0 , 1] , zscore t ( · ) denotes per-sweep standardization across beams and π sense t is a beam prior deri ved from sensing. C. Problem formulation: overhead-reliability and link-quality trade-of f Giv en sensing features x t (and past deci- sions/measurements), the policy π sel selects a probing set S t (equiv alently K t and which beams to probe) and outputs the ﬁnal locked beam b lock t . Our objectiv e is communication-centric: maximize locked-beam quality while controlling beam-training overhead and outage risk. This is stated as the constrained problem and equiv alently , a penalized form (P1) Constrained: max π sel E h SNR dB ( Z t,b lock t ) i s.t. E [ K t ] ≤ ¯ K , Pr n SNR dB ( Z t,b lock t ) < θ o ≤ δ , (7) (P2) Penalized: max π sel E h SNR dB ( Z t,b lock t ) − c K K t − c out 1 n SNR dB ( Z t,b lock t ) < θ o i . (8) where ¯ K is an av erage probing-budget constraint and δ is an outage-risk constraint. This formulation makes the sensing– communication coupling explicit: sensing affects the policy’ s conﬁdence about beams (via a sensing-deriv ed prior), and that conﬁdence is used to adapt K t so as to satisfy the overhead constraint while maintaining reliability . I I I . P R O P O S E D M E T H O D A. Calibrated multimodal prior A multimodal prior network maps sensing features x t to logits z t ( b ) ∈ R for b ∈ B . W e standardize logits using training statistics ( µ train , σ train ) : ˜ z t ( b ) = z t ( b ) − µ train σ train + ϵ . (9) T emperature scaling [18] yields a calibrated PMF: p t ( b ) = exp( ˜ z t ( b ) /T eﬀ ) P b ′ ∈B exp( ˜ z t ( b ′ ) /T eﬀ ) . (10) where effecti ve temprature, T eﬀ = clip  T 0  ˆ s t s min  − α  1 + γ u norm t  , T min , T max  . Also, we compute the prior entropy H ( p t ) = − X b ∈B p t ( b ) log p t ( b ) , (11) and use it to set the probing budget K t (Sec. III-C). B. Q-ensemble re war d pr ediction and uncertainty W e use an ensemble of M predictors { Q m } M m =1 , where Q m ( x t , b ) predicts a scalar re ward proxy . Ensemble mean and standard de viation are follows: µ t ( b ) = 1 M M X m =1 Q m ( x t , b ) , (12) τ t ( b ) = v u u t 1 M M X m =1  Q m ( x t , b ) − µ t ( b )  2 . (13) The τ t ( b ) is used as a practical epistemic uncertainty proxy in the standard deep-ensemble sense [17]; we do not claim formal regret guarantees under general non-stationary mmW ave dy- namics, b ut we validate its utility empirically through ablations (Sec. V). For scheduling we normalize per sweep uncertainty: b σ t ( b ) = τ t ( b ) max b ′ ∈B τ t ( b ′ ) + ϵ . (14) C. Prior-Q UCB scheduling and adaptive pr obing W e score each beam by s t ( b ) = (1 − λ ) zscore t ( µ t ( b )) + λ log p t ( b ) + β b σ t ( b ) , (15) where λ ∈ [0 , 1] and β ≥ 0 . W e form S t by ranking beams in descending s t ( b ) and greedily selecting beams while enforcing a minimum circular separation d θ . JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, A UGUST 2021 3 Algorithm 1 Sensing-assisted Prior–Q UCB scheduling with adaptiv e probing and safety shield (per sweep t ) 1: Inputs: fused feature x t ; prior logits z t ( · ) ; Q-ensemble { Q m } M m =1 ; codebook B ; previous locked beam b lock t − 1 . 2: Parameters: λ, β ; d θ ; H low , H high ; K min , K mid , K max ; τ gap ; θ , ∆ ; neighbor width w ; ϵ . 3: (A) Prior: compute ˜ z t ( b ) via (9); compute p t ( b ) via (10); compute H ( p t ) via (11). 4: (B) Q-ensemble: compute µ t ( b ) , τ t ( b ) via (12)–(13); normalize b σ t ( b ) via (14). 5: (C) Scor e: compute s t ( b ) via (15). 6: (D) Budget: set K t via (16); optional bump if p (1) t − p (3) t < τ gap . 7: (E) Select: b uild S t by descending s t ( b ) with min separation d θ until |S t | = K t . 8: (F) Probe: measure SNR dB ( Z t,b ) via (1) for b ∈ S t ; set b best t via (3). 9: (G) Lock: lock b lock t via (4). 10: Output: b lock t , S t , K t . 11: Complexity (per sweep): compute O ( M B + B log B ) ; shield scan O ( w ) ; probing cost is K t beam measurements with K t ≪ B . D. Adaptive pr obe budget and Safety shield W e set K t deterministically from the prior entropy: K t =      K min , H ( p t ) ≤ H low , K max , H ( p t ) ≥ H high , K mid , otherwise , (16) with K min < K mid < K max and H low < H high . While the entropy H ( p t ) captures global prior uncertainty , it may not detect near-ties among the most likely beams. W e therefore sort the top prior masses p (1) t ≥ p (2) t ≥ p (3) t and deﬁne g t ≜ p (1) t − p (3) t . If g t < τ gap , we increase the probing budget by one beam, K t ← min { K t + 1 , K max } . Setting τ gap = 0 disables this step. The complete per-sweep procedure (prior calibration, scoring, adaptiv e K t , diverse set selection, probing, and safety lock) is summarized in Algorithm 1. W e use θ = 0 dB as a conservati ve connecti vity boundary (signal and noise powers comparable) and ∆ = 3 dB as a standard engineering safety margin (approximately a factor - of-two power buf fer) to av oid lo w-margin locks. W e report both outage rate (5) and shield activ ation rate to quantify how often reliability is provided by fallback. E. Computational complexity Per sweep, inference requires one prior forward pass and M ensemble forward passes producing B scores, i.e., O ( M B ) , plus sorting O ( B log B ) . The dominant practical cost in beam training is measurement overhead, which scales with only K t ≪ B probes (versus exhausti ve sweeping K = B ). I V . E X P E R I M E N T A L S E T U P A N D E V A L U A T I O N P R OT O C O L W e ev aluate on the real-world DeepSense-6G dataset [19]. All models were run with CPU inference for fair comparison of complexity . The T op-1 and T op-3 accuracies are computed with respect to the DFT codebook (oracle), while our online policy probes only K t ≪ B beams per sweep. W e focus on four key performance metrics: T op-1/T op-3 locked-beam accuracy , average probes per sweep, outage probability , and the mean with 5th-percentile ( p 05 ) of the locked-beam SNR. 1) Dataset and Split W e use DeepSense-6G scenarios 42 and 44 for training, while scenario 43 is used for testing. Each sweep uses a B = 21 beam codebook. The state vector x t fuses radar fea- tures (64 × 64 heatmap compression), LiDAR polar histogram and bearing, a camera 21-bin horizontal edge histogram and bearing, and meta features. Training statistics (feature stan- dardization and prior logit statistics) are sav ed and reused at ev aluation. 2) T raining and Inference The prior is optimized with class-weighted cross-entropy and soft targets deriv ed from the per-sweep IQ-SNR re wards; standardized feature scaling is persisted and reused at test time. T emperature scaling parameters are learned from global prior-logit statistics sa ved during training. 3) Evaluation Metrics The follo wing metrics are deﬁne to ev aluate the perfor - mance of our framework: (i) T op-1/T op-3: the fraction of sweeps in which b lock t equals the oracle best beam (T op-1) or lies among the top-3 beams by oracle SNR (T op-3). (ii) Overhead: mean probes per sweep, E [ K t ] . (iii) Outage: Pr { SNR dB ( Z t,b lock t ) < θ } with θ =0 dB. (iv) SNR statistics: mean and p05 of the locked-beam absolute SNR proxy . V . R E S U L T S A N D D I S C U S S I O N A. Main performance and baselines T able I reports locked-beam T op-1/T op-3 accuracy , average probing overhead E [ K t ] , and locked-beam SNR/outage under the probe-and-lock protocol deﬁned in Sec. II. W e compare against Random selection and a prior-only argmax baseline (selecting arg m ax b p t ( b ) with K t =1 ), and a contextual bandit baseline (LinUCB) at a matched probing budget ( K t =2 ). T o isolate the contribution of adapti ve budgeting, we additionally include a ﬁxed-b udget control for our method ( K t ≡ 2 ). The proposed sensing-assisted Prior–Q policy achiev es T op- 1/T op-3 = 0 . 81 / 0 . 99 with E [ K t ] = 1 . 99 , corresponding to a B / E [ K t ] ≈ 21 / 1 . 99 ≈ 10 . 6 × reduction in probing relativ e to exhaustiv e sweeping ( K t = B = 21 ), while maintaining 0% outage at θ = 0 dB with safety margin ∆ = 3 dB. Fig. 1 shows that the locked-beam SNR proxy in (1) is tightly concentrated (mean 14 . 18 dB; p05 14 . 13 dB), consistent with stable link quality under very low measurement o verhead. The ﬁxed- K t control matches the adaptiv e result, indicating that performance is driv en primarily by the learned beam ranking (calibrated prior fused with Q-ensemble prediction and an uncertainty bonus), while the entropy rule mainly selects a low-o verhead operating point. B. Ablation: score terms and safety mechanism Fig. 2 isolates the contrib ution of the probing-and-lock com- ponents at a matched budget ( K t =2 ). The full policy (prior– Q fusion with uncertainty bonus and the margin-based lock rule) maintains high T op-1/T op-3 accuracy while preserving the same high-SNR operating point. Importantly , the measured shield activ ation rate is negligible in this scenario (consistent with Fig. 1), indicating that the reported reliability is not JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, A UGUST 2021 4 T ABLE I M A IN P ER F O R MA N C E O N D E E P S E N S E - 6 G S C E NA R IO 4 3 ( O R AC L E D E FIN E D OV E R B = 21 B EA M S ) U N D E R T HE PR O BE - A ND - L O CK PR OT OC O L Method T op-1 T op-3 E [ K t ] SNR avg SNR p05 Outage Random ( K t =1 ) 0.03 0.14 1.00 – – – Prior-only argmax ( K t =1 ) 0.56 0.83 1.00 – – – LinUCB ( K t =2 ) 0.08 0.27 2.00 14.17 14.13 0.00 Ours (ﬁxed K t =2 ) 0.81 0.99 2.00 14.18 14.13 0.00 Ours (adapti ve K t ) 0.81 0.99 1.99 14.18 14.13 0.00 14.12 14.14 14.16 14.18 14.20 14.22 14.24 Locked-beam SNR (dB) 0.0 0.2 0.4 0.6 0.8 1.0 Empirical CDF N=100 E[K]=1.99 Out@θ=0.0% Out@Δ=0.0% Shield=0.0% θ=0.0 dB, Δ=3.0 dB Locked beam p05=14.13 dB mean=14.18 dB Fig. 1. Empirical CDF of the locked-beam SNR proxy (1) for the proposed adaptiv e policy on DeepSense-6G scenario 43 ( B =21 ). V ertical markers denote the mean and the 5th-percentile (p05), illustrating a tight locked-beam SNR distribution under E [ K t ] ≈ 2 probes per sweep. achiev ed by frequent rule-based overrides; rather , the learned ranking and targeted probing already produce high-margin candidates. C. Modality-dropout r obustness T o isolate the impact of individual sensing modalities, T able II reports modality-dropout results under a ﬁxed probing budget ( K t =2 ), while Fig. 3 complements the table by summa- rizing both T op- k and SNR statistics under (i) adapti ve probing and (ii) ﬁxed- K t probing. Removing the camera substantially reduces T op-1 accuracy , and the radar-only setting degrades sharply , which indicates that a single sensing modality may be insufﬁcient to reliably rank beams in this scenario. At the same time, the locked-beam SNR statistics remain tightly clustered across modality subsets (Fig. 3(c)–(d)), reﬂecting that the probe-and-lock step conﬁrms the ﬁnal decision using measured SNR and that many neighboring beams yield similar link quality in this test conﬁguration. D. Codebook-size r obustness T able III reports a uniform subcodebook sweep with B ∈ { 9 , 13 , 17 , 21 } . In this dataset conﬁguration, T op- k and locked- beam SNR/outage remain unchanged across these subcode- books, suggesting that the dominant beams are preserved under sub-sampling for this scenario. W e include this sweep to demonstrate that the pipeline executes consistently under vary- ing B ; broader mobility regimes and multi-user interference remain important directions for future ev aluation. E. Limitations and Future W ork This letter ev aluates a single-link scenario on a ﬁxed real- world dataset split. While we strengthen breadth via Lin- UCB, modality dropout, and codebook sweeps, multi-user interference, broader mobility conditions, and different array full (N=99, E[K]=2.00) no-shield (N=34, E[K]=2.00) linucb (N=100, E[K]=2.00) 0.0 0.2 0.4 0.6 0.8 1.0 Accuracy 0.81 0.82 0.08 0.99 1.00 0.27 Top-1 Top-3 full no-shield linucb 14.05 14.10 14.15 14.20 14.25 14.30 SNR (dB) 14.18 14.18 14.17 14.13 14.14 14.13 SNR mean (dB) SNR p05 (dB) Outage (θ) Shield rate 0.0 0.2 0.4 0.6 0.8 1.0 Rate Fig. 2. Ablation at matched probing budget ( K t =2 ). Left: locked-beam T op- 1/T op-3 accuracy . Right: locked-beam SNR statistics (mean and p05) with reliability indicators (outage at θ and observ ed shield acti vation rate). T ABLE II M O DA LI T Y - D RO P OU T A B L A T I ON U ND E R FI X E D P RO B I NG BU D G E T ( K t =2 ) O N D E E P S E N SE - 6 G S C E NA R IO 4 3 ( B = 21 ). Setting T op-1 T op-3 SNR avg SNR p05 K t All sensors 0.81 0.99 14.18 14.13 2 No camera 0.62 1.00 14.17 14.13 2 No LiDAR 0.81 0.99 14.18 14.13 2 No radar 0.81 0.99 14.18 14.13 2 Radar only 0.10 0.65 14.16 14.13 2 LiD AR only 0.76 0.96 14.18 14.13 2 Camera only 0.81 0.99 14.18 14.13 2 conﬁgurations remain open directions. W e treat multimodal sensing measurements as a gi ven side information and focus on resource coupling. A natural extension is closed-loop joint sensing/communication design where sensing wav eform and communication training are co-optimized, and uncertainty is propagated into both subsystems. V I . C O N C L U S I O N W e proposed a sensing-assisted adapti ve beam probing framew ork that combines calibrated multimodal priors with a Q-ensemble and UCB-style uncertainty-aware scheduling. The policy explicitly couples sensing conﬁdence to the probing budget. On DeepSense-6G scenario 43 with B = 21 , the proposed polic y achiev es T op-1/T op-3 = 0 . 81 / 0 . 99 using E [ K t ] = 1 . 99 probes per sweep, with 0% outage under θ = 0 dB (margin ∆ = 3 dB) and mean/p05 locked-beam SNR 14 . 18 / 14 . 13 dB. Fixed- K control, modality dropout, a codebook-size sweep, and a LinUCB baseline support repro- ducibility and practical assessment. A C K N O W L E D G E M E N T This work was jointly supported by the African Center of Excellence in Internet of Things (A CEIoT) Uni versity of Rwanda, Regional Scholarship and Innovation Fund (RSIF), and National Research Foundation of K orea under Grant RS- 2024-00409492. R E F E R E N C E S [1] M. Giordani, M. Polese, A. Roy , D. Castor , and M. Zorzi, “A T utorial on Beam Management for 3GPP NR at mmW ave Frequencies, ” IEEE Commun. Surveys T uts. , vol. 21, no. 1, pp. 173–196, 2019. [2] C. Liu, S. Chen, and L. Hanzo, “Beam T raining and Tracking in MmW ave Communication: A Survey , ” IEEE Commun. Surveys T uts. , vol. 23, no. 2, pp. 631–655, 2021. [3] H. Park, J. Kang, S. Lee, J. W . Choi, and S. Kim, “Deep Q-Network Based Beam T racking for Mobile Millimeter-W ave Communications, ” IEEE Tr ans. W ireless Commun. , v ol. 22, no. 2, pp. 961–971, Feb . 2023. JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, A UGUST 2021 5 All_sensors No_camera No_lidar No_radar Radar_only LiDAR_only Camera_only 0.0 0.2 0.4 0.6 0.8 1.0 Accuracy T op-1 T op-3 (a) T op- k accuracy with adapti ve probing ( K t from entropy). All_sensors_K2 No_camera_K2 No_lidar_K2 No_radar_K2 Radar_only_K2 LiDAR_only_K2 Camera_only_K2 0.0 0.2 0.4 0.6 0.8 1.0 Accuracy T op-1 T op-3 (b) T op- k accuracy with ﬁx ed probing ( K t ≡ 2 ). All_sensors No_camera No_lidar No_radar Radar_only LiDAR_only Camera_only 14.14 14.15 14.16 14.17 14.18 Lock ed-beam SNR (dB) Mean SNR p05 SNR (c) SNR statistics with adapti ve probing (mean and p05). All_sensors_K2 No_camera_K2 No_lidar_K2 No_radar_K2 Radar_only_K2 LiDAR_only_K2 Camera_only_K2 14.13 14.14 14.15 14.16 14.17 14.18 Lock ed-beam SNR (dB) Mean SNR p05 SNR (d) SNR statistics with ﬁxed probing ( K t ≡ 2 ; mean and p05). Fig. 3. Modality-dropout robustness on DeepSense-6G scenario 43. Panels (a)–(b) report locked-beam T op-1/T op-3 accuracy across modality subsets; panels (c)–(d) report locked-beam SNR proxy statistics (mean and p05). Adaptive probing mitigates lo w-conﬁdence sensing states by allocating more probes when the calibrated prior is dif fuse, while the probe-and-lock step preserves high lock ed-beam SNR across settings. T ABLE III C O DE B O O K - S I ZE S WE E P U S I N G U N IF O R M S U BC O D E BO O K S O F T H E B = 21 D F T C O D E BO O K ( A DA P T IV E PO L I CY ) . B used T op-1 T op-3 E [ K t ] SNR avg Outage 9 0.81 0.99 1.99 14.18 0.00 13 0.81 0.99 1.99 14.18 0.00 17 0.81 0.99 1.99 14.18 0.00 21 0.81 0.99 1.99 14.18 0.00 [4] P . Xue, Y . Huang, D. Zhu, Y . Zhao, and C. Sun, “Reducing the System Overhead of Millimeter-W ave Beamforming W ith Neural Networks for 5G and Beyond, ” IEEE Access , v ol. 9, pp. 165 956–165 965, 2021. [5] P . W ang, C. Sun, K. Ma, Y . Bai, and Z. W ang, “MmW a ve Beam Pre- diction Under Hand Blockage, ” IEEE W ir eless Commun. Lett. , v ol. 13, no. 12, pp. 3598–3602, 2024. [6] S. Rangan, T . S. Rappaport, and E. Erkip, “Beam management and self- alignment for 5g millimeter -wav e mobile communication systems. ” [7] A. Alkhateeb, Y .-H. Nam, M. S. Rahman, J. Zhang, and R. W . Heath, “Initial Beam Association in Millimeter W av e Cellular Systems: Analy- sis and Design Insights, ” IEEE T rans. Wir eless Commun. , vol. 16, no. 5, pp. 2807–2821, 2017. [8] R. Gupta, K. Lakshmanan, and A. K. Sah, “Beam Alignment for mmW ave Using Non-Stationary Bandits, ” IEEE Commun. Lett. , v ol. 24, no. 11, pp. 2619–2622, 2020. [9] Q. Xue, C. Ji, S. Ma, J. Guo, Y . Xu, Q. Chen, and W . Zhang, “A Survey of Beam Management for mmW ave and THz Communications T owards 6G, ” IEEE Commun. Surveys T uts. , vol. 26, no. 3, pp. 1520–1559, 2024. [10] G. S. Charan, U. Demirhan, and A. Alkhateeb, “V ision-Position Multi- Modal Beam Prediction Using Real Millimeter W av e Datasets, ” in Proc. IEEE WCNC , 2022, pp. 2727–2731. [11] M. Ghassemi, H. Zhang, A. Afana, A. B. Sediq, and M. Erol-Kantarci, “Multi-Modal Transformer and Reinforcement Learning-Based Beam Management, ” IEEE Netw . Lett. , vol. 6, no. 4, pp. 222–226, 2024. [12] K. Patel and R. W . Heath, “Harnessing Multimodal Sensing for Multi- User Beamforming in mmW ave Systems, ” IEEE T rans. W ireless Com- mun. , vol. 23, no. 12, pp. 18 725–18 739, 2024. [13] C. Zheng, J. He, C. G. Kang, G. Cai, Z. Y u, and M. Debbah, “M2BeamLLM: Multimodal Sensing-empo wered mmW a ve Beam Prediction with Large Language Models, ” 2025. [Online]. A v ailable: https://arxiv .org/abs/2506.14532 [14] U. Demirhan and A. Alkhateeb, “Radar Aided 6G Beam Prediction: Deep Learning Algorithms and Real-W orld Demonstration, ” in Proc. IEEE WCNC , 2022. [15] Y . Guo, W . Qin, Y . Xu, Y . Gu, C. Y in, and B. Xia, “Predicti ve Beam T racking and Power Allocation With Cooperative Sensing for V2I Communication, ” IEEE Open J. Commun. Soc. , v ol. 5, pp. 6048–6063, 2024. [16] W . Chen, L. Li, Z. Chen, T . Quek, and S. Li, “Enhancing THz/mmW av e Network Beam Alignment With Integrated Sensing and Communica- tion, ” IEEE Commun. Lett. , v ol. 26, no. 7, pp. 1698–1702, 2022. [17] B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and Scalable Predicti ve Uncertainty Estimation using Deep Ensembles, ” 2017. [Online]. A vailable: https://arxi v .org/abs/1612.01474 [18] C. Guo, G. Pleiss, Y . Sun, and K. Q. W einberger , “On calibration of Modern Neural Networks, ” in Int. conf. on mach. learn. PMLR, 2017, pp. 1321–1330. [19] A. Alkhateeb, U. Demirhan, C. K. R. T umuluru, K. V . Mishra, G. S. Saha, I. Guvenc, A. Dwarakanath, F . Ahmed, H. S. Dhillon, and M. Bennis, “DeepSense 6G: A Large-Scale Real-W orld Multi-Modal Sensing and Communication Dataset, ” IEEE Commun. Mag . , vol. 61, no. 9, pp. 36–42, 2023.

Sensing-Assisted Adaptive Beam Probing with Calibrated Multimodal Priors and Uncertainty-Aware Scheduling

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment