Transmission Delay Minimization for NOMA-Based F-RANs

1 T ransmission Delay Minimization for NOMA-Based F-RANs Y uan Ai, Xidong Mu, Member , IEEE , Pengbo Si, Senior Member , IEEE , Y uanwei Liu, F ellow , IEEE Abstract —A novel non-orthogonal multiple access (NOMA) based low-delay ser vice framework is pr oposed f or f og radio access networks (F-RANs). F og access points (F APs) leverage NOMA for local delivery of cached content, while the cloud access point employs NOMA to simultaneously push content to F APs and directly ser ve users. Based on this model, a delay minimization problem is formulated by jointly optimizing user association, cache placement, and power allocation. T o address this non-con vex mixed-integer nonlinear programming problem, an alternating optimization (A O) algorithm is developed, which decomposes the original problem into two subproblems, namely joint user association and cache placement, and po wer alloca- tion. In particular , a low-complexity algorithm is designed to optimizing the user association and cache placement strategy using the McCormick envelope theory and Lagrangian partial relaxation. The power allocation is optimized by in voking the successive con vex approximation. Simulation results rev eal that: 1) the proposed A O-based algorithm effectively balances between the achieved perf ormance and computational efﬁciency , and 2) the pr oposed NOMA-based F-RANs framew ork signiﬁcantly outperforms orthogonal multiple access-based F-RANs systems in terms of average transmission delay in different scenarios. Index T erms —Non-orthogonal multiple access (NOMA), fog radio access networks, delay minimization. I . I N T RO D U C T I O N The rapid expansion of data demand, coupled with the emergence of bandwidth-intensi ve applications, is dri ving the dev elopment of the next generation of moible communica- tion netw orks. This ev olution presents substantial challenges, including the management of heterogeneous service data traf- ﬁc, accommodating ultra-massi ve connectivity , and achie ving ultra-high throughput alongside minimal latency [1]. T o meet these challenges, the de velopment of adv anced multiple ac- cess technology , collecti vely termed next generation multiple access (NGMA), is anticipated. These schemes are expected to support a vast number of users more ef ﬁciently in terms of both resource utilization and computational comple xity compared to current multiple access technologies [2]. Among these advanced schemes, po wer-domain non-orthogonal multiple ac- cess (NOMA) has emerged as a crucial technology , garnering signiﬁcant attention from both academia and industry [3]. By efﬁciently assigning multiple user signals to the same resource block via superposition coding, NOMA offers considerable Y . Ai, P . Si are with the School of Information Science and T ech- nology , Beijing University of T echnology , Beijing 100124, China (e-mail: aiyuan@bjut.edu.cn; sipengbo@bjut.edu.cn). (Corresponding author: P engbo Si.) X. Mu is with the Centre for W ireless Innov ation (CWI), Queen’ s Uni versity Belfast, Belfast, BT3 9DT , U.K. (e-mail: x.mu@qub .ac.uk). Y . Liu is with the Department of Electrical and Electronic Engineering, The Univ ersity of Hong Kong, Hong Kong (e-mail: yuanwei@hku.hk). performance enhancements over orthogonal multiple access (OMA) methods, particularly in terms of network capacity , user connectivity , and latenc y reduction [4]. On the other hand, fog radio access networks (F-RANs) hav e emerged as a critical technology and architectural in- nov ation in 6G, offering solutions for the diverse demands of heterogeneous devices through adv anced and efﬁcient resource management strategies [5]. F-RANs are designed to address the latenc y and signaling overhead challenges inherent in cloud radio access networks (C-RANs) by partially shifting network intelligence, such as computing and storage capabilities, closer to edge devices [6]. This is achie ved through the deplo yment of fog access points (F APs) that combine distributed caching and advanced signal processing functionalities, enabling collabora- tiv e cloud-edge processing. T o further enhance these beneﬁts, integrating power -domain NOMA technology within F-RANs is e xpected to meet the extraordinary requirements for high access speeds and lo w latenc y . NOMA can be incorporated into F-RANs to facilitate content deli very from F APs to mobile users, as well as to manage task ofﬂoading when computa- tional tasks are requested by users. As a result, the fusion of NOMA with F-RANs has sparked a ne w research area known as NOMA-based F-RANs, which holds signiﬁcant promise for enhancing spectral efﬁcienc y and reducing transmission delays in next-generation wireless systems [7]. A. Related W orks 1) Studies on NOMA-based F-RANs: Recent research ex- tensiv ely in vestigated the potential beneﬁts of integrating NOMA to improve the performance of F-RANs. In [8], a detailed analysis of the network architecture and core mod- ules of NOMA F-RANs was provided, with a focus on the application of artiﬁcial intelligence (AI)-enabled methods to optimize resource allocation through latent feature extraction and cooperative caching. Similarly , Guo et al. [9] in vestigated the beneﬁts of integrating cache-aided multicast transmissions with NOMA in F-RANs, using index coding to minimize transmission energy . Furthermore, Ai et al. [10] introduced a cost-efﬁcent resource allocation framew ork speciﬁcally de- signed for NOMA-based F-RANs, achieving a signiﬁcant trade-off between netw ork throughput and computational com- plexity . 2) Studies on Low-Delay Optimization: Minimizing trans- mission delay remains a critical challenge in the design of F- RANs, particularly when integrating NOMA. Recent advance- ments hav e explored the synergy of NOMA with caching and computing to address these multi-faceted challenges. Shen et 2 al. [11] proposed a comprehensive framework integrating task ofﬂoading and caching in NOMA-assisted networks, utilizing deep reinforcement learning to optimize resource allocation and reduce latency . Moreov er, Y e et al. [12] introduced a hierarchical network slicing approach for 6G systems, inte- grating NOMA and multi-dimensional resource allocation to enhance efﬁcienc y and meet the stringent requirements of latency-critical applications. Y in et al. [13] devised a joint optimization strategy for user pairing, power allocation, and content server placement in NOMA-assisted wireless caching networks, improving hit probabilities under dynamic chan- nel conditions. Furthermore, Dong et al. [14] focused on reliability-aware services in NOMA-based networks, incorpo- rating power allocation and task ofﬂoading to ensure ultra- reliable lo w-latency communications. Lim et al. [15] tackled the challenge of imperfect successiv e interference cancellation (SIC) in mmW av e-NOMA, proposing a cross-entropy-based clustering and beamforming algorithm to optimize system performance. Qin et al. [16] proposed a cluster-NOMA frame- work for space-air-ground edge computing, utilizing multi- agent learning to optimize task ofﬂoading, caching, and rout- ing for reduced IoT latency . Fang et al. [17] dev eloped a joint ofﬂoading and caching scheme in multi-cell NOMA networks, employing deep reinforcement learning and clustering to min- imize content deliv ery delay . Dou et al. [18] introduced a NOMA-assisted system for integrated sensing and of ﬂoading, optimizing beamforming and resource allocation to minimize energy while preserving sensing performance. Despite these contributions, existing research often ad- dresses isolated optimization dimensions, such as caching or task ofﬂoading, without fully e xploiting their interplay , leading to suboptimal solutions or high computational complexity . In contrast, our proposed framework comprehensiv ely inte grates user association, cache placement, and power allocation in the NOMA-enabled F-RANs. By employing NOMA at both F APs and the cloud access point (CP), we enhance cloud-fog collaboration to achiev e low-delay services. Dif ferent from all existing works that apply NOMA exclusi vely at the fog/edge tier , the proposed frame work innov atively introduces po wer- domain NOMA at the CP to realize a dual-purpose ‘push- and-deliv er’ transmission strategy . This enables the CP to simultaneously push uncached popular ﬁles to multiple F APs and deliver requested content directly to UEs using a single superimposed signal—drastically reducing fronthaul latency compared with orthogonal schemes that require sequential transmissions. T able I compares our work with these recent studies, highlighting our unique contributions in jointly op- timizing multiple dimensions and leveraging NOMA across both edge and cloud tiers for superior latency performance. B. Motivations and Contrib utions While NOMA offers potential beneﬁts in terms of efﬁcient resource allocation, the integration of NOMA with F-RANs introduces new challenges that must be addressed to fully realize these beneﬁts. The tightly coupled nature of computa- tion and communication resources in NOMA-based F-RANs complicates the management of these resources. V ariability in user demand, the limited capacity of F APs, and the coe xistence of di verse user access modes all complicate the resource allocation process. These challenges are further exacerbated by the need for real-time decision-making, which is essential for meeting the stringent low-delay requirements of modern networks. As a result, joint user association and resource allocation strategies become critical, as they must dynamically account for the varying nature of both communication and computation resources to minimize transmission delay and optimize network performance. Despite extensi ve research on NOMA and F-RANs in- dividually , and some exploration of their integration, there remains a critical gap in addressing these combined challenges from a delay perspecti ve. Existing studies on NOMA-based F-RANs often overlook the complexities introduced by the need for joint resource management, particularly the non- con vex optimization problem that arises when designing user association, cache placement, and power allocation strategies under practical constraints such as F AP caching capacity and power budgets. Traditional approaches often fail to adequately address these issues or result in solutions that are computa- tionally prohibitiv e, limiting their practical applicability . Giv en the challenges in managing resource allocation and reducing delays in F-RANs, this paper introduces a nov el NOMA-based F-RAN framew ork. This framework leverages NOMA ’ s ﬂexible resource management capabilities within a fog-cloud architecture. A joint resource allocation problem is formulated that integrates user association, cache manage- ment, and po wer allocation, addressing these interconnected challenges through advanced optimization techniques. The solution balances the need for performance enhancement with computational efﬁcienc y , as demonstrated by the signiﬁcant reduction in transmission delay observed in our simulations. The main contributions are as follo ws: 1) W e propose a nov el low-delay service framework for NOMA-based F-RANs, where F APs leverage NOMA for local deliv ery of cached content, while the CP employs NOMA to simultaneously push content to F APs and directly serve users. This frame work explicitly ac- counts for the constraints of F AP caching capacity and power allocation. W e formulate a non-conv ex mixed- integer nonlinear programming (MINLP) problem aimed at minimizing the av erage delay through joint optimiza- tion of user association, cache placement, and power allocation. 2) Giv en the non-deterministic polynomial-time (NP)-hard nature of the problem, an alternating optimization (A O) algorithm is developed, which decomposes the original problem into two subproblems, namely a) Joint user association and cache placement subproblem, and b) P ower allocation optimization subproblem. For the joint user association and cache placement, we introduce a lo w-complexity algorithm based on the McCormick en velope theory and Lagrangian partial r elaxation , ef- fectiv ely addressing the multidimensional resource con- straints. For the non-conv ex power allocation problem, we propose a successive con vex appr oximation (SCA) - based algorithm, thereby signiﬁcantly reducing the com- 3 T ABLE I: Comparison of Our Contributions with the State-of-the-Art W orks Aspect [11] [12] [13] [14] [15] [16] [17] [18] Proposed User Association × × ✓ × ✓ × ✓ × ✓ Cache Placement × ✓ ✓ ✓ × ✓ × × ✓ Power Allocation ✓ × ✓ ✓ ✓ ✓ ✓ ✓ ✓ NOMA in F AP ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ NOMA in CP × × × × × × × × ✓ Low Delay Service × ✓ × × × ✓ ✓ × ✓ Cloud-Fog Collaboration × ✓ × × × ✓ ✓ ✓ ✓ Joint Assoc./Caching/Power Alloc. × × × × × × × × ✓ putational complexity while maintaining near-optimal performance. 3) W e demonstrate that the proposed NOMA-based F- RANs signiﬁcantly outperforms OMA-based F-RANs systems in terms of av erage transmission delay in differ - ent scenarios. Additionally , the proposed A O algorithm strikes an ef fecti ve balance between the achiev ed system performance and computational efﬁciency , making it a viable candidate for practical deployment in future wireless networks. C. Or ganization The structure of this paper is organized as follows: Section II presents the system model for NOMA-based F-RANs and formulates the problem addressed in this study . In Section III, we develop a low-complexity algorithm that integrates the subproblems of user association, cache placement, and power allocation into a cohesiv e framew ork. Section IV provides numerical results and assesses the performance of the proposed algorithm. Finally , Section V concludes the paper with a summary of the ﬁndings. I I . S Y S T E M M O D E L A. System Description As depicted in Fig. 1, we examine a downlink NOMA- based F-RANs conﬁguration consisting of a centralized CP , N F APs, and K user equipments (UEs). 1 The CP and F APs are situated at predetermined positions within a circular region D with radius R D , while the UEs are distributed uniformly throughout this area. Each device is assumed to be equipped with a single antenna. Meanwhile, the collection of F APs and the CP is represented by the set N = { 0 , 1 , 2 , . . . , N } , where the element 0 in N speciﬁcally denotes the CP . The set of F APs (excluding the CP) is denoted by N + = N \ { 0 } . The set of UEs is denoted by K = { 1 , 2 , 3 , . . . , K } . Each F AP has caching capabilities and can deliver data to UEs by utilizing cached content or through edge processing. Multiple UEs are simultaneously served by each F AP using the NOMA protocol. The F APs are connected to the CP via a shared wireless fronthaul link. The NOMA protocol allo ws the CP to execute the push-and-deli very strategy [19], enhancing 1 This architecture is particularly suited for practical scenarios such as IoT networks, smart cities, and vehicular networks, where low-latenc y content deliv ery and efﬁcient spectrum utilization are critical. For instance, in IoT networks, F APs serve numerous devices with cached content, while in vehicular networks, roadside units acting as F APs provide localized processing in high-mobility settings. FAP FAP         Centralized cloud access point UE push-and-deliver NOMA UE NOMA UE NOMA Fig. 1: An illustration of the NOMA-based F-RANs. spectrum efﬁciency in wireless caching. This strategy enables concurrent wireless fronthaul transmissions between the CP and F APs and wireless access transmissions from the CP to UEs. If the requested content is not available at the local F AP , the UE is served directly by the CP . NOMA is applied to achiev e dual objectiv es: content pushing to F APs for future use and immediate content deliv ery to UEs by the CP . The CP and F APs operate on separate frequency bands to eliminate cross- tier interference. The CP uses its dedicated band for dual- purpose NOMA transmission, while F APs share a distinct band for NOMA-based access. This design decouples interference while maximizing intra-tier spectral efﬁciency via NOMA. B. Cache Model Consider a set of popular content ﬁles denoted by F = { 1 , 2 , 3 , . . . , F } , which are distributed at the CP according to the Zipf distribution. Each requested content ﬁle f has a packet size of s f . The CP can access the entire content library , while each F AP is capable of prefetching and storing selected content ﬁles from the CP’ s library in its local memory . When serving users, if the requested content is not stored locally , the F APs can directly download the data from the CP . Additionally , F APs can decode the data from the receiv ed signals and deliv er the requested content to users, thereby fulﬁlling their demands. Each F AP is equipped with a cache of capacity S n . T o optimize content deliv ery , each F AP n initially downloads the most popular content ﬁles f ∈ F from the CP and stores them within its cache according to its capacity . The network controller determines which content should be cached at each F AP n . The caching decision is represented as the 4 binary decision variable c n,f ∈ { 0 , 1 } , which depicts whether the content f should be cached or not at F AP n , where c n,f ∈ { 0 , 1 } , f ∈ F , n ∈ N + . (1) In this context, c n,f = 1 indicates that content f is stored at F AP n , while c n,f = 0 indicates that it is not. If content f is already cached at F AP n , there is no need for F AP n to retrie ve it from the CP when it is requested by associated UEs. Howe ver , if c n,f = 0 , F AP n must fetch the content from the CP upon recei ving a request from its associated UEs. The cache size constraint for each F AP n is e xpressed as X f ∈F c n,f s f ≤ S n , n ∈ N + . (2) The caching decision for all F APs is represented by the vector c ≜ ( c n,f ) f ∈F ,n ∈N . C. NOMA-based Communication Model 1) T ransmission fr om F AP to UE: For n ∈ N , k ∈ K , let x n,k denote whether UE choosing to associated with node n , where x n,k ∈ { 0 , 1 } , n ∈ N , k ∈ K . (3) Here, x n,k = 1 means that the k -th UE is served by the access node n , and otherwise if x n,k = 0 . W e denote x ≜ ( x n,k ) n ∈N ,k ∈K as the UE association vector . It is assumed that the individual UEs can be associated with one access node at the same time. The UEs associated with the same node are considered as a cluster . Let K n represent as the cluster formed by node n . In any cluster K n that contains t UEs, where t = |K n | , the recei ved signal by UE k associated with F AP n is given by y n,k = h n,k √ p n,k s n,k + h n,k t X i =1 ,i  = k √ p n,i s n,i | {z } intra-F AP interference + N X j =1 ,j  = n t X i =1 h j,k √ p j,i s j,i | {z } inter-F AP interference + n n,k , (4) where h n,k ∈ C represents the complex channel gain between UE k and F AP n , p n,k ≥ 0 denotes the po wer allocation for user k by F AP n , s n,k is the transmitted signal for UE k , normalized such that E [ | s n,k | 2 ] = 1 , and n n,k ∼ C N (0 , σ 2 ) is the additiv e white Gaussian noise (A WGN) at UE k , with variance σ 2 > 0 . In addition, the channel coefﬁcient for UE k in cluster K n , where t = |K n | , is giv en by | h n,k | 2 = | g n,k | 2 L − 1 n,k , where g n,k ∈ C denotes the small- scale fading, with | g n,k | 2 ∼ Exp (1) for Rayleigh fading, L n,k represents the large-scale path loss coefﬁcient. In addition, the power coefﬁcients within any cluster K n are constrained by P t i =1 p n,i = p n and p n,i ≥ 0 for all i ∈ K n . The inter-F AP interference I n,k is deﬁned as: I n,k = N X j =1 ,j  = n t X i =1 | h j,k | 2 p j,i = N X j =1 ,j  = n | h j,k | 2 p j , (5) where p j = P i ∈K j p j,i is the total transmit power of F AP j . In the proposed NOMA-based F-RANs, intra-F AP interfer- ence results from co-channel transmissions under the NOMA scheme, and SIC is employed to mitigate this interference [20]. According to the principles of SIC and NOMA, the UE with strong channel conditions cancel out interference from UEs with weaker conditions, while UEs with weaker channels must consider the signals from stronger UEs as additional noise. Consequently , the signal-to-interference-plus-noise ratio (SINR) for UE k associated with F AP n is gi ven by γ n,k = | h n,k | 2 p n,k P i ∈K n \{ k } : π ( i ) >π ( k ) | h n,k | 2 p n,i + I n,k + σ 2 . (6) where π ( k ) ∈ { 1 , 2 , . . . , t } denotes the decoding order posi- tion of UE k , with higher π ( k ) indicating a stronger channel condition. Note that the con ventional decoding order in single-cell NOMA systems, which relies solely on the order of channel gains, cannot be directly applied in the proposed multi- F AP scenario. In particular , both intra-F AP interference and inter-F AP interference must be accounted for in the effecti ve channel conditions. As discussed in [21], one of the challenges in establishing the SIC decoding order lies in the fact that interference in a giv en F AP depends on the po wer and time- frequency resources allocated not only locally but also in other F APs. Hence, the exact decoding order is generally not known a priori , because it is inherently coupled with the interference lev els arising from resource usage decisions across multiple F APs. Despite these complications, the fundamental principle for SIC decoding in NOMA still applies. Formally , consider two UEs, k and m , within the same cluster K n . Suppose UE k has a better channel condition than UE m , denoted by m < k . Perfect SIC occurs if the decoding rate of UE k for UE m ’ s signal meets or exceeds the rate at which UE m decodes its own signal, that is, R n,k → m ≥ R n,m → m . In our NOMA-based F-RAN setup, each F AP applies SIC on a per -cluster basis, while treating the aggreg ated inter -F AP interference as noise, similar to the single-cell case. Building on this principle from [22], we adopt the follo wing optimal decoding order for cluster K n . Then, the decoding sequence can be expressed as Q ( K n ) ≜ I n, 1 + σ 2 | h n, 1 | 2 ≥ I n, 2 + σ 2 | h n, 2 | 2 ≥ ... ≥ I n,t + σ 2 | h n,t | 2 . (7) Based on (6), the transmission rate of UE k at F AP n can be expressed as R n,k = B log 2 (1 + γ n,k ) , (8) where B > 0 is the total channel bandwidth in hertz (Hz). 2) T ransmission fr om CP to UE: If content requirements of the UE cannot be met by the F AP mode, the system will utilize the CP mode to process the request of UE, i.e., x 0 ,k = 1 . In this scenario, the CP emplo ys NOMA technology to e xecute the push and deliver strategy , ensuring efﬁcient content distri- bution. Speciﬁcally , the CP transmits a superimposed signal comprising the following two components: 1) Direct content deliv ery: Files directly requested by UEs that cannot be served 5 by F APs; 2) Content pushing to F APs: Files that are not cached at the F APs but are requested by UEs associated with those F APs. This simultaneous transmission on the same resource block is enabled exclusi vely by NOMA ’ s superposition coding. Under OMA, these operations would require two orthogonal time slots, signiﬁcantly increasing fronthaul delay . W e deﬁne the set of ﬁles that need to be pushed by the CP as F push . A ﬁle f ∈ F is included in F push if it satisﬁes the following condition: F push = ( f ∈ F   X n ∈N X k ∈K x n,k r k,f (1 − c n,f ) ≥ 1 ) , (9) where f ∈ F represents a ﬁle in the content library , x n,k = 1 indicates that UE k is associated with F AP n , r k,f = 1 indicates that UE k requests ﬁle f , c n,f = 0 indicates that ﬁle f is not cached at F AP n . Thus, F push contains all ﬁles that need to be pushed by the CP to the F APs. Giv en the decoding complexity associated with NOMA technology , this paper assumes a simpliﬁed scenario where the CP serves only one UE for direct content deliv ery while simultaneously pushing the required content to F APs. In this setup, the receiv ed signal by UE k associated with the CP directly is given by y 0 ,k = h 0 ,k √ p 0 ,k s 0 ,k + X f ∈F push √ p 0 ,f h 0 ,k s 0 ,f + n 0 ,k , (10) where h 0 ,k ∈ C is the complex channel gain between the CP and UE k , with | h 0 ,k | 2 = | g 0 ,k | 2 L − 1 0 ,k , similar to h n,k . s 0 ,k is the transmitted signal for the k -th user, s 0 ,f denotes the signal that represents the information contained in ﬁle f , normalized such that E [ | s 0 ,k | 2 ] = 1 and E [ | s 0 ,f | 2 ] = 1 . p 0 ,k is the po wer allocated to UE k by the CP , p 0 ,f is the power allocated to ﬁle f ∈ F push . For the CP , power allocation coefﬁcients satisfy p 0 ,k + P f ∈F push p 0 ,f ≤ p 0 , where p 0 > 0 is the total transmit power of the CP , p 0 ,k ≥ 0 and p 0 ,f ≥ 0 , ∀ f ∈ F . n 0 ,k ∼ C N (0 , σ 2 ) is the A WGN at UE k , with variance σ 2 > 0 . The signal intended for pushing ﬁles to the F AP is treated as interference by user k , and the received SINR of the UE k directly served by the CP can be calculated as γ 0 ,k = | h 0 ,k | 2 p 0 ,k | h 0 ,k | 2 P f ∈F push p 0 ,f + σ 2 . (11) Based on (11), the transmission rate of UE k at the CP can be expressed as R 0 ,k = B log 2 (1 + γ 0 ,k ) . (12) 3) T ransmission fr om CP to F AP: Each F AP n is capable of decoding additional ﬁles f that are pushed from the CP through the superimposed coded signals. The decoding order using SIC is based on the popularity of the ﬁles, where a ﬁle j with higher popularity (i.e., j < f ) is decoded before a ﬁle f with lower popularity . Assuming that all ﬁles with higher popularity , denoted as j where j < f , have been successfully decoded and their effects have been subtracted, the SINR for F AP n when decoding ﬁle f is giv en by γ f 0 ,n = | h 0 ,n | 2 p 0 ,f | h 0 ,n | 2 P |F push | i = f +1 p 0 ,i + σ 2 , (13) where | h 0 ,n | 2 = L ( d 0 ,n ) − 1 , and L ( d 0 ,n ) is the path loss function. In the channel model between the CP and the F AP , small-scale multipath fading is not considered because the F AP and the CP can be connected via a line-of-sight (LoS) link, where lar ge-scale path loss dominates. The corresponding downlink data rate is gi ven by R f 0 ,n = B log 2 (1 + γ f 0 ,n ) . (14) If R f 0 ,n ≥ R f , then ﬁle f can be decoded and subtracted correctly at F AP n , where R f denotes the target data rate of ﬁle f . D. Delay Model When UE k selects the edge cloud mode to access F AP n , the transmission delay depends on whether the requested content is cached at the F AP . If the content is cached, the UE retrie ves it directly from the F AP’ s local cache without incurring additional delay via the fronthaul link. Otherwise, the F AP must fetch the content from the CP , introducing fronthaul delay . Ho wever , practical F-RAN systems incur additional latencies be yond pure transmission time, particularly due to the requisite control signaling and protocol interactions for link establishment [23]. T o account for these practical constraints while maintaining the tractability of the system model, we introduce a constant overhead term D ov , which represents the aggregated latenc y attributable to backhaul signaling and connection setup overhead [24]. T o formulate the delay model, we deﬁne the request indi- cator variable r k,f ∈ { 0 , 1 } , where r k,f = 1 if UE k requests content f , and 0 otherwise, with the constraint P F f =1 r k,f = 1 ensuring each UE requests exactly one content ﬁle. Follo wing the standard transmission models in [25], the total delay from F AP n to UE k , denoted D n,k , is given by: D n,k = D F n,k + (1 − c n,f ) D B n,k + D ov = F X f =1 r k,f s f R n,k + r k,f (1 − c n,f ) s f R f 0 ,n ! + D ov , (15) where: • D F n,k = P F f =1 r k,f s f R n,k : The access delay , representing the time to transmit the requested content from F AP n to UE k . • D B n,k = P F f =1 r k,f s f R f 0 ,n : The fronthaul delay , representing the time to fetch content f from the CP to F AP n , incurred only when c n,f = 0 . • D ov : The constant signaling and protocol overhead delay . Remark 1: In this study , we focus on minimizing the av erage transmission delay , which consists of the access delay and the wireless fronthaul delay . Follo wing the established con ventions in physical-layer resource management for F- RANs, queuing delay and SIC processing delay are omitted based on the following justiﬁcations: (i) Queuing Delay: W e adopt a snapshot-based optimization approach focusing on the deliv ery of requested content ﬁles under a saturated buffer or a scheduled transmission period. In such a determinis- tic frame work, queuing delay (which depends on stochastic 6 arriv al processes) is typically secondary to the transmission time when handling large content ﬁles. (ii) SIC Processing Delay: In practical NOMA systems with limited cluster sizes (e.g., 2 or 3 users), the iterative decoding latency is in the order of microseconds, which is negligible compared to the transmission and signaling ov erhead D ov . Consequently , it is absorbed into the constant term D ov without loss of generality . Despite the increase of the number of F APs av ailable for content service, the edge cloud remains unable to meet all user demands due to the limited resources of F APs. Consequently , a signiﬁcant amount of content must still be deli vered through the centralized CP . The delay from the centralized CP to UE k can be expressed as D 0 ,k = F X f =1 r k,f s f R 0 ,k + D ov . (16) E. Pr oblem F ormulation For the sake of improving the performance of the proposed NOMA based F-RANs, an optimization problem is formulated for minimizing the average system delay . By optimizing user association, cache placement, and power allocation, the prob- lem can be formulated as follo ws: min { x , c , z } 1 K N X n =0 K X k =1 x n,k D n,k (17a) s.t. Q ( K n ) , ∀ k , ∀ n, (17b) F X f =1 c n,f s f ≤ S n , ∀ n ∈ N + , (17c) X n ∈N x n,k = 1 , ∀ k ∈ K , (17d) X k ∈K x n,k ≤ F n , ∀ n ∈ N , (17e) c n,f ∈ { 0 , 1 } , ∀ n ∈ N + , ∀ f ∈ F , (17f) x n,k ∈ { 0 , 1 } , ∀ n ∈ N , ∀ k ∈ K , (17g) K X k =1 p n,k ≤ p max n , ∀ n ∈ N + , (17h) p n,k ≥ 0 , ∀ n ∈ N , ∀ k ∈ K , (17i) p 0 ,k + X f ∈F push p 0 ,f ≤ p max 0 , (17j) p 0 ,k ≥ 0 , p 0 ,f ≥ 0 , ∀ k ∈ K , ∀ f ∈ F . (17k) Constraint (17b) ensures the optimal decoding order within each F AP’ s user cluster , guaranteeing that an y UE can suc- cessfully decode the signal of a UE with weaker channel conditions. Constraint (17c) imposes the cache capacity limit of the F AP . Constraint (17d) represents the user association constraint, stipulating that each user can associate with only one F AP . Constraint (17e) limits the number of users asso- ciated with a access point to a maximum of F n . Constraint (17f) speciﬁes that the cache decision variables are binary . Constraint (17g) indicates that the user association decision variables are binary as well. Constraint (17h) speciﬁes the transmit power constraint for F AP n , where the maximum al- low able transmit power is denoted by p max n . Finally , constraint (17i) asserts that the power allocation variables are positive. Constraint (17j) deﬁnes the po wer constraint of the CP , with a maximum transmit power limit of p max 0 . Finally , constraint (17k) asserts that the po wer allocation variables for the CP are positiv e. The formulated problem is a non-con vex MINLP problem. The user association variable x n,k and the cache placement variable c n,f are binary , which introduces integer constraints in (17c), (17d), and (17e). Moreov er, the objecti ve function presented in (17) is non-conv ex, making the optimization problem inherently combinatorial. While techniques such as exhausti ve search can guarantee ﬁnding the global optimum, they are typically associated with exponentially increasing computational complexity . I I I . P RO P O S E D L OW C O M P L E X I T Y S O U L U T I O N T o ov ercome aforementioned issue, the problem (17) is decomposed into two subproblems, namely 1) the user asso- ciation and cache placement subproblem, and 2) the power allocation subproblem. These subproblems are then solved alternately in an iterativ e manner . It should be noted that the constant signaling ov erhead D ov , introduced in the delay model, is omitted in the following objecti ve function formu- lations. Since D ov is a constant term independent of the op- timization variables, its exclusion simpliﬁes the mathematical deriv ation without affecting the determination of the optimal solution. In Section III-A and Section III-B, a low-comple xity joint user association and cache placement algorithm based on the McCormick env elope theory and Lagrangian partial relaxation method is proposed, along with a low-complexity power allocation algorithm based on SCA. A. Pr oposed algorithm for user association and caching placement In this subsection, we propose the algorithms that solve the user association and caching placement subproblem. Given that the user association and cache placement subproblems are NP-hard, the comple xity is exceedingly high. T o reduce the problem’ s complexity , this subsection proposes a distributed algorithm. First, leveraging McCormick env elope theory , the optimization problem is equi v alently transformed. Next, the transformed problem is solved using a Lagrangian partial relaxation method, breaking it down into several subproblems. It is evident that the cache variables c n,f and user association variables x n,k are coupled in the objective function (17a), making it challenging to solve. T o decouple the user asso- ciation and cache variables in the problem (17), new variable z k f ,n is introduced, where z k f ,n = (1 − c n,f ) x n,k . Consequently , the optimization subproblem for a ﬁx ed power allocation can be reformulated as follo ws: min { x , c , z } 1 K ( N X n =1 K X k =1 ( x n,k D F n,k + z k f ,n D B n,k ) + K X k =1 x 0 ,k D 0 ,k ) (18a) s.t. (17b) − (17g) , (18b) 7 z k f ,n = (1 − c n,f ) x n,k , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F . (18c) Here, N + = N \ { 0 } denotes the set of F APs e xcluding the CP , (18c) represents a non-con ve x constraint. By utilizing the McCormick en velope theory [26] to relax the constraint, it is giv en by the follo wing equations: z k f ,n ≥ x n,k − c n,f , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (19) z k f ,n ≥ 0 , ∀ k ∈ K , ∀ f ∈ F , (20) z k f ,n ≤ x n,k , ∀ k ∈ K , ∀ f ∈ F , (21) z k f ,n ≤ 1 − c n,f , ∀ k ∈ K , ∀ f ∈ F . (22) Giv en the binary and discrete nature of the optimization variables c n,f and x n,k , it can be rigorously proven that the equality z k f ,n = (1 − c n,f ) x n,k is equi valent to the constraint conditions (19) and (22), as sho wn in T able II. T ABLE II: Equiv alent T ransformation between Equality (18c) and Con ve x Relaxation Conditions x n,k c n,f z k f ,n max( x n,k − c n,f , 0) min( x n,k , 1 − c n,f ) 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 1 0 0 0 Therefore, the optimization problem (18) can be further expressed as: min { x , c , z } 1 K ( N X n =1 K X k =1 ( x n,k D F n,k + z k f ,n D B n,k ) + K X k =1 x 0 ,k D 0 ,k ) (23a) s.t. (17b) − (17g) , (19) − (22) . (23b) T o solve the new optimization problem (23), the Lagrangian partial relaxation method is employed [27]. Speciﬁcally , the constraints (19), (21), and (22) are relaxed, and a set of dual Lagrangian multipliers is introduced: λ k f ,n ≥ 0 , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (24) µ k f ,n ≥ 0 , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (25) ψ k f ,n ≥ 0 , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F . (26) Thus, the Lagrangian function can be expressed as: L ( λ , µ , ψ , x , c , z ) = 1 K ( N X n =1 K X k =1 ( x n,k D F n,k + z k f ,n D B n,k ) + K X k =1 x 0 ,k D 0 ,k ) + N X n =1 K X k =1 F X f =1 [ µ k f ,n ( x n,k − c n,f − z k f ,n ) + λ k f ,n ( z k f ,n − x n,k ) + ψ k f ,n ( z k f ,n + c n,f − 1)] . (27) The dual problem corresponding to the original optimization formulation can be deri ved by introducing dual variables, leading to the follo wing dual optimization problem: max { λ , µ , ψ } min { x , c , z } L ( λ , µ , ψ , x , c , z ) (28a) s.t. (17b) − (17g) , (20) , (24) − (26) . (28b) In this formulation, the objecti ve function of the dual problem and is given by L ( λ , µ , ψ , x , c , z ) = f ( x ) + g ( c ) + h ( z ) , (29) Here, the objectiv e functions f ( x ) , g ( c ) , and h ( z ) correspond to the subproblems P1, P2, and P3, respectively . Moreover , the feasible region of the problem (28) is decomposable into three distinct regions, namely { (17b), (17d), (17e), (17g) } , { (17c), (17f) } and { (20) } . As a result, the problem (28) is decomposed into three separate subproblems, denoted as P1, P2, and P3. These subproblems can be formulated as follows: P 1 : min { x } 1 K ( N X n =1 K X k =1 x n,k D F n,k + K X k =1 x 0 ,k D 0 ,k ) + N X n =1 K X k =1 F X f =1 ( µ k f ,n x n,k − λ k f ,n x n,k ) (30a) s.t. (17b) , (17d) , (17e) , (17g) . (30b) P 2 : max { c } N X n =1 K X k =1 F X f =1 ( µ k f ,n c n,f − ψ k f ,n c n,f ) (31a) s.t. (17c) , (17f) . (31b) P 3 : min { z } 1 K N X n =1 K X k =1 z k f ,n D B n,k + N X n =1 K X k =1 F X f =1 ( λ k f ,n z k f ,n + ψ k f ,n z k f ,n − µ k f ,n z k f ,n ) (32a) s.t. (20) . (32b) After decomposition, the joint optimization problem is effecti vely reduced to separate optimization tasks, removing the coupling between user association and cache placement variables. It can be demonstrated that these three subproblems are integer programming problems gi ven a speciﬁc set of Lagrange multipliers. Speciﬁcally , P1, which in volves user association, can be addressed using the Hungarian algorithm [28]. Meanwhile, P2 which in volv e cache placement and related decisions, can be efﬁciently solved via integer linear programming [29]. For P3, z k f ,n is set to 1 if 1 K D B n,k + λ k f ,n + ψ k f ,n − µ k f ,n < 0 , and 0 otherwise. Upon solving these subproblems, the optimal values for x , c , and z are obtained. Subsequently , the dual problem is addressed using the subgradient method, where the subgradients of the dual function are computed as follows: ∇ µ k f ,n ( t ) = x n,k ( t ) − c n,f ( t ) − z k f ,n ( t ) , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (33) 8 ∇ λ k f ,n ( t ) = z k f ,n ( t ) − x n,k ( t ) , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (34) ∇ ψ k f ,n ( t ) = z k f ,n ( t ) + c n,f ( t ) − 1 , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (35) In the ( t + 1) th iteration, the subgradients ∇ µ k f ,n ( t ) , ∇ λ k f ,n ( t ) and ∇ ψ k f ,n ( t ) are utilized to update three dual variables. Consequently , three update e xpressions for the dual variables at iteration ( t + 1) can be formulated as follows: µ k f ,n ( t + 1) =  µ k f ,n ( t ) + ξ ( t ) × ∇ µ k f ,n ( t )  + , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (36) λ k f ,n ( t + 1) =  λ k f ,n ( t ) + ξ ( t ) × ∇ λ k f ,n ( t )  + ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (37) ψ k f ,n ( t + 1) =  ψ k f ,n ( t ) + ξ ( t ) × ∇ ψ k f ,n ( t )  + , ∀ n ∈ N + , ∀ k ∈ K , ∀ f ∈ F , (38) where [ x ] + = max { x, 0 } denotes the non-negati ve part of x , and ξ ( t ) = 0 . 01 √ t represents the positiv e step size at the ( t + 1) th iteration. It is important to note that this step size is non-accumulativ e and diminishes over time. As demonstrated in [30], such a step size ensures that the algorithm conv erges to the optimal solution. Algorithm 1 Joint User Association and Cache Placement Algorithm 1: Set µ k f ,n (1) = 0 . 1 for ﬁles f requested by user k , else 0; λ k f ,n (1) = 0 , ψ k f ,n (1) = 0 , W (1) = 0 , t = 1 , t max = 20 , ε = 10 − 3 . Compute initial decoding order and F push . 2: While t ≤ t max and | W ( t ) − W ( t − 1) | ≥ ε do 3: Solve subproblem P1 for x ( t ) . 4: Solve subproblem P2 for c ( t ) . 5: Solve subproblem P3 for z ( t ) . 6: Update decoding order and F push based on x ( t ) and c ( t ) . 7: Compute the objectiv e function value W ( t ) = 1 K ( N P n =1 K P k =1 ( x n,k D F n,k + z k f ,n D B n,k ) + K P k =1 x 0 ,k D 0 ,k ) . 8: Update multipliers via (36)-(38) with ξ ( t ) = 0 . 01 √ t . 9: t = t + 1 10: End While 11: Output : x ∗ = x ( t ) , c ∗ = c ( t ) The proposed algorithm, outlined in Algorithm 1, addresses the joint user association and caching placement optimization problem. Its overall computational complexity is e v aluated under the assumption that subproblems P1, P2, and P3 are ex ecuted in parallel within each iteration. The analysis consid- ers the per-iteration complexity and the number of iterations required for con vergence. P1 is solved using an Hungarian algorithm, optimizing user assignments across N + 1 nodes. Its complexity is O ( K 3 ) , dri ven by the cubic scaling of the Hungarian method with the number of UEs K , which dominates other computations like decoding order and inter- ference calculations. P2 optimizes caching decisions for each of the N F APs via integer linear programming. Using practical solvers (e.g., interior-point methods for relaxed problems), the complexity per F AP is O ( F 3 . 5 ) , where F is the number of ﬁles. Across N F APs, the total comple xity is O ( N F 3 . 5 ) . P3 optimizes auxiliary variables for F AP-UE-ﬁle triplets, with a complexity of O ( N K F ) , as it scales linearly with the number of F APs N , UEs K , and ﬁles F . When P1, P2, and P3 are solved in parallel, the per-iteration complexity is governed by the most computationally intensive subproblem. The algorithm con verges in O (1 /ε ) iterations. Hence, the total time com- plexity is: O  1 ε · max  K 3 , N F 3 . 5 , N K F  . An exhausti ve search would yield a complexity of O (( N + 1) K ) , e xponen- tially scaling with K . In contrast, the proposed algorithm’ s polynomial complexity offers substantial efﬁcienc y , making it practical for large-scale wireless networks. B. Pr oposed power allocation algorithm In this subsection, we address the po wer allocation sub- problem under ﬁxed user association and cache placement, as determined by the solutions to the subproblems discussed in Section III-A. It is important to highlight that the model considered here does not account for cross-layer interference between the CP and F APs, focusing only on intra-cell and inter-cell interference. For each layer, the power allocation coefﬁcients within that layer do not inﬂuence the performance of other layers. The objecti ve is to minimize the average delay in content deli very across a wireless network comprising a CP and N F APs, while adhering to power constraints and accounting for interference and caching effects. The power allocation coefﬁcients can be iteratively optimized during each iteration of the proposed joint user association and cache placement algorithm, enabling the simultaneous improvement of both aspects. The original problem formulated in (17) under ﬁxed user association and cache placement is given by: min { p n,k } 1 K ( N X n =1 X k ∈K n D n,k + X k ∈K 0 D 0 ,k ) (39a) s.t. X k ∈K n p n,k ≤ p max n , ∀ n ∈ N + , (39b) p n,k ≥ 0 , ∀ n ∈ N + , ∀ k ∈ K n , (39c) p 0 ,k + X f ∈F push p 0 ,f ≤ p max 0 , (39d) p 0 ,k ≥ 0 , p 0 ,f ≥ 0 , ∀ k ∈ K 0 , ∀ f ∈ F push . (39e) This problem is non-conv ex due to the interference-coupled rate expressions. T o make the problem tractable, we decom- pose it into two interdependent subproblems: F AP power allo- cation with interference coordination, and CP power allocation with a push-delivery tradeoff. The F APs power allocation subproblem must consider inter- F AP interference while satisfying the per -F AP po wer con- straints. This subproblem is formulated as follows: min { p n,k } N X n =1 X k ∈K n F X f =1 r k,f s f R n,k (40a) s.t. X k ∈K n p n,k ≤ p max n , ∀ n ∈ N + , (40b) p n,k ≥ 0 , ∀ n ∈ N + , ∀ k ∈ K n . (40c) 9 The CP power allocation subproblem balances direct content deliv ery to users and proactive content pushing to F APs. This subproblem is expressed as follo ws: min { p 0 ,k ,p 0 ,f } X k ∈K 0 F X f =1 r k,f s f R 0 ,k + N X n =1 X k ∈K f n X f ∈F push r k,f s f R f 0 ,n , (41a) s.t. p 0 ,k + X f ∈F push p 0 ,f ≤ p max 0 , (41b) p 0 ,k ≥ 0 , p 0 ,f ≥ 0 , ∀ k ∈ K 0 , ∀ f ∈ F push . (41c) Solving the two subproblems provides the solution to the original power allocation problem (39). The power allocation problem for the CP is conceptually similar to that for the F APs. Problem (40) is non-conv ex due to its non-con ve x objectiv e function in (40a), although the constraints are conv ex. This type of problem is NP-hard, and a global optimal solution can theoretically be found using the Branch-and-Bound (BB) algorithm [31]. Howe ver , the exhausti ve search nature of this method results in high computational complexity and inefﬁcienc y . T o ov ercome these challenges, we propose a low-comple xity solution based on the SCA technique, which effecti vely addresses problem (40). First, we propose an equi valent problem for (40), followed by approximating the non-conv ex constraints using a ﬁrst- order approximation to transform them into con vex constraints. Next, an iterativ e algorithm is proposed by applying the SCA algorithm to the transformed problem. Speciﬁcally , in the l -th iteration of the proposed algorithm, problem (40) is equiv alently transformed into the follo wing problem: min { p n,k , ˆ D,τ n,k , υ n,k ,ς n,k } ˆ D (42a) s.t. N X n =1 X k ∈K n F X f =1 ( r k,f s f τ n,k ) ≤ ˆ D , (42b) τ n,k ≤ B log 2 (1 + υ n,k ) , (42c) υ n,k ≤ | h n,k | 2 p n,k ς n,k , (42d) | h n,k | 2 t X i = k +1 p n,i + N X j =1 ,j  = n | h j,k | 2 p j + σ 2 ≤ ς n,k , (42e) (40b) , (40c) , (42f) where n ∈ N + , f ∈ F , k ∈ K . ( p ⋆ n,k , τ ⋆ n,k , υ ⋆ n,k , ς ⋆ n,k ) and ˆ D ⋆ denote the optimal solution and corresponding optimal v alue, respectiv ely . Lemma 1 (Equiv alence between problem (40) and problem (42)): Problem (40) is equiv alent to problem (42). Pr oof : It is clear that problem (40) qualiﬁes as a non-conv ex optimization problem, primarily due to its non-con ve x nature of the objectiv e function. T o aid in solving the problem, we introduce a new variable ˆ D , allowing the original problem to be equiv alently reformulated as follo ws: min { p n,k , ˆ D } ˆ D (43a) s.t. N X n =1 X k ∈K n F X f =1 ( r k,f s f R n,k ) ≤ ˆ D , (43b) (40b) , (40c) . (43c) The equiv alence between problem (43) and problem (40) is straightforwardly pro ved as all constraints in (43) remain equal under optimal conditions. By introducing new v ariables τ ≜ ( τ n,k ) n ∈N ,k ∈K and focusing on constraint (43b), it is equiv alent to the following two constraints: N X n =1 X k ∈K n F X f =1 ( r k,f s f τ n,k ) ≤ ˆ D , (44) τ n,k ≤ R n,k , (45) where n ∈ N + , f ∈ F , k ∈ K . Next, by introducing two new variables υ ≜ ( υ n,k ) n ∈N ,k ∈K and ς ≜ ( ς n,k ) n ∈N ,k ∈K , equation (45) can be equiv alently rewritten as: τ n,k ≤ B log 2 (1 + υ n,k ) , (46) υ n,k ≤ | h n,k | 2 p n,k ς n,k , (47) | h n,k | 2 t X i = k +1 p n,i + N X j =1 ,j  = n | h j,k | 2 p j + σ 2 ≤ ς n,k . (48) W ith the above analysis, the equiv alence transformation of problem (40) is easily obtained when the constraints (44), (46)–(48) hold as equalities [32]. The proof is completed. ■ All constraints in problem (42) are con ve x except for (42d). First, consider the transformation of constraint (42d), where the term υ n,k ς n,k can be rewritten as follows: υ n,k ς n,k = 1 4 [( υ n,k + ς n,k ) 2 − ( υ n,k − ς n,k ) 2 ] . (49) After transforming the bilinear term, the resulting expres- sion takes the form of the difference of two conv ex functions within the brackets. This non-conv ex formulation can be efﬁciently addressed using SCA algorithms [33]. The core idea behind SCA is to approximate the objectiv e function with a ﬁrst-order T aylor series, which provides a con vex lower or upper bound. In equation (49), the second term in the brackets is replaced by its conv ex lower bound. By applying the SCA method, the T aylor series expansion at the point ( υ (Φ) n,k , ς (Φ) n,k ) is deriv ed as follows: f ( υ n,k , ς n,k ) = ( υ n,k − ς n,k ) 2 ≥ ( υ (Φ) n,k − ς (Φ) n,k ) 2 + 2( υ (Φ) n,k − ς (Φ) n,k )[( υ n,k − υ (Φ) n,k ) − ( ς n,k − ς (Φ) n,k )] ≜ g ( υ n,k , ς n,k , υ (Φ) n,k , ς (Φ) n,k ) . (50) At the point ( υ n,k , ς n,k ) = ( υ (Φ) n,k , ς (Φ) n,k ) , the af ﬁne approx- imation aligns with the function f ( υ n,k , ς n,k ) . During the 10 iterativ e process, the variable v alues ( υ n,k , ς n,k ) are updated to ( υ (Φ+1) n,k , ς (Φ+1) n,k ) , satisfying the follo wing conditions: f ( υ n,k , ς n,k ) ≥ g ( υ n,k , ς n,k , υ (Φ) n,k , ς (Φ) n,k ) , f ( υ (Φ) n,k , ς (Φ) n,k ) = g ( υ (Φ) n,k , ς (Φ) n,k , υ (Φ) n,k , ς (Φ) n,k ) , ∇ f ( υ n,k , ς n,k ) | ( υ (Φ) n,k ,ς (Φ) n,k ) = ∇ g ( υ n,k , ς n,k , υ (Φ) n,k , ς (Φ) n,k ) | ( υ (Φ) n,k ,ς (Φ) n,k ) , (51) where ∇ f denotes the function gradient. Consequently , the problem (42) can be transformed into min { p , ˆ D , τ , υ , ς } ˆ D (52a) s.t. ( υ n,k + ς n,k ) 2 − g ( υ n,k , ς n,k , υ (Φ) n,k , ς (Φ) n,k ) ≤ 4 | h n,k | 2 p n,k , (52b) (40b) , (40c) , (42b) , (42c) , (42e) . (52c) The optimization problem (52) is con ve x and can be efﬁ- ciently solved for the Φ -th iteration, gi ven the initial values υ (Φ) n,k and ς (Φ) n,k [27]. As established in [22], the SCA-based approach to solving problem (40) con ver ges to a local opti- mum. With an increasing number of iterations, the algorithm approaches KKT points of the original problem (42). Similar to Lemma 1, the CP power allocation subproblem (41) can be equiv alently transformed into the optimization problem (53) through v ariable substitution and constraint reformulation. The structural similarity between problems (40) and (41) allo ws the SCA methodology de veloped for F APs to be directly applied to the CP case. Speciﬁcally , the non- con vex terms in (41a) can be handled using the same conv ex approximation technique presented in (49)-(51). Therefore, we omit the detailed deriv ation for bre vity . The CP optimization problem can be transformed into min { p 0 ,k ,p 0 ,f , ˆ D C P ,τ 0 ,k , τ f 0 ,n ,υ 0 ,k ,υ f 0 ,n ,ς 0 ,k ,ς f 0 ,n } ˆ D C P (53a) s.t. X k ∈K 0 F X f =1 r k,f s f τ 0 ,k + N X n =1 X k ∈K f n X f ∈F push r k,f s f τ f 0 ,n ≤ ˆ D C P , (53b) τ 0 ,k ≤ B log 2 (1 + υ 0 ,k ) , (53c) τ f 0 ,n ≤ B log 2 (1 + υ f 0 ,n ) , (53d) υ 0 ,k ≤ | h 0 ,k | 2 p 0 ,k ς 0 ,k , (53e) ς 0 ,k ≥ | h 0 ,k | 2 X f ∈F push p 0 ,f + σ 2 , (53f) υ f 0 ,n ≤ L − 1 0 ,n p 0 ,f ς f 0 ,n , (53g) ς f 0 ,n ≥ L − 1 0 ,n |F push | X i = f +1 p 0 ,i + σ 2 , (53h) (41b) , (41c) , (53i) where n ∈ { 0 } , f ∈ F , k ∈ K . ( p ⋆ 0 ,k , p ⋆ 0 ,f ) , ( τ ⋆ 0 ,k , τ f ⋆ 0 ,n ) , ( υ ⋆ 0 ,k , υ f ⋆ 0 ,n ) , ( ς ⋆ 0 ,k , ς f ⋆ 0 ,n ) and ˆ D ⋆ C P denote the optimal solution and corresponding optimal v alue, respecti vely . Algorithm 2 SCA-based Power Allocation Algorithm 1: Initialize Φ = 0 , F AP powers P (0) F AP , CP powers P (0) CP , and auxiliary variables ( υ (0) n,k , ς (0) n,k ) . 2: F AP Power Allocation: 3: While Φ < Φ max and not con ver ged do 4: Update ( υ (Φ) n,k , ς (Φ) n,k ) using P (Φ) F AP , NOMA decoding order . 5: Solve conv ex approximation of (52) for P (Φ+1) F AP . 6: Check con vergence with tolerance ϵ F AP . Set Φ = Φ + 1 . 7: End While 8: CP Power Allocation: 9: Set Φ = 0 . 10: While Φ < Φ max and not con ver ged do 11: Update ( υ (Φ) CP ,k , ς (Φ) CP ,k ) using P (Φ) CP . 12: Solve conv ex approximation of (53) for P (Φ+1) CP . 13: Check con vergence with tolerance ϵ CP . Set Φ = Φ + 1 . 14: End While 15: Output: Optimized po wers P ∗ F AP and P ∗ CP . The computational complexity of the proposed SCA-based power allocation algorithm, outlined in Algorithm 2, is driv en by the conv ex optimization steps for F AP and CP power allocations. The algorithm optimizes power for K users, N F APs, and F content ﬁles across N + 1 nodes (CP and F APs). Using CVX with an interior-point method, the per- iteration comple xity includes: Optimizes P F AP ∈ R N × K for users served by N F APs. Assuming K/ N users per F AP , the complexity per F AP is O (( K/ N ) 3 . 5 ) , yielding a total of O ( K 3 . 5 / N 2 . 5 ) across N F APs. Optimizes P CP . users for 1 CP users and P CP . ﬁles for F ﬁles, with complexity O ((1 + F ) 3 . 5 ) . The per -iteration complexity , combining sequential F AP and CP optimizations, is: O ( K 3 . 5 / N 2 . 5 + (1 + F ) 3 . 5 ) . W ith a maximum of Φ max iterations, the total complexity is: O (Φ max ( K 3 . 5 / N 2 . 5 + (1 + F ) 3 . 5 )) . Compared to an ex- haustiv e search ( O (2 N K + K + F ) ), the proposed algorithm’ s polynomial complexity ensures scalability . C. Overall Algorithm In this subsection, we present a comprehensiv e overvie w of the proposed A O algorithm that integrates user association, cache placement, and po wer allocation to minimize average delay . The algorithm iterativ ely solves subproblems to achieve a near-optimal solution, balancing computational complexity and performance. The iterativ e optimization process is outlined as follows: 1) Initialization: Begin with an initial conﬁguration for user association, cache placement, and power allocation. 2) User Association and Cache Placement Optimization: Employ Algorithm 1 to optimize user association and cache place- ment based on the current po wer allocation coefﬁcients. 3) Po wer Allocation Optimization: Utilize Algorithm 2 to reﬁne power allocation coefﬁcients, giv en the ﬁxed user association and cache placement. This step le verages SCA to iteratively improv e power allocation. 4) Iteration: Alternate between 11 Algorithm 1 and Algorithm 2, updating user association, cache placement, and power allocation in each iteration. The total complexity depends on the inner iterations of the subproblems and the outer iterations of the A O algo- rithm. The A O algorithm alternates between Algorithm 1 and Algorithm 2. Let T outer denote the number of outer iterations. Thus, the total complexity of the A O algorithm is: O ( T outer ( 1 ε max( K 3 , N F 3 . 5 , N K F ) + Φ max ( K 3 . 5 / N 2 . 5 + (1 + F ) 3 . 5 ))) . Although the complexity expression in volv es multiple terms, it is dominated by low-order polynomial terms O ( K 3 . 5 + N F 3 . 5 ) per iteration in practical settings where K , N , and F are moderate (e.g., tens of users and ﬁles, as typical in edge networks). This complexity scaling aligns with standard interior-point methods for con ve x optimization [27]. This is fundamentally different from e xponential-complexity global solvers (e.g., branch-and-bound used in LINGO) or NP- hard exhausti ve search with complexity O (2 N K F · K !) , which are veriﬁed to be computationally prohibiti ve ev en for small- scale instances due to the combinatorial nature of MINLP [31]. Thus, the proposed J ACPM+SCA framework achie ves an excellent trade-off between near-optimal performance and practical scalability , fully justifying its classiﬁcation as a low- complexity solution suitable for real-time deployment in 6G edge networks. I V . N U M E R I C A L R E S U LT S A N D D I S C U S S I O N S This section presents the simulation results and validates the effecti veness of the proposed algorithm. In the simulation scenario setup, a circular area with a radius of R D = 500 m is considered, with the CP located at the center of the circle. The positions of the three ﬁxed F APs are at (0 , 0 . 5 R D ) , (0 . 4 R D , − 0 . 3 R D ) , and ( − 0 . 4 R D , − 0 . 3 R D ) . The UEs are uniformly and independently distributed, with a total number of K = 7 . Users request content ﬁles based on the Zipf probability distribution. Considering the ﬁle popularity model, the request probability for the i -th ﬁle is giv en by P ( F i ) = 1 i r P F p =1 1 p r , where r > 0 represents the skewness of content popularity; the larger the Zipf parameter , the more uneven the popularity distribution. Each simulation result is av eraged ov er 100 channel realizations. The transmit power , noise power , path loss model, and other physical layer parameters adhere to 3GPP standards and are informed by cutting-edge research in F-RANs [34]. Unless otherwise speciﬁed, other parameters used in the simulation are summarized in T able III. 2 In the proposed NOMA-based F-RANs, we ev aluate its performance against three benchmark schemes: OMA- based F-RANs, ﬁxed-power NOMA-based F-RANs and Most Popular Content Maximum SINR (MCP-MS) scheme. • OMA : In this scheme, TDMA is employed as the multi- ple access technique at the access points. The time slots for the UEs’ recei ved signals are ev enly distributed, with each UE assigned a time slot allocation factor of 1 /K . 2 In this simulation section, we focus on quantifying the performance gains achiev ed speciﬁcally by the proposed resource allocation algorithm. Since D ov is a constant value determined by protocol standards and cannot be reduced through the optimization of decision variables, and considering it is generally negligible compared to the transmission delay in the examined scenarios, we assume D ov ≈ 0 in the following numerical results. T ABLE III: System Simulation P arameters Parameter V alue T ransmit power of the CP 40 dBm T ransmit power of the F AP 30 dBm Path loss model for the CP 15 . 3 + 37 . 6 log 10 ( d ) , d ( m ) Path loss model for the F AP 38 . 46 + 20 log 10 ( d ) , d ( m ) Noise power spectral density σ 2 − 174 dBm/Hz Bandwidth 10 MHz File size 10 Kb Cache capacity S n 20 Kb User capacity F n (1 , 2 , 2 , 2) Number of ﬁles 10 Number of users 7 Zipf ske wness ( γ ) 0.8 • Fixed-NOMA : For any cluster K n containing t UEs in a cell, the power allocation coefﬁcient for the k -th UE is deﬁned as p n,i = t − i +1 µ , where µ = P t i =1 i ensures P t i =1 p n,i = 1 . • MCP-MS : The benchmark scheme MCP-MS [35], refers to users selecting the access node based on the maximum SINR criterion and caching the most popular ﬁle content. The effecti veness of the proposed algorithm, combining joint user association and cache placement (J A CPM) with successiv e conv ex approximation (SCA) power allocation (J ACPM + SCA), will be e valuated by comparing its perfor- mance against these benchmark schemes. 0 10 20 30 FAP Maximum Transmission Power (dBm) 0 2 4 6 8 10 12 14 16 Average Delay (ms) JACPM+SCA LINGO's global solver (a) Cache capacity = 0 Kb 0 10 20 30 FAP Maximum Transmission Power (dBm) 0 2 4 6 8 10 12 14 16 Average Delay (ms) JACPM+SCA LINGO's global solver (b) Cache capacity = 20 Kb Fig. 2: A verage delay under dif ferent maximum transmission power of F AP Fig. 2 ev aluates the av erage transmission delay performance of the proposed J A CPM+SCA algorithm against the LINGO global solver under varying F AP maximum transmit powers. The e valuation considers two cache capacity scenarios: 0 Kb and 20 Kb. The LINGO solver serves as a benchmark [31], employing an general branch-and-bound approach to provide a baseline for delay performance, at a signiﬁcantly higher computational cost. The proposed J A CPM+SCA algorithm consistently reduces the a verage delay across all tested trans- mit power levels (0 dBm to 30 dBm). Although LINGO yields a slightly lo wer delay due to its global optimization nature, the marginal gap demonstrates that J A CPM+SCA effecti vely approximates the optimal solution with substantially lo wer computational ov erhead. Fig. 2(b) presents the delay perfor - mance for a cache capacity of 20 Kb . Here, J ACPM+SCA 12 records an av erage delay of 12.7835 ms at 30 dBm, compared to LINGO’ s 11.7235 ms, a dif ference of 1.0600 ms (ap- proximately 9.0% higher). The smaller performance gap with increased cache capacity highlights J ACPM+SCA ’ s ability to le verage caching effecti vely , further reducing transmission delays and approaching the globally optimal solution provided by LINGO. T ABLE IV: Computation Time Comparison at F AP Maximum Power of 30 dBm Cache Capacity (Kb) J A CPM+SCA (s) LINGO (s) 0 12.89 7162 20 3.83 1898 The computational efﬁciency of J ACPM+SCA is further substantiated in T able IV, which compares computation times at F AP transmit po wer of 30 dBm. For a cache capacity of 0 Kb, J ACPM+SCA requires only 12.89 s, while LINGO demands 7162 s (approximately 2 hours), achie ving a com- putational speedup exceeding 550 times. With a cache ca- pacity of 20 Kb, J A CPM+SCA completes in 3.83 s, com- pared to LINGO’ s 1898 s, yielding a speedup of nearly 500 times. This dramatic reduction in computation time positions J ACPM+SCA as a highly practical solution for real-time applications in large-scale wireless networks. 0 5 10 15 20 25 30 FAP Maximum Transmission Power (dBm) 12 13 14 15 16 17 18 19 Average Delay (ms) JACPM + OMA JACPM + Fixed-NOMA JACPM + SCA MCP-MS + OMA MCP-MS + Fixed-NOMA MCP-MS + SCA Fig. 3: A verage transmission delay versus F AP maximum transmission power for dif ferent schemes. Fig. 3 ev aluates the average transmission delay perfor- mance of dif ferent schemes under varying F AP maximum transmission po wers, ranging from 0 dBm to 30 dBm. The proposed J A CPM+SCA consistently outperforms all bench- mark schemes across the tested power range. Speciﬁcally , at a F AP maximum transmit power of 30 dBm, the proposed scheme achiev es an average delay of 12.7835 ms, compared to 13.4967 ms for J A CPM + OMA, 14.1593 ms for J ACPM + Fix ed-NOMA, and 15.9902 ms for MCP-MS + SCA. This corresponds to delay reductions of approximately 5.3%, 9.7%, and 20.1%, respecti vely . The performance comparison with the OMA benchmark acts as an ablation study , clearly quantifying the beneﬁts brought by NOMA. The achie ved latency re- duction conﬁrms that power -domain multiplexing successfully enables concurrent content pushing and user request servic- ing, thereby overcoming the resource partitioning overhead of OMA and markedly improving system-wide efﬁciency . Furthermore, the analysis rev eals that the transmission delay decreases noticeably as the F AP transmission power increases within the low-po wer region. For instance, the delay for J ACPM + SCA drops from 13.3416 ms at 0 dBm to 12.8598 ms at 10 dBm, a reduction of approximately 3.6%. Ho wever , this improvement diminishes in the high-po wer region, where the delay decreases from 12.8090 ms to 12.7835 ms, further delay reduction becomes marginal due to the interference- limited nature of NOMA clusters. When the F AP transmit power e xceed approximately 10 dBm, as the transmit power increases, both the desired signal and the intra-cluster/inter- cluster interference increase proportionally , resulting in an almost constant SINR and thus a saturation of the achie vable rate and delay . 500 1000 1500 2000 2500 3000 3500 4000 Cell Radius (m) 12 13 14 15 16 17 18 19 Average Delay (ms) JACPM + OMA JACPM + Fixed-NOMA JACPM + SCA MCP-MS + OMA MCP-MS + Fixed-NOMA MCP-MS + SCA Fig. 4: A verage transmission delay versus cell radius for different schemes. In the follo wing, we conduct a comprehensiv e sensiti vity analysis to ev aluate how the system performance responds to variations in key physical and network parameters. This analysis validates the robustness of the proposed model and algorithm under diverse network conditions. Fig. 4 analyzes the impact of cell radius on average transmission delay . The simulation ev aluates the performance of the proposed J A CPM + SCA scheme against several benchmarks. The cell radius varies from 500 m to 4000 m. As the cell radius increases, the average channel gain diminishes due to heightened path loss, leading to reduced transmission rates and ele v ated delays across all schemes. At a radius of 4000 m, J ACPM + SCA records a delay of 12.9229 ms, compared to 13.6018 ms for J ACPM + OMA (a 5.0% relativ e increase) and 16.1382 ms for MCP-MS + SCA (a 24.8% relati ve increase). The widening performance gap highlights the adaptability of J A CPM + SCA to worsening channel conditions. As radius decreases, the rise in signal po wer is largely of fset by a corresponding increase in interference power , stabilizing the SINR across the range of cell radii. Since the transmission rate depends logarithmically on SINR and delay is in versely related to the rate, this stability translates to a gradual rather than pronounced change in delay . This trend is consistent across all schemes in Fig. 4, where the shallow slopes of delay curves reﬂect the counterbalancing effects of channel gain and interference. Fig. 5 illustrates the relationship between av erage transmis- sion delay and F AP cache capacity . The proposed J A CPM 13 0 10 20 30 40 50 Cache Capacity (Kb) 12 13 14 15 16 17 18 19 Average Delay (ms) JACPM + SCA JACPM + Fixed-NOMA MCP-MS + SCA MCP-MS + Fixed-NOMA Fig. 5: A verage transmission delay versus F AP cache capacity for different schemes. + SCA scheme consistently achieves the lowest transmission delay across all cache capacities. At a cache capacity of 20 Kb, J A CPM + SCA records an average delay of 12.7835 ms, reducing delay by 29.2% compared to MCP-MS + Fixed- NOMA (18.0235 ms), 20.1% compared to MCP-MS + SCA (15.9902 ms), and 9.7% compared to J ACPM + Fixed-NOMA (14.1593 ms). Beyond 20 Kb, the delay for all schemes stabi- lizes, indicating that the cache capacity is sufﬁcient to store all requested content, eliminating additional fronthaul delays. The decrease in delay with increasing cache capacity is attrib uted to the e xpanded feasible region for content caching. Larger cache capacities allow F APs to store more ﬁles, increasing the likelihood of satisfying user requests locally and reducing reliance on fronthaul links, which incur higher delays. 4 6 8 10 12 14 16 Number of Users (K) 0 20 40 60 80 100 120 140 160 180 Average Delay (ms) JACPM + SCA JACPM + Fixed-NOMA MCP-MS + SCA MCP-MS + Fixed-NOMA Fig. 6: A verage transmission delay versus number of users for different schemes. Fig. 6 examines the impact of the number of users K on the average transmission delay . As sho wn in Fig. 6, the proposed J A CPM + SCA scheme consistently achieves the lowest transmission delay across all user counts. Notably , J ACPM + SCA reduces delay by 14.3% compared to J A CPM + Fixed-NOMA, 86.1% compared to MCP-MS + SCA, and 91.6% compared to MCP-MS + Fixed-NOMA at K = 16 , highlighting its superior scalability . The increase in delay with more users is driven by heightened content request demands, which strain the limited F AP cache capacity . As K gro ws, the div ersity of ﬁle requests increases, reducing cache hit rates and necessitating more fronthaul transmissions, which incur higher delays. Additionally , in NOMA-based F-RANs, the increased number of users per cluster ampliﬁes intra-cluster interference, further reducing transmission rates. 0.5 1 1.5 Zipf Parameter 13 14 15 16 17 18 19 Average Delay (ms) JACPM + SCA JACPM + Fixed-NOMA MCP-MS + SCA MCP-MS + Fixed-NOMA Fig. 7: A verage transmission delay versus Zipf parameter for different schemes. Fig. 7 ev aluates the impact of content popularity skewness, characterized by the Zipf parameter , on av erage transmis- sion delay . The a verage transmission delay decreases as the Zipf parameter increases, reﬂecting a more skewed content popularity distrib ution. At a Zipf parameter of 0.9, J A CPM + SCA achieves a delay of 13.6052 ms, reducing delay by 25.8% compared to MCP-MS + Fix ed-NOMA (18.4117 ms), 19.1% compared to MCP-MS + SCA (16.8120 ms), and 6.6% compared to J ACPM + Fixed-NOMA (14.5593 ms). The reduction in delay with increasing Zipf parameter is attributed to enhanced caching efﬁcienc y . A higher Zipf parameter concentrates user requests on a smaller subset of popular ﬁles, increasing the likelihood that requested content is cached at F APs. This reduces reliance on fronthaul links, which incur higher delays. V . C O N C L U S I O N In this work, a novel low-delay service framework for NOMA-based F-RANs was proposed, leveraging the capabili- ties of NOMA within a fog-cloud architecture. The frame work was designed to address the complex challenges associated with joint resource allocation, including user association, cache placement, and po wer allocation. By employing an alter- nating optimization algorithm, which decomposed the original problem into two subproblems, we effecti vely tackled the non- con vex MINLP problem. The proposed framew ork demon- strated signiﬁcant impro vements over con ventional OMA- based F-RAN systems, particularly in reducing average trans- mission delay across v arious scenarios. The insights gained from our simulations underscored the importance of a low- complexity approach to optimize user association and cache placement, as well as the effecti veness of SCA in power allo- cation. Moreover , the proposed algorithm achiev ed an optimal balance between performance and computational efﬁcienc y , highlighting its potential for practical deployment in next- generation wireless networks. 14 R E F E R E N C E S [1] X. Mu, Z. W ang, and Y . Liu, “NOMA for integrating sensing and communications tow ards 6G: A multiple access perspectiv e, ” IEEE W ireless Commun. , vol. 31, no. 3, pp. 316–323, 2024. [2] Y . Liu, S. Zhang, X. Mu, Z. Ding, R. Schober, N. Al-Dhahir , E. Hossain, and X. Shen, “Ev olution of NOMA to ward next generation multiple access (NGMA) for 6G, ” IEEE J. Sel. Ar eas Commun. , vol. 40, no. 4, pp. 1037–1071, 2022. [3] Z. Ding, X. Lei, G. K. Karagiannidis, R. Schober, J. Y uan, and V . K. Bharga va, “ A surve y on non-orthogonal multiple access for 5G networks: Research challenges and future trends, ” IEEE J. Sel. Areas Commun. , vol. 35, no. 10, pp. 2181–2195, 2017. [4] X. Mu, Y . Liu, L. Guo, J. Lin, and Z. Ding, “Energy-constrained UA V data collection systems: NOMA and OMA, ” IEEE T rans. V eh. T echnol. , vol. 70, no. 7, pp. 6898–6912, 2021. [5] M. Kaneko, I. Randrianantenaina, H. Dahrouj, H. Elsa wy , and M.-S. Alouini, “On the opportunities and challenges of NOMA-based fog radio access networks: An ov erview , ” IEEE Access , v ol. 8, pp. 205 467– 205 476, 2020. [6] Y . Ai, M. Peng, and K. Zhang, “Edge computing technologies for internet of things: a primer, ” Digit. Commun. Netw . , vol. 4, no. 2, pp. 77–86, 2018. [7] H. Zhang, Y . Qiu, K. Long, G. K. Karagiannidis, X. W ang, and A. Nallanathan, “Resource allocation in NOMA-based fog radio access networks, ” IEEE W ireless Commun. , vol. 25, no. 3, pp. 110–115, 2018. [8] Z. Y ang, Y . Fu, Y . Liu, Y . Chen, and J. Zhang, “ A new look at AI-driven NOMA-F-RANs: Features extraction, cooperative caching, and cache- aided computing, ” IEEE W ireless Commun. , vol. 29, no. 3, pp. 123–130, 2022. [9] Y . Guo, C. W . Sung, S. Mostafa, and J. Zou, “ A cross-layer optimization framew ork for inde x-coded NOMA in cache-aided F-RANs, ” IEEE T rans. Commun. , vol. 70, no. 11, pp. 7322–7336, 2022. [10] Y . Ai, C. Liu, and M. Peng, “Joint user association and resource allocation for cost-efﬁcient NOMA-enabled F-RANs, ” Digit. Commun. Netw . , vol. 10, no. 6, pp. 1686–1697, 2024. [11] K. Z. Shen, D. K. So, J. T ang, and Z. Ding, “Power allocation for NOMA with cache-aided D2D communication, ” IEEE T rans. W ireless Commun. , vol. 23, no. 1, pp. 529–542, 2023. [12] F . Y e, J. Li, P . Zhu, D. W ang, and X. Y ou, “Intelligent hierarchical NOMA-based network slicing in cell-free RAN for 6G systems, ” IEEE T rans. W ireless Commun. , vol. 23, no. 5, pp. 4724–4737, 2024. [13] Y . Yin, T . Ohtsuki, G. Gui, C. Y uen, H.-C. Wu, and H. Sari, “Joint Optimization of User Pairing, Power Allocation and Content Server Deployment in NOMA-Assisted Wireless Caching Networks, ” IEEE T rans. V eh. T echnol. , vol. 72, no. 12, pp. 16 866 – 16 870, 2023. [14] C. Dong, Y . T ian, Z. Zhou, W . W en, and X. Chen, “Joint Power Allo- cation and T ask Ofﬂoading for Reliability-A ware Services in NOMA- enabled MEC, ” IEEE T rans. W ireless Commun. , vol. 23, no. 7, pp. 7537 – 7551, 2024. [15] B. Lim, W . J. Y un, J. Kim, and Y .-C. Ko, “Joint user clustering, beamforming, and power allocation for mmW ave-NOMA with imperfect SIC, ” IEEE T rans. W ireless Commun. , vol. 23, no. 3, pp. 2025 – 2038, 2024. [16] P . Qin, M. Fu, Y . Fu, R. Ding, and X. Zhao, “Collaborativ e Edge Computing and Program Caching W ith Routing Plan in C-NOMA- Enabled Space-Air-Ground Network, ” IEEE Tr ans. W ireless Commun. , vol. 23, no. 12, pp. 18 302 – 18 315, 2024. [17] C. Fang, H. Xu, T . Zhang, Y . Li, W . Ni, Z. Han, and S. Guo, “Joint task ofﬂoading and content caching for NOMA-aided cloud-edge-terminal cooperation networks, ” IEEE T rans. W ireless Commun. , vol. 23, no. 10, pp. 15 586 – 15 600, 2024. [18] C. Dou, M. Dai, N. Huang, Y . W u, L. Qian, and T . Q. Quek, “Inte grated Sensing and T wo-T ier T ask Ofﬂoading via Non-Orthogonal Multiple Ac- cess: An Energy-Minimization Design, ” IEEE T rans. W ireless Commun. , vol. 23, no. 12, pp. 19 157 – 19 171, 2024. [19] Z. Ding, P . Fan, G. K. Karagiannidis, R. Schober , and H. V . Poor, “NOMA assisted wireless caching: Strate gies and performance analysis, ” IEEE T rans. Commun. , vol. 66, no. 10, pp. 4854–4876, 2018. [20] J. Zhao, Y . Liu, K. K. Chai, A. Nallanathan, Y . Chen, and Z. Han, “Spectrum allocation and power control for non-orthogonal multiple access in HetNets, ” IEEE T rans. W ireless Commun. , vol. 16, no. 9, pp. 5825–5837, 2017. [21] L. Y ou and D. Y uan, “A note on decoding order in user grouping and power optimization for multi-cell NOMA with load coupling, ” IEEE T rans. W ireless Commun. , vol. 20, no. 1, pp. 495–505, 2021. [22] K. W ang, W . Liang, Y . Y uan, Y . Liu, Z. Ma, and Z. Ding, “User clustering and power allocation for hybrid non-orthogonal multiple access systems, ” IEEE Tr ans. V eh. T echnol. , vol. 68, no. 12, pp. 12 052– 12 065, 2019. [23] M. Peng, S. Y an, K. Zhang, and C. W ang, “Fog-computing-based radio access networks: Issues and challenges, ” IEEE Netw . , vol. 30, no. 4, pp. 46–53, 2016. [24] J. Zhao, Q. Li, Y . Gong, and K. Zhang, “Computation ofﬂoading and resource allocation for cloud assisted mobile edge computing in vehicular networks, ” IEEE T rans. V eh. T echnol. , vol. 68, no. 8, pp. 7944– 7956, 2019. [25] X. Dai, S. Tian, H. Liu, Z. Li, H. Jiang, and Q. Deng, “Joint optimiza- tion of ofﬂoading and caching in full-duplex-enabled edge computing networks, ” IEEE T rans. Mobile Comput. , 2025. [26] L. Liberti and C. C. Pantelides, “ An exact reformulation algorithm for large noncon ve x NLPs in volving bilinear terms, ” J ournal of Global Optimization , vol. 36, pp. 161–189, 2006. [27] S. Boyd and L. V andenberghe, Con vex optimization . Cambridge Univ . Press., 2004. [28] H. W . Kuhn, “The hungarian method for the assignment problem, ” Naval r esear ch logistics quarterly , vol. 2, no. 1-2, pp. 83–97, 1955. [29] B. H. Korte, J. Vygen, B. K orte, and J. Vygen, Combinatorial optimiza- tion . Springer , 2011, vol. 1. [30] D. P . Bertsekas, “Nonlinear programming, ” Journal of the Operational Resear ch Society , vol. 48, no. 3, pp. 334–334, 1997. [31] Y . Wu, Y . Song, T . W ang, L. Qian, and T . Q. Quek, “Non-orthogonal multiple access assisted federated learning via wireless power transfer: A cost-efﬁcient approach, ” IEEE Tr ans. Commun. , vol. 70, no. 4, pp. 2853–2869, 2022. [32] L.-N. Tran, M. F . Hanif, A. T olli, and M. Juntti, “Fast con ver ging algorithm for weighted sum rate maximization in multicell MISO downlink, ” IEEE Signal Pr ocess. Lett. , vol. 19, no. 12, pp. 872–875, 2012. [33] M. F . Hanif, Z. Ding, T . Ratnarajah, and G. K. Karagiannidis, “ A minorization-maximization method for optimizing sum rate in the down- link of non-orthogonal multiple access systems, ” IEEE T rans. Signal Pr ocess. , vol. 64, no. 1, pp. 76–88, 2015. [34] Y . Qiu, H. Zhang, K. Long, and M. Guizani, “Subchannel assignment and power allocation for time-varying fog radio access network with NOMA, ” IEEE T rans. W ireless Commun. , vol. 20, no. 6, pp. 3685–3697, 2021. [35] C. Y ang, Y . Y ao, Z. Chen, and B. Xia, “ Analysis on cache-enabled wireless heterogeneous networks, ” IEEE Tr ans. W ireless Commun. , vol. 15, no. 1, pp. 131–145, 2015. Y uan Ai received the Ph.D. de gree in informa- tion and communication engineering from the Bei- jing Uni versity of Posts and T elecommunications (BUPT), Beijing, China, in 2023. From 2023 to 2025, he was a Postdoctoral Researcher with the School of Information Science and T echnology , Beijing University of T echnology (BJUT), Beijing. From 2024 to 2025, he was also a V isiting Research Associate with the Department of Electrical and Electronic Engineering, The University of Hong K ong (HKU). He is currently a Lecturer with the School of Information Science and T echnology , BJUT . His research inter- ests include ﬂexible-antenna technologies, ne xt-generation multiple access (NGMA), AI for B5G/6G, and edge intelligence. 15 Xidong Mu (Member, IEEE, https://xidongmu. github .io/) recei ved the Ph.D. degree in Informa- tion and Communication Engineering from the Bei- jing University of Posts and T elecommunications, Beijing, China, in 2022. He was with the School of Electronic Engineering and Computer Science, Queen Mary Uni versity of London, from 2022 to 2024, where he was a Postdoctoral Researcher. He has been a lecturer (an assistant professor) with the Centre for W ireless Innov ation (CWI), School of Electronics, Electrical Engineering and Computer Science, Queen’s Univ ersity Belfast, U.K. since August 2024. His research interests include ﬂexible-antenna technologies, reconﬁgurable surface aided communications, next generation multiple access (NGMA), integrated sensing and communications, and optimization theory . Xidong Mu is a W eb of Science Highly Cited Researcher . He received the IEEE ComSoc Outstanding Y oung Researcher A ward for EMEA region in 2023 and the IEEE ComSoc Wireless Communications T echnical Committee (WTC) Outstanding Y oung Researcher A ward in 2025. He is the recipient of the 2024 IEEE Communications Society Heinrich Hertz A ward, the Best Paper A ward in ISWCS 2022, the 2022 IEEE SPCC-TC Best P aper A ward, and the Best Student Paper A ward in IEEE VTC2022-Fall. He serves as the secretary of the IEEE ComSoc T echnical Committee on Cognitive Networks (TCCN), the secretary of the IEEE ComSoc NGMA Emerging T echnology Initiative, and the URSI UK Early Career Representati ve (ECR) for Commission C. He also serves as an Editor of I E E E T R A NS AC T I O NS O N C O M MU N I C A T I ON S , a Guest Editor for I E E E J O UR N AL O N S E LE C T E D A R EA S I N C O M M UN I - C A T I O NS , IE E E T R AN S AC T I ON S O N C O G NI T I V E C O M MU N I C A T I ON S A N D N E TW O R KI N G , I E E E I NT E R N ET O F T H I NG S J O U RN AL , and the “Mobile and W ireless Networks” symposium co-chair of IEEE GLOBECOM 2025. Pengbo Si (Senior Member, IEEE) recei ved the B.S. and Ph.D. degrees from the Beijing University of Posts and T elecommunications (BUPT), Beijing, China, in 2004 and 2009, respectively . He joined the Beijing University of T echnology (BJUT), Bei- jing, in 2009, where he is currently a Professor . From 2007 to 2008, he was a V isiting Student with Carleton Univ ersity , Ottawa, ON, Canada. From 2014 to 2015, he w as a V isiting Scholar with the Univ ersity of Florida, Gainesville, FL, USA. His research interests include blockchain, SDN, resource management, and cognitiv e radio networks. Dr . Si serves as an Associate Editor for the International Journal on Ad Hoc Networking Systems and an Editorial Board Member for Ad Hoc & Sensor W ireless Networks . He was a Symposium Chair for IEEE GLOBECOM 2019. He has served as a Guest Editor for Advances in Mobile Cloud Computing and a Special Issue of IEEE T ransactions on Emer ging T opics in Computing . He was also the TPC Co-Chair of IEEE ICCC’13-GMCN, the Program V ice Chair of IEEE GreenCom’13, and a TPC Member for numerous international conferences. Y uanwei Liu (S’13-M’16-SM’19-F’24, https:// www .eee.hku.hk/ ∼ yuanwei/) is a tenured full Pro- fessor in Department of Electrical and Electronic Engineering (EEE) at The Univ ersity of Hong K ong (HKU) and a visiting professor at Queen Mary Univ ersity of London (QMUL). Prior to that, he was a Senior Lecturer (Associate Professor) (2021- 2024) and a Lecturer (Assistant Professor) (2017- 2021) at QMUL, London, U.K, and a Postdoctoral Research Fellow (2016-2017) at King’ s College London (KCL), London, U.K. He received the Ph.D. degree from QMUL in 2016. His research interests include non-orthogonal multiple access, reconﬁgurable intelligent surface, near ﬁeld communications, integrated sensing and communications, and machine learning. Y uanwei Liu is a Fellow of the IEEE, a Fellow of AAIA, a Fellow of AIIA, a W eb of Science Highly Cited Researcher, an IEEE Communication Society Distinguished Lecturer , an IEEE V ehicular T echnology Society Distinguished Lecturer , the rapporteur of ETSI Industry Speciﬁcation Group on Reconﬁg- urable Intelligent Surfaces on work item of “Multi-functional Reconﬁgurable Intelligent Surfaces (RIS): Modelling, Optimisation, and Operation”, and the UK representative for the URSI Commission C on “Radio communication Systems and Signal Processing” (2023-2024). He was listed as one of 35 Innov ators Under 35 China in 2022 by MIT T echnology Review . He receiv ed IEEE ComSoc Outstanding Y oung Researcher A ward for EMEA in 2020. He received the 2020 IEEE Signal Processing and Computing for Communications (SPCC) T echnical Committee Early Achievement A ward, IEEE Communication Theory T echnical Committee (CTTC) 2021 Early Achiev ement A ward. He received IEEE ComSoc Outstanding Nominee for Best Y oung Professionals A ward in 2021. He is the co-recipient of the 2024 IEEE Communications Society Heinrich Hertz A ward, the Best Student Paper A ward in IEEE VTC2022-Fall, the Best Paper A ward in ISWCS 2022, the 2022 IEEE SPCC-TC Best Paper A ward, the 2023 IEEE ICCT Best Paper A ward, and the 2023 IEEE ISAP Best Emerging T echnologies Paper A ward. He serves as the Co-Editor-in-Chief of IEEE ComSoc TC Newsletter , an Area Editor of IEEE Transactions on Communications and IEEE Communications Letters, an Editor of IEEE Communications Surveys & T utorials, IEEE T ransactions on Wireless Communications, IEEE Transactions on V ehicular T echnology , IEEE T ransactions on Network Science and Engineering, and IEEE T ransactions on Cogniti ve Communications and Networking. He serv es as the (leading) Guest Editor for Proceedings of the IEEE on Next Generation Multiple Access, IEEE JSA C on Next Generation Multiple Access, IEEE JSTSP on Intelligent Signal Processing and Learning for Ne xt Generation Multiple Access, and IEEE Network on Ne xt Generation Multiple Access for 6G. He serves as the Publicity Co-Chair for IEEE VTC 2019-F all, the Panel Co-Chair for IEEE WCNC 2024, Symposium Co-Chair for several ﬂagship conferences such as IEEE GLOBECOM, ICC and VTC. He serves the academic Chair for the Next Generation Multiple Access Emerging T echnology Initiativ e, vice chair of SPCC and T echnical Committee on Cognitiv e Networks (TCCN) (2023-2024).

Transmission Delay Minimization for NOMA-Based F-RANs

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment