A Digital Predistortion Scheme Exploiting Degrees-of-Freedom for Massive MIMO Systems

The primary source of nonlinear distortion in wireless transmitters is the power amplifier (PA). Conventional digital predistortion (DPD) schemes use high-order polynomials to accurately approximate and compensate for the nonlinearity of the PA. This…

Authors: Miao Yao, Munawwar Sohul, R

A Digital Predistortion Scheme Exploiting Degrees-of-Freedom for Massive   MIMO Systems
A Digital Predistortion Scheme Exploiting De grees-of-Freedom for Massi v e MIMO Systems Miao Y ao, Munaww ar Sohul, Randall Nealy , V uk Maroje vic and Jeffre y Reed W ireless@VT , Dept. of Electrical and Computer Engineering, V irginia T ech Email: { miaoyao,mmsohul,rnealy ,maroje,reedjh } @vt.edu Abstract —The primary source of nonlinear distortion in wire- less transmitters is the power amplifier (P A). Con ventional digital predistortion (DPD) schemes use high-order polynomials to accurately approximate and compensate f or the nonlinearity of the P A. This is not practical for scaling to tens or hundreds of P As in massive multiple-input multiple-output (MIMO) systems. There is more than one candidate precoding matrix in a massive MIMO system because of the excess degrees-of-freedom (DoFs), and each precoding matrix requires a different DPD polynomial order to compensate f or the P A nonlinearity . This paper pr oposes a low-order DPD method achieved by exploiting massive DoFs of next-generation front ends. W e pr opose a novel indir ect learning structure which adapts the channel and P A distortion iteratively by cascading adaptive zero for cing precoding and DPD. Our solution uses a 3rd order polynomial to achie ve the same performance as the conv entional DPD using an 11th order polynomial for a 100 × 10 massive MIMO configuration. Experimental results show a 70% reduction in computational complexity , enabling ultra-low latency communications. I . I N T RO D U C T I O N The power amplifier (P A) is the most power-hungry com- ponent in a cellular communications netw ork. The energy efficienc y of a base station (BS) significantly depends on the P A efficiency . High input power back-off (IBO) is often required to keep the signal within the amplifier’ s linear region and avoid in-band and out-of-band distortion and spectral regro wth giving rise to both amplitude/amplitude modulation (AM/AM) and amplitude/phase modulation (AM/PM) nonlin- ear distortion. Ho wever , operating with high IBO results in lo w energy efficiency for communications and for cooling because of the high heat dissipation. Therefore, the tradeoff between power efficienc y and linearity of the P A has motiv ated the dev elopment of linearization techniques for energy-ef ficient cellular communications. Digital predistortion (DPD), which is used for P A nonlinearity compensation, is a popular solution among the linearization schemes because of its flexibility , accuracy and ability to adapt to the time-variant characteristic of the P A. It enables the application of lo w-cost P As and op- eration of higher signal lev els for increased energy efficiency without sacrificing linearity . This is especially needed for the BS of 5G new radio (NR) since it will use one or two orders of magnitude more P As than a 4G L TE BS. The DPD aims to realize a linear response for the combined DPD-P A block by cascading the P A and its inv erse response. It can be generally categorized into two groups: polynomial based scheme [1] and look-up table (LUT) based scheme [2]. The V olterra series represents one of the most popular polynomial models of a nonlinear system. Moreover , the P A power response is time-variant when dealing with wideband signal inputs such as 10 or 20 MHz L TE. Therefore, the memory-based V olterra model is widely applied to represent the behavior of a P A in wideband scenarios. Literature shows that a wide range of P A nonlinearities can be approximated with considerable precision by the V olterra series-based filter of sufficient order and memory depth in both single-input single-output [1] and multiple-input multiple-output (MIMO) systems [3], [4]. The V olterra series-based method can be a complete representation of an unkno wn nonlinear dynamic system, but has the drawback of requiring a large number of basis functions [5], [6], [7]. The associated computational complexity limits its practical application especially for large- scale MIMO (a.k.a. massiv e MIMO) systems. MIMO technology has g ained popularity in the de velopment and deployment of modern wireless communication systems, because of the improved signal to noise ratio (SNR), link cov erage and spectral ef ficiency . T o accommodate for the increasing demand of smart devices in emerging economies as well as heterogeneous networks, proposals for 5G NR wireless communications standards consider massive MIMO with up to hundreds of transmit antennas. This means that up to hundreds of P As, one per antenna are needed. The prohibitiv e cost of highly linear P As for massive deployment requires the use of low-cost P As. In massive MIMO systems, a digital predistorter is needed for each low-cost P A. The implementation of digital predistorters in massive MIMO system is challenging for the following reasons: • The use of full-order and hence highly precise V olterra series-based DPD is computationally impractical; • The required number of V olterra series grows exponen- tially with the number of antenna when we are consider- ing the notable RF/antenna crosstalk in MIMO systems [3]; and • The input of the P A is deri ved from the pre- coder/beamformer , which combines multiple data streams and hence creates an interplay between dynamic precoder and DPD. Therefore, it is critical to extend the operation of the P A into weakly nonlinear re gion of the lo w-cost P As in massi ve MIMO systems and hence reduce the required order of the V olterra Fig. 1: Con ventional DPD for massive MIMO systems (left) and proposed, low-comple xity and precoding-aware DPD solution (right, TX hardware: transmitter hardware, TX CoProc: transmitter coprocessor, the feedback loop includes P As, channel matrix H which is deriv ed from the TDD uplink channel estimation, adaptive precoding matrix P , and adaptiv e predistorter). series [8], [9]. This paper proposes a low-complexity and scalable DPD architecture and algorithm that take advantage of the spatial degrees-of-freedom (DoFs) of massiv e MIMO to facilitate energy-ef ficient 5G NR BSs. The rest of the paper is organized as follows. Section II introduces the system and signal model. Section III presents the proposed successi ve refinement filtering structure and com- putational complexity in terms of hardware implementation. Section IV discusses the numerical and experimental results, and Section V concludes the paper . I I . S Y S T E M M O D E L A N D P R O B L E M F O R M U L A T I O N Consider a downlink massive MIMO system which has one BS equipped with N t antennas. The number of BS antennas is significantly larger than the number of single antenna users, i.e. N t  M r . When the BS transmits signals to M r users in parallel by means of spatial multiplexing, the receiv ed signal of user k can be represented as r k = h k h f 1 ( p 1 s ) , f 2 ( p 2 s ) , · · · , f N t ( p N t s ) i T + n k , (1) where h k is the k th row of the channel matrix and represents the 1 × N t channel vector from the BS to user k , p k the 1 × M r precoding/beam-weight vector , s the M r × 1 information sym- bols, f i ( · ) the nonlinear amplification operation of the i th P A, n k the zero-mean complex circular-symmetric additiv e white Gaussian noise (A WGN) at user k , and T the matrix transpose operation. The information symbols at the BS are mapped to the appropriate transmit antennas so that the information receiv ed by each user has minimal interference from the signals of the other users. A larger number of spatial DoFs necessary for zero forcing allows selecting from a larger signal space for precoding to minimize the multiuser interference. It allows the P As to work near the nonlinear operating region. In this section, we start with an ideal P A model by assuming that the P As are perfectly linear and ignoring the memory effect to simplify the analysis. Zero forcing precoding (ZFPC) is employed for multiuser signal transmission because of its simplicity and outstanding performance. The beam-weights of ZFPC need to satisfy the constraint HP = I M r , (2) where H = [ h T 1 , · · · , h T M r ] T denotes the M r × N t channel matrix between the BS and all users, P = [ p T 1 , · · · , p T N t ] T the N t × M r precoding matrix and I M r the M r × M r identity matrix. W ithout loss of generality , the P A nonlinearity and the RF coupling between the different paths of the transmitters, which are the two major sources of ZFPC condition violation and the transmitter impairment, need to be considered for massiv e MIMO deployments. Therefore, an extra nonlinear function is needed to preprocess the input signal of the P A and thus linearize the overall cascaded DPD-P A amplified signal. In order to determine this nonlinear function, the behavioral model and inv erse behavioral model of P As are needed. Suppose that f i ( x i ) is the nonlinear model of the transmitter for the baseband input signal x i = p i s and z i is the equiv alent complex signal at the output of the i th P A. The predistortion function g i ( x i ) then has to satisfy the follo wing set of relations z i = f i ( g i ( x i )) = G 0 x i , (3) where G 0 is the equiv alent linear gain of the DPD-P A cas- caded system. A V olterra series-based memory polynomial predistorter is applied to compensate for the dynamic nonlinear behavior . This model, known as the parallel Hammerstein model in literature [3], is a parallelization of a nonlinear function followed by a linear memory . Moreover , apart from the typical odd-order polynomial representation, even-order polynomials are also included to enrich the basis set and improv e the modeling accuracy and DPD performance [3]. The discrete baseband-equiv alent form of the V olterra series with memory effect consists of a sum of multidimensional con volutions that can be written as [3] y ( n ) = Q X q =0 K X k =1 ω k,q | x ( n − q ) | k − 1 x ( n − q ) , (4) where x ( n ) and y ( n ) are the input and output complex en velopes, ω k,q the polynomial coefficient of the filter tap for the k th order and q th delay , K the nonlinear degree, and Q the memory depth. By defining a new sequence, r k,q ( n ) = | x ( n − q ) | k − 1 x ( n − q ) , (5) we can rewrite ( 4 ) as y = R ω , (6) where y = [ y ( n ) , y ( n + 1) , · · · , y ( n + N − 1)] T , ω = [ ω 10 , · · · , ω 11 , · · · , ω K Q ] T , R = [ r 10 , · · · , r 11 , · · · , r K Q ] , and r kq = [ r kq ( n ) , r kq ( n + 1) , · · · , r kq ( n + N − 1)] T . Notice that the proposed V olterra-based models are linear with the coefficients ω . W e can then estimate the model parameter by minimizing the cost function k y − R ˜ ω k 2 [3]: ω = arg min ˜ ω k y − R ˜ ω k 2 . (7) In this paper, we apply the least mean squares (LMS) adaptiv e algorithm for DPD, where the DPD coefficients are updated ev ery sample. The LMS algorithm is based on the minimum mean square error rule with steepest descent which can be described as [3] e ( m ) = y ( m ) − R ( m ) ˜ ω ( m − 1) , (8a) ˜ ω ( m ) = ˜ ω ( m − 1) + µ e H ( m ) R ( m ) . (8b) The weights of the LMS filter ω ( m ) = LMS ( x ( m ) , y ( m )) are a function of the input x ( m ) and output y ( m ) of the LMS algorithm, initialized as ω (0) = ω 0 . Expression e H ( m ) is the Hermitian transpose of the error vector and µ the step size. I I I . P RO P O S E D A R C H I T E C T U R E A N D A L G O R I T H M The predistorter is trained using the system identification architecture, where the P A characteristics are identified in the feedback path [3]. A con ventional indirect learning structure applied for massive MIMO system is sho wn in Fig. 1 (left) [10], [11]. It duplicates the DPD-P A structure for each RF path and the LMS filters associated with each P A are updated independently . This architecture fails to consider the influence of ZFPC on DPD. Since the number of BS antennas in our model is signifi- cantly larger than the number of users, the precoding matrix is underdetermined. This implies that there is more than one solution for the precoding matrix satisfying the constraint (2). The number of solutions depends on the rank of the channel matrix. The objective of our proposed scheme is finding the appropriate precoding matrix which enables low- complexity DPD with low order basis functions. Therefore, instead of the traditional one-stage indirect learning structure of Fig. 1 (left), we propose a nov el learning structure which adapts the channel and P A distortion iterativ ely by cascading adaptiv e ZFPC and DPD. W e modify the conv entional indirect learning architecture by incorporating the channel matrix into the feedback path. T o be more specific, the feedback loop in Fig. 1 (right) includes the channel matrix, adaptiv e ZFPC, adaptiv e DPD and P As. This new indirect learning architecture lends itself to ef- ficient, yet flexible implementation for 5G NR. The forward paths, which are always functional, can be implemented in hardware (e.g. ASIC or FPGA), exploiting parallel processing. The feedback paths, which need to be functional during the weight updating phase only , can be implemented by a general purpose processor-based coprocessor . It is critical to reduce the complexity of the forward paths, especially the number of complex multipliers, to enable ultra-lo w latency communications in 5G NR. The proposed scheme exploits the enormous DoFs in a massiv e MIMO system to reduce the number of basis functions and to mitigate its ef fects of crosstalk. A cascaded FIR filtering scheme, named successiv e refinement filtering, is proposed because it provides fast con ver gence to their steady-state values [12], [13]. Algorithm 1 depicts the algorithm flow of successiv e refinement filtering. Algorithm 1 Successiv e Refinement Filtering Input: channel matrix H , modulated symbol s Output: predistortion matrix G , precoding matrix P Initialization: P = P 0 1: while error ≥ threshold  do 2: G ( m ) = LMS ( z 0 ( m ) , y ( m )) ; 3: DPD update: DPD ← G ( m ) ; 4: P ( m ) = LMS ( r ( m ) , x ( m )) ; 5: ZFPC update: ZFPC ← P ( m ) ; 6: m ← m + 1 ; 7: end while 8: return G and P As shown in Fig. 1 (right), y ( m ) represents the input signal of a P A for the m th iteration, r ( m ) the input of the ZFPC P , z 0 ( m ) the input of the adaptiv e DPD, and x ( m ) the output of the precoding in the forward path of the proposed structure. In order to give a quantitative measure of complexity of the transmitter hardware, we ev aluate the floating point operations (FLOPs) [14] required for the architecture of Fig. 1 (right). The complex filtering operation requires six FLOPs per filter tap, four real multiplications and two summations. For a giv en delay tap q , the DPD output for a ( K + 1) -order V olterra series can be represent as y K +1 q ( n ) = K +1 X k =1 ω k,q | x ( n − q ) | k − 1 x ( n − q ) (9) = ω 1 ,q x ( n − q ) + | x ( n − q ) | K X k =1 ω k +1 ,q | x ( n − q ) | k − 1 x ( n − q ) | {z } ˜ y K q ( n ) . There is one complex-real multiplication for | x ( n − q ) | ˜ y K q ( n ) , one complex multiplication for ω 1 ,q x ( n − q ) (can be pre- calculated), and one complex summation 1 . Therefore, the total savings of the proposed DPD when comparing it with the conv entional DPD is 4( K C − K P ) QN t FLOPs, where K C denotes the order of the conv entional independent DPD scheme, K P the order of the proposed scheme, Q the memory depth, and N t the total number of transmit antennas in the downlink. Notice that N t is usually very large for massi ve MIMO configurations and hence considerable FLOPs are sav ed. I V . S I M U L AT I O N A N D E X P E R I M E N T A L R E S U LT S A. Simulation Results Simulations are performed with the polynomial P A model containing memory effects. A 100 × 10 massi ve MIMO config- uration with 100 transmit antenna elements is used to ev aluate the proposed DPD algorithm. The simulation setup consists of RF sources that represent the transmitting paths and accurately capture the effects of crosstalk on the performance of the massiv e MIMO transmitter . The crosstalk effect is simulated by coupling the signal in each RF path to its adjacent paths with 20 dB attenuation. The P A input and output are assumed to obey the Saleh model with parameters α a = 2 , β a = 2 . 2 , α φ = 2 , and β φ = 1 [15]. The center frequency is set to 3.5 GHz and the baseband bandwidth to 10 MHz. Fig. 2 shows the PSD output of the P A for the con ventional polynomial predistorter with polynomial orders K = 3 and K = 9 along with the proposed polynomial predistorter with polynomial order K = 3 . The highest trace (purple) shows the spectral regro wth of the P A output without predistortion and is seen to be approximately − 50 dB below the in-band power lev el. The black trace shows the output spectrum with the con ventional memory polynomial model with nonlinear degree K = 3 and memory depth Q = 5 . It shows an out-of-band emission of − 70 dB with respect to the in-band signal power le vel. The green trace is for the con ventional memory polynomial model with nonlinear degree K = 9 and memory depth Q = 5 , which reduces the regrowth to -90 dB relativ e to the in-band power lev el. The blue trace shows the output spectrum for the proposed DPD scheme described in Algorithm 1 with nonlin- ear degree K = 3 and memory depth Q = 5 . W e observe that the proposed algorithm outperforms the con ventional memory 1 The computational complexities of ˜ y K q ( n ) and y K q ( n ) are the same. Fig. 2: Predistortion linearization performance in terms of spectral regro wth suppression. The P A output PSD is shown for dif ferent cases with crosstalk set to − 20 dB for adjacent antennas. Also sho wn is the original signal without crosstalk. polynomial solution ( K = 9 , Q = 5) and reduces the out-of- band emission by another 10 dB. B. Hybird Experimental and Simulation Results It is clear that the hardware implementation of the entire massiv e MIMO system requires a large number of P As (and digital con verters, mixers, etc.) since each antenna element needs an independent P A. In this subsection, we therefore introduce a hybrid experimental and simulation setup which exploits both the real and theoretical P A models to examine the proposed DPD performance with limited hardware resources (Fig. 3). Fig. 3: Hybrid experimental and simulation setup (one mini- circuits ZHL-42 P A (33dB gain@3.5GHz&15V) and two USRPs (N210) are employed). Our testbed features a real P A that serves as one of the 100 transmit antenna elements, whereas the others are emulated in the host PC using a theoretical P A model. Synchronization is necessary between the real path and virtual paths. The proposed hybird experimental and simulation setup consists of a mini-circuits ZHL-42 P A (33dB gain@3.5GHz&15V) and two Ettus USRPs (model N210) which are equipped with SBX daughterboards that support radio frequencies from 400 MHz up to 4.4 GHz. A real-time spectrum analyzer (RSA3408A) is used to show the output spectrum of the ZHL-42 in the experiment. MA TLAB Simulink is connected to the USRPs to run Algorithm 1 and emulates the Saleh model for the other P As. The USRPs are instantiated in Simulink. As sho wn in Fig. 3, one transmit signal is sent to a USRP via Ethernet and from the USRP to the ZHL-42 P A after the 3.5 GHz upconv ersion. The amplified signal is attenuated and sent to another USRP and its output is sent to the feedback path in the host PC via an Ethernet-PCIe card after the downcon version. The rest of the transmit signals are processed by the Saleh model and go to the internal feedback path directly . The normalized mean square error (NMSE) between the output of the ZHL-42 and the undistorted input signal, for the con ventional and the proposed DPD, are calculated and presented in T able I. It also shows the complexity comparison in terms of the number of FLOPS. T ABLE I: Performance comparison of no DPD, conv entional DPD and the proposed DPD ( N t = 100 and Q = 5 ). Scheme NMSE Complexity ZHL-42 only (No DPD) -18.41 dB N/A Con ventional 3rd order DPD -27.38 dB 7 × 10 3 FLOPS Con ventional 9th order DPD -40.17 dB 1 . 9 × 10 4 FLOPS Con ventional 11th order DPD -45.30 dB 2 . 3 × 10 4 FLOPS Proposed 3rd order DPD -45.89 dB 7 × 10 3 FLOPS The proposed 3rd order DPD scheme achie ves g ains of 18.5, 5.7 and 0.6 dB ov er the conv entional 3rd order, 9th order and 11th order DPD, respectively . Note that the proposed 3rd order DPD scheme needs only 30% of the computational po wer needed by the con ventional 11th order DPD scheme to achieve the equiv alent NMSE performance. V . C O N C L U S I O N This paper has introduced a low-comple xity predistorter for nonlinear P As with memory . By exploiting the excess DoFs, our approach outperforms the conv entional approach in terms of accuracy and complexity . Both simulations and hardware experiments confirm the proposed DPD model can linearize P As better than the conv entional approach. The proposed solution achieves high performance at low complexity , making it practical for 5G NR massiv e MIMO system deployments. R E F E R E N C E S [1] D. Mirri, G. Luculano, F . Filicori, G. Pasini, G. V annini, and G. Gabriella, “ A modified v olterra series approach for nonlinear dynamic systems modeling, ” IEEE T ransactions on Cir cuits and Systems I: Fundamental Theory and Applications , vol. 49, no. 8, pp. 1118–1128, 2002. [2] H. Zhi-yong, G. Jian-hua, G. Shu-Jian, and W . Gang, “ An improved look-up table predistortion technique for HP A with memory effects in OFDM systems, ” IEEE T ransactions on Br oadcasting , vol. 52, no. 1, pp. 87–91, 2006. [3] P . M. Suryasarman and A. Springer , “ A comparative analysis of adaptive digital predistortion algorithms for multiple antenna transmitters, ” IEEE T ransactions on Cir cuits and Systems I: Regular P apers , vol. 62, no. 5, pp. 1412–1420, 2015. [4] A. Abdelhafiz, L. Behjat, F . M. Ghannouchi, M. Helaoui, and O. Hammi, “ A high-performance complexity reduced behavioral model and digital predistorter for MIMO systems with crosstalk, ” IEEE T ransactions on Communications , vol. 64, no. 5, pp. 1996–2004, 2016. [5] C. Y u, L. Guan, E. Zhu, and A. Zhu, “Band-limited volterra series- based digital predistortion for wideband rf power amplifiers, ” IEEE T ransactions on Micr owave Theory and T echniques , vol. 60, no. 12, pp. 4198–4208, 2012. [6] Y . Ma, Y . Y amao, Y . Akaiwa, and K. Ishibashi, “W ideband digital pre- distortion using spectral extrapolation of band-limited feedback signal, ” IEEE T ransactions on Circuits and Systems I: Re gular P apers , vol. 61, no. 7, pp. 2088–2097, 2014. [7] S. Bensmida, O. Hammi, A. Kwan, M. S. Sharawi, K. A. Morris, and F . M. Ghannouchi, “Extending the characterization bandwidth of dynamic nonlinear transmitters with application to digital predistortion, ” IEEE Tr ansactions on Micr owave Theory and T echniques , v ol. 64, no. 8, pp. 2640–2651, 2016. [8] J. Peng, S. He, B. W ang, Z. Dai, and J. Pang, “Digital predistortion for power amplifier based on sparse bayesian learning, ” IEEE T ransactions on Cir cuits and Systems II: Expr ess Briefs , v ol. 63, no. 9, pp. 828–832, 2016. [9] J. W ood, “System-lev el design considerations for digital pre-distortion of wireless base station transmitters, ” IEEE T ransactions on Micr owave Theory and T echniques , vol. 65, no. 5, pp. 1880–1890, 2017. [10] Y . Liu, W . Pan, S. Shao, and Y . T ang, “ A general digital predistortion architecture using constrained feedback bandwidth for wideband power amplifiers, ” IEEE T ransactions on Micr owave Theory and T echniques , vol. 63, no. 5, pp. 1544–1555, 2015. [11] J. Zhai, L. Zhang, J. Zhou, X.-W . Zhu, and W . Hong, “ A nonlinear filter-based volterra model with low complexity for wideband power amplifiers, ” IEEE Micr owave and Wir eless Components Letter s , vol. 24, no. 3, pp. 203–205, 2014. [12] P . Prandoni and M. V etterli, “ An fir cascade structure for adaptive linear prediction, ” IEEE transactions on signal pr ocessing , vol. 46, no. 9, pp. 2566–2571, 1998. [13] R. Y u and C. C. K o, “Lossless compression of digital audio using cascaded rls-lms prediction, ” IEEE transactions on speech and audio pr ocessing , vol. 11, no. 6, pp. 532–537, 2003. [14] S. Afsardoost, T . Eriksson, and C. Fager , “Digital predistortion using a vector -switched model, ” IEEE T ransactions on Microwave Theory and T echniques , v ol. 60, no. 4, pp. 1166–1174, 2012. [15] M. O’ droma, S. Meza, and Y . Lei, “New modified saleh models for memoryless nonlinear power amplifier behavioural modelling, ” IEEE Communications Letters , vol. 13, no. 6, 2009.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment