Capacity of Sparse Wideband Channels with Partial Channel Feedback
This paper studies the ergodic capacity of wideband multipath channels with limited feedback. Our work builds on recent results that have established the possibility of significant capacity gains in the wideband/low-SNR regime when there is perfect c…
Authors: Gautham Hariharan, Vasanthan Raghavan, Akbar M. Sayeed
1 Capacity of Sparse W ideband C hannels with P artial Chann el Feedback Gautham Hariharan, V asanthan Raghav an and Akbar M. Saye ed Abstract This paper studies th e ergodic capacity o f wideba nd multipath ch annels with lim ited feed back. Our work builds on recent r esults that h av e established the possibility of significan t capacity gains in th e wideband / low- SNR regime when th ere is perfect ch annel state in formatio n (CSI) at the transmitter . Furthermo re, the perfe ct CSI benc hmark ga in can be obtained with the feedback of just one bit per channel coe fficient. Ho wev er, the inpu t signals u sed in these metho ds are pea ky , th at is, they h ave a large peak- to-average po wer r atios. Signal pe akiness is re lated to cha nnel co herenc e and many recen t measuremen t c ampaigns show that, in con trast to previous assumptions, wideban d channels exhibit a sparse multipa th structure th at natur ally leads to coherence in time and frequ ency . In this work, we first show that even an instantan eous p ower constraint is sufficient to ach iev e the benchm ark gain when perfect CSI is av ailable at the receiver . In the m ore re alistic non-coh erent setting, we study the perfor mance of a training-based signaling scheme. W e show that mu ltipath sparsity can be lev eraged to achieve the benchm ark ga in un der both average as well as instantaneous power constraints as long as the channel coherence scales at a suffi ciently fast rate with signal space d imensions. W e also present rules of thu mb on ch oosing signaling par ameters as a f unction of the channel parameters so th at the full benefits of sparsity can be rea lized. I . I N T RO D U C T I O N Recent research on the fund amental limi ts of wideband / low- SNR communi cations has focused on the non-coherent regime where the impact of channel state information (C SI) on the achiev abl e This work was supported in part by the NSF through grant #CCF-0431088. G. Hariharan is wi th the Qualcomm Inc., San Diego , CA 92121, USA ( gharihar@q ualcomm.com ). V . Ragha v an is with the Coordinated Science L aboratory and the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA ( vasanthan raghavan@i eee.org ). A. M. Sayeed is with the Department of Electrical and Computer Engineering, Univ ersit y of Wisconsin -Madison, Madison, WI 53706, USA ( akbar@engr.wisc.e du ). DRAFT 2 rates is critical. From a capacity perspective, spreading signals has been shown t o be sub- optimal [1] and peaky o r flash sign aling schemes are necessary [2], [3] t o achieve the non- coherent wideband capacity . Recent work by Zheng et al. [4] has emphasized the crucial role of channel coherence in the low- SNR regime and the i mportance of impli cit / explicit channel learning schemes t hat can bridge t he gap between th e coherent and the non-coherent extremes. Howe ver , these results have been deri ved based on an impl icit assumpt ion of rich multipath where the number of ind ependent d egrees of freedom (DoF) in the delay domain scale linearly with bandwidth. Recent measurement campaigns in the case of ul trawideband systems show that t he num ber of in dependent DoF do not scale linearly with bandwidth [5]–[1 1]. In fact, the p hysical layer channel model proposed by t he IEEE 802.15 working group for ul trawideband comm unication systems exhibits sparsity i n t he delay dom ain (see for example, the measurement data i n [12, p. 15]). Motiv ated by t hese works, we introduced the no tion of multipath sparsity in [13] as a source of channel coherence and proposed a channel modeling frame work to capture t he im pact of sparsi ty in delay and Doppler on achiev able rates. The analysis in [13] sh ows that multip ath sparsity can h elp in reduci ng / eliminati ng t he need for peaky signaling in achie ving wi deband capacity . In this work, we build on the resul ts in [13] and study the impact of channel state feedbac k on achiev abl e rates in sparse wideband channels. Al though earlier works (for example [14]–[1 6] and references therein) ha ve explored capacity with t ransmitter CSI, it i s only recently [2], [17], [18] th at t he i mpact of feedback in the lo w- SNR , n on-coherent regime h as received att ention. In particular , i n the low- SNR regime, it is shown in [2], [17] t hat with an avera ge power const raint, the capacity gain wi th perfect transmi tter and receiver CSI over the case wh en t here i s only perfect recei ver CSI is log 1 SNR . M ore in terestingly , it is shown that a l imited f eedback scheme where o nly one bit per independent Do F is a vailable at the transm itter can also achieve a gain o f log 1 SNR [2], [17]. Howe ver , for b oth the optimal waterfilling scheme [14], [19] as well as the one bit limited feedback scheme, the i nput si gnal tends to be peaky (or bursty) in time, leading to a high peak-to-aver age power ratio, and difficulties from an i mplementati on s tandpoint. The need to reliably estimate the channel at the recei ver leads to the use o f peaky training fol lowed by comm unication in [17]. Simil ar results h a ve also been reported in [18] where the auth ors study the optimization of the training length, a verage traini ng power and spreading bandwidth DRAFT 3 in a wideband setti ng. The focus of th is work i s on lev eraging multip ath sparsity to overcome or reduce t he need for peaky signaling schemes. W e work tow ards thi s goal by providing a concise description of the s parse channel model [13] in Sec. II. W e t hen s tudy the performance in the case where t he recei ver has perfect CSI and the transmi tter has one bit (per in dependent DoF) in Sec. III. In contrast to [2], [17], [18] which study the performance onl y under an averag e (or lo ng-term) power constraint , we als o cons ider an instanta neous (or short-term) power constraint . W e restrict our attention to causal signalin g schemes that can be realized in practice. W e show that an optimal threshold of the form h t = λ log 1 SNR for any λ ∈ (0 , 1) provides a measure of achie vable rate 1 which b eha ves as (1 + h t ) SNR in the wi deband l imit. Thus when λ approaches 1 , we achie ve the perfect transmit ter CSI capacity which is the benchmark for all limi ted feedback schemes. W e deri ve a sufficient condition under which this benchmark can be approached ev en wit h an instantaneous power constraint. A ke y parameter that determines this condi tion is E [ D eff ] , the a verage num ber of active in dependent channel di mensions, the number of ind ependent chann el coef ficients that exceed the threshold in the power all ocation scheme. In particular , with an instantaneous power cons traint, t he benchmark capacity gain is achie ved when E [ D eff ] − h t → ∞ as SNR → 0 . W e dis cuss the feasibility of the above condition when the channel is rich as well as sparse. In Sec. IV, the focus is on the case where th e recei ver has no CSI a priori and a training- based signaling scheme is employed. Along the same lines as in [17 ], [18], we stud y the rates achie vable with t his s cheme, albeit for sp arse channels . W ith an av erage p ower constraint, it is shown that as long as the channel coherence dim ension N c scales with SNR as N c = 1 SNR µ for some µ > 1 , the rate achiev able with the training s cheme con ver ges to the capacity with perfect transmitt er CSI, the performance benchm ark, in the wideband limi t. Furthermore, this condition is achiev able only when t he channel is sp arse and we provide gui delines on choosing the signal space parameters (signaling / p acket duration, bandwi dth and transmit po wer) such that µ > 1 is realized. The critical role of channel sparsity is further re vealed when we imp ose an instantaneous power constraint. In contrast to p eaky s ignaling that viol ates the finiteness constraint on the peak-to-a verage po wer , channel sparsity is necessary to realize the conditions 1 All logarithms are assumed to be base e and the units for all rate quantities are assumed to be nats per channel use. DRAFT 4 required to approach the performance gain wi th an in stantaneous power con straint: µ > 1 and E [ D eff ] − h t → ∞ . W e summarize th e paper in Sec. V by hi ghlighti ng ou r contri butions and placing them in the context of [2], [17], [18]. I I . S Y S T E M M O D E L In this s ection, we elucidate the m odel dev eloped in [13 ] for sparse multipath channels. Our results are based on an orthogonal short-time Fourier (STF) si gnaling frame work [20], [21] that naturally relates m ultipath sparsity i n delay-Dop pler t o coherence in tim e and frequency . A. Sparse Multipath Channel Modeling A di screte, physical multi path channel can be modeled as y ( t ) = Z T m 0 Z W d 2 − W d 2 h ( τ , ν ) x ( t − τ ) e j 2 πν t d ν d τ + w ( t ) (1) h ( τ , ν ) = X n β n δ ( τ − τ n ) δ ( ν − ν n ) , y ( t ) = X n β n x ( t − τ n ) e j 2 πν n t + w ( t ) (2) where h ( τ , ν ) is the delay-Doppler spreading function of the channel, β n , τ n ∈ [0 , T m ] and ν n ∈ [ − W d / 2 , W d / 2] denote the compl ex path g ain, delay and Doppler shift associated wit h the n -th path. T m and W d denote the delay and the Dop pler sp reads, respectiv ely . The quanti ties x ( t ) , y ( t ) and w ( t ) denote the transmitt ed, receiv ed and additiv e white Gaussian noise wa veforms, respectiv ely . Throughout this paper , we ass ume an und erspr ead channel where T m W d ≪ 1 . W e use a virtual repr esentation [22], [23] of t he phy sical model in (2) that captures the channel characteristics in terms of r esol vable pa ths and greatly facilitates s ystem analysi s from a communication -theoretic perspectiv e. The virt ual representation un iformly samples t he multi path in delay and Doppler at a resolution commensurate with signaling bandwidth W and signaling duration T , respecti vely . Thus, we hav e y ( t ) = L X ℓ =0 M X m = − M h ℓ,m x ( t − ℓ/ W ) e j 2 πmt/T + w ( t ) (3) h ℓ,m ≈ X n ∈ S τ ,ℓ ∩ S ν,m β n (4) where L = ⌈ T m W ⌉ and M = ⌈ T W d / 2 ⌉ . The sampl ed representation (3) is l inear and i s characterized by the vi rtual delay-Doppl er channel coef ficients { h ℓ,m } in (4). Each h ℓ,m consists DRAFT 5 of the sum of gains of all paths whose delay and Doppler shifts lie within the ( ℓ, m ) -th delay- Doppler resolution bin S τ ,ℓ ∩ S ν,m of size ∆ τ × ∆ ν , ∆ τ = 1 W , ∆ ν = 1 T as illustrated in Fig. 1(a). Dis tinct h ℓ,m ’ s correspond to approximately disjoint subsets o f paths and are h ence approximately statisti cally independent . In this work, we assume th at the channel coeffic ients { h ℓ,m } are perfectly independent. W e al so assu me 2 Rayleigh fading in which { h ℓ,m } are zero- mean Gaussian rando m v ariabl es. Let D denote the n umber of non-zero channel coefficients that reflects t he (dominant) stati s- tically independent DoF in the channel and also signifies the delay-Doppler diversity af forded by the channel [22]. W e decompose D as D = D T D W where D T denotes the Doppler / t ime div ersity and D W denotes the frequency / delay di versity . The channel DoF or delay-Doppler div ersity is bounded as D = D T D W ≤ D max , D T , max D W, max (5) D T , max = ⌈ T W d ⌉ , D W, max = ⌈ T m W ⌉ (6) where D T , max denotes the maxi mum Doppler div ersity and D W, max denotes the maximum delay div ersity . Note th at D T , max and D W, max increase linearly with T and W , respectiv ely , and thus represent a r ich mu ltipath en viron ment in which each resolution bin i n Fig. 1(a) corresponds to a dominant channel coeffic ient. Howe ver , there is growing experimental evidence [5]–[11] that the domi nant channel coeffi- cients get sparser in delay as the bandwidt h increases. Furthermore, we are als o interested in modeling scenarios w ith Doppler effects, due to m otion. In such cases, as we cons ider large bandwidths and / or l ong signaling durations , the resolution of paths in both delay and Doppler domains gets finer , l eading to the scenario in Fig. 1(a) where the delay-Doppler resoluti on bins are sparsely po pulated with path s, i .e. D ≪ D max . In th is work, we model multipath sparsity by a sub -linear scali ng of D T and D W with T and W , respectiv el y: D W ∼ g 1 ( W ) , D T ∼ g 2 ( T ) (7) 2 Note that the Rayleigh fading assumption is used only for mathematical tractability . T he general theme of results will continue to hold as long as t he fading distributions hav e an exponential tail. See [17] for details and [13] for a discussion on modeling issues. DRAFT 6 where g 1 and g 2 are arbitrary s ub-linear functions . As a concrete example, we will focus on a power -law scalin g for the rest of t his paper: D T = ( T W d ) δ 1 , D W = ( W T m ) δ 2 (8) for some δ 1 , δ 2 ∈ (0 , 1) . But the results deri ved here hold true for any general sub-lin ear scaling law . Note th at (6) and (7) imp ly th at in sparse multi path, the total number of delay-Doppler DoF , D = D T D W , scales sub-linearly with the signal space di mension N = T W . Remark 1 : W i th perfect CSI at the receiver , th e parameter D denotes the delay-Doppler div ersity af forded by the channel, whereas with no CSI, it reflects the le vel of channel uncertainty; the num ber of channel parameters that need to be learned at the recei ver for coherent processing. 1/T 1/W 0 W d /2 - W d /2 T m DOPPLER DOPPLER MULTIPATH MULTIPATH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Coherence Subspace Basis Functions T W W coh T coh t f (a) (b) Fig. 1. (a) Delay-dop pler sampling commensurate wit h si gnaling bandwidth and duration. (b) T ime-frequency coherence subspaces in STF signaling. B. Orth ogonal Short-T ime F ourier Si gnaling W e cons ider signaling using an orthono rmal sho rt-time Fourier (STF) basis [20], [21] that is a natural generalizati on 3 of orthogonal frequency-division mul tiplexing (OFDM) for time- var ying channels. An ortho gonal STF basis { φ ℓm ( t ) } for th e s ignal space is generated from a fixed prototype wav eform g ( t ) via t ime and frequency shift s: φ ℓm ( t ) = g ( t − ℓT o ) e j 2 πW o t , 3 STF signaling can be treated as OFDM signaling over a block of O FDM symbol periods wi th an appropriately chosen symbol duration. DRAFT 7 where T o W o = 1 , ℓ = 0 , · · · , N T − 1 , m = 0 , · · · , N W − 1 and N = N T N W = T W with N T = T /T o , N W = W /W o . The transmitted sign al can be represented as x ( t ) = N T − 1 X ℓ =0 N W − 1 X m =0 x ℓm φ ℓm ( t ) 0 ≤ t ≤ T (9) where { x ℓm } denot e the N transmit ted symbols that are modulated onto the STF basis wa veforms. The recei ved signal i s p rojected onto t he STF basis wa veforms to yield y ℓm = h y , φ ℓm i = X ℓ ′ ,m ′ h ℓm,ℓ ′ m ′ x ℓ ′ m ′ + w ℓm . (10) W e can represent the system using an N -dim ensional matrix equation [20], [21] y = Hx + w (11) where w is the addit iv e noi se vector whose entries are i.i .d. C N (0 , 1) . The N × N m atrix H consists of th e channel coefficients { h ℓm,ℓ ′ m ′ } in (10 ). W e assume that the input sym bols that form the transm it code word x satisfy an average power constraint 1 T · E k x k 2 ≤ P . (12) Since there are N = T W s ymbols per codeword, we define the parameter SNR (transmi t ener gy per mo dulated sym bol) for a giv en a verage transmit po wer P as SNR = T P T W = P W . In this work, the focus is on the wideband regime where SNR → 0 as W → ∞ for a fixed P . For suf ficientl y underspread channels, the parameters T o and W o can be matched to T m and W d so that the STF basis wa veforms serve as approxim ate eigenfunctions of the channel [20], [21]; that is, (10) simplifies to 4 y ℓm ≈ h ℓm x ℓm + w ℓm . Thus the channel matrix H is approximately diagonal. In t his work, we assume t hat H is exactly diagonal; that is, H = diag h h 11 · · · h 1 N c | {z } Subspace 1 , h 21 · · · h 2 N c | {z } Subspace 2 · · · h D 1 · · · h D N c | {z } Subspace D i . (13) The diagonal entries of H in (13) admit an intuitive block fading interpretation in t erms of time-fr equency coh er ence subsp aces [20] illustrated in Fig. 1(b). Th e signal space is partitioned as N = T W = N c D where D represents the number of stati stically independent ti me-frequency coherence su bspaces, reflecting the DoF in the channel, and N c represents the dim ension of each 4 The STF channel coefficients are dif ferent from the delay-Dopp ler coef ficients, e ven though we are reusing the same symbols. DRAFT 8 coherence su bspace, whi ch we refer to as the coherence di mension . In the block fading mod el in (13), the channel coefficients ove r the i -th coherence sub space h i 1 , · · · , h iN c are assumed to be identical (denot ed by h i ), whereas the coefficients across differ ent coherence subspaces are independent and identically dis tributed. Thus , the channel is characterized by the D distinct STF channel coef ficients, { h i } , that are i.i.d. zero-mean Gaussian random variables (Rayleigh fading) with (normalized) variance equal to E [ | h i | 2 ] = P n E [ | β n | 2 ] = 1 [20]. Using the DoF scaling for sp arse channels in (7), t he scaling behavior for the coherence dimension can be computed as W coh = W D W ∼ f 1 ( W ) , T coh = T D T ∼ f 2 ( T ) (14) N c = W coh T coh ∼ f 1 ( W ) f 2 ( T ) (15) where T coh is the coher ence time and W coh is the coher ence bandwidth of th e channel, as illustrated in Fig. 1(b). As a consequence of the sub-li nearity of g 1 and g 2 in (7), f 1 and f 2 are also sub-linear . In particular , corresponding to the power -law scaling in (8), we obt ain T coh = T 1 − δ 1 W δ 1 d , W coh = W 1 − δ 2 T δ 2 m . (16) Remark 2 : Note th at when the channel is sparse, both N c and D increase sub-linearly with N , whereas when t he channel is rich, D s cales linearly wit h N , wh ile N c is fixed. In t his work, the focus is on comput ing achiev able rates in the non-coherent setting with feedback and as we will see in Sec. III and IV, the rates turn out to be a function only of the parameters N c and SNR . Thus, in order to analyze the low- SNR asymptotics, the following relation between N c and SNR (= P / W ) plays a ke y role: N c = 1 SNR µ , µ > 0 (17) where the parameter µ reflects the level of channel coherence. W e will revisit (17) and discuss its achie vability and impli cations in Sec. IV. I I I . A C H I E V A B L E R A T E S W I T H P E R F E C T R E C E I V E R C SI A N D L I M I T E D C H A N N E L S T A T E F E E D BA C K In this s ection, we study the scenario when there is perfect CSI at the recei ver . W e ass ume throughout this paper that both the t ransmitter and the recei ver hav e statistical CSI - kn owledge DRAFT 9 of T m , W d , g 1 , g 2 , f 1 and f 2 so that the scaling in D and N c are k nown. On one extreme, with perfect receiv er CSI and n o transmit ter CSI (no feedback), t he coherent capacity p er dim ension (in nats/s/Hz) equals C coh , 0 ( SNR ) = sup Q : T r( Q ) ≤ T P E log det I N c D + HQH H N c D . (18) The optimization is ove r the set of N c D -dimensi onal posi tiv e definite input covariance matrices Q = E xx H satisfying the a verage power constraint in (12). Due to the diagonal nature of H in (13 ), the optimal Q i s als o diagonal. Furthermore, w ith no transm itter CSI, the uni form power allocation Q = T P N c D I N c D = SNR · I N c D achie ves this optim um. The corresponding capacity in the limit o f lo w- SNR i s [2], [4] C coh , 0 ( SNR ) ≈ SNR − SNR 2 . (19) On th e other extreme is the case of perfect recei ver and t ransmitter CSI, where the receiv er instantaneously feeds back all the channel coef ficients , { h i } D i =1 , correspondi ng to the D indepen- dent coherence subspaces to the transmitt er . The op timum transmitt er po wer allocation in thi s case is waterfilling [14], [19] over the d iffe rent coherence su bspaces. In the low- SNR extreme, it is shown in [2], [17] that the capacity with perfect transmitter CSI scales as log 1 SNR SNR . That is, the capacity g ain (com pared with the recei ver CSI only case) is directly proportional to the waterfilling th reshold, h w ∼ log 1 SNR , and this gain serves as a benchmark for all l imited feedback s chemes. More interesting ly , it is sho wn in [2], [17] that this m aximum capacity gain can be achieved with just one bi t o f feedback per channel coef ficient. In the case of li mited feedback, both the transmitter and the recei ver hav e a priori knowledge of a common threshold denoted by h t . The recei ver com pares the channel s trength ( | h i | 2 , i = 1 , 2 , · · · , D ) in each coherence subs pace with h t , and feeds back b i = 1 if | h i | 2 ≥ h t 0 if | h i | 2 < h t . (20) At the transmitter , power allo cation is uni form across th e coherence subspaces for which b i = 1 and no power is allocated to those subspaces for whi ch b i = 0 . The inpu t power allocation is conditioned on the p artial CSI av ailabl e at the transmit ter (denoted by CSI ), which is { b i } D i =1 . DRAFT 10 This po wer allo cation, which we still denote by Q wit h an abuse of notation, takes the form Q ( CSI ) = diag E [ | x 1 | 2 | CSI ] , E [ | x 2 | 2 | CSI ] , · · · , E [ | x N | 2 | CSI ] (21) = diag q 1 , · · · , q 1 | {z } N c , q 2 , · · · , q 2 | {z } N c , · · · , q D , · · · , q D | {z } N c (22) q i = P · χ ( | h i | 2 ≥ h t ) . (23) The choice of P depends on th e type of power constraint and also on the nature of feedback. T o explore this further , let D eff denote t he number of active su bspaces, t hose which exceed the threshold h t . W e h a ve D eff = D X i =1 χ ( | h i | 2 ≥ h t ) (24) E [ D eff ] ( a ) = D E χ ( | h | 2 ≥ h t ) ( b ) = D e − h t (25) where (a) i s due to the fact that { h i } D i =1 are i. i.d. and (b) is due to the fact that for a standard Gaussian, E [ χ ( | h i | 2 ≥ h t )] = Pr ( | h i | 2 ≥ h t ) = e − h t . If we assume knowledge of { b i } D i =1 at the be gin ning of each codew ord, alb eit non-causally , at the transm itter , then we can uniformly divide power among the active subspaces. That is P , nc = T P N c D eff . (26) The rate achiev able with thi s p ower allocation, denoted by C coh , 1 , L T ( SNR ) , is C coh , 1 , L T ( SNR ) = max h t 1 D D X i =1 E log 1 + T P N c D eff · | h i | 2 χ | h i | 2 ≥ h t . (27) The power allocation in (26) sati sfies the power const raint instant aneously as well as on avera ge. T o see this, note that P inst , nc = N c T D X i =1 q i = N c T D X i =1 T P N c D eff χ | h i | 2 ≥ h t = P (28) and clearly E [ P inst , nc ] = P as well. T he non-causalit y of the scheme is m ore relev ant in t he scenario when the receiver estimates the channel coefficients { h i } D i =1 and feeds back { b i } D i =1 based on t hese estim ates. This motivates us to instead consider a causal p ower allocation scheme, one in whi ch for all i = 1 , · · · , D , q i in (23) depends on b i only through t he in dicator fun ction and P is independent of { b i } D i =1 . From (23), we ha ve E k x k 2 = N c D X i =1 E [ q i ] = N c D X i =1 P · E χ ( | h i | 2 ≥ h t ) = N c P E [ D eff ] . (29) DRAFT 11 Thus to satis fy E [ k x k 2 ] ≤ T P , the po wer allocation for the causal scheme is given by P , c = T P N c E [ D eff ] = T P N c D e − h t (30) and the correspondin g rate, b C coh , 1 , L T ( SNR ) , is given by b C coh , 1 , L T ( SNR ) = max h t 1 D D X i =1 E log 1 + T P N c D e − h t | h i | 2 χ ( | h i | 2 ≥ h t ) . (31) The causal power allocation po licy in (30) satisfies the a verage po wer constraint b ut can have a lar ge instantaneous po wer . Thi s i s because P inst , c = N c T D X i =1 T P N c D e − h t χ | h i | 2 ≥ h t = D eff D e − h t P . (32) Thus E [ P inst , c ] = P , but unlike (28), P inst , c ∈ [0 , ∞ ) depending on th e choice of h t . W e wil l address this iss ue in Sec. III-B, b ut first, we study the average po w er constraint case more carefully . A. Achievable Rates under A vera ge P ower Constraint The following theorem establishes t hat a threshold of the form h t ∼ λ log 1 SNR for s ome λ ∈ (0 , 1) provides the solution to (31). Theorem 1: Give n any λ ∈ (0 , 1) , a causal on-off signalin g scheme under an avera ge power constraint achie ves b C LB ≤ b C coh , 1 , L T ( SNR ) ≤ b C UB with an op timal threshold of the form: lim SNR → 0 h t λ log 1 SNR = 1 (33) where b C UB = SNR λ · log 1 + λ SNR 1 − λ log 1 SNR + log 1 + SNR 1 − λ 1+ λ SNR 1 − λ log ( 1 SNR ) (34) b C LB = SNR λ · log 1 + λ SNR 1 − λ log 1 SNR + 1 2 log 1 + 2 SNR 1 − λ 1+ λ SNR 1 − λ log ( 1 SNR ) . (35) Pr oof: Starting from (31 ), we have b C coh , 1 , L T ( SNR ) = max h t 1 D D X i =1 E log 1 + T P N c D e − h t | h i | 2 χ ( | h i | 2 ≥ h t ) (36) ( a ) = E log 1 + SNR e h t | h | 2 χ ( | h | 2 ≥ h t ) (37) DRAFT 12 where (a) fol lows from the fact that { h i } are i.i.d. C N (0 , 1) and h is a generic i.i.d . C N (0 , 1) random variable. The expectation in (37) can be computed using [24, 4.33 7(1), p. 574]. W ith α , 1+ SNR h t e h t SNR e h t , we h a ve b C coh , 1 , L T ( SNR ) = e − h t · h log 1 + SNR h t e h t + exp ( α ) R ∞ α e − t t d t i (38) = e − h t · log 1 + SNR h t e h t + ν α (39) where ν α , exp ( α ) R ∞ α e − t t d t. As α → ∞ , the following bounds hold for ν α [25, 5.1.20, p. 229]: 1 2 log 1 + 2 α ≤ ν α ≤ log 1 + 1 α . (40) It can be checked that the choice of h t maximizing (39) is obtained by set ting its deri vati ve to zero and sati sfies ∆ , 1 − log 1 + SNR h t e h t − 1 SNR e h t · ν α = 0 . (41) Now , if h t is such th at lim SNR → 0 h t λ log ( 1 SNR ) = 1 for some λ ∈ (0 , 1) , th en as SNR → 0 , we have SNR h t e h t → 0 and α → ∞ . Thus using (40), we can approxim ate ν α as ν α ≈ 1 α . W ith t his approximation in (41), we hav e 1 SNR e h t · ν α ≈ 1 1+ SNRh t e h t → 1 . Using the choi ce of h t as in (33), it follows th at as SNR → 0 , ∆ → 0 . Subst ituting this choice of h t in (39) and us ing the upper and lowe r b ounds on ν α in (40), we obtain the b ounds in (34) and (35). It can also be shown that the rate achiev abl e with t he causal scheme is asymptoti cally (in low- SNR ) the same as the no n-causal capacity in (27 ). That is, b C coh , 1 , L T ( SNR ) is a tight bound to C coh , 1 , L T ( SNR ) and for all λ ∈ (0 , 1) , we ha ve lim SNR → 0 C coh , 1 , L T ( SNR ) − b C coh , 1 , L T ( SNR ) C coh , 1 , L T ( SNR ) = 0 . (42) The proof of the above statement can be found in Appendix A. Corollary 1: T he capacity gain for the D -bit channel state feedback, causal power allocation scheme ove r t he capacity with only recei ver CSI in (19) is lim SNR → 0 b C coh , 1 , L T ( SNR ) C coh , 0 ( SNR ) = (1 + h t ) = 1 + λ log 1 SNR . (43) Pr oof: A T aylor series expansion of t he upper and lower bound s in (34) and (35) shows that they are equal up to first-order . This com mon term is such that b C coh , 1 , L T ( SNR ) = SNR 1 + λ lo g 1 SNR = (1 + h t ) SNR . (44) DRAFT 13 On the other hand, with CSI at t he receiv er alone, we ha ve from (19), C coh , 0 ( SNR ) SNR = (1 + o(1)) . Thus the d esired resul t follows. Remark 3 : The capacity gain due to feedback is directly proportional to h t and the high est gain is obtained by choosing λ → 1 , and equals the benchmark where perfect CSI is av ailable at both the ends [17]. Statements analogous to those in Th eorem 1 and Corollary 1 are well-known from prior work; see [2], [17], [18] for details. W e now re vert our attention back to t he instantaneous t ransmit power case described in (32). Note that as D → ∞ , P inst , c → P as a consequence o f the law of large numbers. Howe ver , for any finite D , P inst , c may be much larger than P . This is a serious issue in practical systems that typically operate wit h peak power limitations. Th us it is im portant t o analyze the impact of constraints on th e instantaneous power in (32), as discussed next. B. Achievable Rates under Instantaneous P ower Constraint In addition to the average power cons traint, let us im pose a constraint on the i nstantaneous transmit p ower of the form P inst , c a.s. ≤ AP (45) where A > 1 is finite. W i th this short-term cons traint, we now comput e the rate, b C coh , 1 , ST ( SNR ) , achie vable with th e causal signalin g schem e. W e are particularly interested in exploring condi- tions under which b C coh , 1 , ST ( SNR ) ≈ b C coh , 1 , L T ( SNR ) . T o thi s end, we employ the fol lowing power allocation Q = diag q 1 , · · · , q 1 | {z } N c , q 2 , · · · , q 2 | {z } N c , · · · , q D , · · · , q D | {z } N c (46) q i = P , c χ ( | h i | 2 ≥ h t ) χ i P j =1 χ ( | h j | 2 ≥ h t ) ≤ AD e − h t ! . (47) The second indicator function in (47) checks for the const raint in (45) causally , durin g each time-frequency coh erence slot, and allocates power only if th is constraint is met. Note that the choice of q i in (47) meets the average power constraint wit h an inequality and hence, q i can be enhanced further . On the other hand, the right -hand side of the argument within the second indicator functio n has to be reduced by the factor T i T where T i corresponds to the time duratio n DRAFT 14 over which the i coherence subspaces under consideration are encoun tered. W e will not bother with these secondary issues in the ensu ing analysis. W e then ha ve b C coh , 1 , ST ( SNR ) = 1 D E " D X i =1 log 1 + T P N c | h i | 2 χ ( | h i | 2 ≥ h t ) D e − h t χ i X j =1 χ ( | h j | 2 ≥ h t ) ≤ AD e − h t !!# = 1 D D X i =1 E " log 1 + SNR · e h t · | h i | 2 χ ( | h i | 2 ≥ h t ) χ i X j =1 χ ( | h j | 2 ≥ h t ) ≤ AD e − h t !# = 1 D D X i =1 Pr i X j =1 χ ( | h j | 2 ≥ h t ) ≤ AD e − h t ! · E log 1 + SNR · e h t · | h i | 2 χ ( | h i | 2 ≥ h t ) ( a ) = E log 1 + SNR · e h t · | h | 2 χ ( | h | 2 ≥ h t ) · P D i =1 Pr P i j =1 χ ( | h j | 2 ≥ h t ) ≤ AD e − h t D = b C coh , 1 , L T ( SNR ) · P D i =1 p i D where b C coh , 1 , L T ( SNR ) is the rate achie vable with only an a verage po wer constraint, and (a) follows from the fact that { h i } are i.i .d. and p i , Pr i X j =1 χ ( | h j | 2 ≥ h t ) ≤ AD e − h t ! . (48) Thus, characterizing b C coh , 1 , ST ( SNR ) is equiv alent t o computi ng p i . In particular , und er what condition does P D i =1 p i D → 1 ? This is di scussed in the following proposi tion. Propositi on 1: W ith h t ∼ λ lo g 1 SNR as in (33), we hav e P D i =1 p i D ≥ L where L ≈ 1 − 4 SNR λ ( 1+ SNR λ / 4 ) AD 2 − 1 − D (1 − A/ 2) ( 1+ SNR λ / 4 ) D ( A − 1) 2 (49) if 1 < A < 2 , and if A > 2 , we ha ve L ≈ 1 − 4 SNR λ 1 + SNR λ / 4 D ( A − 1) . (50 ) In particular , if E [ D eff ] − h t = D e − h t − h t ∼ D SNR λ + λ log ( SNR ) → ∞ as SNR → 0 , (51) then L → 1 for all A > 1 and b C coh , 1 , ST ( SNR ) → b C coh , 1 , L T ( SNR ) . Pr oof: See Ap pendix B. DRAFT 15 C. Discussi on: Rich vs. Sparse Multipath The result of Theorem 1 implies that the rate achiev able with the D -bit channel state feedback scheme approaches the benchmark, t he perfect transmitt er CSI capacity when λ → 1 . Further - more, this benchmark can be att ained in the wideband limit, even when there is an instantaneou s power constraint. As described in Prop. 1, E [ D eff ] − h t → ∞ provides a sufficient condition. W e no w discuss the feasibility of s atisfying these conditi ons when the channel i s rich and when it is sparse. The behavior of E [ D eff ] provides key insight s in this regar d. A1) Rich multipath : For a rich channel, from (6) w e note that D scales linearly with T and W . For a fixed T , D ∼ SNR − 1 (since SNR = P W ). That is, E [ D eff ] − h t = D SNR λ + λ log( SNR ) → ∞ for 0 < λ < 1 . W e can thus conclude that for rich multipath the perfect CSI benchmark is attained trivially with both average and instantaneous po wer const raints. A2) Sparse multipath : From t he power -l aw scalin g in (8), i gnoring the const ant factors, we hav e D ∼ T δ 1 W δ 2 and therefore E [ D eff ] − h t ∼ T δ 1 SNR λ − δ 2 + λ log ( SNR ) . (52) For a fixed T , as SNR → 0 , we ha ve E [ D eff ] − h t → ∞ if 0 < λ < δ 2 −∞ 1 > if λ ≥ δ 2 . (53) While we can approach t he benchmark capacity with an average power constraint , (53) suggests a cap on λ , the highest achie vable gain with an inst antaneous po wer const raint. D. Capacity Optimal P ac ket Configurations From (53), we see that th e perfect CSI gain is not always achiev able when there is an instantaneous power constraint. Howe ver , we note that (53) is d eriv ed assuming a fixed choice of T , while we know that sp arsity in Dop pler facilitates any desired scaling in the D oF with increasing T . Lev eraging both delay and Doppler sparsities, we propose the follo wing solution to get around the restrictio n in A2 . Inst ead of signaling with a fixed duration T , let us suppos e that we m aintain a scali ng relationship for T as a functi on of W . For example, l et T ∼ W ρ for some ρ > 0 . Consequently , D ∼ T δ 1 W δ 2 ∼ W δ 2 + ρδ 1 and we have E [ D eff ] − h t ∼ SNR λ − δ 2 − ρδ 1 + λ log ( SNR ) . (54) DRAFT 16 Thus in the limi t as SNR → 0 , the asymptotic behavior of E [ D eff ] − h t is giv en by E [ D eff ] − h t → ∞ if 0 < λ < δ 2 + ρδ 1 −∞ 1 > if λ ≥ δ 2 + ρδ 1 . (55) Note that i n (55), we hav e δ 2 + ρδ 1 ≥ 1 ⇐ ⇒ ρ ≥ 1 − δ 2 δ 1 (56) which cons equently leads to the desired result that E [ D eff ] − h t → ∞ for all λ ∈ (0 , 1) . Thus the benchmark gain is achie va ble e ven under an in stantaneous power const raint. T o further i llustrate this idea, we present an example when channel sparsity foll ows the power - law scaling in (8). For sim plicity , let us assume th at δ 1 = δ 2 = δ . From (56), we require T ∼ W ρ with ρ ≥ 1 − δ δ to achiev e the benchmark performance. With N = T W , th e capacity op timal ( T , W ) packet configuration is then gi ven by T ∼ N ρ 1+ ρ , W ∼ N 1 1+ ρ . (57) Fig. 2 illustrates the o ptimal packe t configuratio n relationshi p for a rich m ultipath channel ( δ → 1) , for a m edium sparse channel ( δ = 0 . 5 ) and for a very sparse channel ( δ → 0) . They show that in sparse multipath channels, the perfect CSI capacity gain is achie vable with limited feedback under both aver age and instantaneous constraints on the transmis sion power by appropriate signaling strategies. These gui delines can be easily extended t o generic sub-linear scaling laws. I V . A C H I E V A B L E R A T E S W I T H C H A N N E L E S T I M A T I O N A T T H E R E C E I V E R In contrast to the perfect receiv er CSI case, we now consider the more realisti c case where no CSI is av ail able a priori . W e first cons ider only an a verage power constraint and show that the first-order t erm of the benchm ark capacity can be achieved if the channel i s s parse and the channel coherence di mension, N c , scales with SNR at an appropriate rate, allowing the recei ver to learn the channel reliabl y . W e also show that this is infeasible when the channel is rich, due to poor channel estim ation. More specifically , the focus is here on a t raining-based s ignaling scheme where the trans- mitted signals include t raining s ymbols to enable channel estimatio n and coherent detection. The restriction to training schemes is motiv ated by th eir easy realizability . The total energy a vailable for training and communication is P T , of which a fraction η is used for training and DRAFT 17 T W T W W T δ → 0 δ → 1 δ = 0 . 5 Rich Medium Sparse V ery Sparse Fig. 2. Optimal packet configurations with perfect recei ver C SI and limited feedback as a function of richness of the channel. Three cases are illustrated here: Rich multipath ( δ → 1) , medium sparsity ( δ = 0 . 5) and very high sparsity ( δ → 0) . the remainin g fraction (1 − η ) is u sed in communication. W ith the block fading model, t his means that one signal space di mension in each coherence s ubspace is used for training and the remaining ( N c − 1) are used in communication . This is pictorially illustrated in Fig. 3. W e consider m inimum mean-squared error (MMSE) channel estim ation and the reader is referred to [13, Sec. IIc] for more details on the training scheme. A. Achievable Rates under A vera ge P ower Constraint Let b C train , 1 , L T ( SNR ) denot e th e aver age m utual in formation achiev able (per-dimension) with the causal t raining scheme under the a verage power constraint. W e proceed alo ng the same lines as the no feedback case [13, Lemma 1 ] to characterize b C train , 1 , L T ( SNR ) . Let H be t he actual channel, b H be the estim ated channel and ∆ = H − b H denote the esti mation error matrix. W e begin with the following well-known lo wer- bound [26] to b C train , 1 , L T ( SNR ) : b C train , 1 , L T ( SNR ) ≥ sup Q E h log det I ( N c − 1) D + b HQ b H H ( I + Σ ∆x ) − 1 i N c D (58) DRAFT 18 t f T W t = T 0 T 0 + T coh T coh T raining Symbol Fig. 3. T raining-based signaling scheme in t he STF domain. The D estimated channel coef fi cients determine the D feedback bits for the communica tion scheme with limited feedback. where the supremum is over { Q : T r( Q ) ≤ (1 − η ) T P } . The optimal Q is again d iagonal and analogous to (23), equals Q = diag q 1 , · · · , q 1 | {z } N c − 1 , q 2 , · · · , q 2 | {z } N c − 1 , · · · , q D , · · · , q D | {z } N c − 1 (59) q i = (1 − η ) T P ( N c − 1) D · χ | b h i | 2 ≥ h train t E h χ | b h | 2 ≥ h train t i (60) where h train t is the t hreshold in the t raining case. The fol lowing theorem describes con ditions under which the rates achie vable wi th t he t raining scheme con verge to th ose in t he coherent case. Theorem 2: If N c = 1 SNR µ for some µ > 1 , then lim SNR → 0 b C train , 1 , L T ( SNR ) b C coh , 1 , L T ( SNR ) = 1 . (61) DRAFT 19 Pr oof: Using th e choice of Q from (60) in (58) and proceeding along the lines of (48), we obtain b C train , 1 , L T h train t , η , N c , SNR = κ 1 · log 1 + (1 − η )(1 + η N c SNR ) h train t SNR (1 − η ) SNR + κ 1 κ 2 + ν (1 − η )(1+ η N c SNR ) h train t SNR +(1 − η ) SNR + κ 1 κ 2 η (1 − η ) N c SNR 2 , (62) κ 1 = e − h train t (1+ ηN c SNR ) ηN c SNR , κ 2 = η ( N c − 1) SNR + 1 − 1 N c (63) where ν • is as defined following (39). The tightest lo wer bound to (62) is obtained by maxim izing b C train , 1 , L T h train t , η , N c , SNR over η , the fraction of energy spent on training, and over h train t : C ∗ train , 1 , L T = max h train t max η b C train , 1 h train t , η , N c , SNR . (64) Performing the optimization in (64) seems difficult. Motivated by our study i n Sec. III, we now assume a sp ecific form for the threshold: h train t = ǫ log 1 SNR . It is shown in Appendix C t hat with this choice of h train t , the optimal choice for η and N c can be obtained in clo sed form and the desired result in (61) is est ablished. Alternative ly , we dem onstrate a sub-optimal, but simp ler approach t hat suffic es to obt ain (61). This approach us es the choice of η that opt imizes the aver age mu tual information in the no feedback case [13 , Lem ma 2]. This choice, denoted by η ∗ , is given as η ∗ = N c SNR + N c − 1 ( N c − 2) N c SNR · q 1 + N c SNR ( N c − 2) N c SNR + N c − 1 − 1 . (65) Let h train , ⋆ t = η ∗ N c SNR 1+ η ∗ N c SNR h t where h t ∼ λ log 1 SNR , κ ⋆ 1 = κ 1 | η ∗ , h train , ⋆ t and κ ⋆ 2 = κ 2 | η ∗ . If we define, A 1 = (1 − η ∗ )(1+ η ∗ N c SNR ) h train , ⋆ t SNR (1 − η ∗ ) SNR + κ ⋆ 1 κ ⋆ 2 , (66) A 2 = (1 − η ∗ )(1+ η ∗ N c SNR ) h train , ⋆ t SNR +(1 − η ∗ ) SNR + κ ⋆ 1 κ ⋆ 2 η ∗ (1 − η ∗ ) N c SNR 2 , (67) it is cumbersome, but s traightforward to show that lim SNR → 0 A 1 = 0 and lim SNR → 0 1 A 2 = 0 (68) DRAFT 20 for any µ > 0 . From (62), we then ha ve max h train t ,η b C train , 1 , L T ( h train t , η , N c , SNR ) ≥ b C train , 1 , L T ( h train , ⋆ t , η ∗ , N c , SNR ) (69) = κ 1 · [log (1 + A 1 ) + ν A 2 ] (70) ( a ) ≥ κ 1 · log (1 + A 1 ) + 1 2 log 1 + 2 A 2 (71) ( b ) ≈ κ 1 · A 1 + 1 A 2 (72) where (a) follows from (40) and (b) is the low- SNR approxi mation to (71). Subst ituting for h train , ⋆ t and si mplifying we can reduce the lower bound i n (72 ) to b C train , 1 , L T ( SNR ) ≥ (1 − η ∗ ) N c N c − 1 η ∗ N c SNR 1 + η ∗ N c SNR [1 + h t ] SNR . (73) Substitutin g for η ∗ from (65) and N c = 1 SNR µ , it can be checked that when µ > 1 the leading term is [1 + h t ] SNR which equ als the first-order term of the coherent capacity as described by Corollary 1. On the o ther hand when µ < 1 , th e leading term takes the form O SNR 3 − µ 2 and hence, µ > 1 is necessary . Ha ving establi shed the result with an av erage power con straint, let u s consider the instantaneous power constraint case. B. Achievable Rates under Instantaneous P ower Constraint W e impose a constraint as in (45) for the communication phase of the training scheme. W ith the same power allocation scheme as in (47) (Sec. III-B), we obtain b C train , 1 , ST ( SNR ) = 1 − 1 N c 1 D D X i =1 E " log 1 + | b h i | 2 q i (1 + E tr ) 1 + q i + E tr × χ i X j =1 χ ( | b h j | 2 ≥ h train t ) ≤ AD e − h train t (1+ ηN c SNR ) ηN c SNR (1 − η ) (74) = b C train , 1 , L T ( SNR ) · P D i =1 p train i D (75) where E tr = η N c SNR and p train i = Pr i P j =1 χ ( | b h j | 2 ≥ h train t ) ≤ AD e − h train t (1+ ηN c SNR ) ηN c SNR (1 − η ) ! . Und erstand- ing when P D i =1 p train i D → 1 is sim ilar to the case s tudied in Sec. III-B. T aking recourse to the analysis of Prop. 1 by usin g a threshold of the form h train , ⋆ t = η ∗ N c SNR 1+ η ∗ N c SNR h t where η ∗ is as in (65) DRAFT 21 and h t ∼ λ lo g 1 SNR , it can be shown that the P D i =1 p train i D is lower bo unded by the same expression as in (49) and (50) with A replaced by A 1 − η ∗ . After some simpl ifications, we can conclu de that if E [ D eff ] 1 − η ∗ − h t → ∞ , then b C train , 1 , ST ( SNR ) → b C train , 1 , L T ( SNR ) . Note that t he conditi on in the perfect CSI case is more stringent than in the training sett ing. That is, if the channel is s uch that E [ D eff ] − h t → ∞ , t hen it auto matically ensu res th at E [ D eff ] 1 − η ∗ − h t → ∞ . C. Discussi on The analys is in Sec. IV -A and IV -B reveals that the fol lowing conditions are critical: C1 ) The channel coherence dim ension, N c , scales wi th SNR according t o N c ∼ 1 SNR µ , µ > 1 , and C2 ) The independent degre es of freedom (DoF), D , in the channel scales with SNR such that E [ D eff ] 1 − η ∗ − h t = D e h t 1 − η ∗ − h t → ∞ as SNR → 0 . W ith onl y an a verage power constraint, C1 is necessary and su f ficient so that b C train , 1 , L T ( SNR ) → b C coh , 1 , L T ( SNR ) . In particular , with λ → 1 , we approach the p erfect CSI benchm ark. When there is an instantaneou s p ower constraint, we need to satisfy both C1 and C2 so th at the benchmark can be attai ned. W e now study th e i mplications of these condit ions. Note th at C1 p redicates a certain minim um channel coherence le vel to ensure the fidelity of the trainin g performance. That is, the larger the value of µ and hence, N c , the more easier it is to meet the benchmark. On the other hand, C2 describes the required growth rate in the DoF , D , so that E [ D eff ] − h t → ∞ and the i nstantaneous power constraint is s atisfied without any rate loss. That is, the larger the value of D , the more easier it is t o meet the benchmark. It is clear that the two conditions are som e what conflicting in nature since for a richer channel, it i s easier to increase D but m ore difficult to increase N c , while for a sparser channel, it is the rev erse. Therefore a natural questi on is if they can be satisfied s imultaneous ly . T o understand this, we first study the achiev abil ity of C1 . What are t he conditions o n the channel parameters ( T m , W d , δ 1 and δ 2 ) and how do they interact with the signal space parameters ( T , W and P ) so that µ > 1 is feasible? As we di scuss next, by lev eraging d elay and Doppler sparsities and usin g p eaky signaling (when necessary), µ > 1 is achie vable. B1) Rich multipath : When the channel is rich in bot h d elay and Doppl er , N c = 1 T m W d is fix ed and does not scale with SNR . Thus we can never maintain the scaling relationship in N c as in DRAFT 22 Theorem 2 and C1 can never be satisfied. Therefore, we cannot attain the benchmark even under an a verage power constraint. B2) Doppler sparsity only : In this case W coh = 1 T m is fixed and the scaling in N c is onl y through T coh ∼ f 2 ( T ) (see (15)). Therefore, by s caling T w ith W according t o T ∼ f − 1 2 ( W µ ) and choosing µ > 1 , we ha ve N c ∼ T coh ∼ f 2 f − 1 2 ( W µ ) ∼ 1 SNR µ . For the power -law scaling in (16), we obtain T ∼ W µ 1 − δ 1 . (76) Note that as δ 1 increases and t he channel gets more richer , T increases monotonically in (76). B3) Delay sparsity only : In this case, T coh = 1 W d and N c = W coh T coh scales with SNR only through W coh ∼ f 1 1 SNR . Therefore, for any su b-linear function f 1 ( · ) , we cannot satisfy µ > 1 . A possi ble s olution to overcome t his difficulty is to use peaky signaling where training and com munication are performed only on a subset of t he D coherence subspaces. Modeling peakiness as i n [4], [13] and defining ζ = SNR γ , γ > 0 as the fraction o f D ove r which signaling is performed, it can be shown that [13, Lemma 3] the condition for asympt otic coherence gets relaxed t o N c = 1 SNR µ p eaky from the original N c = 1 SNR µ where µ peaky = µ + γ . W e require µ peaky > 1 whi ch is t he sam e as µ > 1 − γ . For the power -l aw scaling in (16 ), we h a ve N c ∼ f 1 ( W ) ∼ W 1 − δ 2 ∼ 1 SNR 1 − δ 2 . Thus , if the peakiness coefficient γ sati sfies γ > δ 2 , we can satisfy the desired conditio n. B4) Delay and Doppler sparsity : Using (15), we have W coh ∼ f 1 ( W ) and T coh ∼ f 2 ( T ) . Therefore, if we scale T with W according to T ∼ f 3 ( W ) with f 3 ( x ) = f − 1 2 x µ f 1 ( x ) , (77) we h a ve N c = W coh T coh ∼ f 1 ( W ) f 2 ( f 3 ( W )) = f 1 ( W ) f 2 f − 1 2 W µ f 1 ( W ) ∼ 1 SNR µ . Thus wi th µ > 1 in (77), we attain the desired scaling of N c with SNR . For the po wer -law scaling in (16), the desi red scalin g i n N c can be obtained by choo sing T , W and P according to t he foll owing canonical relationship t hat i s obtained using (16) in (77) T = T δ 2 m W δ 1 d 1 1 − δ 1 W µ − 1+ δ 2 1 − δ 1 P µ 1 − δ 1 . (78) From the above dis cussion, it is clear that channel sparsit y is necessary and in additi on we also require a specific scalin g relationship between T and W as defined i n (78). But th is i s necessary DRAFT 23 for achieving the benchm ark capacity with an av erage power constraint (sati sfying C1 ). W e now study how this s caling l aw impacts the scaling of D wi th SNR , as in the instantaneous power case. T his is critical in determin ing the achiev abi lity of C2 , which we discuss next. W e recall that by d efinition D = T W N c = T W SNR µ . (79) Using (78) i n (79 ) and s implifyi ng, we obtain t he induced scalin g behavior on D with SNR as D ∼ SNR δ 1 (1 − µ ) − δ 2 1 − δ 1 . (80) Therefore, we ha ve E [ D eff ] − h t = SNR λ + δ 1 (1 − µ ) − δ 2 1 − δ 1 + λ log( SNR ) and con sequently E [ D eff ] − h t → ∞ if 0 < λ < δ 2 +( µ − 1) δ 1 1 − δ 1 −∞ if 1 > λ ≥ δ 2 +( µ − 1) δ 1 1 − δ 1 . (81) It is easily seen that δ 2 + ( µ − 1) δ 1 1 − δ 1 > 1 ⇐ ⇒ µ > 1 − δ 2 δ 1 (82) which yields E [ D eff ] − h t → ∞ for all λ ∈ (0 , 1 ) , and C2 is sat isfied as desired. The s pecial cases of d elay sp arsity only and Doppler sparsity only (as in B2 and B3 ) are simpl e extensions and follow naturally . T o s ummarize, µ > 1 = ⇒ C1 is achievable (83) µ > 1 − δ 2 δ 1 = ⇒ C2 is achievable . (84) Therefore, µ > max 1 , 1 − δ 2 δ 1 = ⇒ C1 and C2 are achievable . (85) W e now elucidate the optim al packet configuration s for diffe rent levels of channel sparsity . Analogous to the discussion in Sec. III-D, we focus on the power -law scaling and illus trate rules of thu mb for choo sing T and W for a giv en N = T W . Assuming symmetrical sparsit y ( δ 1 = δ 2 = δ ) , we note the follo wing t wo cases: Case 1: 1 − δ δ > 1 ⇐ ⇒ δ < 0 . 5 , T ∼ W ρ , ρ > 1 − δ δ (86) Case 2: 1 − δ δ < 1 ⇐ ⇒ δ > 0 . 5 , T ∼ W ρ , ρ > δ 1 − δ . (87) DRAFT 24 The correspondi ng packet configurations are shown in Fig. 4 for δ → 0 , δ = 0 . 5 and δ → 1 . It is observed th at the slo west scaling in T with W i s obt ained for δ = 0 . 5 when the DoF follow a squar e-r oo t scaling law with signal space di mension. On either extreme of thi s square- root law , the required scalin g in T with W only g ets worse. Thi s conclusio n is expected and is cons istent with the contradictory requirements presented by C1 and C2 . When δ < 0 . 5 , the channel cond itions are more fa vorable t ow ards scaling N c as a function o f SNR (sp ecified by C1 ). Howe ver , the required scali ng of D wit h SNR (specified by C2 ) is non-trivial and ulti mately dominates the required scaling of T w ith W . On t he other hand, when δ > 0 . 5 , the relatively less sparse channel conditio ns are fav orably d isposed towards the scalin g of D as a functi on of SNR , but this is at the cost of s caling in N c . For the case of asy mmetrically sparse channels, it can be shown that this desirable conditio n (slowest scaling of T wit h W ) generalizes to δ 1 + δ 2 = 1 . T W W T δ → 0 δ → 1 δ = 0 . 5 Rich Medium Sparse V ery Sparse T W Fig. 4. Optimal pack et configurations i n the non-coherent scenario with limited feedback. Three cases illustrated here are rich multipath ( δ → 1) , medium sparsity ( δ = 0 . 5) and ve ry high sparsity ( δ → 0) . V . C O N C L U D I N G R E M A R K S In this paper , we studied the achiev able rates of sparse multipath channels wit h limited feedback. The focus of our analysis is i n th e wideband / low- SNR regime. Our i n vestigation includes constraining both the a verage and the instantaneous transmit powe rs. W e first analyzed the case when th e receiv er has perfect CSI and wh en one b it (per channel coefficient) of this CSI is k nown perfectly at the transm itter . W e established condition s under which the rates achie vable with this scheme approach the capacity wit h perfect receiv er and transmitter CSI. For sparse channels, these conditi ons translate to certain optimal packet configurations for signaling . When DRAFT 25 the recei ver has n o CSI a priori , we studied the performance of a training schem e. It is shown that with only an average power const raint, channel sparsity is necessary to attain the coherent performance. W ith an instataneous power constraint, we est ablished condit ions on opt imal packet configurations in order to approach the benchm ark capacity g ain asym ptotically as SNR → 0 . T ABLE I C O N D I T I O N S N E C E SS A RY T O A C H I E V E T H E P E R F E C T C S I B E N C H M A R K O F log ` 1 SNR ´ SNR . CSI CS I Power Necessary Signaling Rx. Tx. Const. Conditions Parameters Perf. Perf. - h w ∼ log ` 1 SNR ´ W aterfilling; see [2], [17] Perf. 1 bit A vg. h t = λ log ` 1 SNR ´ , λ → 1 No constraints on richness or T , W ; see [2], [17], [18] Perf. 1 bit Inst. h t = λ log ` 1 SNR ´ Rich channel: no constraint on T or W , for λ < 1 , and Sparse ( T fixed): λ < δ 2 limits rates, E [ D eff ] − h t → ∞ Sparse (general): T ∼ W ρ , ρ ≥ 1 − δ 2 δ 1 T rain. 1 bit A vg. N c ∼ 1 SNR µ , µ > 1 Rich channel: Impossible, Sparsity (Doppler): Non-peaky scheme with T ∼ W µ 1 − δ 1 , Sparsity (delay): P eaky scheme with peakiness coef ficient γ > δ 2 , Sparsity (both): Non-peaky scheme; see (77) and (78) T rain. 1 bit I nst. N c ∼ 1 SNR µ , µ > 1 Rich channel: Impossible, and E [ D eff ] 1 − η ∗ − h t → ∞ Sparse (both): µ > 1 − δ 2 δ 1 for no rate loss, else λ < δ 2 +( µ − 1) δ 1 1 − δ 1 W e contrast the results of this work with recent observations i n [17], [18]. The focus in [17], [18] is on training schemes and on s cenarios where T coh increases as SNR decreases, although there is no mention of how such a scaling la w can be realized in pra ctice. In particular , the authors show that capacity scales as log ( T coh ) SNR i f log ( T coh ) log 1 SNR and equals t he coherent capacity , log 1 SNR SNR , when log( T coh ) lo g 1 SNR . On the other hand, we ha ve sh own that when the chann el is sparse, channel coherence scales naturall y with T and W and the benchmark gain, log 1 SNR , can always be achieved by app ropriately choosin g T and W . Furth ermore, w hile DRAFT 26 [17], [18] considered only an av erage power constraint, we have established achiev abili ty under both a verage and instantaneous p ower constraint s. Al so, peaky traini ng s chemes are n ecessary in the framework of [17] to achie ve perfect traini ng performance. Such schemes would violate any finite instantaneous powe r con straint. Our findings here revea l that channel sparsity is a degree of freedom that can be exploit ed to obtain near-coherent performance wi th non-peaky train ing schemes. T able I provides a short summary o f our contrib utions and places them in the con text of [2], [17 ], [18]. Finally , we n ote that the results ob tained here closely p arallel ou r earlier work [13] where we studied the achiev able rates with training and no feedback. W e sho wed that when N c = 1 SNR µ with µ > 1 , the channel is a symptotically coher ent ; channel estimation performance is near- perfect at a vanishing ener gy cost. Analogous to [13], we h a ve shown here that under the assumpti on of an error-free D -bit feedback link, the rate achiev able with the training scheme con ver ges to the perfect CSI benchmark. Furthermore, the cost of feedback, measured in terms of t he number of feedback b its per di mension ( D/ N ) con verges asymptotically to zero in a sparse channel. A P P E N D I X A. T ightness of b C coh , 1 , L T ( SNR ) to C coh , 1 , L T ( SNR ) as SNR → 0 Let χ i denote the random variable χ ( | h i | 2 ≥ h t ) . Defining γ , | C coh , 1 , L T ( SNR ) − b C coh , 1 , L T ( SNR ) | C coh , 1 , L T ( SNR ) , we hav e γ = 1 D D X i =1 E log 1 + T P | h i | 2 χ i ( De − h t − P i χ i ) P i χ i N c D e − h t 1 + T P | h i | 2 χ i N c D e − h t (88) ≤ 1 D D X i =1 E log 1 + T P | h i | 2 χ i ( De − h t − P i χ i ) P i χ i N c D e − h t 1 + T P | h i | 2 χ i N c D e − h t (89) ( a ) ≤ 1 D D X i =1 E T P | h i | 2 χ i ( De − h t − P i χ i ) P i χ i N c D e − h t 1 + T P | h i | 2 χ i N c D e − h t (90) = T P N c D 2 e − h t D X i =1 E | h i | 2 χ i D e − h t − P i χ i P i χ i 1 + T P | h i | 2 χ i N c D e − h t (91) ( b ) = T P N c D e − h t E | h 1 | 2 χ 1 D e − h t − P i χ i P i χ i 1 + T P | h 1 | 2 χ 1 N c D e − h t , γ 0 (92) DRAFT 27 where (a) follows from t he log-inequality and (b) from the fact that { h i } are i.i.d. Conditioni ng on χ 1 , we n ow ha ve γ 0 = T P N c D e − h t E [ χ 1 ] E h 1 , { χ j ,j > 1 } | h 1 | 2 D e − h t − (1 + P j > 1 χ j ) (1 + P j > 1 χ j ) 1 + T P | h 1 | 2 N c D e − h t (93) = SNR · E h 1 , { χ j ,j > 1 } | h 1 | 2 D e − h t − (1 + P j > 1 χ j ) (1 + P j > 1 χ j ) 1 + T P | h 1 | 2 N c D e − h t (94) ( a ) = SNR · E h 1 " | h 1 | 2 1 + T P | h 1 | 2 N c D e − h t # · E { χ j ,j > 1 } D e − h t − (1 + P j > 1 χ j ) (1 + P j > 1 χ j ) (95) ≤ SNR · E [ | h 1 | 2 ] · E { χ j ,j > 1 } D e − h t − (1 + P j > 1 χ j ) (1 + P j > 1 χ j ) , γ 1 (96) where (a) fol lows from the fact that h 1 and { χ j , j > 1 } are independent. T o show the closeness of b C coh , 1 , L T ( SNR ) to C coh , 1 , L T ( SNR ) , we now produce an upper bo und for γ 1 that tends to 0 as SNR → 0 . Our goal is to show that given any choi ce of D , γ 1 SNR is bounded. Consider E { χ j ,j > 1 } D e − h t − (1 + P j > 1 χ j ) (1 + P j > 1 χ j ) = E { χ j ,j > 1 } " D e − h t (1 + P j > 1 χ j ) − 1 # ( a ) ≤ v u u u t E χ j D e − h t (1 + P j > 1 χ j ) ! 2 + 1 − 2 D e − h t (1 + P j > 1 χ j ) | {z } , γ 2 where (a) i s a consequence of Cauchy-Schwa rz inequality . Let E denote e − h t . W e t hen have γ 2 ( b ) ≤ v u u t 1 + D 2 E 2 · E χ j " 1 (1 + P j > 1 χ j ) 2 # − 2 D E 1 + ( D − 1) E (97) where in (b) we ha ve used the fact that E 1 X ≥ 1 E [ X ] for a posi tive random variable X . W e now estimate α , E χ j h 1 (1+ P j > 1 χ j ) 2 i . It is easy to check that α = D − 1 X i =0 D − 1 i E i (1 − E ) D − 1 − i ( i + 1) 2 . (98) DRAFT 28 Noting that (1 + y ) D − 1 = D − 1 X i =0 D − 1 i y i (99) and integrating twice both sid es of (99) with respect to y , we ha ve (1 + y ) D +1 D ( D + 1) = D − 1 X i =0 D − 1 i y i +2 ( i + 1)( i + 2) . (100) Using y = E 1 − E in (100), we hav e 1 D ( D + 1) E 2 = D − 1 X i =0 D − 1 i E i (1 − E ) D − 1 − i ( i + 1)( i + 2) . (101) Observe that 1 ( i +1) 2 ≤ 2 ( i +1)( i +2) for all i ≥ 0 and an u pper bound for γ 2 is γ 2 ≤ s 1 + 2 D 2 E 2 D ( D + 1) E 2 − 2 D E 1 + ( D − 1) E = s D 2 E − 4 D E + 3 D − E + 1 ( D + 1)( D E − E + 1) (102) which i s bo unded for any choice of D . (In fact, the upper bound con verges to 1 as D → ∞ ). Note that the bound in (102 ) i s loo se and one m ight expect that γ 1 SNR → 0 as D → ∞ as a consequence of the law of large nu mbers. Howe ver , for our purpose, the proposed loose upper bound in (102) is suffi cient. B. Pr oof of Pr o position 1 T o compute p i , Pr P i j =1 χ ( | h j | 2 ≥ h t ) ≤ AD e − h t , we need the following result [27, Theorem 2.8, p . 57] on the tail p robability of a sum of independent random variables. Lemma 1: Let X i , i = 1 , · · · , n be in dependent random variables with E [ X i ] = 0 and E [ X 2 i ] = σ 2 i . Define B n = P n i =1 σ 2 i . If th ere e xists a pos itive constant H such that E [ X m i ] ≤ 1 2 m ! σ 2 i H m − 2 (103) for all i and x ≥ B n H , th en we ha ve Pr P n i =1 X i > x ≤ exp − x 4 H . If x ≤ B n H , th en we have Pr P n i =1 X i > x ≤ exp − x 2 4 B n . T o apply Lemma 1, we s et n = i and X j = χ ( | h j | 2 ≥ h t ) − E [ χ ( | h j | 2 ≥ h t )] = χ ( | h j | 2 ≥ h t ) − e − h t = χ j − E for j = 1 , · · · , i . Then, a simple comput ation o f t he hi gher moments of X j implies that E [ X 2 j ] = σ 2 j = E (1 − E ) , B i = i E (1 − E ) , E [ X m j ] = E ( 1 − E ) · ((1 − E ) m − 1 + ( − 1) m E m − 1 ) . DRAFT 29 It can be checked that H = (1 − E ) is sufficient to satisfy the con ditions of Lemma 1. W ith this setting, we ha ve Pr i X j =1 χ ( | h j | 2 ≥ h t ) − i E > ( AD − i ) E ! ≤ exp − ( AD − i ) E 4(1 − E ) if i ≤ ⌊ AD 2 ⌋ , exp − ( AD − i ) 2 E 4 i (1 − E ) if i ≥ ⌊ AD 2 ⌋ + 1 . (104) If 1 < A < 2 , with κ = E 4(1 − E ) using (104 ), the follo wing lower bou nd, L , hold s for P D i =1 p i D : L = 1 − e − AD κ X i ≤⌊ AD 2 ⌋ e iκ + X i ≥⌊ AD 2 ⌋ +1 e − ( AD − i ) 2 κ i (105) ( a ) = 1 − " e − κ ( AD − 1) · ( e κ ⌊ AD 2 ⌋ − 1) e κ − 1 + D − AD 2 e − ( A − 1) 2 D κ # (106) ≥ 1 − 1 e κ − 1 · e − κ ( AD 2 − 1 ) + (1 + D (1 − A/ 2)) e − ( A − 1) 2 D κ (107) where (a) follows by first using ( AD − i ) 2 i ≥ ( A − 1) 2 D for all 1 ≤ i ≤ D and then upon further simplification using t he su m of a geometric series. If A ≥ 2 , we ha ve th e following lower bound to P D i =1 p i D : L = 1 − exp( − AD κ ) X 1 ≤ i ≤ D e iκ ≈ 1 − e − κ ( D ( A − 1) − 1) · 1 e κ − 1 . (108) W ith h t = λ log 1 SNR as in (33), the dominant term of E is SNR λ and hence i n κ is SNR λ 4 . W ith this choice of h t in (107) and (108) and simplifyi ng, we obtain the desired bound s i n (49) and (50). It is also straightforward to check that when D satisfies D SNR λ + λ log( SNR ) → ∞ as SNR → 0 , L → 1 in both the cases. C. Completing the Pr oof of Theor em 2 The choice of h t we study is h t = ǫ log 1 SNR for some ǫ > 0 . First, with t his fixed choice of h t , note that maxim izing b C train , 1 , L T ( η , N c , SNR ) i s equiva lent to s etting its deriv ative (with DRAFT 30 respect to η ) to zero. Then, it is straigh tforward to check th at the deriv ative is ν β h t η | {z } I + h t η log e 1 + (1 − η )(1 + η N c SNR ) h t SNR (1 − η ) SNR + κ 1 κ 2 | {z } I I + ν β − 1 β SNR η κ 1 1 − 1 N c N c η 2 SNR + 2 η − 1 (1 − η ) 2 + h t (1 + η N c SNR ) η N c SNR (1 − η ) − SNR ( h t + 1) | {z } I I I + h t SNR 2 N c η (1 − η ) SNR + κ 1 κ 2 · N c SNR 2 (1 − η ) 2 − κ 1 κ 2 (1 + η SNR N c ) 1 + h t (1 − η ) N c η 2 SNR (1 − η ) SNR + κ 1 κ 2 + (1 − η )(1 + η N c SNR ) h t SNR | {z } IV . (109) For s implicit y , we will denote the fou r term s in (109) by I , II , III and IV . W e will further assum e that η = SNR x , x ≥ 0 and N c = 1 SNR y , y > 0 . For a giv en choice of ǫ , our g oal is to determi ne the relationship b etween x and y such that th e deriv ativ e in (109) can be zero. W e cons ider three cases: i) y > 1 + x , ii) y < 1 + x and iii) y = 1 + x . Case i: First, note that η N c SNR = SNR − z for some z > 0 . The dominant t erms of β can be seen to be 1 SNR 1 − ǫ + ǫ lo g 1 SNR and thus, up to first order β = 1 SNR 1 − ǫ . Similarly , (1 − η ) SNR + κ 1 κ 2 up to first order equals SNR ǫ − z . Note from [25, 5.1.20, p. 229] that ν β = O 1 β if β → ∞ and hence I is ǫ log 1 SNR 1 SNR ǫ + x − 1 . It can also be checked t hat II is ǫ log 1 SNR 2 1 SNR ǫ + x − 1 , ν β − 1 β = O 1 β 2 and hence III is ǫ log 1 SNR 1 SNR ǫ + x − 1 as long as y < 1 + 2 x . Under the same assumption , y < 1 + 2 x , IV is − ǫ log 1 SNR 2 1 SNR ǫ + x − 1 . Thus, by playing with constants the deriv ative can b e set to zero in t his case. If y ≥ 1 + 2 x , I and II remain unchanged, but III is SNR 2+ x − y − ǫ and I V is − ǫ log 1 SNR SNR 2+ x − y − ǫ . By comparing th e coef ficients , we see that the only way the deri vati ve can be zero i s if y = 1 + 2 x . Case ii: In this case, the first o rder terms show the following behavior . W ith w = 1 + x − y > 0 , I is SNR w − x , II i s ǫ log 1 SNR log log 1 SNR 1 SNR x , III is − SNR 2 w − x 1 ǫ log ( 1 SNR ) , and IV is SNR 2 − 2 y + x . It can be seen that the deri vati ve can ne ver be zero and hence this case i s rul ed out . Case iii: In this case, b ased on a simi lar analysis, we see that the deri va tive can again be set to zero. Therefore, i f ǫ ∈ (0 , 1) , x ≥ 0 and 1 + x < y ≤ 1 + 2 x , we hav e b C train , 1 , L T ( SNR ) ≥ SNR ǫ log 1 + ǫ log 1 SNR SNR 1 − ǫ (1 − SNR x ) 1 − SNR y ! + SNR . (110) DRAFT 31 Thus, b C train , 1 , L T ( SNR ) is up t o first order the same as b C coh , 1 , L T ( SNR ) and C coh , 1 , L T ( SNR ) . If y = 1 + x and η N c SNR = a for some choice of a (positive, finite and in dependent of SNR ), we need a > ǫ 1 − ǫ and we h a ve b C train , 1 , L T ( SNR ) ≥ SNR ǫ (1+ a ) a log 1 + ǫ SNR 1 − ǫ (1+ a ) a log 1 SNR + a 1 + a · SNR . (111) If y < 1 + x , the training scheme is strictly sub-opt imal (in the limit of SNR ) from an er g odic capacity point-of-view . Putting things together , we o btain the desired conditio n, µ > 1 . R E F E R E N C E S [1] M. Medard and R. G. Gallager , “Bandwidth Scaling for Fading Multipath Channels, ” I EEE Tr ans. Inform. Theory , vol. 48, no. 4, pp. 840–852, Apr . 2002. [2] S. V erd ´ u, “Spectral Ef ficiency in the Wideban d R egime, ” IEEE Tr ans. Inform. Theory , vol. 48, no. 6, pp. 1319–1343, June 2002. [3] ´ I. E. T elatar and D. N. C. T se, “Capacity and Mutual Information of Wideb and Multipath Fading Channels, ” IEE E T rans. Inform. Theory , vol. 46, no. 4, pp. 1384–140 0, July 2000. [4] L. Zheng, D. N. C. Tse, and M. Medard, “Channel Coherence in the Low-SNR Regime, ” IEE E T rans. Inform. Theory , vol. 53, no. 3, pp. 976–997, Mar . 2007. [5] A. F . Molisch, “Ult rawideban d Propagation Channels - Theory , Measuremen t and Modeling, ” IEEE T rans. V eh. T ech. , vol. 54, no. 5, pp. 1528–154 5, Sept. 2005. [6] R. Saadane, D. Aboutajdine, A. M. Hayar , and R. Knopp, “On t he Estimation of the Degrees of Freedom of Indoor UWB Channel, ” Proc. IEEE 2005 Spring V eh. T ech . Conf. , vol. 5, pp. 3147–3 151, May 2005. [7] C. C. Chong, Y . Ki m, and S. S . Lee, “A Modified S-V Clustering Channel Model for the UWB Indoor Residential En vironment, ” Pr oc. IE EE 2005 Spring V eh. T ech. C onf. , vol. 1, pp. 58–62, May 2005. [8] J. Tsao, D. Porrat, and D. N. C. Tse, “Prediction and Modeling for the T ime-Evolving Ultra-Wideban d Channel, ” IEEE J ourn. Selected T opics in Sig. Pro c. , vol. 1, no. 3, pp. 340–356, Oct. 2007. [9] J. Karedal, S. W yne, P . Almers, F . T ufvesson, and A. F . Molisch, “Statistical Analysis of the UWB Channel in an Industrial En vironment, ” Pr oc. IE EE 2004 F all V eh. T ech. Conf. , vol. 1, pp. 81–85, Sept. 2004. [10] A. F . Molisch et al., “A Comprehensiv e Standardized Model for Ultrawideban d P ropagation Channe ls, ” IEEE T rans. Antennas Pr opaga t. , vol. 54, no. 11, pp. 3151–3166, Nov . 2006. [11] C. C. Chong and S. K. Y ong, “A Generic Statistical-B ased UW B Channel Model for High-Rise Apartments, ” IEEE T rans. Antennas Pr opaga t. , vol. 53, no. 8, pp. 2389–2399, Aug. 2005. [12] J. Foerster et al. , “Channel Modeling Sub-committee Report Final, ” IEEE Document IEEE P802.15-02/490r1-SG3 a , 2003. [13] V . Raghav an, G. Hari haran, and A. M. S ayeed, “Capacity of Sparse Multipath Channels in the Ultra-Wide band Regime, ” IEEE Journ. Selected T opics in Sig. P r oc. , vol. 1, no. 3, pp. 357–371, Oct. 2007. [14] A. Goldsmith and P . V araiya, “Capacity of Fading Channels with Channel Side Information, ” IEEE T rans. Inform. Theory , vol. 43, no. 6, pp. 1986–1992, Nov . 1997. [15] G. Caire, G. T aricco, and E . B iglieri, “Optimum Po wer Control Over Fading Channels, ” IEE E T rans. Inform. Theory , vo l. 45, no. 5, pp. 1468–148 9, July 1999. DRAFT 32 [16] E. Biglieri, J. P roakis, and S. Shamai (S hitz), “Fading Channels: Information-Theoretic and Communications Aspects, ” IEEE Tr ans. Inform. Theory , vol. 44, no. 6, pp. 2619–2692, Oct. 1998. [17] S. B orade and L. Zheng, “W ideband Fading Channels with Feedback, ” Pr oc. Allerton Conf. Commun. Cont. and Comp. , Sept. 2004, A va ilable: [Online]. http://web. mit.edu/lizh ong/www . [18] M. Agarwal and M. Honig, “W ideband Channel Capacity w ith Train ing and Partial F eedback, ” Pro c. All erton Conf. Commun. Cont. and Comp. , Sept. 2005, A vailable: [Online]. http://www.ece.no rthwestern.e du/ ∼ mh . [19] R. G. Gallager , Information Theory and Reli able Communication , John Wiley & Sons Inc., 1968. [20] K. Liu, T . Kadous, and A. M. Sayeed, “Orthogonal T ime-Frequency Signaling ov er Doubly Dispersiv e C hannels, ” IEEE T rans. Inform. T heory , vol. 50, no. 11, pp. 2583–2603, Nov . 2004. [21] W . K ozek, Adaptation of W e yl–Heisenber g F rames to Underspr ead En vironme nts , in Gabor A nalysis and Algorithm: Theory and Applications, H. G. Feichtinger and T . Strohmer , Eds. Boston, MA, Birkh ¨ auser , pp. 323-352, 1997. [22] A. M. Sayeed and B. Aazhang , “Joint Multipath-Doppler Div ersity in Mobile W i reless Communications, ” IEEE Tr ans. Commun. , vol. 47, no. 1, pp. 123–132, Jan. 1999. [23] A. M. Sayeed and V . V . V eerava lli, “Essential Degrees of Freedom in Space-T ime Fading Channels, ” Pr oc. 13th IEE E Intern. Symp. P ers. Indoor , Mobile Radio Commun. , vol. 4, pp. 1512–1516, Sept. 2002. [24] I. S. Gradshteyn and I. M. Ryzhik, T able of Inte grals, Series, and P r oducts , Academic Press, NY , 4th edition, 1980. [25] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions with F ormulas, Graphs and Mathematical T ables , National Bureau of Standards, USA, 10th edition, 1972. [26] M. Medard, “The Effect Upon C hannel Capacity in W i reless Communications of Perfect and Imperfect Kno wledge of the Channel, ” IE EE T rans. Inform. Theory , vol. 46, no. 3, pp. 935–946, May 2000. [27] V . V . P etrov , Limit Theor ems of Pr obability T heory: Sequences of Independent Random V ariables , Springer, Berlin, 1975. DRAFT
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment