An Improved Lower Bound on Cardinality of Support of the Amplitude-Constrained AWGN Channel

We study the amplitude-constrained additive white Gaussian noise channel. It is well known that the capacity-achieving input distribution for this channel is discrete and supported on finitely many points. The best known bounds show that the support …

Authors: Haiyang Wang, Luca Barletta, Alex Dytso

An Impro v ed Lo w er Bound on Cardinalit y of Supp ort of the Amplitude-Constrained A W GN Channel Haiy ang W ang ∗ Luca Barletta † Alex Dytso ‡ Marc h 26, 2026 Abstract W e study the amplitude-constrained additive white Gaussian noise channel. It is well kno wn that the capacity-ac hieving input distribution for this channel is discrete and supported on finitely many p oin ts. The b est known bounds sho w that the supp ort size of the capacity-ac hieving distribution is lo wer-bounded by a term of order A and upp er-bounded by a term of order A 2 , where A denotes the amplitude constrain t. It w as conjectured in [1] that the linear scaling is optimal. In this w ork, w e establish a new lo wer bound of order A √ log A , improving the known bound and ruling out the conjectured linear scaling. T o obtain this result, we quantify the fact that the capacit y-achieving output distribution is close to the uniform distribution in relative entrop y . Next, w e introduce a wrapping op eration that maps the problem to a compact domain and develop a theory of b est approximation of the uniform distribution b y finite Gaussian mixtures. These approximation bounds are then combined with stability properties of capacity-ac hieving distributions to yield the final support-size low er b ound. 1 In tro duction W e consider an additiv e white Gaussian noise (A WGN) channel sub ject to a p eak-pow er constrain t. The c hannel output is giv en b y Y = X + Z, (1) where the input random v ariable X satisfies the constraint | X | ≤ A almost surely (a.s.), and Z is a standard normal random v ariable indep enden t of X . W e are in terested in the channel capacit y , defined as C ( A ) = max X : | X |≤ A I ( X ; Y ) , (2) where I ( X ; Y ) denotes the mutual information b etw een X and Y . W e denote by X ∗ a capacity-ac hieving input random v ariable and by Y ∗ the corresp onding induced output random v ariable. In general, b oth the exact v alue of the capacity C ( A ) and the precise structure of the capacit y-ac hieving distribution X ∗ remain unkno wn. This channel was first studied by Shannon in his seminal work [2], where initial upper and low er b ounds on the capacity were deriv ed. Shannon also show ed that, in the low signal-to-noise ratio regime, the p eak- p o w er constrained capacity exhibits the same asymptotic b eha vior as the capacity under an av erage-pow er (second-momen t) constraint. Subsequen t progress on characterizing b oth the capacity and the structure of the optimal input distribu- tion w as made b y Smith [3, 4]. In particular, Smith established that the capacit y-ac hieving input distribution ∗ Haiyang W ang is with the Department of Applied and Computational Mathematics, Y ale Univ ersity , New Ha ven, CT 06511, USA (e-mail: haiy ang.wang1024@gmail.com). † Luca Barletta is with the Dipartimento di Elettronica, Informazione e Bioingegneria, P olitecnico di Milano, 20133 Milano, Italy (e-mail: luca.barletta@polimi.it). ‡ Alex Dytso is with Qualcomm Flarion T echnology , Inc., Bridgewater, NJ 08807, USA (e-mail: o dytso2@gmail.com). 1 is unique, symmetric ab out the origin, and discrete with finitely many mass p oin ts. Moreo ver, Smith show ed that for all A < 0 . 1, the capacity-ac hieving distribution is equiprobable on the t wo-point set {± A } , which effectiv ely established capacity for this regime. This result was later sharpened by Sharma and Shamai [5], who sho w ed that an equiprobable binary input supp orted on {± A } is optimal if and only if A ≤ ¯ A ≈ 1 . 665, where ¯ A is c haracterized as the solution to a certain integral equation. F urthermore, they demonstrated that a ternary input supp orted on {− A, 0 , A } is capacity-ac hieving for all ¯ A ≤ A ≤ ¯ ¯ A ≈ 2 . 786. They also conjectured that, as A increases, the supp ort of the capacit y-achieving distribution grows b y at most one p oin t at a time, with new mass p oints alwa ys appearing at zero. Zhang [6] studied the asymptotic b eha vior of the capacit y-achieving input and output distributions. He pro ved bulk asymptotic uniformity of the output on [ − A, A ] [6, Thm. 4.16], argued that the least fav orable prior approaches the Je ffreys prior in the bulk with a discrete boundary pattern near ± A [6, p. 56], and, for equally weigh ted equally spaced priors on [ − A, A ], show ed that the optimal spacing is 2 π (1+ o (1)) √ ln A [6, p. 91]. He also remarked that on [0 , ∞ ) the spacing of the improp er asymptotic prior W ∗ should b e prop ortional to 1 / √ ln x [6, p. 95]; together with the b oundary-corrected constructions on p. 96, this suggests heuristic supp ort gro wth Θ( A √ ln A ), not a rigorous non-asymptotic low er b ound. In contrast, our work gives explicit non-asymptotic bounds. Dytso et al. [1] established rigorous upper and low er bounds on the cardinalit y of the capacit y-ac hieving input distribution for the amplitude-constrained A W GN c hannel, of orders A 2 and A , resp ectiv ely . Their upp er b ounds rely on Karlin’s oscillation theorem [7] together with complex-analytic zero-coun ting argu- men ts. The low er b ound, on the other hand, relied on the fact that mutual information is b ounded by the log of cardinality of input supp ort. Based on these results and supp orting numerical evidence, the authors conjectured that the supp ort size scales linearly with A . Subsequen t n umerical inv estigations hav e suggested alternativ e asymptotic b ehaviors. In particular, Mattingly et al. [8] rep orted an empirical scaling of order A 4 / 3 as A → ∞ , based on exp erimen ts inv olving not only the A W GN c hannel but also several non-Gaussian mo dels, including the binomial channel and certain tw o-dimensional channels. A follow-up w ork by Abbott and Mach ta [9] provided a heuristic, physics- inspired argument in supp ort of this scaling. As noted earlier, this is in addition to Zhang’s spacing-based heuristic of roughly A √ ln A support growth [6, pp. 91, 95, 96]. The diversit y of these conjectured scaling laws can b e partly attributed to the extreme n umerical sen- sitivit y of the underlying optimization problem. Numerical studies in this regime are highly susceptible to algorithmic bias and rounding errors, particularly for large A . In this limit, the nearly optimal output distribution in the interior approac hes the uniform distribution, with meaningful deviations o ccurring only at the scale of numerical precision. Moreo v er, standard implementations based on the Blahut–Arimoto al- gorithm [10, 11] require nested numerical integrations of log-output probabilities, which further exacerbate n umerical instabilit y and bias. As a result, differen t numerical metho dologies can lead to mark edly different empirical scaling la ws. In addition to studying the structure of the capacit y-achieving distribution, a large b ody of w ork has fo cused on upper and low er bounds on the capacit y in (2). Broadly sp eaking, existing capacit y upp er bounds fall in to three categories. The first relies on the maxim um en trop y principle [12, Chapter 12], upper bounding the output differential en tropy h ( Y ) under suitable moment constraints [13, 14]. The second is based on a dual c haracterization of capacity , where maximizing m utual information o ver the input distribution is reform ulated as minimizing relative en tropy o ver the output distribution; a sub optimal choice of the latter yields an explicit upper b ound. Notable examples in this class include the McKellips b ound [15] and the b ound of Thangara j and Kramer [16]; see also [17] for an in-depth exp osition. The third approach exploits the represen tation of mutual information as an integral in volving the minimum mean square error (MMSE) [18], leading to an upp er bound by replacing the optimal estimator with a suboptimal one [19]. There also exists extensive literature on extending Smith’s original pro of strategy [3], as well as the asso ciated cardinality b ounds in [1], to a wide range of channel mo dels. F or complex and vector Gaussian c hannels, discreteness of the capacity-ac hieving input distribution and explicit cardinalit y bounds are es- tablished in [1, 20–24]. F or additive noise channels with sufficiently regular noise densities, discreteness of the optimal input distribution is sho wn in [25, 26]. Proofs of discreteness and cardinality b ounds for prac- tically relev ant Ra yleigh fading channels can b e found in [27–29]. Similar results for non-additive channels include the Poisson c hannel [13, 30] and the binomial channel [31]. A broader survey of optimization-based tec hniques for establishing discreteness of capacit y-achieving distributions is provided in [32]. Extensions to 2 m ultiuser channels can b e found in [33–35]. In addition, a comprehensive ov erview of capacity results for p oin t-to-p oin t Gaussian and related channels is given in [36]. Our results also rely on recent adv ances in the theory of b est approximation by finite Gaussian mixtures; see [37] and references therein. Indeed, our Theorem 2 will closely parallel the arguments dev eloped in [37]. 1.1 Outline and Con tributions In what follo ws, Section 2 presents our main results and provides some discussion. Section 3 collects the main proofs. In particular, Section 3.1 in tro duces a wrapping op eration that maps real-v alued random v ariables on to the c ircle and establishes lo wer bounds on ho w w ell a uniform distribution on [ − π , π ) can b e appro ximated by wrapp ed Gaussian mixtures. Section 3.2 pro vides several new prop erties of the optimal output distribution. In particular, w e establish an upp er b ound on the difference b et ween the output distribution induced by a uniform input on [ − A, A ] and that induced by X ∗ . Section 3.3 presents the proof of the main result by combining the bounds dev eloped in the preceding sections. The pro of of Proposition 1 is deferred to App endix A. T o give some intuition for the pro of, the argument pro ceeds in tw o steps. First, we sho w that if the input has K mass p oin ts, then the induced output distribution c annot approximate the uniform distribution on [ − A, A ] better than exp  − c ( K / A ) 2  ; this is done via a wrapping argumen t and bounds on approximation b y finite Gaussian mixtures. Second, we show that the output distribution induced by the capacity-ac hieving input is within O (1 / A ) of the uniform distribution. Putting these tw o statements together yields the desired lo wer bound on K . Section 4 concludes the pap er. W e conclude this section by presenting relev ant notation. 1.2 Notation Throughout the pap er, the deterministic scalar quan tities are denoted by low er-case letters and random v ariables are denoted by upp ercase letters. W e denote the distribution of a random v ariable X by P X . The supp ort set of P X is denoted and defined as supp ( P X ) = { x : for every op en set D ∋ x w e hav e that P X ( D ) > 0 } . (3) The notation | · | , depending on the con text, denotes either absolute v alue or cardinalit y of the set. All logarithms are tak en with base e. The density of a standard normal will be denoted by ϕ ( x ) = 1 √ 2 π e − x 2 2 , x ∈ R . Giv en t wo probability distributions P and Q , P ≪ Q denotes that P is absolutely contin uous with resp ect to Q , and P ≪≫ Q denotes that P ≪ Q and Q ≪ P . Let p and q b e the probability density functions (p dfs) asso ciated with P and Q , respectively . Then, we will require the follo wing distances: Relativ e Entrop y: D ( P ∥ Q ) = Z p ( x ) log p ( x ) q ( x ) d x, (4) χ 2 Div ergence: χ 2 ( P ∥ Q ) = Z ( p ( x ) − q ( x )) 2 q ( x ) d x, (5) with the understanding that D and χ 2 are equal to infinity if P is not absolutely contin uous with resp ect to Q . 2 Main Result The main result of this w ork is the following theorem. Theorem 1. Fix some A > 0 and let P X ∗ b e the c ap acity-achieving input distribution in (2) . Then, | supp ( P X ∗ ) | ≥ max    A q log + ( c A ) 2 π , r 1 + 2 π e A 2 , 2    (6) 3 for some explicit c onstant c > 0 , wher e log + ( x ) := max { log( x ) , 0 } . A few remarks are in order: • The b ound in (6) pro vides an asymptotic improv emen t ov er the previously kno wn low er b ound of order A deriv ed in [1]. Additionally , it dispro v es the conjecture made in [1] that the support scales as A . • Our result rules out linear gro wth of the supp ort size and shows that an y v alid scaling must b e sup erlinear. Whether the correct scaling is A √ log A, A log A, A 4 / 3 , A 2 or another in termediate rate remains open. • In (6), the constan t can be tak en to be c = 1 2 ζ (2e) r 2 π e ≈ 0 . 0337 , ζ ( t ) = ( t − 1) 2 t − 1 − log( t ) . (7) Consequen tly , the logarithmic term becomes p ositiv e once A > 1 c ≈ 29 . 67. Moreov er, the A √ log + ( c A ) 2 π lo wer bound becomes b etter than the q 1 + 2 π e A 2 lo wer bound once A is roughly 3 × 10 5 or larger. 3 Pro ofs and T ec hniques In this section, we collect the main pro ofs and techniques needed to show the main theorem. W e b egin b y presen ting results related to the b est approximation with Gaussian mixtures. Next, w e present a few new prop erties of the capacity-ac hieving input and output distributions. The pro of of Prop osition 1 is deferred to Appendix A. This section concludes with the pro of of the main theorem. 3.1 A Best Appro ximation Theory With Gaussian Mixtures W e b egin b y in tro ducing a wrapping op eration that maps real-v alued random v ariables on to the circle, allo wing appro ximation questions to b e studied on a compact domain. Giv en a contin uous random v ariable W ∈ R and a parameter B > 0 w e define the following wrapping op erator: ⟨ W ⟩ B = π B ( W mod 2 B ) ∈ [ − π , π ) , (8) where W mo d 2 B : = W − 2 B  W + B 2 B  . The next results summarize some prop erties of the wrapping op eration that will b e imp ortant in our deriv ations. Lemma 1. Supp ose that W ∈ R is a c ontinuous r andom variable with p df f W . Then, the fol lowing statements hold: for B > 0 • the density of ⟨ W ⟩ B is given by f ⟨ W ⟩ B ( θ ) = B π X m ∈ Z f W  B π ( θ + 2 π m )  , θ ∈ [ − π , π ) . (9) • Supp ose that U ∼ U ( − B , B ) indep endent of Z . Then f ⟨ U + Z ⟩ B ( θ ) = 1 2 π , θ ∈ [ − π , π ) . (10) • for any X ∈ [ − B , B ] indep endent of Z , we have the fol lowing F ourier c o efficients: for n ∈ Z b f ⟨ X + Z ⟩ B ( n ) = Z π − π f ⟨ X + Z ⟩ B ( θ ) exp( inθ )d θ = exp  − 1 2  π n B  2  E [exp ( in ⟨ X ⟩ B )] . (11) 4 Pr o of. W e only sho w the last statement. Fix B > 0 and let V = ⟨ X + Z ⟩ B ∈ [ − π , π ). Since the map w 7→ ⟨ w ⟩ B is a reduction mo dulo 2 π after scaling b y π /B , and since θ 7→ exp( inθ ) is 2 π -p eriodic, we hav e exp( inV ) = exp  in π B ( X + Z )  a.s. (12) Therefore, for n ∈ Z , b f ⟨ X + Z ⟩ B ( n ) = Z π − π f ⟨ X + Z ⟩ B ( θ ) exp( inθ ) d θ (13) = E [exp( inV )] (14) = E h exp  in π B ( X + Z ) i (15) = E h exp  in π B X i E h exp  in π B Z i , (16) where w e used independence of X and Z . Finally , since Z ∼ N (0 , 1), E [exp( itZ )] = exp( − t 2 / 2) , t ∈ R , and substituting t = π n B yields b f ⟨ X + Z ⟩ B ( n ) = exp  − 1 2  π n B  2  E h exp  in π B X i . (17) Since X ∈ [ − B , B ] implies ⟨ X ⟩ B = π B X a.s., the last exp ectation equals E [exp( in ⟨ X ⟩ B )], whic h giv es (11). The key result for providing a low er bound is the follo wing theorem. It quan tifies the b est p ossible appro ximation of the wrapp ed uniform distribution b y a wrapp ed Gaussian mixture with finitely man y comp onen ts. Its pro of adopts the trigonometric momen t method of [37, Thm. 7]; see also [38, Thm. 3]. Theorem 2. L et X ∈ [ − A, A ] b e a discr ete r andom variable with K > 1 mass p oints and U ∼ U [ − A, A ] . Then, χ 2  P ⟨ X + Z ⟩ A ∥ P ⟨ U + Z ⟩ A  ≥ 1 2 exp  − 4 π 2 K 2 A 2  . (18) Pr o of. Let us first in tro duce some preliminary notation. The n -th trigonometric momen t of a random v ariable W supported on the circle [ − π , π ) is defined as t n ( W ) = E [exp( inW )]. F urthermore, we can define its n -th trigonometric moment matrix as follo ws: T n ( W ) =      t 0 ( W ) t 1 ( W ) · · · t n ( W ) t − 1 ( W ) t 0 ( W ) · · · t n − 1 ( W ) . . . . . . . . . . . . t − n ( W ) t − n +1 ( W ) · · · t 0 ( W )      , (19) whic h is an ( n + 1) × ( n + 1) Hermitian matrix. Additionally , note that, giv en a discrete random v ariable W with distribution P W that has K mass points, we can see that T n ( W ) = P K k =1 P W ( w k ) T n ( w k ) is the sum of K rank-1 matrices, therefore it has rank at most K . No w, using (11) the F ourier coefficients of the underlying densities are given b y: for n ∈ Z b f ⟨ U + Z ⟩ A ( n ) =  1 n = 0 0 n  = 0 , (20a) b f ⟨ X + Z ⟩ A ( n ) = exp  − σ 2 n 2 2  t n ( ⟨ X ⟩ A ) , (20b) 5 where σ 2 = π 2 A 2 . Consequently , w e ha ve the following characterization of the χ 2 -distance χ 2  P ⟨ X + Z ⟩ A ∥ P ⟨ U + Z ⟩ A  = 2 π Z π − π  f ⟨ X + Z ⟩ A ( θ ) − 1 2 π  2 d θ (21) = X n ∈ Z : n  =0 exp  − σ 2 n 2  | t n ( ⟨ X ⟩ A ) | 2 , (22) where (21) follo ws from (10); and (22) follows from Parsev al’s theorem and F ourier expressions in (20). F urthermore, w e hav e that the χ 2 div ergence can be low er bounded as follo ws: χ 2  P ⟨ X + Z ⟩ A ∥ P ⟨ U + Z ⟩ A  ≥ X n  =0 , | n |≤ 2 K exp  − σ 2 n 2  | t n ( ⟨ X ⟩ A ) | 2 (23) ≥ exp( − σ 2 4 K 2 ) X n  =0 , | n |≤ 2 K | t n ( ⟨ X ⟩ A ) | 2 (24) ≥ exp( − σ 2 4 K 2 ) 2 K + 1 X n  =0 , | n |≤ 2 K (2 K + 1 − | n | ) | t n ( ⟨ X ⟩ A ) | 2 (25) = exp( − σ 2 4 K 2 ) 2 K + 1 ∥ T 2 K ( ⟨ X ⟩ A ) − I ∥ 2 F , (26) where ∥ · ∥ F is the F robenius norm of a matrix. No w, as was argued already b efore, T 2 K ( ⟨ X ⟩ A ) is a matrix with rank K at most, therefore, by Ec k art- Y oung-Mirsky theorem [39], w e hav e that ∥ T 2 K ( ⟨ X ⟩ A ) − I ∥ 2 F ≥ min B ∈ C (2 K +1) × (2 K +1) : rank( B ) ≤ K ∥ B − I ∥ 2 F = K + 1 . (27) Substituting (27) in to (26), w e obtain: χ 2  P ⟨ X + Z ⟩ A ∥ P ⟨ U + Z ⟩ A  ≥ exp( − σ 2 4 K 2 )( K + 1) 2 K + 1 ≥ 1 2 exp  − 4 π 2 K 2 A 2  , whic h concludes the proof of the theorem. If, in addition, the wrapp ed densit y is b ounded, then we can upgrade the abov e result to a low er b ound on the rev erse relative entrop y . Corollary 1. Supp ose that the assumption of The or em 2 holds and assume that sup x f ⟨ X + Z ⟩ A ( x ) ≤ M for some c onstant M > 0 . Then, D  P ⟨ U + Z ⟩ A ∥ P ⟨ X + Z ⟩ A  ≥ 1 2 ζ (2 π M ) exp  − 4 π 2 K 2 A 2  , (28) wher e ζ ( t ) = ( t − 1) 2 t − 1 − log( t ) . (29) Pr o of. The proof will require the follo wing tw o inequalities: Supp ose that P ≪≫ Q , P  = Q and β 1 ∈ (0 , 1) where β 1 = inf x d Q d P ( x ); (30) then the follo wing inequalities hold: 1. Bounding χ 2 with R elative Entr opy [40, Eq. (169)]: χ 2 ( P ∥ Q ) D ( P ∥ Q ) ≤ 1 κ 2 ( β − 1 1 ) , (31) where κ 2 ( t ) = t log t + (1 − t ) (1 − t ) 2 . (32) 6 2. R atio of R elative Entr opies [40, Thm. 6]: D ( P ∥ Q ) D ( Q ∥ P ) ≤ κ ( β − 1 1 ) , (33) with β 1 as in (30) and κ ( t ) = t log t + (1 − t ) t − 1 − log( t ) . (34) Com bining inequality (31) and (33), w e arriv e at χ 2 ( P ∥ Q ) D ( Q ∥ P ) = χ 2 ( P ∥ Q ) D ( P ∥ Q ) D ( P ∥ Q ) D ( Q ∥ P ) ≤ κ ( β − 1 1 ) κ 2 ( β − 1 1 ) = ζ ( β − 1 1 ) , (35) where ζ ( t ) is defined in (29). No w we apply the abov e inequalit y to our sp ecific case and let P = P ⟨ X + Z ⟩ A , Q = P ⟨ U + Z ⟩ A . First, note that b y (9) and positivity of the Gaussian densit y , the density of P is p ositiv e and con tinuous on the circle, so P ≪≫ Q . Second, since Q has densit y q ( x ) = 1 2 π on [ − π , π ) and P has density p ( x ) = f ⟨ X + Z ⟩ A ( x ), the b ound on p implies β 1 = inf x ∈ [ − π ,π ) d Q d P ( x ) ≥ 1 2 π M . (36) Since t 7→ ζ ( t ) increases for t ≥ 1, w e hav e ζ ( β − 1 1 ) ≤ ζ (2 π M ). Com bining this with the inequalit y in (35) and the bound in Theorem 2, we arrive at the b ound in (28). 3.2 Some Prop erties of Capacit y-Ac hieving Distribution In this section, w e summarize some of the known prop erties of the capacit y and capacity-ac hieving distribu- tions. W e b egin b y presen ting the follo wing w ell-known stability result [41, 42]. Lemma 2. Given a channel P Y | X , supp ose that P X ∗ is a c ap acity-achieving distribution. Then, for any P X we have that D ( P Y ∥ P Y ∗ ) ≤ I ( X ∗ ; Y ∗ ) − I ( X ; Y ) , (37) wher e P X and P X ∗ induc e P Y and P Y ∗ , r esp e ctively, thr ough the channel P Y | X . 1 A related result to Lemma 2 are the following KKT conditions [4, 44]. Lemma 3. Consider the amplitude c onstr aine d sc alar additive Gaussian channel Y = X + Z wher e the input X , satisfying | X | ≤ A a.s., is indep endent of the noise Z ∼ N (0 , 1) . The c ap acity-achieving distribution P X ∗ and induc e d output distribution P Y ∗ satisfy the fol lowing: for A > 0 D ( P Y | X ( ·| x ) ∥ P Y ∗ ) ≤ C ( A ) , x ∈ [ − A, A ] , (38) D ( P Y | X ( ·| x ) ∥ P Y ∗ ) = C ( A ) , x ∈ supp( P X ∗ ) . (39) The next result provides upp er and low er b ounds on the capacity . It also quantifies stabilit y of the optimal output distribution, by b ounding the relative en tropy betw een optimal output distribution P X ∗ + Z and output distribution induced b y a uniform input P U + Z . Lemma 4. L et U ∼ U [ − A,A ] . Then, for A > 0 1 2 log  1 + 2 A 2 π e  ≤ I ( U ; U + Z ) ≤ C ( A ) ≤ log 1 + √ 2 A √ π e ! . (40) Mor e over, D ( P U + Z ∥ P X ∗ + Z ) ≤ r π e 2 1 A . (41) 1 If the capacit y-achieving distribution do es not exist, I ( X ∗ ; Y ∗ ) should be replaced b y the capacity v alue. Note that, as shown in [43], the capacity-ac hieving output distribution P Y ∗ alwa ys exists and is unique. In our setting, the capacity-ac hieving input distribution P ∗ X exists and is unique, as shown in [3]. 7 Pr o of. The lo wer b ound in (40) is due to Shannon [2, Section 25] and the upp er b ound is due to McKellips [15]; see also [16, Sec. IV.A] for the complete pro of of McKellips bound. T o sho w the bound on the relativ e en tropy of the output distributions note that D ( P U + Z ∥ P X ∗ + Z ) ≤ C ( A ) − I ( U ; U + Z ) (42) ≤ log 1 + √ 2 A √ π e ! − 1 2 log  1 + 2 A 2 π e  (43) = log 1 + √ 2 A √ π e q 1 + 2 A 2 π e (44) ≤ log   1 + 1 q 2 π e A   (45) ≤ 1 q 2 π e A , (46) where (42) follows from Lemma 2; (43) follo ws from the bounds in (40); and (46) follows from using inequality 1 + u ≤ e u , u ≥ 0. W e no w pro duce a few new properties of the optimal distributions. Prop osition 1. The c ap acity-achieving input and output distributions satisfy the fol lowing for A > 0 : (P 1 ) Supp ose that y 0 is a glob al maximum of f X ∗ + Z , then ther e exists a p oint x ∈ supp ( P X ∗ ) such that | y 0 − x | ≤ 1 . (47) (P 2 ) L et M A = max y ∈ R f X ∗ + Z ( y ) , then 1 √ 2 π e + 2 A ≤ e − C ( A ) − h ( Z ) ≤ M A ≤ e − C ( A ) − h ( Z )+1 ≤ e √ 2 π e + 4 A 2 , (48) wher e h ( Z ) is differ ential entr opy of Z . (P 3 ) F or | y | ≥ A f X ∗ + Z ( y ) ≤ M A e − ( | y |− A ) 2 2 . (49) (P 4 ) sup θ ∈ [ − π ,π ) f ⟨ X ∗ + Z ⟩ A ( θ ) ≤ e π . (50) Pr o of. See Appendix A. Remark 1. In [6, Thm. 4.16], Zhang showe d that for any 0 < B < A with A − B → ∞ , lim A →∞ sup | y |≤ B | 2 Af X ∗ + Z ( y ) − 1 | = 0 , (51) which implies that the output distribution P X ∗ + Z is asymptotic al ly uniform in the bulk interior of [ − A, A ] . He also ar gue d that the le ast favor able prior appr o aches the Jeffr eys prior in the bulk with a discr ete b oundary c orr e ction ne ar ± A [6, p. 56]. In the e qual ly weighte d e qual ly sp ac e d class on [ − A, A ] , his The o- r em 5.3 gives sp acing 2 π (1+ o (1)) √ ln A [6, p. 91]; on le ast favor able impr op er prior over [0 , ∞ ) he further r emarks that the sp acing of W ∗ should b e pr op ortional to 1 / √ ln x [6, p. 95]. T o gether with the b oundary-c orr e cte d c onstructions on p. 96, these heuristics p oint to supp ort gr owth Θ( A √ ln A ) [6, p. 96]. In c ontr ast, the b ound in (48) , which shows that max y f X ∗ + Z ( y ) = Θ(1 / A ) , that is, b ounde d ab ove and b elow by p ositive c onstant multiples of 1 / A , is str onger in the sense that it pr ovides a uniform, non-asymptotic guar ante e. On the other hand, it is we aker in that it do es not r e c o ver the optimal asymptotic c onstant. 8 3.3 Pro of of the Main Theorem W e only show the first lo wer bound. The low er b ound q 1 + 2 π e A 2 w as sho wn in [1]. The bound | supp ( P X ∗ ) | ≥ 2 is immediate b ecause C ( A ) > 0 for every A > 0 by (40), whereas any one-p oin t input induces zero mutual information. Denote K = | supp ( P X ∗ ) | . F rom property (P 4 ) of Prop osition 1, w e ha ve that f ⟨ X ∗ + Z ⟩ A ≤ e π , so 2 π M = 2e in Corollary 1. Therefore, 1 2 ζ (2e) exp  − 4 π 2 K 2 A 2  ≤ D  P ⟨ U + Z ⟩ A ∥ P ⟨ X ∗ + Z ⟩ A  (52) ≤ D ( P U + Z ∥ P X ∗ + Z ) (53) ≤ r π e 2 1 A , (54) where (53) follo ws b y data pro cessing inequality [45]; and (54) follows from using (41). Rearranging the ab o v e b ounds, w e hav e that 4 π 2 K 2 A 2 ≥ log 1 2 ζ (2e) r 2 π e A ! . Since the left-hand side is nonnegativ e, w e may take the p ositiv e part and conclude that K ≥ A 2 π v u u t log + 1 2 ζ (2e) r 2 π e A ! , whic h prov es Theorem 1 with c = 1 2 ζ (2e) q 2 π e ≈ 0 . 0337. 4 Conclusion In this w ork, we deriv ed a new lo wer b ound on the cardinalit y of the capacity-ac hieving input distribution for the amplitude-constrained A W GN channel. Our result impro ves the previously kno wn linear lo wer bound and establishes that the support size must grow sup erlinearly with the amplitude constraint, thereby ruling out linear scaling. Sev eral questions remain op en. While our result shows sup erlinear growth, the exact asymptotic scaling, whether A √ log A , A log A , A 4 / 3 , A 2 , or something in b et ween, remains unresolved. F urther refinements of the constants and a deep er understanding of the boundary b ehavior of the capacity-ac hieving distribution are of interest. Extending these techniques to other channel mo dels, including vector Gaussian and fading c hannels, is another promising direction. Bey ond the refined b ound itself, the main conceptual contribution of this work lies in the metho dology used in the pro of. Our approach combines a comparison betw een the output distribution induced by the capacit y-achieving input and that induced by a uniform input with a wrapping argumen t that maps the problem to a compact domain. This transformation allows the problem to b e studied through approxima- tion of the uniform distribution by finite Gaussian mixtures, for which sharp low er b ounds are av ailable. W e b eliev e that this technique ma y b e useful for studying other structural prop erties of optimal input distri- butions and for related problems inv olving Gaussian mixtures and approximation on compact domains. In particular, it would b e interesting to in vestigate whether a similar approac h could b e used to derive b ounds on the supp ort of the least-fav orable distribution in the classical problem of estimating a b ounded normal mean [46–48]. A Pro of of Prop osition 1 Pr o of. W e no w prov e Proposition 1. W e b egin with (P 1 ). W e claim that there exists a supp ort point x k ∈ supp ( P X ∗ ) such that | y 0 − x k | ≤ 1. Recall f X ∗ + Z ( y ) = P i w i ϕ ( y − x i ) where w i > 0. Suppose 9 | y 0 − x i | > 1 for all i . Then f ′′ X ∗ + Z ( y 0 ) = P w i (( x i − y 0 ) 2 − 1) ϕ ( y 0 − x i ) > 0, whic h con tradicts the fact that y 0 is a maxim um. Th us, there m ust be at least one supp ort point x k with | y 0 − x k | ≤ 1. W e no w pro ve (P 2 ). Let y 0 b e a global maximum of f X + Z . Using a generalization of Tweedie’s formula, also kno wn as Hatsell and Nolte [49], w e ha ve that [50, Eq. (54)]: d 2 d y 2 log f X + Z ( y ) = V ar( X | X + Z = y ) − 1 ≥ − 1 , (55) since V ar( X | X + Z = y ) ≥ 0. Therefore, since y 0 is a global maximum, for any y we ha v e that log f X + Z ( y ) ≥ log f X + Z ( y 0 ) − 1 2 ( y − y 0 ) 2 . (56) Therefore, f X + Z ( y ) ≥ M A e − ( y − y 0 ) 2 2 where M A := f X + Z ( y 0 ). W e no w upp er bound M A . Starting with the KKT condition in (39), we ha v e that for x ∈ supp ( P X ∗ ) − C ( A ) − h ( Z ) = − D ( P Y | X ( ·| x ) ∥ P X ∗ + Z ) − h ( Z ) (57) = Z ∞ −∞ ϕ ( y − x ) log f X ∗ + Z ( y )d y (58) ≥ Z ∞ −∞ ϕ ( y − x )  log M A − 1 2 ( y − y 0 ) 2  d y (59) = log M A − 1 2 Z ∞ −∞ ϕ ( y − x )( y − y 0 ) 2 d y (60) = log M A − 1 2  1 + ( x − y 0 ) 2  (61) ≥ log M A − 1 , (62) where (59) follows from using the lo w er bound in (56); and (62) follows from the b ound in (47). Consequen tly , w e hav e that log( M A ) ≤ − C ( A ) − h ( Z ) + 1 (63) ≤ − 1 2 log  1 + 2 A 2 π e  − h ( Z ) + 1 (64) = log  e √ 2 π e + 4 A 2  , (65) where in (64) we hav e used the low er b ound on C ( A ) in (40). This concludes the proof of the upp er b ounds in (P 2 ). T o sho w the lo wer b ound, w e follo w similar steps and note for x ∈ supp ( P X ∗ ) − C ( A ) − h ( Z ) = − D ( P Y | X ( ·| x ) ∥ P X ∗ + Z ) − h ( Z ) (66) = Z ∞ −∞ ϕ ( y − x ) log f X ∗ + Z ( y )d y (67) ≤ log f X ∗ + Z ( y 0 ) = log( M A ) . (68) The pro of of the lo wer bounds in (P 2 ) is concluded by using the upper b ound C ( A ) ≤ log  1 + √ 2 A √ π e  in (40). W e now prov e (P 3 ). First, w e establish a p oint wise b ound for the density in the tail region y > A . F or an y x ∈ [ − A, A ] and y > A , we hav e y − x ≥ y − A > 0. Let y = A + t with t > 0 and note that ( y − x ) 2 − ( A − x ) 2 = ( A + t − x ) 2 − ( A − x ) 2 (69) = ( A − x ) 2 + 2 t ( A − x ) + t 2 − ( A − x ) 2 (70) = 2 t ( A − x ) + t 2 . (71) 10 Since x ≤ A and t > 0, w e hav e 2 t ( A − x ) ≥ 0. Therefore, ( y − x ) 2 − ( A − x ) 2 ≥ t 2 = ( y − A ) 2 , which implies that ϕ ( y − x ) ϕ ( A − x ) ≤ exp  − 1 2 ( y − A ) 2  . (72) With the upper bound (72), w e hav e that f X ∗ + Z ( y ) = Z A − A ϕ ( y − x )d P X ∗ ( x ) (73) ≤ Z A − A e − ( y − A ) 2 2 ϕ ( A − x )d P X ∗ ( x ) (74) = e − ( y − A ) 2 2 f X ∗ + Z ( A ) (75) ≤ M A e − ( y − A ) 2 2 , (76) where in the last inequalit y w e ha ve that M A = max y ∈ R f X ∗ + Z ( y ). Mirroring the same argumen t for y < − A , w e arrive at the desired bound in (P 3 ). W e no w prov e (P 4 ). Starting with the expression of the wrapped density in (9) f ⟨ X ∗ + Z ⟩ A ( θ ) = A π X k ∈ Z f X ∗ + Z  A π ( θ + 2 π k )  (77) = A π f X ∗ + Z  A π θ  + A π X k ∈ Z ,k  =0 f X ∗ + Z  A π ( θ + 2 π k )  (78) ≤ A π e √ 2 π e + 4 A 2   1 + X k ∈ Z ,k  =0 e − A 2 ( | θ π +2 k | − 1 ) 2 2   (79) = A π e √ 2 π e + 4 A 2 1 + X m ∈ Z e − A 2 ( 2 m +1 − θ π ) 2 2 ! (80) ≤ A π e √ 2 π e + 4 A 2 2 + 2 ∞ X m =1 e − 2 A 2 m 2 ! (81) = A π 2e √ 2 π e + 4 A 2 1 + ∞ X m =1 e − 2 A 2 m 2 ! (82) ≤ e π , (83) Here (79) follo ws from (48) and (49). T o obtain (80), set t = θ π ∈ [ − 1 , 1) and reindex the tail terms b y o dd in tegers. F or (81), write u = 1 − t 2 so that the brack et b ecomes 1 + P m ∈ Z exp( − 2 A 2 ( m + u ) 2 ). By Poisson summation, X m ∈ Z exp( − 2 A 2 ( m + u ) 2 ) = √ π √ 2 A 1 + 2 ∞ X n =1 e − π 2 n 2 2 A 2 cos(2 π nu ) ! , whic h is maximized at u = 0; equiv alently , the brac k et is largest at θ = ± π and equals 2 + 2 P ∞ m =1 e − 2 A 2 m 2 . T o conclude, w e split in to t wo cases. If A ≥ 1, then m 2 ≥ m for m ≥ 1 giv es ∞ X m =1 e − 2 A 2 m 2 ≤ ∞ X m =1 e − 2 A 2 m = e − 2 A 2 1 − e − 2 A 2 , so A π 2e √ 2 π e + 4 A 2 1 + ∞ X m =1 e − 2 A 2 m 2 ! ≤ 2e A π √ 2 π e + 4 A 2 1 1 − e − 2 A 2 . 11 Using 1 − e − x ≥ x 1+ x for x ≥ 0, we obtain 1 1 − e − 2 A 2 ≤ 1 + 2 A 2 2 A 2 , hence 2e A π √ 2 π e + 4 A 2 1 1 − e − 2 A 2 ≤ e π 1 + 2 A 2 A √ 2 π e + 4 A 2 ≤ e π , where the last step follows from (1 + 2 A 2 ) 2 ≤ A 2 (2 π e + 4 A 2 ) for A ≥ 1. If 0 < A ≤ 1, then the function x 7→ e − 2 A 2 x 2 is decreasing on [0 , ∞ ), so ∞ X m =1 e − 2 A 2 m 2 ≤ Z ∞ 0 e − 2 A 2 x 2 d x = √ π 2 √ 2 A . Therefore A π 2e √ 2 π e + 4 A 2 1 + ∞ X m =1 e − 2 A 2 m 2 ! ≤ 2e A + e p π / 2 π √ 2 π e + 4 A 2 . Finally , since A ≤ 1,  2 A + p π / 2  2 = 4 A 2 + 4 A p π / 2 + π / 2 ≤ 4 A 2 + 4 p π / 2 + π / 2 < 4 A 2 + 2 π e , whic h implies 2e A + e p π / 2 π √ 2 π e + 4 A 2 ≤ e π . This concludes the pro of of (P 4 ) and of the prop osition. References [1] A. Dytso, S. Y agli, H. V. Poor, and S. Shamai (Shitz), “The capacit y ac hieving distribution for the amplitude constrained additive Gaussian c hannel: An upp er bound on the n umber of mass p oin ts,” IEEE T r ans. Inf. The ory , vol. 66, no. 4, pp. 2006–2022, 2020. [2] C. E. Shannon, “A mathematical theory of communication,” Bel l Syst. T e ch. J. , vol. 27, no. 379-423, 623-656, 1948. [3] J. G. Smith, “On the Information Capacit y of Peak and Average Pow er Constrained Gaussian Chan- nels,” PhD dissertation, Universit y of California, 1969. [4] ——, “The information capacit y of amplitude- and v ariance-constrained scalar Gaussian c hannels,” Inform. and Contr. , v ol. 18, no. 3, pp. 203–219, 1971. [5] N. Sharma and S. Shamai (Shitz), “T ransition p oin ts in the capacit y-ac hieving distribution for the p eak- p o w er limited A W GN and free-space optical in tensit y c hannels,” Pr obl. Inf. T r ansm. , vol. 46, no. 4, pp. 283–299, 2010. [6] Z. Zhang, “Discrete noninformativ e priors,” Ph.D. dissertation, Y ale Univ ersity , New Hav en, CT, 1994. [7] S. Karlin, “Decision theory for P´ oly a t yp e distributions. case of tw o actions, i,” in Pr o c e e dings of the Thir d Berkeley Symp osium on Mathematic al Statistics and Pr ob ability, V olume 1: Contributions to the The ory of Statistics . Berkeley , Calif.: Univ ersity of California Press, 1956, pp. 115–128. [8] H. H. Mattingly , M. K. T ranstrum, M. C. Abb ott, and B. B. Mach ta, “Maximizing the information learned from finite data selects a simple mo del,” Pr o c e e dings of the National A c ademy of Scienc es , v ol. 115, no. 8, pp. 1760–1765, 2018. 12 [9] M. C. Abbott and B. B. Mac hta, “A scaling la w from discrete to contin uous solutions of c hannel capacit y problems in the low-noise limit,” Journal of Statistic al Physics , vol. 176, no. 1, pp. 214–227, 2019. [10] R. Blah ut, “Computation of c hannel capacit y and rate-distortion functions,” IEEE T r ans. Inf. The ory , v ol. 18, no. 4, pp. 460–473, 1972. [11] S. Arimoto, “An algorithm for computing the capacity of arbitrary discrete memoryless c hannels,” IEEE T r ans. Inf. The ory , v ol. 18, no. 1, pp. 14–20, 1972. [12] T. Cov er and J. Thomas, Elements of Information The ory: Se c ond Edition . Wiley , 2006. [13] S. Shamai (Shitz), “Capacit y of a pulse amplitude mo dulated direct detection photon channel,” IEE Pr o c e e dings I (Communic ations, Sp e e ch and Vision) , v ol. 137, no. 6, pp. 424–430, 1990. [14] A. Dytso, M. Goldenbaum, H. V. P o or, and S. Shamai (Shitz), “Amplitude constrained MIMO channels: Prop erties of optimal input distributions and b ounds on the capacity ,” Entr opy , vol. 21, no. 2, p. 200, 2019. [15] A. L. McKellips, “Simple tigh t b ounds on capacity for the p eak-limited discrete-time channel,” in Pr o c. IEEE Int. Symp. Inf. The ory , Chicago, IL, 2004, pp. 348–348. [16] A. Thangara j, G. Kramer, and G. B¨ oc herer, “Capacit y bounds for discrete-time, amplitude-constrained, additiv e white Gaussian noise channels,” IEEE T r ans. Inf. The ory , v ol. 63, no. 7, pp. 4172–4182, 2017. [17] A. Lapidoth and S. M. Moser, “Capacity b ounds via dualit y with applications to m ultiple-antenna systems on flat-fading channels,” IEEE T r ans. Inf. The ory , vol. 49, no. 10, pp. 2426–2467, 2003. [18] D. Guo, S. Shamai (Shitz), and S. V erd ´ u, “Mutual information and minimum mean-square error in Gaussian c hannels,” IEEE T r ans. Inf. The ory , vol. 51, no. 4, pp. 1261–1282, 2005. [19] A. Dytso, R. Bustin, H. V. Poor, and S. Shamai (Shitz), “A view of information-estimation relations in Gaussian net works,” Entr opy , v ol. 19, no. 8, p. 409, 2017. [20] S. Shamai (Shitz) and I. Bar-Da vid, “The capacity of a v erage and p eak-pow er-limited quadrature Gaus- sian c hannels,” IEEE T r ans. Inf. The ory , vol. 41, no. 4, pp. 1060–1071, 1995. [21] T. H. Chan, S. Hranilovic, and F. R. Ksc hischang, “Capacit y-achieving probabilit y measure for condi- tionally Gaussian channels with b ounded inputs,” IEEE T r ans. Inf. The ory , v ol. 51, no. 6, pp. 2073–2088, 2005. [22] B. Rassouli and B. Clerckx, “On the capacity of vector Gaussian channels with b ounded inputs,” IEEE T r ans. Inf. The ory , v ol. 62, no. 12, pp. 6884–6903, Decem b er 2016. [23] A. Dytso, M. Al, H. V. Poor, and S. Shamai (Shitz), “On the capacity of the p eak p o wer constrained v ector Gaussian channel: An estimation theoretic p ersp ectiv e,” IEEE T r ans. Inf. The ory , vol. 65, no. 6, pp. 3907–3921, 2019. [24] A. Dytso, L. Barletta, and G. Kramer, “On 2 × 2 MIMO Gaussian channels with a small discrete-time p eak-pow er constrain t,” in Pr o c. IEEE Int. Symp. Inf. The ory , 2024, pp. 2365–2370. [25] A. Tchamk erten, “On the discreteness of capacit y-achieving distributions,” IEEE T r ans. Inf. The ory , v ol. 50, no. 11, pp. 2773–2778, 2004. [26] J. F ahs and I. Ab ou-F aycal, “On properties of the support of capacit y-achieving distributions for additiv e noise channel mo dels with input cost constraints,” IEEE T r ans. Inf. The ory , vol. 64, no. 2, pp. 1178– 1198, 2017. [27] I. C. Ab ou-F aycal, M. D. T rott, and S. Shamai (Shitz), “The capacity of discrete-time memoryless Ra yleigh-fading channels,” IEEE T r ans. Inf. The ory , v ol. 47, no. 4, pp. 1290–1301, 2001. 13 [28] M. Katz and S. Shamai (Shitz), “On the capacit y-achieving distribution of the discrete-time noncoherent and partially coheren t A WGN c hannels,” IEEE T r ans. Inf. The ory , vol. 50, no. 10, pp. 2257–2270, 2004. [29] A. F av ano, L. Barletta, A. Dytso, and G. Kramer, “Non-coheren t Ra yleigh fading channels: Prop erties of the capacit y-achieving input,” IEEE T r ans. Inf. The ory , vol. 71, no. 7, pp. 5258–5276, 2025. [30] A. Dytso, L. Barletta, and S. Shamai (Shitz), “Properties of the supp ort of the capacit y-achieving distribution of the amplitude-constrained Poisson noise channel,” IEEE T r ans. Inf. The ory , v ol. 67, no. 11, pp. 7050–7066, 2021. [31] L. Barletta, I. Zieder, A. F av ano, and A. Dytso, “Binomial c hannel: On the capacity-ac hieving distri- bution and bounds on the capacity ,” in Pr o c. IEEE Int. Symp. Inf. The ory , 2024, pp. 711–716. [32] A. Dytso, M. Goldenbaum, H. V. Poor, and S. Shamai (Shitz), “When are discrete c hannel inputs optimal? - Optimization tec hniques and some new results,” in Pr o c. Conf. on Inf. Sci. and Sys. , Princeton, NJ, USA, March 2018, pp. 1–6. [33] B. Mamandip oor, K. Moshksar, and A. K. Khandani, “Capacity-ac hieving distributions in Gaussian m ultiple access channel with peak p o w er constrain ts,” IEEE T r ans. Inf. The ory , v ol. 60, no. 10, pp. 6080–6092, 2014. [34] O. Ozel, E. Ekrem, and S. Ulukus, “Gaussian wiretap c hannel with amplitude and v ariance constrain ts,” IEEE T r ans. Inf. The ory , vol. 61, no. 10, pp. 5553–5563, 2015. [35] A. Dytso, M. Egan, S. M. Perlaza, H. V. P o or, and S. Shamai (Shitz), “Optimal inputs for some classes of degraded wiretap channels,” in Pr o c. IEEE Inf. The ory Workshop , 2018, pp. 1–5. [36] S. V erd ´ u, “Fift y years of Shannon theory ,” IEEE T r ans. Inf. The ory , vol. 44, no. 6, pp. 2057–2078, 1998. [37] Y. Ma, Y. W u, and P . Y ang, “On the b est approximation by finite Gaussian mixtures,” IEEE T r ans. Inf. The ory , v ol. 71, no. 7, pp. 5469–5492, 2025. [38] H. W ang, “Sup er-linear growth of the capacity-ac hieving input supp ort for the amplitude-constrained A WGN channel,” arXiv pr eprint arXiv:2510.20723 , 2025. [39] L. Mirsky , “Symmetric gauge functions and unitarily inv arian t norms,” Quarterly Journal of Mathe- matics , v ol. 11, pp. 50–59, 1960. [40] I. Sason and S. V erd ´ u, “ f -divergence inequalities,” IEEE T r ans. Inf. The ory , v ol. 62, no. 11, pp. 5973– 6006, 2016. [41] F. T opsøe, “An information theoretical identit y and a problem inv olving capacit y ,” Studia Scientiarum Mathematic arum Hungaric a , v ol. 2, no. 291-292, p. 246, 1967. [42] I. Csisz´ ar and J. K¨ orner, Information The ory: Co ding The or ems for Discr ete Memoryless Systems . Cam bridge Universit y Press, 2011. [43] J. Kemp erman, “On the Shannon capacit y of an arbitrary c hannel,” in Indagationes Mathematic ae (Pr o c e e dings) , vol. 77, no. 2. North-Holland, 1974, pp. 101–115. [44] A. Dytso, M. Goldenbaum, H. V. Poor, and S. Shamai (Shitz), “When are discrete c hannel inputs optimal?: Optimization techniques and some new results,” in Pr o c. IEEE 52nd Annu. Conf. Info. Scien. and Syst. (CISS) , Princeton, USA, 2018, pp. 1–6. [45] Y. P olyanskiy and Y. W u, Information The ory: F r om Co ding to L e arning . Cambridge, United Kingdom; New Y ork, NY: Cam bridge Universit y Press, 2025. [46] A. Dytso, H. V. Poor, R. Bustin, and S. Shamai, “On the structure of the least fav orable prior distri- butions,” in 2018 IEEE International Symp osium on Information The ory (ISIT) , 2018, pp. 1081–1085. 14 [47] J. C. Berry , “Minimax estimation of a b ounded normal mean vector,” Journal of Multivariate Analysis , v ol. 35, no. 1, pp. 130–139, 1990. [48] G. Casella and W. E. Stra wderman, “Estimating a b ounded normal mean,” Ann. Statist. , pp. 870–878, 1981. [49] C. Hatsell and L. Nolte, “Some geometric prop erties of the lik eliho o d ratio (corresp.),” IEEE T r ansac- tions on Information The ory , v ol. 17, no. 5, pp. 616–618, 1971. [50] A. Dytso, H. V. Poor, and S. Shamai (Shitz), “Conditional mean estimation in Gaussian noise: A meta deriv ative iden tity with applications,” IEEE T r ans. Inf. The ory , v ol. 69, no. 3, pp. 1883–1898, 2022. 15

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment