Redundancy Estimates for Word-Based Encoding of Sequences Produced by a Bernoulli Source

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The efficiency of a code is estimated by its redundancy $R$, while the complexity of a code is estimated by its average delay $\bar N$. In this work we construct word-based codes, for which $R \lesssim \bar N^{-5/3}$. Therefore, word-based codes can attain the same redundancy as block-codes while being much less complex. We also consider uniform on the output codes, the benefit of which is the lack of a running synchronization error. For such codes $\bar N^{-1} \lesssim R \lesssim \bar N^{-1}$, except for a case when all input symbols are equiprobable, when $R \leqslant \bar N^{-2}$ for infinitely many $\bar N$.

💡 Research Summary

The paper investigates lossless compression of infinite sequences generated by a memoryless (Bernoulli) source using word‑based codes, i.e., variable‑length “words” drawn from the input alphabet that are mapped to variable‑length codewords over a finite output alphabet. The authors introduce two performance metrics: average delay (\bar N = \sum_j p(A_j)|A_j|) (the expected number of input symbols per codeword) as a measure of complexity, and redundancy (R = \bar N^{-1}\sum_j p(A_j)|\phi(A_j)| - H\log_2 n) as a measure of inefficiency relative to the source entropy (H).

A central technical tool is the Kraft inequality for the output lengths, expressed as (\delta = 1 - \sum_j n^{-|\phi(A_j)|}>0). By defining the error term (\varepsilon_j = |\phi(A_j)| + \log_n p(A_j)) and bounding it within (

Redundancy Estimates for Word-Based Encoding of Sequences Produced by a Bernoulli Source

💡 Research Summary

Comments & Academic Discussion

Leave a Comment