ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model
Generative modeling of non-negative, discrete data, such as symbolic music, remains challenging due to two persistent limitations in existing methods. Firstly, many approaches rely on modeling continuous embeddings, which is suboptimal for inherently discrete data distributions. Secondly, most models optimize variational bounds rather than exact data likelihood, resulting in inaccurate likelihood estimates and degraded sampling quality. While recent diffusion-based models have addressed these issues separately, we tackle them jointly. In this work, we introduce the Information-Theoretic Discrete Poisson Diffusion Model (ItDPDM), inspired by photon arrival process, which combines exact likelihood estimation with fully discrete-state modeling. Central to our approach is an information-theoretic Poisson Reconstruction Loss (PRL) that has a provable exact relationship with the true data likelihood. ItDPDM achieves improved likelihood and sampling performance over prior discrete and continuous diffusion models on a variety of synthetic discrete datasets. Furthermore, on real-world datasets such as symbolic music and images, ItDPDM attains superior likelihood estimates and competitive generation quality-demonstrating a proof of concept for distribution-robust discrete generative modeling.
💡 Research Summary
**
The paper introduces ItDPDM (Information‑theoretic Discrete Poisson Diffusion Model), a novel diffusion‑based generative framework designed specifically for non‑negative discrete data such as symbolic music or quantized images. Existing diffusion models fall into four categories based on timestep (discrete vs. continuous) and latent space (discrete vs. continuous). Continuous‑state models (CTCS, DTCS) embed discrete data into a continuous space, which introduces a “discretization gap” and forces the model to learn probability density functions (pdfs) instead of the true probability mass functions (pmfs). Discrete‑time discrete‑state (DTDS) models such as Learning‑to‑Jump (LTJ) avoid this gap but rely on variational ELBO objectives that do not correspond exactly to the data log‑likelihood, leading to sub‑optimal likelihood estimates and degraded sample quality.
ItDPDM tackles both issues simultaneously by defining a Poisson noise channel: given a non‑negative integer input (x), the noisy observation at signal‑to‑noise ratio (SNR) (\gamma) is drawn from a Poisson distribution (z_\gamma \sim \text{Poisson}(\gamma x)). Unlike Gaussian noise, Poisson corruption is non‑additive and non‑separable, which makes denoising more challenging but also aligns naturally with the discrete nature of the data.
The core technical contribution is the Poisson Reconstruction Loss (PRL): \
Comments & Academic Discussion
Loading comments...
Leave a Comment