Dynamic Novel View Synthesis in High Dynamic Range
High Dynamic Range Novel View Synthesis (HDR NVS) seeks to learn an HDR 3D model from Low Dynamic Range (LDR) training images captured under conventional imaging conditions. Current methods primarily focus on static scenes, implicitly assuming all scene elements remain stationary and non-living. However, real-world scenarios frequently feature dynamic elements, such as moving objects, varying lighting conditions, and other temporal events, thereby presenting a significantly more challenging scenario. To address this gap, we propose a more realistic problem named HDR Dynamic Novel View Synthesis (HDR DNVS), where the additional dimension ``Dynamic’’ emphasizes the necessity of jointly modeling temporal radiance variations alongside sophisticated 3D translation between LDR and HDR. To tackle this complex, intertwined challenge, we introduce HDR-4DGS, a Gaussian Splatting-based architecture featured with an innovative dynamic tone-mapping module that explicitly connects HDR and LDR domains, maintaining temporal radiance coherence by dynamically adapting tone-mapping functions according to the evolving radiance distributions across the temporal dimension. As a result, HDR-4DGS achieves both temporal radiance consistency and spatially accurate color translation, enabling photorealistic HDR renderings from arbitrary viewpoints and time instances. Extensive experiments demonstrate that HDR-4DGS surpasses existing state-of-the-art methods in both quantitative performance and visual fidelity. Source code is available at https://github.com/prinasi/HDR-4DGS.
💡 Research Summary
The paper introduces a new problem setting called High‑Dynamic‑Range Dynamic Novel View Synthesis (HDR‑DNVS), which aims to reconstruct temporally coherent HDR radiance fields and dynamic geometry from only low‑dynamic‑range (LDR) multi‑exposure image sequences. Existing HDR‑NVS methods assume static scenes, while Dynamic NVS approaches operate solely in the LDR domain, leaving a gap for real‑world scenarios that involve both motion and extreme lighting. To fill this gap, the authors propose HDR‑4DGS, a framework built on Gaussian Splatting (GS) that extends the 3DGS representation to four dimensions (3‑D space + time) and incorporates a novel Dynamic Tone‑Mapper (DTM) that learns per‑channel tone‑mapping functions conditioned on temporal radiance context.
In the 4DGS backbone, each Gaussian is parameterized by a mean μ = (μₓ, μ_y, μ_z, μ_t) and a 4×4 covariance matrix, allowing the model to represent both spatial position and the probability of a point being active at a given timestamp. Appearance is encoded with 4D spherical harmonics (4D‑SH) combined with Fourier series, enabling smooth, time‑varying color variations. The DTM first aggregates the average HDR color of all Gaussians at each time step to form a radiance signature r_h(t). A sliding window of the past k signatures is fed into a Dynamic Radiance Context Learner (DRCL) – which can be an RNN, LSTM, GRU, or Transformer – producing a context embedding f_t. The HDR color (log‑scaled) is concatenated with the exposure time (log e_t) and f_t, and passed through a lightweight per‑channel MLP g_θ to obtain the corresponding LDR color. This design mimics the exposure‑dependent camera response function while dynamically adapting to evolving illumination, thereby preserving temporal radiance consistency.
Training optimizes a combined loss L_total = L_ldr + α L_hdr. L_ldr consists of L1 + D‑SSIM terms between rendered LDR images (both from the tone‑mapped 2D rasterization and the direct 3DGS rendering) and the ground‑truth LDR inputs. When HDR ground truth is available, L_hdr measures the same loss between the rendered HDR image and the HDR reference; otherwise α = 0 and the model learns solely from LDR supervision. The dual‑supervision strategy stabilizes CRF learning and improves generalization, as demonstrated in ablation studies.
To evaluate the approach, the authors release two benchmark suites: HDR‑4D‑Syn (8 synthetic scenes) and HDR‑4D‑Real (4 real‑world captures). Each sequence provides multi‑exposure LDR frames, synchronized HDR frames, and per‑frame 3D geometry. Compared against state‑of‑the‑art static HDR‑NVS methods (HDR‑NeRF, HDR‑GS, GaussHDR) and dynamic LDR‑NVS methods (D‑NeRF, Hyper‑NeRF, 4DGS variants), HDR‑4DGS achieves superior PSNR, SSIM, and HDR‑VDP‑2 scores, often improving by 2–3 dB. Qualitative results show better preservation of highlights, shadows, and color fidelity under rapid illumination changes. Importantly, HDR‑4DGS renders at 30–60 fps on a single RTX 4090, approaching real‑time performance.
Ablation experiments explore the impact of the context window length, the choice of DRCL architecture, and the weighting α. Longer windows capture slower illumination trends but may lag behind sudden lighting shifts; Transformers yield marginally higher accuracy than LSTMs at higher computational cost. Removing the DTM leads to noticeable temporal flickering and loss of HDR detail, confirming its critical role.
In summary, HDR‑4DGS offers a practical solution for dynamic HDR view synthesis from ordinary LDR video streams. By marrying the efficiency of Gaussian Splatting with a temporally adaptive tone‑mapping module, it simultaneously resolves geometric‑temporal coherence and radiometric fidelity—challenges that have limited prior NVS approaches. Future work may address more complex non‑linear lighting, camera calibration errors, and scaling to large‑scale streaming applications.
Comments & Academic Discussion
Loading comments...
Leave a Comment