A framework for LISA population inference
The Laser Interferometer Space Antenna (LISA) is expected to have a source rich data stream containing signals from large numbers of many different types of source. This will include both individually resolvable signals and overlapping stochastic backgrounds, a regime intermediate between current ground-based detectors and pulsar timing arrays. The resolved sources and backgrounds will be fitted together in a high dimensional Global Fit. To extract information about the astrophysical populations to which the sources belong, we need to decode the information in the Global Fit, which requires new methodology that has not been required for the analysis of current gravitational wave detectors. Here, we %start that development, presenting present a hierarchical Bayesian framework to infer the properties of astrophysical populations directly from the output of a LISA Global Fit, consistently accounting for information encoded in both the resolved sources and the unresolved background. Using a simplified model of the Global Fit, we illustrate how the interplay between resolved and unresolved components affects population inference and highlight the impact of data analysis choices, such as the signal-to-noise threshold for resolved sources, on the results. Our approach provides a practical foundation for population inference using LISA data.
💡 Research Summary
The paper addresses a fundamental statistical challenge that will arise with the Laser Interferometer Space Antenna (LISA): the simultaneous presence of individually resolvable gravitational‑wave sources and a stochastic confusion background generated by the same underlying astrophysical population. Existing hierarchical Bayesian methods, developed for ground‑based detectors (LIGO/Virgo) and pulsar‑timing arrays (PTA), treat either a set of independent resolved events or a purely stochastic signal, but they cannot accommodate a mixed regime where the same population contributes to both components.
The authors first review the standard hierarchical Bayesian formalism used in LIGO/Virgo analyses, introducing the population differential rate N(θ|Λ), the total expected number of events N(Λ), and the selection function ξ(Λ) that accounts for detection thresholds. They point out that these formulations assume a clear separation between data chunks that contain detectable events and those that do not—a premise that fails for LISA, where essentially all data contain overlapping signals over long observation times.
To overcome this, the paper develops a general hierarchical Bayesian framework that explicitly models two sub‑populations: (i) resolvable sources with differential rate N₁(θ|Λ) defined on a region S₁ of parameter space, and (ii) unresolved sources that form a Gaussian stochastic background with rate N₂(θ|Λ) on the complementary region S₂. Both sub‑populations are treated as independent Poisson processes. The joint likelihood of the data, the set of source parameters {θₙ}, the total number of sources n, and the noise/background parameters Σ is written as
p(d,{θₙ},n,Σ|Λ) ∝ p(d|{θₙ},n,Σ) p(Σ|{θₙ},n,Λ) ∏_{i=1}^{n} N(θ_i|Λ) e^{-N}.
Crucially, the framework allows the number of resolved sources to vary, which is achieved in practice by reversible‑jump Markov chain Monte Carlo (RJMCMC). In an RJMCMC sample, adding a resolved source reduces the contribution of the stochastic background and vice‑versa, naturally encoding the correlation between the two components that is absent in previous approximations.
The authors then consider a “separable” special case where the boundary between S₁ and S₂ is sharp and known, deriving an explicit expression for the marginal population likelihood (their Eq. 3.16). They discuss how, in realistic LISA analyses, the division is not sharp; instead, a signal‑to‑noise ratio (SNR) threshold determines whether a source is modelled individually or absorbed into the background. This choice introduces a systematic dependence of the inferred hyper‑parameters on the threshold.
To illustrate the impact, a toy Global Fit is constructed. The toy model assumes a simple Galactic binary population with power‑law distributions in frequency and amplitude. Two SNR thresholds (low and high) are explored. For each threshold, RJMCMC samples are generated, and the posterior distributions of hyper‑parameters (e.g., the slope of the amplitude distribution) are recovered. The results show that a high SNR threshold leaves many low‑SNR binaries in the stochastic component, biasing the inferred amplitude distribution toward lower values. Conversely, a low threshold yields many resolved sources, increasing the dimensionality of the parameter space and inflating uncertainties, but it reduces bias in the hyper‑parameter estimates. Importantly, when the joint likelihood that includes the correlation between resolved and unresolved components is used, the bias is substantially mitigated compared with the naïve product of separate likelihoods.
The paper concludes that (1) a unified hierarchical Bayesian formalism is essential for LISA population inference, (2) reversible‑jump sampling provides a practical way to explore the variable‑dimension posterior, and (3) the choice of SNR threshold must be treated as part of the inference problem rather than a fixed preprocessing step. The authors suggest that future work should incorporate more realistic astrophysical models (including EMRIs and massive‑black‑hole mergers) and integrate the framework into full LISA data‑analysis pipelines, thereby enabling robust extraction of population‑level information from the unprecedentedly rich LISA data set.
Comments & Academic Discussion
Loading comments...
Leave a Comment