SVD-based unfolding: implementation and experience
📝 Original Info
- Title: SVD-based unfolding: implementation and experience
- ArXiv ID: 1112.2226
- Date: 2011-12-13
- Authors: Kerstin Tackmann, Andreas Hoecker
📝 Abstract
With the first year of data taking at the LHC by the experiments, unfolding methods for measured spectra are reconsidered with much interest. Here, we present a novel ROOT-based implementation of the Singular Value Decomposition approach to data unfolding, and discuss concrete analysis experience with this algorithm.💡 Deep Analysis
📄 Full Content
The unfolding problem can be formulated as a matrix equation, Âij x j = b i , where x is the true, physical distribution, b the measured distribution. Âij is the probability for an event generated in bin j to be reconstructed in bin i and as such,  describes finite resolution and inefficiencies and can be obtained from the simulation (or appropriate control samples). The singular value decomposition of  serves both for shedding light on the underlying instability of the problem, as well as for providing a solution. Small singular values, which are often present in detector response matrices, are found to greatly enhance statistical fluctuations in the measured distribution. A suitably chosen regularization procedure dampens the enhanced fluctuations. Rewriting the above equation to A ij w j = b i , where A ij now contains numbers of events rather than probabilities, and w describes the ratio between the desired physical distribution and the underlying true distribution in the simulation (for example), allows for a better treatment of the statistical uncertainties in the detector matrix. At the same time, this allows for a physically motivated regularization via a discrete minimum-curvature condition on the ratio of the unfolded distribution and a simulated truth distribution, which corresponds to retaining the statistically significant contributions of w, shown to be related to the larger singular values in the decomposition of A.
This note presents a C++ implementation of the SVD-based unfolding, discusses analysis experience with this algorithm, and provides a comparison to the iterative dynamically stabilized unfolding method (IDS) [2] for a concrete example.
A C++ implementation of the SVD-based unfolding is provided by TSVDUnfold, which is part of the ROOT analysis framework [3] as of version 5.28. It can also be used through the RooUnfold framework [4], which is based on ROOT and comes with additional functionality.
TSVDUnfold provides access to the singular values of the detector response matrix and to the distribution of the |d i | (see Ref. [1]), which help to properly set the regularization strength parameter in the unfolding. TSVDUnfold also allows to propagate covariance matrices of the measured spectrum through the unfolding using pseudo experiments. In addition it provides the covariance matrix of the unfolded spectrum related to finite statistics in the simulation sample (or control sample) that is used to determine the detector response matrix, also making use of pseudo experiments.
More recently, TSVDUnfold has been extended to also provide the regularized covariance matrix and the inverse covariance matrix (not regularized) computed during the unfolding (see Eqs. (52,53) in Ref. [1]). In addition, the new version of TSVDUnfold implements the internal rescaling of the unfolding equations making use of the full covariance matrix of the measured spectrum (see Eq. (34) in Ref. [1]) rather than only its diagonal elements.
The covariance matrices of the unfolded spectrum as computed during the unfolding and as obtained from pseudo experiments, respectively, have been compared for a toy example (see Fig. 1) and have been found in good agreement. The uncertainties (taken from the diagonal elements of the covariance matrices) provided by the two methods agree to better than 4% and the correlations are well-reproduced. Even in the case of non-optimal regularization, the two methods provide compatible results: the uncertainties obtained with the two methods have been found to agree within 6% (11%) for a strongly under-(over-) regularized unfolding, with compatible correlation patterns.
The SVD-based unfolding has been used in numerous data analyses over the past 15 years, among which is the unfolding of the hadronic mass spectrum in inclusive, charmless, semileptonic B-meson decays, B → X u ν, at the BABAR experiment [5]. Due to the nature of the measured spectrum, its unfolding and in particular the determination of the appropriate regularization required careful studies. The relatively low statistics of estimated 1027 signal events and the subtraction of the dominant B → X c ν backgrounds result in sizable statistical and systematic uncertainties. The size of the bins has been chosen to equal the hadronic mass resolution in signal events. Due to the lar