Statistical Multiresolution Dantzig Estimation in Imaging: Fundamental Concepts and Algorithmic Framework
In this paper we are concerned with fully automatic and locally adaptive estimation of functions in a "signal + noise"-model where the regression function may additionally be blurred by a linear operator, e.g. by a convolution. To this end, we introd…
Authors: Klaus Frick, Philipp Marnitz, Axel Munk
ST A TISTICAL MUL TIRESOLUTION D ANTZIG ESTIMA TIO N IN IMA GING: FUND AMENT AL CONCE PTS AND ALGORITHMIC FRAMEW ORK KLA US FRICK Institute for Mathematic al Sto cha stics University of G¨ ottingen Goldschmid tstr aße 7, 37077 G¨ ottingen PHILIPP MARNITZ Institute for Mathematic al Sto cha stics University of G¨ ottingen Goldschmid tstr aße 7, 37077 G¨ ottingen AXEL MUNK Institute for Mathematic al Sto cha stics University of G¨ ottingen Goldschmid tstr aße 7, 37077 G¨ ottingen and Max Planck Institute for Biophysic al Chemistry Am F aßb er g 11, 37077 G¨ ottingen Abstra ct. In this pap er we are concerned with fully automatic and locally adaptive estima- tion of functions in a “signal + noise”-model where the regression fun ct ion may additionally b e b lurred by a linear op erator, e.g. by a con vol ution. T o this end, w e introd uce a general class of statistic al multir esolution estimators and devel op an algorithmic framework for com- puting those. By th is we mean estimators that are d efi ned as solutions of conv ex optimization problems with ℓ ∞ -type constraints. W e emp lo y a com bination of the alternating d irection metho d of multiplie rs with Dy k stra’s algorithm for comput in g orthogonal p ro jections onto inters ections of conv ex sets and pro ve numerical conv ergence. The capability of the prop osed metho d is illustrated by vari ous examples from imaging and signal detection. E-mail addr esses : frick@math.uni-goe ttingen.de , marnitz @math.uni-g oettingen.de , munk@math. uni-goettin gen.de . 2000 Mathematics Subje ct Classific ation. Primary: 62G05, 90C06; secondary 68U10. Key wor ds and phr ases. Alternating Direction Metho d of Multipliers (A DMM), Biophotonics, Dantzig Selector, D ykstra’s Pro jection Algorithm, Statistical Imaging, Statistical Multiscale Analysis, Statistical Reg- ularization, Local Adaption, Signal Detection. Correspondence to fric k@math.uni- goettingen.de . 1 2 1. Introduction In numerous applications, th e relation of observ able data Y and the (unknown) signal of in terest u 0 can b e mo d eled as an in v erse linear regression problem. W e shall assume that the data Y = { Y ν } is s ampled on th e equid istan t grid X = { 1 , . . . , m } d , with m, d ∈ N and that u 0 ∈ U for some linear space U , suc h as the Euclidean space or a S ob olev class of functions. Hence the mo del can b e formalized as Y ν = ( K u 0 ) ν + ε ν , ν ∈ X . (1) Here we assume th at ε = { ε ν } ν ∈ X are ind ep end en t and identic ally distribu ted r.v. with E ( ε ν ) = 0 and E ε 2 ν = σ 2 > 0 (white n oise). Moreo v er, K : U → ( R m ) d denotes a linear op erator that enco des the functional relation b etw een the quan tities that are accessible by exp eriment and th e und erlying signal. Often the op erator K do es not h a v e a con tin uous in v erse (or its inv erse is ill-conditioned in a discrete setting, where K is a matrix), th at is estimation of u 0 giv en the data Y is an il l-p ose d pr oblem . As a consequence, estimators for u 0 can n ot b e obtained by merely applying the in v erse of K to an estimator o f K u 0 , in general. Instead, m ore sophisticated statistic al r e gularization tec hniques hav e to b e emplo y ed that, lo osely sp eaking, are capable of simultaneously inv erting K and solving the regression problem. The application we pr imarily h a v e in mind is the reconstruction of low-dimensional signals (e.g. images) u 0 whic h are presum ed to exhibit a str on g neighborh o o d structure as it is c haracteristic for imaging or signal detection pr oblems. These neigh b orho o d relations are often m o deled by prior smo othn ess or structur al assumptions on u 0 (e.g. on the texture of an image). The aim of th is pap er is t w ofold. First, we will intro d uce the br oad class of statistic al multir esolution estimators (SMRE ) . W e claim that n umerous regularizatio n tec hniques, th at w ere recentl y p rop osed for different problems in v arious branc hes of app lied mathematics and statistics, can b e considered as sp ecial cases of these. Among others, th is includ es the Dantzig sele ctor (see [4, 7 , 29] and r eferences therein) that w as recen tly prop osed in the context of high dimensional s tatistics. Our prior fo cus, ho w ev er, will b e put on imaging problems and it will turn out that the aforemen tioned neigh b orho o d relations can b e mo deled within our SMRE framework in a straight forwa rd manner. This will r esult in lo c al ly adaptive and ful ly automatic image reconstruction metho ds. The high in trinsic structur e of the signals that are typica lly u nder consid eration in imaging is in contrast to the usual situation in high-dimensional statistics. Here u 0 is usually assu med to b e unstructured but to hav e a sparse rep resen tation w ith resp ect to some basis of U (cf. [7, 8 , 43 ]). C onsequent ly , the consisten t estimation of u 0 is r ealized by m inimizing a regularization functional whic h fosters sparsit y , such as the ℓ 1 -norm of th e co efficient s, sub ject to an ℓ ∞ -constrain t on the co efficien ts of the residual, i.e. inf u ∈ U k u k 1 s.t. k K ∗ ( Y − K u ) k ∞ ≤ q . (2) In order to apply this approac h for image reconstruction, t w o mo difications b ecome necessary: Often one aims to m inimize other regularization functionals s u c h as the total v ariatio n semi- norm (cf. [34, 37]) or S ob olev norms, sa y . Hence, we su ggest to replace the ℓ 1 -norm in (2) by a general con v ex functional J that mo dels the smo othness or texture inf ormation of signals or images (cf. [1, 35]). F urthermore, w e relax the ℓ ∞ -constrain t such th at neigh b orho o d relations of the image can b e tak en into account . This generalizes the Dan tzig selector to ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 3 this task in a natural w a y and obviously increases estimation efficien tcy . As we will lay out in P aragraph 1.2.2, this requ ires new algorithms to compute efficiently the resulting large scale optimization problem. 1.1. Statistical Multiresolution Estimation. W e will n o w in tro du ce th e ann ou n ced class of estimators. T o this end, let S b e some ind ex set and W = ω S : S ∈ S b e a set of giv en w eigh t-functions on the grid X = { 1 , . . . , m } d . A statistic al multir esolution estimator (SMRE ) (or gener alize d Dantzig sele ctor ), is defin ed as a so lution of the constrained optimiza tion problem inf u ∈ U J ( u ) s.t. max S ∈S X ν ∈ X ω S ν (Λ( Y − K u )) ν ≤ q . (3) Here, J : U → R den otes a regularizatio n fun ctional that incorp orates a pr iori kno wledge on the u nknown signal u 0 (suc h as sm o othness) and Λ : ( R m ) d → ( R m ) d a p ossibly non-linear transformation. The constan t q can b e considered as a univ ersal r e gularization p ar ameter that go v e rns the trade-off b et w een regularit y and data-fit of the reconstruction. In most practical situations q is c hosen to b e the α -qu an tile q α of the multir esol ution (MR) statistic T ( ε ), where T : ( R m ) d → R en co des the inequalit y constrain t in (3), i.e. T ( v ) = max S ∈S X ν ∈ X ω S ν (Λ( v )) ν , v ∈ ( R m ) d . (4) T o this end, we assume the distribu tion of T ( ε ) to b e (appro ximately) kno wn. This can either b e obtained by sim ulations or in some cases the limiting distribu tion can even b e derive d explicitly . The regularization parameter q then adm its a sound sta tistical interpretati on: Eac h solution ˆ u α of (3) satisfies P J ( ˆ u α ) ≤ J ( u 0 ) ≥ α i.e. the estimator ˆ u α is mor e r e gular (in terms of J ) than u 0 with a p robabilit y of at least α . T o see this simply observ e that th e tru e signal u 0 satisfies the constraint in (3) with probabilit y at least α . F or a giv en estimator ˆ u of u 0 , the set W is assumed to b e r ic h en ough in order to catc h all relev an t non-random signals that are visible in the residual Y − K ˆ u . Then, the a v erage function µ S ( v ) = X ν ∈ X ω S ν (Λ( v )) ν (5) ev aluated at v = Y − K ˆ u is su pp osed to b e significan tly la rger than q for at le ast one ω ∈ W , when ever Y − K ˆ u fails to resem ble wh ite n oise. Put differen tly , the MR-statistic T ( Y − K ˆ u ) is b ounded b y q , wheneve r Y − K ˆ u is accepted as white noise according to the r esolution pro vided by W . I n fact, this is a key observ ations that r ev eals n umerous p otenti al application areas of th e estimation metho d (3). Th e examples we ha v e in m ind are mainly from statistic al sig nal dete ction and imaging , where the index set S is t ypically c hosen to b e an o v erlapping (redund ant) system of subsets of the grid X and ω S is the normalized indicator fu nction on S ∈ S . Consequently the inequ ality constraint in (3) guarante es that the residual r esem bles white noise on all sets S ∈ S . In other words, the SMRE app roac h in (3) yields a reconstruction m etho d that lo c al ly adap ts the amount of r e gularization according to the und erlying image features. W e illustrate th is in Section 3 by v a rious examples. 4 ST A TISTICAL MUL TIRESOLUTION DANTZIG ESTIMA TION IN IMAGING Summarizing, th e optimization problem in (3) amounts to c ho ose the most p arsimonious among all estimators ˆ u for whic h the residual Y − K ˆ u r esem bles white n oise according to the statistic T . If Y − K ˆ u con tains some non r andon signal, i.e. T ( Y − K ˆ u ) is lik ely to b e larger than q and u happ ens to lie outside the admissible domain of (3). T hus, the multi-reso lution constrain t preven ts to o pars im on ious r econstructions due to the minimization of J . 1.2. Algorithmic C hallenges and Related W ork. 1.2.1. Multir esolution Metho ds. SMREs and related MR statistics h a v e recen tly b een studied in v arious cont exts. W e giv e a b rief (but incomplete) o v erview. Classical MR statistic s are obtained f rom the general form in (4) b y setting U = ( R m ) d and Λ = Id . Moreo v er, one considers the s y s tem W to con tain indicator functions on cub es. T o b e more p r ecise, d efine the index set S to b e the system of all d -dimensional cub es in X and set ω S = χ S / √ # S . Then , the MR-statisti c in (4) reduces to T ( v ) = max S ∈S 1 √ # S X ν ∈ S v ν . This statistic was int ro du ced in [42] (called scanning statistic there) in order to d etect a signal against a noisy bac kground. It was sh own in [31] that lim m →∞ T ( ε ) √ 2 d log m = σ a.s. If the sys tem S is reduced to the set of all dyadic squares, then it wa s pro v ed in [28] that (after suitable trans f ormations) T also con v erges w eakly to the Gumbel distribu tion. T here, the authors also established a m etho d f or lo cally adaptive image denoising employing lin ear diffusion equations with spatially v aryin g diffusivity . SMREs (3) h a v e b een studied recen tly for the case d = 1 in [12] and [6], where total-v ariation p en alt y and the n umber of jump s in piecewise constant r egression were considered as regularization functional J , resp ectiv ely . In [21] consistency and conv ergence r ates for SMREs hav e b een studied in a general Hilb ert space setting. SMREs with s quared residuals, that is Λ( v ) ν = v 2 ν , yield another class of estimators that ha v e attracted muc h attent ion. Ab o v e all, the situ ation where S consists of th e single set X and ω X is c hosen to b e the constant 1 fun ction is of sp ecial in terest, since then (3) red u ces to the p enalize d le ast squar e estimation . In particular (3) then can b e rewritten in to inf u ∈ U J ( u ) + λ X ν ∈ X ( K u − Y ) 2 ν (6) for a suitable multiplier λ > 0. If J ( u ) = k u k 1 the LASSO estimator will result (cf. [43]). Recen tly , also non-trivial c hoices of S w ere considered. In [3] S is chosen to consist of a partition of G wh ic h is obtained b eforehand b y a Mumford -Shah segmen tat ion. In [15], a subset S ⊂ X is fixed and afterwa rds S is defin ed as the collection of all tr an s lates of S . In [17] MR-statistics are used for shap e-constrained estimation b ased on testing qualitativ e h yp othesis in nonparametric regression for d = 1. Here, th e w eigh t functions ω S incorp orate qualitativ e features su ch as monotonicit y or conca vit y . S imilarly , MR-statistics are u sed in [18] in ord er to detect lo cations of lo cal increase and decrease in density estimation. Muc h in the same spirit is the w ork in [16] w here m ultiscale sign tests are emplo y ed for computing confidence bands for isotonic med ian cur v es. ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 5 As mentio ned previously , the Dantzig-sele ctor [7] is also co v ered by the general SMRE framew ork in (3 ). T o see this, set U = R p (with t ypically p ≫ m ), Λ = Id and define the w eigh ts ω S = K χ S , S ∈ S . Then, eac h solution of (3) can b e considered as a generalized Dantzi g selector. The matrix K ∈ R m × p in this con text is us u ally interpreted as design matrix of a high dimen s ional lin ear mo del. Th e classical Dan tzig selector as introdu ced in [7] then results in the sp ecial case where S only consists of sin gle-elemen ted su bsets of { 1 , . . . , p } and J is c hosen to b e the ℓ 1 -regularizatio n fun ctional J ( u ) = k u k 1 = p X i =1 | u i | . Hence L ASSO and Dan tzig selector are uni-scale esti mators whic h tak e in to accoun t the largest ( S = { X } ) an d smallest ( S consists of all s ingletons in { 1 , . . . , p } ) s cales, resp ectiv ely . In this sense, they constitute tw o extreme cases of SMRE. 1.2.2. Algo rithmic Chal lenges. F rom a compu tational p oint of view, compu ting an SMRE amoun ts to solv e the c ons tr aine d optimization pr oblem (3) whic h can b e rewr itten in to inf u ∈ U J ( u ) s.t. µ S ( Y − K u ) ≤ q , ∀ ( S ∈ S ) . (7) W e n ote that in practical applications the num b er of constrain ts in (7), that is the cardinalit y of the index set S , can b e qu ite large (in Section 3.2 d enoising of a 512 × 512 image results in more than 6 m illion inequalities). Moreo v er, the inequalities (even for the s implest case where Λ = Id) are mutually correlated. Both of these facts tu rn (7) in to a numerical ly c hallenging problem and standard app roac hes (suc h as interior p oin t or conjugate gradient metho ds) p erform far from satisfactorily . The authors in [3, 15, 28] appr oac h the n umerical solution of (7) by means of an analogon of (6) with spatially dep enden t multiplier λ ∈ ( R m ) d , i.e. inf u ∈ U J ( u ) + X ν ∈ X λ ν ( K u − Y ) 2 ν . Starting from a (constan t) initial parameter λ = λ 0 , the parameter λ is iterativ ely adj usted by increasing it in regions w h ic h we re p o orly reconstructed b efore according to the MR-stati stic T . Th is approac h strongly dep end s on the sp ecial structure of S that allo ws a s traigh tforw ard iden tification of eac h set S ∈ S w ith a unique p oint in the grid X . Put differen tly , it is not clear h o w to mo dify this paradigm in order to solv e (3) for highly redu ndant systems S as we ha v e it in mind . Recen tly a general algorithmic f r amew ork was in tro duced in [2 ] for the solutions of large- scale con v ex cone prob lems inf u ∈ U J ( u ) s.t. K u − Y ∈ K where K is a conv ex cone in s ome Euclidean space. The app roac h w as realized in the softw are pac k age T emplates f or First-Or der Conic Solvers (TFOCS) 1 . Th e ab o v e formulation is very 1 a v ailable at http://tfo cs.stanford .edu/ 6 ST A TISTICAL MUL TIRESOLUTION DANTZIG ESTIMA TION IN IMAGING general and in order to reco v er (7) one has to consider the cone K = ( ( v , q ) ∈ ( R m ) d × R : X ν ∈ S Λ( v ) ν ≤ q ∀ ( S ∈ S ) ) The approac h in [2] emplo ys the dual formulatio n of the problem inf ξ ∈ V J ∗ ( K ∗ v ) + h Y , v i s.t. v ∈ K ∗ whic h in v olv es the compu tation of the du al cone K ∗ ( J ∗ denotes the Legendre-F enchel dual of J ). Th is approac h is particularly app ealing for the uni-scale Dan tzig selector sin ce in this situation the cone K coincides w ith th e epi-graph of the ℓ ∞ -norm and hence its dual cone is straigh tforw ard to compute (it is the epi-graph of the ℓ 1 -norm). As it is argued in [2], this approac h is capable of computing Dan tzig selectors for large scale pr oblems in con trast to p revious app roac hes such as sta ndard linea r programming tec hniques [7] or homotop y metho ds su c h as D ASSO [29 ] or [41]. As the authors stress, their appr oac h w orks well in the case when K is the epi-graph of a norm for whic h the pro jections onto K ∗ are tractable and computationally efficient . Ho w ev er, for the applicatio ns w e ha v e in mind (suc h as lo cally adaptiv e imaging reconstru ction), the approac h in [2] is only of limited u se: In cont rast to the aforementioned epi-graphs, the large num b er of (strongly dep enden t) constraints in (7) brings ab out a cone K that on the one h an d exhibits a tremendous amount of faces compared to the d imension of the image space dim( H ) = md and that on the other hand is no longer symmetric w.r .t. to the q -axis. Both of these facts turn the computation of d ual cone K ∗ (or the pr o jections onto it) into a most chall enging problem, ev en in the simp lest case when Λ is linear. The aim of this pap er is to dev elop a general algorithmic framework that mak es solutions of (7) numerical ly accessible for man y applications. In order to do so w e prop ose to introd uce a slac k v ariable in (7) and then use the alternating dir e ction metho d of multipliers , an Uza w a- t yp e algorithm that decomp oses pr ob lem (7) into a J -p enalized least squares problem for the prim al v ariable and a orthogonal pr o jection pr ob lem on the f easible set of (7) for the slac k v ariable. Th is approac h has the app ealing effect that once an im p lemen tation f or the pro jection problem is established, different regularization functionals J can easily b e emplo y ed without c hanging th e backbone of the algorithm. O ur wo rk is m uc h in the same spirit as [33], whic h considered an alternating direction metho d for the computation of the Dan tzig selector recen tly . In this case the computation of the o ccurring orthogonal p ro jections are av ailable in closed form, wh er eas in our app lications this is not the case du e to th e aforemen tioned dep end encies. In order to tac kle the orthogonal pr o jection problem we emplo y Dykstra’s pro jection metho d [5 ] w hic h is capable of computing the pro jection on to the inte rsection of con v ex b o d ies by merely usin g the ind ividual pro jections on to the latter. The efficiency of the pro- p osed metho d h ence increases consid erably if the ind ex set S can b e decomp osed in to “few” partitions that cont ain indices of mutually indep enden t inequalities in (7). In p articular, b y this app roac h w e will b e able to compute classical SMRE (as introd uced in [12, 21]) in d = 2 space dimensions wh ic h to our kno wledge has n ev er b een d one so far. Th is p uts u s into the p osition to stud y th e p erformance of s u c h estimators compared with other b enc hmark meth- o ds in lo cally adaptive signal reco v ery (such as adaptive weights smo othing cf. [40]). As it will turn out in Section 3 it w ill outp erform these visually as well as quan titativ ely . ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 7 1.3. Organization of the Paper. Th e p ap er is organized as follo ws: In S ection 2 we in- tro duce a general algorithmic app roac h for computing SMREs. W e will rewr ite (7) in to a linearly constrained problem and compute a s addle p oin t of the corresp onding augmen ted Lagrangian by the alternating d irection metho d of multipliers in Parag raph 2.2. Un der quite general assumption, we prov e con v ergence of the algorithm in Theorem 2.2 and giv e some qualitativ e estimates for the iterates in T heorem 2.4 . One of the o ccurr in g minimization steps amounts to th e computation of an orthogonal pro jection onto a conv ex set in Euclidean space. In P aragraph 2.3, this problem w ill b e tac kled b y means of Dykstra’s pro jection algo- rithm introd uced in [5]. Finally , w e illustrate the p erform an ce of some particular instances of SMREs in Section 3: we stu d y problems in nonparametric regression, image denoising and decon v olution of flu orescence microscop y images and compare ou r results to other metho ds b y means of sim ulations. 2. Comput a tional Meth odology In this section we will address the qu estion on ho w to solve the linearly constrained opti- mization problem (7). After discuss ing some notations and basic assump tions in Subsection 2.1, w e will r eform ulate the prob lem in P aragraph 2.2 such that the alternating direction metho d of multipliers (ADMM), a Uz a w a-t yp e algorithm, can b e emp lo y ed as a solution metho d. As an effect, the task of computing a solution of (7) is replaced b y alternating i) solving an un constrained p enalized least squares problem that is indep e ndent of the M R- statistic T and ii) computing the orthogonal pro jection on a con v ex set in Euclidean space that is indep en- dent of J . This reve als an app ealing mo du lar n ature of our app roac h: The regularization fu nctional J can easily b e r eplaced once a metho d for the pro jection problem is settled. F or the latter we will prop ose an iterativ e p ro jection algorithm in Pa ragraph 2.3 that was introd u ced by Bo yle and Dykstra in [5]. 2.1. Basic Assumptions and Notation. F rom n o w on, X will stand for the d -dimen s ional grid { 1 , . . . , m } d and agree up on H = R X ≃ ( R m ) d b eing the space of all r eal v alued functions v : X → R . Moreo v er, we assu me that S den otes some index set and that W = ω S : S ∈ S is a collectio n of element s in H . F or tw o elemen ts v , w ∈ H we will u se the standard inner pro du ct and n orm h v , w i = X ν ∈ X v ν w ν and k v k = p h v , v i resp ectiv ely . Next, we assum e that Λ : H → H is conti n uous suc h that Λ(0) = 0 and th at for all S ∈ S the mapping v 7→ ω S , Λ( v ) is con v ex. With this notation, we can r ewrite the av er age function in (5 ) in the compact form µ S ( v ) = w S , Λ( v ) . W e note, that it is not restricitv e to consid er more generaly Λ : H → R N with arb itrary N ∈ N . This could e.g. b e usefu l for augmen ting th e constraint set of (7) with further constrain ts of different t yp e. F or the signal and image detectio n p roblems as studied in this pap er, how ever, Λ is alwa ys a p oint w ise transformation of the residuals. Hence, we will restrict our considerations on the case when Λ : H → H . 8 ST A TISTICAL MUL TIRESOLUTION DANTZIG ESTIMA TION IN IMAGING F urtherm ore, we d efine U to b e a separable Hilb er t-sp ace with in ner pr o duct h· , ·i U and induced norm k·k U . Th e op erator K : U → H is assumed to b e linear and b ound ed and the functional J : U → R is conv ex and lo w er semi-con tin uous, that is { u n } n ∈ N ⊂ U and lim n →∞ u n =: u ∈ U = ⇒ J ( u ) ≤ lim inf n →∞ J ( u n ) . Recall the defin ition of the MR-statistic in (4). Throughout this pap er we w ill agree u p on the follo wing Assumption A. i) F or al l y ∈ H ther e exists u ∈ U such that T ( K u − y ) < q . ii) F or al l y ∈ H and c ∈ R the set u ∈ U : max S ∈S µ S ( K u − y ) + J ( u ) ≤ c is b ounde d. Under Assump tion A it follo ws from standard techniques in con v ex optimizatio n, th at a solution of (7) exists. As we w ill discu s s in Section 2.2 it even f ollo ws that a saddle p oin t of the corresp onding Lagrangian exists (cf. Theorem 2.1 b elo w). In this con text Assumption A i) is often referred to as Slater’s c onstr aint qualific ation and is for instance satisfied if K ( U ) is dense in H . Moreo v er, Assump tion A ii) will b e n eeded in order to guaran tee con v ergence of the algorithm for computing suc h a solution, as it is prop osed in the up coming section. This requirement is fu lfilled if J is co ercive i.e. lim k u k U →∞ J ( u ) = ∞ . In man y applications U is some function space an d J a grad ient b ased regularization metho d, suc h as the total v ariation semi-norm (cf. Section 3.2 ). Then a t ypical suffi cien t condition for Assumption A ii) is that K do es not annihilate constant f u nctions. 2.2. Alternating Direction Metho d of Multipliers. By in tro ducing a slac k v ariable v ∈ H w e rewrite (7) to the equiv ale n t p roblem inf u ∈ U,v ∈ H J ( u ) + G ( v ) sub ject to K u + v = Y . (8) Here, G denotes the c haracteristic fun ction on the feasible region C of (7), that is, C = { v ∈ H : µ S ( v ) ≤ q ∀ ( S ∈ S ) } and G ( v ) = ( 0 if v ∈ C + ∞ else . (9) Note that due to the assump tions on Λ, th e set C is closed and con v ex. The tec hnique of rewriting (7) into (8) is r eferr ed to as the de c omp osition-c o or dination appr o ach , see e.g. F ortin & Glo w inski [20, Chap. I I I]. T here, L agrangian multiplier metho ds are used for solving (8). T o this end , we recall the definition of the augmente d L agr angian of Problem (8), th at is L λ ( u, v ; p ) = 1 2 λ k K u + v − Y k 2 + J ( u ) + G ( v ) − h p, K u + v − Y i , λ > 0 . (10) The name stems from the fact that th e ordin ary Lagrangian L ( u, v ; p ) = J ( u ) + G ( v ) − h p, K u + v − Y i ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 9 is augmen ted by the quadratic p enalt y term (2 λ ) − 1 k K u + v − Y k 2 that fosters the fu lfillmen t of the linear constraints in (8). T h e augmente d L agr a ngian metho d consists in compu ting a saddle p oint ( ˆ u, ˆ v , ˆ p ) of L λ , that is L λ ( ˆ u, ˆ v ; p ) ≤ L λ ( ˆ u, ˆ v ; ˆ p ) ≤ L λ ( u, v ; ˆ p ) , ∀ (( u, v, p ) ∈ U × H × H ) W e n ote that eac h saddle p oint ( ˆ u, ˆ v , ˆ p ) of th e augmen ted Lagrangian L λ is already a saddle p oint of L and vice v ersa and that in either case th e pair ( ˆ u, ˆ v ) is a s olution of (8 ) (and th us ˆ u is a desired s olution of (7)). This follo ws e.g. from [20, C hap 3. Thm. 2.1]. Su fficien t conditions for the existence of saddle p oin ts are usu ally hard er to come u p with. Assum ption A summarizes a standard set of such conditions. Theorem 2.1. Assume that Assumption A holds. Then, ther e exists a sadd le p oint ( ˆ u, ˆ v , ˆ p ) of L λ . Pr o of. According to [19, Chap. I I I, Prop. 3.1 and Prop. 4.2] a saddle p oin t of L exists, if there is an elemen t u 0 ∈ U suc h that G is con tin uous at K u 0 − Y and that lim k u k Q →∞ J ( u ) + G ( K u − Y ) = ∞ . (11) According to Assumption A i) and due to the cont in uit y of Λ th e first r equirement is clearly satisfied. F ur ther, the co ercivit y assum ption (11) is a consequence of Assum ption A ii). W e w ill use th e Alternating D i r etion Metho d of Multipliers (ADMM ) (cf. Algorithm 1) as prop osed in [20 , Chap. I I I Sec. 3.2] f or the computation of a saddle p oint of L λ (and hence of a solution of (7)): Successiv e minimization of the augmen ted Lagrangian L λ w.r.t. the first and second v ariable follo w ed by an explicit s tep for m aximizing w.r.t. the third v ariable is p erformed. Con v ergence of this metho d is established in T heorem 2.2 whic h is a generalization of [20, Chap. I I I Thm . 4 .1]. W e note that the pro of, as presen ted in the App end ix A allo ws for appr oximate solution of the ind ividual subp roblems. F or th e sak e of simplicit y , we presen t the Algorithm in its exact form. Theorem 2.2. Every se que nc e { ( u k , v k ) } k ≥ 1 that is g ener ate d by Algorithm 1 is b ounde d in U × H and every we ak cluster p oint is a solution of (8) . Mor e over, X k ∈ N k K u k + v k − Y k 2 + k K ( u k − u k − 1 ) k 2 < ∞ . Remark 2.3. i) Th eorem 2.2 implies, that eac h weak cluster p oin t of { u k } k ≥ 1 is a s olution of (7). In particular, if th e solution u † of (7) is u nique (e.g. if J is strictly con v ex) , then u k ⇀ u † . ii) Note in p articular that (12) is in dep end en t of the c hoice of J , wh ile (13) is indep enden t of the multiresolution statistic b eing used. T his d ecomp osition giv es the p rop osed metho d a neat mo dular app eal: once an efficient s olution metho d for the pro jection p roblem (12) is established (see e.g. Section 2.3), the regularization fun ctional J in (3) can easily b e replaced b y p ro viding an alg orithm for the p enalized least squ ares problem (1 3). F or m ost p opu lar choice s of J , p r oblem (13) is well studied and efficient computational metho ds are at hand (see [44] for a extensive collectio n of algorithms and [32] for an o v erview on MCMC metho d s ). F or a giv en tolerance τ > 0, Theorem 2.2 imp lies that Algorithm 1 terminates and ou tp uts appro ximate solution u [ τ ] and v [ τ ] of (8). Ho w ever, the breaking condition in Algorithm 1 10 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING Algorithm 1 Alternating Direction Metho d of Mu ltipliers Require: Y ∈ H (data), λ > 0 (step size), τ ≥ 0 (tolerance). Ensure: ( u [ τ ] , v [ τ ]) is an appro ximate solution of (8) computed in k [ τ ] iteration steps. u 0 ← 0 U and v 0 = p 0 ← 0 H r ← k K u 0 + v 0 − Y k and k ← 0. while r > τ do k ← k + 1. v k ← ˜ v where ˜ v ∈ C satisfies k ˜ v − ( Y + λp k − 1 − K u k − 1 ) k 2 ≤ k v − ( Y + λp k − 1 − K u k − 1 ) k 2 ∀ ( v ∈ C ) . (12) u k ← ˜ u wh er e ˜ u satisfies 1 2 k K ˜ u − ( Y + λp k − 1 − v k ) k 2 + λJ ( ˜ u ) ≤ 1 2 k K u − ( Y + λp k − 1 − v k ) k 2 + λJ ( u ) ∀ ( u ∈ U ) . (13) p k ← p k − 1 − ( K u k + v k − Y ) /λ . r ← max( k K u k + v k − Y k , k K ( u k − u k − 1 ) k ). end while u [ τ ] ← u k and v [ τ ] ← v k and k [ τ ] ← k . merely guaran te es that the linear constrain t in (8) is appro ximated sufficiently well. Moreo v er, w e kn o w from construction that v [ τ ] ∈ C , which implies G ( v [ τ ]) = 0. So, it remains to ev aluate the v a lidit y of u [ τ ]: Theorem 2.4. L et ( ˆ u, ˆ v , ˆ p ) ∈ U × H × H b e any sadd le p oint of L λ . Mor e over, let τ > 0 and u [ τ ] ∈ U b e r etur end by Algorithm 1. Then, 0 ≤ J ( u [ τ ]) − J ( ˆ u ) − h K ∗ ˆ p, u [ τ ] − ˆ u i U ≤ τ 6 k ˆ p k + 4 k K ˆ u k + 2 τ λ ∀ ( τ > 0) . The r esult in Theorem 2.4 sh o ws ho w the accuracy of the appro ximate solution u [ τ ] d ep end s on τ . Moreo v er, it rev eals that choosing a small step size λ in Algorithm 1 p ossib ly yields a slo w deca y of the ob jectiv e fu nctional J . Ho w ev er, it f ollo ws from the defin ition of L λ in (10 ) that a small v al ue for λ fosters th e linear constraint in (8). Corollary 2.5. Let the assumtio ns of Theorem 2.4 b e satisfied. Moreo v er, assume that J is a q u adratic f u nctional, i.e. J ( u ) = 1 2 k Lu k 2 V , w h ere V is a fur ther Hilb ert-space and L : U ⊃ D → V is a linear, densely-defined and closed op erator. Th en k L ( u [ τ ] − ˆ u ) k = O ( √ τ ) ∀ ( τ > 0) . Example 2.6 (Dan tzig selector) . As already men tioned in th e int ro du ction, SMRE (i.e. finding solutions of (3)) reduces to the compu tation of Dantzig sele ctors for the particular setting d = 1, U = R p (with usually p ≫ m ) and J ( u ) = k u k 1 . When applying Algorithm 1 the su bproblem (13 ) amoun ts to compute u k ∈ argmin u ∈ R p 1 2 k K u − ( Y + λp k − 1 − v k ) k 2 + λ k u k 1 . ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 11 This is the we ll kno wn le ast absolute shrinkage and sele ction op er ator (LASSO) estimator [43]. F or the classica l Dan tzig s elector, one chooses S = { 1 , . . . , p } and defines for S ∈ S the w eigh t ω S = K χ { S } . Hence, the subpr oblem (12) in this case consists in the orthonormal pro jection of Y k = Y + λp k − 1 − K u k − 1 on to the set C = v ∈ R m : X 1 ≤ j ≤ m ω S j v j ≤ q for 1 ≤ S ≤ p . The implications of Theorem 2.4 in the pr esen t case are in general r ather weak. If the saddle p oint ˆ u is kno wn to b e S -sparse and when K r estricted to the sup p ort of ˆ u is injectiv e, then it can b e sho wn that | u [ τ ] − ˆ u | 1 = O ( τ ). W e fin ally note that for this particular situation a slightly differen t decomp osition than prop osed in (2.2) is fa v orable. T o b e more pr ecise, defin e ˜ K = K T K and ˜ Y = K T Y and consider J ( u ) + ˜ G ( v ) sub ject to ˜ K u − v = ˜ Y . where ˜ G is the charact eristic function on the set { v ∈ H : k v k ∞ ≤ q } . Algorithm 1 applied to th is mo dified decomp osition then resu lts in the ADMM as in tro duced in [33]. In this case the pro jection in step (12) has a closed fr om. 2.3. The Pro jection Problem. Algorithm 1 resolv es the constrained con v ex optimization problem (7) in to a quadratic program (12) and an un constrained op timization pr oblem (13). The quadratic program (12) in the k -th step of Algorithm 1 can b e written as a p ro jection: inf v ∈ H k v − Y k k 2 sub ject to µ S ( v ) ≤ q ∀ ( s ∈ S ) (14) where Y k = Y + λp k − 1 − K u k − 1 . W e reformulate the side conditions to v ∈ C = \ S ∈S C S where C S = { v ∈ H : µ S ( v ) ≤ q } . (15) The sets C S are closed and conv ex and pr ob lem (14) thus amounts to compute the pr o jection P C ( Y k ) of Y k on to the intersect ion C of closed and con v ex sets. According to this int erpretation, w e use Dykstra’s pro jection algorithm as in tro duced in [5] to solve (14). This algorithm take s an element v ∈ H and con v ex sets D 1 , . . . , D M ⊂ H as argum en ts. It then creates a sequen ce con v erging to the pro jection of v onto the intersect ion of the D j b y successiv ely p erf orm ing pro jections onto ind ividual D j ’s. T o this end, let P D ( · ) denote the pr o jection on to D ⊂ H and S D = P D − Id b e the corresp onding pro jection step. Dykstra’s metho d is summarized in Algorithm 2. A natur al explanation of the algo rithm in a p rimal-dual framew ork as w ell as a pro of th at the sequence { h ( M , k ) } k ∈ N con v erges to P D ( h ) in n orm can b e foun d in [13, 23]. F or the case when D constitutes a p olyhedron even explicit error estimates are at hand (cf. [46]): Theorem 2.7. L et { h k } k ∈ N b e the se quenc e gener ate d by Algor ithm 2 and P D ( h ) b e the pr oje c tion of the input h onto D . Then ther e exist c onsta nts ρ > 0 and 0 ≤ c < 1 such that for al l k ∈ N k h k − P D ( h ) k ≤ ρc k . Remark 2.8. The constan t c in creases with the num b er M of con v ex sets wh ic h int ersection form the set D that h is to b e p ro jected on. T he conv ergence rate therefore imp r o v es with decreasing M . F or f u rther details and estimates for the constants ρ and c , w e refer to [46]. 12 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING Algorithm 2 Dykstra’s Algorithm Require: h ∈ H (data), D 1 , . . . , D M ⊂ H (closed and con v ex sets) Ensure: A sequence { h k } k ∈ N that con v erges strongly to P D ( h ) where D = T j =1 ,...,M D j h 0 , 0 ← h for j = 1 to M do h 0 ,j ← P D j ( h 0 ,j − 1 ) and Q 0 ,j ← S D j ( h 0 ,j − 1 ) end for h 1 ← h 0 ,M and k ← 1 for k ≥ 1 do h k , 0 ← h k for j = 1 to M do h k ,j ← P D j ( h k ,j − 1 − Q k − 1 ,j ) and Q k ,j ← S D j ( h k ,j − 1 − Q k − 1 ,j ) end for h k +1 ← h k ,M and k ← k + 1 end for Note th at application of Dykstr a’s algo rithm is particularly app ealing if the pro jections P D j can b e easily computed or ev en stated explicitly , as it is the case in th e follo wing examples. Example 2.9. Assu me that Λ = Id. Then the sets C S are the rectangular cylinders C S = v ∈ H : ω S , v ≤ q . The pro jection can therefore b e explicitly computed as P C S ( v ) = v − sign ω S , v ω S k ω S k |h ω S ,v i| k ω S k − q if µ S ( v ) > q v else . The left image in Figure 1 depicts an example for C for H = R 2 . F or a d etailed geometric in terpretation of the MR-statistic we also refer to [36]. Example 2.10. Assume that Λ( v ) ν = v 2 ν . T h en, it follo ws that v 7→ ω S , Λ( v ) is con v ex if and only if ω S ν ≥ 0 for all ν ∈ X . In this case, the sets C S are elliptic cylinders C S = ( v ∈ H : X ν ∈ X ω S ν v 2 ν ≤ q ) . Moreo v er, if ω S ν ∈ { 0 , 1 } for all ν ∈ X , then the pr o jection P C S can b e exp licitly compu ted as P C S ( v ) = ( q h ω S , Λ( v ) i v χ { ω S =1 } + v χ { ω S =0 } if µ S ( v ) > q v else . The righ t image in Figure 1 depicts an example of C for H = R 2 . A firs t approac h to use Dykstra’s algorithm to solve (14) is to set M = # S and iden tify D j with C S for all j = 1 , . . . , M . In view of Remark 2.8, h o w ev er, it is clearly desirable to decrease the num b er M of con v ex sets that enter Dykstra’s algorithm. In order to d o so, w e tak e a slightl y m ore soph isticated appr oac h than the one just p resen ted. W e partition the set ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 13 P S f r a g r e p l a c e m e n t s Y k P C ( Y k ) C C S ω S k ω S k q ω S k ω S k P S f r a g r e p l a c e m e n t s Y k P C ( Y k ) C C S ω S k ω S k q ω S k ω S k Figure 1. Admissible set C for the p ro jection problem (14) as in Example 2.9 (left) and Example 2.10 (righ t). S into S 1 , . . . , S M suc h that for all S 6 = ˜ S ∈ S j ω S ⊥ ω ˜ S and ∂ ∂ v ν Λ( · ) ˜ ν ≡ 0 , ∀ ( ν ∈ S, ˜ ν ∈ ˜ S ) (16) and regroup { C S } s ∈S in to { D 1 , . . . , D M } with D j = \ S ∈S j C S . (17) Giv en the pro jections P C S , the pro jection on to D j can b e easily computed: F or v ∈ H identify the set V j = { S ∈ S j : µ S ( v ) > q } of indices for whic h v violates th e side condition (15) and set P D j ( v ) = v − X S ∈ V j ( P C S − Id) v T o kee p M small, w e c ho ose S 1 ⊂ { 1 , . . . , N } as th e b iggest set such that (16) holds for all S , ˜ S ∈ S 1 . W e then choose S 2 ⊂ S \ S 1 with the same prop erty and contin ue in this w a y unt il all indices are utilized. While this pro cedur e do es n ot necessarily result into M b eing minimal with th e d esir ed p rop erty , it still yields a distinct redu ction of N in man y practical situations. W e will illustrate this appr oac h for SMREs in imaging in S ection 3. 3. Applica tions In this section w e will illustrate the capabilit y of Algorithm 1 for computin g S MREs in some practical situ ations: in Section 3.1 w e will study a simp ly one-dimensional regression problem as it w as also studied in [12], yet with a d ifferen t p enalt y fun ction J . In Section 3.2 w e illustrate how SMREs p erforms in image denoising. I n b oth cases w e compare our results to other metho ds. Finally , we will apply the SMRE tec hnique to th e problem of image deblurrin g in confo cal fl uorescence microscop y in Section 3.3. Before w e s tudy the aforemen tio ned examples, w e clarify some common n otation. W e will henceforth assume that U = H = ( R m ) d with d = 1 (Section 3.1 ) and d = 2 (Sections 3.2 and 3.3), resp ectiv ely . Moreo v er, we will emplo y gradient based regularization functionals of the form J ( u ) = T V p ( u ) := 1 p X ν ∈ X | D u ν | p 2 with p ∈ { 1 , 2 } (18) 14 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING where |·| 2 is the Euclidean n orm in R d and D den otes the forwa rd difference op erator defined b y (D u ν ) i = ( u ν + e i − u ν if 1 ≤ ν i ≤ n − 1 0 else. F or the case p = 2 the minimization problem (13) amounts to solv e an imp licit time step of the d -dimen s ional diffusion equation with initial v alue ( Y + λp k − 1 − v k ) and time step size λ . This can b e solved by a simple (sparse) matrix inv ers ion. F or the case p = 1, T V 1 is b etter known as total-variatio n semi-norm . It wa s shown in [34] (see also [24] f or similar resu lts in th e con tin uous setting) that the taut-string algorithm (as introdu ced in [10]) constitutes an efficien t solution metho d for (13) in the case d = 1. In the general case d ≥ 1, we employ the fixed p oin t app r oac h for solving the Euler-Lagrange equations for (13) describ ed in [14] (see also [44, Chap. 8.2.4]). W e finally n ote that the functional T V 1 fails to b e differentiable; a fact t hat leads to serious numerical p r oblems when trying to compute the Euler-Lagrange conditions for (13). Hence, we w ill use in our sim ulations a regularized ve rsion of T V 1 defined b y T V β 1 ( u ) = X ν ∈ X q (D u ν ) 2 i + β 2 (19) for a small constan t β > 0. Evaluation. In order to ev aluate the p erf ormance of S MREs, we will emp lo y v arious distance measures b etw een a n estimator ˆ u and the true sig nal u 0 . On the one hand , w e will use standard measures suc h as me an inte gr ate d squar e d err or (MISE) and the me an inte gr ate d absolute e rr or (MIAE) whic h are giv en by MISE = E 1 m d X ν ∈ X ( ˆ u ν − u 0 ν ) 2 ! and MIAE = E 1 m d X ν ∈ X ˆ u ν − u 0 ν ! , resp ectiv ely . On the other hand, w e also in tend to measure ho w well an estimator ˆ u matc hes the “smo othness” of the tru e signal u 0 , where smo othness is c haracte rized b y the regulariza- tion functional J . T o this end , w e introd uce the symmetric Br e gman diver genc e D sym J ( ˆ u , u 0 ) = 1 m d X ν ∈ X ∇ J ( ˆ u ) ν − ∇ J ( u 0 ) ν ˆ u ν − u 0 ν , where ∇ J denotes the gradien t of the regularization f unctional J . Clearly , D sym J ( ˆ u, u 0 ) is symmetric and since J is assumed to b e conv ex, also non-negativ e. How ever, the symmetric Bregman divergence u s ually do es not satisfy th e triangle inequalit y and hence in general do es not defin e a (semi-) metric on U [9]. The follo wing examples shed some ligh t on how the Bregman div ergence incorp orates the fu nctional J in order to measur es the distance of ˆ u and u 0 . Example 3.1. Let J ( u ) = T V p as in (18). i) If p = 2 , then D sym J ( ˆ u, u 0 ) = X ν ∈ X D ˆ u ν − D u 0 ν 2 2 . In other words, the symmetric Bregman distance w.r.t. to T V 2 is th e mean squ ared distance of the derivatives of ˆ u and u 0 . ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 15 ii) If p = 1, then D sym J ( ˆ u , u 0 ) = 1 m 2 X ν ∈ X D ˆ u ν | D ˆ u ν | − D u 0 ν | D u 0 ν | · D ˆ u ν − D u 0 ν = 1 m 2 X ν ∈ X | D ˆ u ν | + D u 0 1 − D ˆ u ν | D ˆ u ν | · D u 0 ν | D u 0 ν | = 1 m 2 X ν ∈ X | D ˆ u ν | + D u 0 (1 − cos γ ν ) , where γ ν denotes the angle b et w ee n the level lines of ˆ u and u 0 at the p oint x ν . Pu t differen tly , the symm etric Bregman diverge nce w .r.t the total v ariat ion semi-norm T V 1 is small if for su ffi cien tly many p oin ts x ν either b oth ˆ u and u 0 are constan t in a neigh- b orho o d of x ν or the lev el lines of ˆ u and u 0 through x ν are parallel. In pr actice r ather T V β 1 in (19) (for a small β > 0) instead of T V 1 is used in order to a v oid singularities. Then, the ab o v e form ulas are sligh tly more complicated. W e will use the m ean symmetric Bregman dive rgence (MSB) giv en by MSB = E D sym J ( ˆ u, u 0 ) as an additional ev aluation metho d . In all our simulations w e app ro ximate the exp ectations ab o v e by the empirical means of 500 trials. Comp arison with other metho ds. W e will compare the SMREs to other regression metho d s. Firstly , w e will consider estimators obtained by the glob al p enalized least squares metho d: ˆ u ( λ ) := argmin u ∈ H 1 2 X ν ∈ X ( u ν − Y ) 2 + λJ ( u ) , λ > 0 . (20) In particular, we fo cus on estimators ˆ u ( λ ) that are closest (in some sense) to the tr u e f u nction u 0 . W e call suc h estimators or acles . W e d efine the L 2 - and Bregman-oracl e by ˆ u L 2 = ˆ u ( λ 2 ) and ˆ u B = ˆ u ( λ B ), where λ 2 := E argmin λ> 0 u 0 − ˆ u ( λ ) and λ B := E argmin λ> 0 D sym J ( u 0 , ˆ u ( λ )) resp ectiv ely . Of course, oracles are not a v ailable in practice, since the true signal u 0 is unknown. Ho w ev er, they repr esent ideal instances within the class of estimators giv en by (20 ) that usually p erform b etter than any data-driv en parameter choice s (suc h as cross-v alidat ion) and hence ma y serv e as a reference. Secondly , w e also compare our approac h to adaptive weights smo o thing (A WS) [40] which constitutes a b enc hmark tec h nique for data-driv en, spatially adaptiv e regression. W e compute these estimators by means of the official R-pac k age 2 and denote them b y ˆ u k er a ws , where k er ∈ { Gaussian , T riangle } d eco des the shap e of the un d erlying regression k ernel. 2 a v ailable at http://cran.r- project.org /web/packages/aws/index.html 16 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 3.1. Non-parametric Regression. In this section we app ly the SMRE tec hnique to a non- parametric regression pr ob lem in d = 1 dimens ions, i.e. th e noise mo d el (1) b ecomes Y ν = u 0 ν + ε ν ν = 1 , . . . , m, (21) where w e assume that ε ν are indep enden tly and n ormally distribu ted r.v. with E ( ε ν ) = 0 and E ε 2 ν = σ 2 . T he u pp er left image in Figure 2 depicts the true signal u 0 (solid line) and the data Y , with m = 1024 and σ = 0 . 5. Th e application we ha v e in mind with this example arises in NMR sp ectroscop y , where th e NMR sp ectra p ro vide stru ctur al information on the n um b er and t yp e of chemical entiti es in a m olecule. In this conte xt, we suggest to choose J = T V 2 , since the tru e signal u 0 is rather smo oth (see [12] for examples where J is c hosen to b e the total v ariatio n of th e first and second deriv at iv e). Finally , w e discuss the MR-statistic T in (4). W e c ho ose Λ = Id and the index set S to consist of all discrete in terv als with s id e lengths ranging fr om 1 to 100. F or an in terv al S ∈ S w e set ω S = (# S ) − 1 / 2 χ S . T h us, eac h SMRE solv es the constrained optimization p roblem inf u ∈ U T V 2 ( u ) s.t. 1 √ # S X ν ∈ S ( Y − u ) ν ≤ q ∀ ( S ∈ S ) . (22) W e choose q to b e the α -quantil e of the MR-statistic T , that is q α = inf ( q ∈ R : P max S ∈S 1 √ # S X ν ∈ S ε ν ≤ q ! ≥ α ) α ∈ (0 , 1) . (23) W e n ote that except for few sp ecial cases (cf. [28, 30]) closed form expressions f or the dis- tribution of the MT-statistic T a re usually not at hand . In pr actice one rather considers the empirical distribution of T wh ere the v ariance σ 2 can b e estimated at a rate √ md (cf. [38]). W e will henceforth d en ote b y ˆ u α a solution of (22) with q = q α . As argued in Section 1.1, ˆ u α is smo other (i.e. has smaller v al ue T V 2 ) than the true signal u 0 with a pr obabilit y of at least α wh ile it satisfies th e constrain t that the multiresolution statistic T do es not exceed q α . This is a sound statistical interpretation of the regularization parameter α . Numeric a l r esults and simulations. In Figure 2 the oracles ˆ u L 2 and ˆ u B , the A WS-estimat ors ˆ u T riangle a ws and and ˆ u Gauss a ws as w ell as the SMRE ˆ u 0 . 9 , ˆ u 0 . 75 and ˆ u 0 . 5 are d epicted. It is evident th at the SMRE matc hes the smo othness of the tru e ob ject m uc h b etter th an the other estimators while the essen tial features of the signal (su ch as p eak location and p eak heigh t) are p r eserv ed. In particular, almost no additional lo cal extrema are generated by our app roac h whic h sta ys in ob vious con trast to th e other metho ds. Moreo ver, we p oint out that the SMRE are quite robust w.r.t. the c hoice of the confid ence lev el α . W e v erify this b eha vior by a simulatio n stud y in T able 1. F or differen t n oise leve ls ( σ = 0 . 1 , 0 . 3 and 0 , 5) we compare the MISE, MIAE and MSB. Additionally , we compu te the me an numb er of lo c al maxima (MLM) of ˆ u relativ e to the num b er of lo cal maxima in u 0 (whic h is 11). Here ˆ u is an y of the ab o v e estimators. Note that th e latter measure (similar to the MSB) tak es into account the smo othness of th e estimators where a v alue MLM ≫ 1 in dicates to o man y lo cal maxima and hence a lac k of r egularity wher eas MLM < 1 implies sev ere o v ersmoothing. As it becomes a pparent from T able 1 , the SMREs are p erforming similarly w el l when compared to the reference estimators as f ar as the s tand ard measures MISE and MIAE are concerned. F or small noise leve ls ( σ = 0 . 1) SMREs ev en p ro v e to b e sup erior. The distance measures MS B and MLM, ho w ev er, are significantly smaller f or S MREs which indicates that ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 17 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −2 0 2 4 6 8 10 Figure 2. Row-wise from top left: tru e signal u 0 (solid) w ith data Y , oracles ˆ u L 2 and ˆ u B , A WS estimators ˆ u T riangle a ws and ˆ u Gaussian a ws , and the SMREs ˆ u 0 . 9 , ˆ u 0 . 75 and ˆ u 0 . 5 . these m eet the smo othness of the true ob ject u 0 m uc h b etter than the reference estimators (cf. Example 3.1 i)). All in all, the sim ulation results confirm our visual im p ressions ab o v e. Implementation Details. The curr ent index set S results in an o v erall num b er of constrain ts in (22) of # S = 100 X i =1 (1024 − i + 1) = 97450 . As p oin ted out in Secti on 2.3, the efficiency of Dykstra’s Algorithm can b e in creased by grouping ind ep endent side-conditions, that is side-conditions corresp onding to interv als in S with empty in tersectio n. F or example, the system S can b e grou p ed suc h that the intersecti on of the corresp ond ing sets D 1 , . . . , D M in (17) form C with M = 100 X i =1 i = 5050 . 18 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING σ = 0 . 1 σ = 0 . 3 MISE MSB MIAE MLM MISE MSB MIAE MLM ˆ u L 2 0.009 0.008 0.071 11.881 0.046 0.027 0.156 10.915 ˆ u B 0.009 0.007 0.070 11.700 0.048 0.026 0.149 10.359 ˆ u T riangle a ws 0.007 0.007 0.048 2.551 0.040 0 .035 0.112 3.053 ˆ u Gauss a ws 0.054 0.040 0.068 1.971 0.062 0 .041 0.107 2.230 ˆ u 0 . 9 0.008 0.004 0.047 1.336 0.056 0 .019 0.127 1.273 ˆ u 0 . 75 0.007 0.004 0.044 1.342 0.050 0 .018 0.121 1.290 ˆ u 0 . 5 0.007 0.004 0.043 1.366 0.046 0 .017 0.116 1.290 σ = 0 . 5 MISE MSB MIAE MLM ˆ u L 2 0.091 0.037 0.213 9.860 ˆ u B 0.094 0.036 0.206 9.135 ˆ u T riangle a ws 0.078 0.058 0.162 3.141 ˆ u Gauss a ws 0.079 0.043 0.149 2.330 ˆ u 0 . 9 0.134 0.034 0.207 1.194 ˆ u 0 . 75 0.120 0.032 0.196 1.241 ˆ u 0 . 5 0.109 0.030 0.186 1.238 T able 1. Sim ulation stud ies for one dimensional p eak data set. In all our sim ulations we set τ = 10 − 4 and λ = 1 . 0 in Algorithm 1 wh ic h resu lts in k [ τ ] ≈ 100 iterations and an o v erall computation time of appro ximately 20 minutes for eac h SMRE. W e note, ho w ev er, that more than 95% of the computation time is needed for the pro jection step (12) and that a considerable sp eed u p for the latter could b e ac hiev ed by p arallelizat ion. 3.2. Image denoising. In this section we apply the SMRE tec hnique to th e p roblem of image denoising, th at is non-parametric regression in d = 2 dim en sions. I n other w ords, we consider the noise mo del (21) as in Section 3.1, w here the ind ex ν ranges o v er the discrete square { 1 , . . . , m } 2 . I n Figure 3 tw o typical examples for images u 0 and noisy observ atio ns Y are depicted ( m = 512 and σ = 0 . 1, where u 0 is scaled b et w een 0 (b lac k) and 1 (white)). W e will use th e total-v ariation semi-norm J = T V β 1 as r egularization functional ( β = 10 − 8 ). Moreo v er, we c hoose Λ to b e defined as Λ( v ) ν = v 2 ν , ∀ ( ν ∈ 1 , . . . , m 2 ) . (24) The index set S is defi ned to b e the collectio n of all discrete squares with side lengths r anging from 1 to 25 and we set ω S = c S χ S with y et to b e defin ed constants c S . Thus, eac h SMREs solv es the constrained optimization p r oblem inf u ∈ U T V β 1 ( u ) s.t. X ν ∈ S c S ( Y − u ) 2 ν ≤ q ∀ ( S ∈ S ) . (25) W e agree up on q = 1 and sp ecify the constan ts c S . T o this end, compute for s = 1 , . . . , 25 the quanti le v a lues q α,s = inf q ∈ R : P max S ∈S # S = s X ν ∈ S ε 2 ν ≤ q ≥ 1 − α α ∈ (0 , 1) and set c S = q − 1 α, # S . In other words, the defin ition of c S implies that the true signal u 0 satisfies the constraints in (25) for squar e s of a fixe d side length s with prob ab ility at least α . W e will henceforth d en ote b y ˆ u α a solution of (25). W e remark on this particular c hoice of the parameters ω S b elo w. ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 19 Figure 3. Standard test images “cameraman” (to p) and “ro of ” (b ottom) and their noisy count erparts. Numeric a l r esults and simulations. In Figures 4 and 5 the oracles ˆ u L 2 and ˆ u B , the A WS- estimators ˆ u T riangle a ws and ˆ u Gauss a ws as well as the SMRE ˆ u 0 . 9 are dep icted (for the “cameraman” and “ro of ” test image resp ectiv ely) . It is rather obvio us that th e L 2 -oracles are not f a v orable: although r elev an t d etails in the image are preserve d, smooth parts (as e.g. the sky) still con tain rand om structures. In con trast, the estimator ˆ u Gauss a ws preserve s smo oth areas bu t lo oses essen tial details. T he a ws-estimator with triangular ke rnel p erforms muc h b etter, how ever, it giv es p iecewise constant reconstructions of smo othly v arying p ortions of the image, wh ic h is clearly undesirable. Th e SMRE and the Bregman-oracle visually p erform su p erior to the other metho d s. The go o d p erformance of the Bregman-oracle indicates that the symmetric Bregman d istance is a goo d measure for comparing images. In contrast to the Bregman-oracle, the SMRE adapts the amount of smo othing to the underlying image structure: constant image areas are sm o othed nicely (e.g. sky p ortions), wh ile oscillating patterns (e.g. th e grass part in the “cameraman” image or the ro of tiles in the “roof ” image) are reco v ered. W e ev aluate the p erform an ce of the S MREs b y means of a sim ulation study . T o this end , w e compute the MIS E, MIAE and MSB and compare these v al ues with the reference estimators. W e note, ho w ev er, that in particular the MISE and MIAE are not well suited in ord er to measure the d istance of images for they are inconsistent with human eye p erception. In [45] the structur al similarity index (SSIM) was int ro du ced for image qualit y assessmen t that tak es in to accoun t luminance, con trast and structur e of the images at the same time. W e use the 20 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING Figure 4. Reconstructions “cameraman” (ro w-wise fr om top left): L 2 -oracle ˆ u L 2 , Bregman-oracle ˆ u B , A WS estimators ˆ u T riangle a ws and ˆ u Gaussian a ws , and SMRE ˆ u 0 . 9 . author’s imp lemen tation 3 whic h is normalized suc h that the SSIM lies in the interv al [ − 1 , 1] and is 1 in case of a p erfect matc h. W e denote by MSSIM the emp irical mean of th e SSIM in our sim ulations. In T able 2 the simulat ion r esu lts are listed. A fi rst striking fact is the go o d p erformance of the L 2 -oracle w.r.t. th e MISE and MIAE w h ic h is supp osed to imply reconstruction prop erties 3 a v ailable at https://ww w.ece.uwate rloo.ca/ ~ z70wang/re search/ssim / ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 21 Figure 5. Reconstructions “ro of ” (r ow-wise fr om top left): L 2 -oracle ˆ u L 2 , Bregman-oracle ˆ u B , A WS estimators ˆ u T riangle a ws and ˆ u Gaussian a ws , and SMRE ˆ u 0 . 9 . sup erior to th e other metho ds. Keepin g in mind the visual comparison in Figures 4 and 5 , ho w ev er, this is rather questionable. On the other hand, it b ecomes evident that the L 2 -oracle has a r ather p o or p erf ormance w.r.t. the MS B which is more su ited for measuring im age distances. It is therefore remark able that the SMRE p erforms equally go o d as the Bregman- oracle wh ic h, in con trast to the SMRE, is n ot accessible (since u 0 is usually u nknown). As f ar as the s tr uctural s im ilarity m easur e MSS IM is concerned our approac h pro v es to b e sup erior to all others. Finally , th e simulatio n results indicate that aws estimation is not fa v ourable for denoising of natural images. 22 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING “cameraman” “roof ” MISE M SB MIAE MSSIM MISE MSB MIAE MSSIM ˆ u L 2 0.0017 0.0314 0.0276 0.7739 0.0029 0.0499 0. 0383 0.6700 ˆ u B 0.0023 0.0256 0.0275 0.7995 0.0038 0.0405 0. 0391 0.6607 ˆ u T riangle a ws 0.0032 0.0482 0.0308 0.7657 0.0046 0.0702 0. 0416 0.6205 ˆ u Gauss a ws 0.0046 0.0470 0.0360 0.7284 0.0053 0.0686 0. 0457 0.5668 ˆ u 0 . 9 0.0021 0.0252 0.0297 0.8024 0.0033 0.0374 0. 0407 0.7003 T able 2. S im ulation studies for the test images “camerama n” and “ro of ”. Notes on the choic e of Λ and ω S . In general, a pr op er choice of the transformation Λ and of the we igh t-functions ω S can b e ac hiev ed by in clud ing prior stru ctural in f ormation on the tru e image to b e estimated. S ubstantia l parts of natural images, such as photographs, consists of oscillating patterns (as e.g. fabr ic, woo d , hair, grass etc.). This b ecomes obvio us in the standard test images d epicted in Figure 3. W e claim that for signals that exhibit oscillating patterns, a quadratic transform ation Λ as in (24) is fa v orable, s in ce it yields (compared to the lin ear statistic studied in Section 3.1) a larger p ow er of the lo cal test statistic on s mall scales. In order to illustrate this, we simulate noisy observ atio ns Y of the test images u in Figure 3 as in (21) with σ = 0 . 1 and compute a glob al estimator ˆ u by computing a minimizer of th e R OF-functional (20) (with λ = 0 . 1). W e intend to examine how w ell o v er-smoothed r egions in ˆ u are detected b y the MR-statistic T ( Y − ˆ u ) as in (4) w ith t w o differen t av erage fu n ctions (cf. (5)) µ 1 ,S ( v ) = X ν ∈ S v ν and µ 2 ,S ( v ) = X ν ∈ S v 2 ν resp ectiv ely . F or the sak e of simplicit y w e restrict f or the m omen t our considerations on the in dex set S of all 5 × 5 sub-squ ares in { 1 , . . . , m } 2 . In Figure 6 the lo cal means µ i,S of the residuals v = Y − ˆ u for the “ro of ”-image are depicted. T o b e more p recise, the cen ter co ordinate of eac h square S ∈ S is colored according to µ i,S , Hence, large v alues indicate lo cations wh ere the estimator ˆ u is considered ov er -sm o othed according to the statistic. I t b ecomes visually clear that the lo caliz ation of o v ersmoothed r egions is b etter for µ 2 ,S . Th is is a goo d motiv at ion for incorp orating the lo cal means of the squared residu als in the SMRE mo del (7). Figure 6. L o cal means µ 1 ,S (left) and µ 2 ,S (righ t)) of the residuals for “ro of ” image. ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 23 W e fi nally commen t on the c hoice of c S . Since ε ν are indep enden t and normally distribu ted random v a riables, the (scaled) a v erage fu nction σ − 2 µ S ( ε ) = X ν ∈ S ε ν σ 2 is χ 2 distributed with # S d egrees of freedom. Note that the distribu tion of σ − 2 µ S ( ε ) is iden tical only for s ets S of the same scale # S . As a consequence of this, it is lik ely th at certain scales dominate the suprem um in the MR-sta tistic T whic h spoils the multiscale prop er ties of our approac h. As a wa y out, we compute normalizing constan ts for e ach sc ale sep ar ately. An alternativ e approac h would b e to searc h for transformations that tur n µ S ( ε ) into almost iden tically distrib uted r andom v ariables. Logarithmic and p -ro ot transformations are often emplo y ed for this pur p ose (see e.g. [25]). This will b e inv estig ated separately . Implementation Details. The curr ent index set S results in an o v erall num b er of constrain ts in (25) of # S = 25 X i =1 (512 − i + 1) 2 = 62513 00 . Again by groupin g indep en d en t side-conditions, the system S can b e group ed such that the in tersection of the corresp onding sets D 1 , . . . , D M in (17) form C with M = 25 X i =1 i 2 = 5525 . In all our sim ulations we set τ = 10 − 4 and λ = 0 . 25 in Algorithm 1 which results in k [ τ ] ≈ 30 iterations and a ov erall computation time of app r o ximately 2 hour s f or eac h SMRE. Hence, parallelizat ion is clearly desirable in th is case. 3.3. Decon v olution. Another interesting class of problems w hic h c an be a pproac hed b y means of SMREs are decon v olution pr oblems. T o b e more precise, we assu me that K is a con v olutio n op erator, that is ( K u ) ν = ( k ∗ u ) ν = X m ∈ R d k ν − m u m where k is a sq u are-summable k ernel on the lattice Z d and u ∈ H is extended by ze ro-padding. W e w ill fo cu s on the situation where k is a circular Gauss ian k ernel with standard deviation σ giv en by k ν = 1 ( √ 2 π σ ) d e − P d i =1 ν 2 i 2 σ 2 . (26) With Z = Y + λp k − 1 + v k , the primal step (13) in Algorithm 1 amoun ts to s olve u k ← argmin u ∈ H 1 2 X ν ∈ X (( K u ) ν − Z ν ) 2 + λJ ( u ) , where w e c hoose J to b e as in (18) and apply the tec hniques describ ed in [44] for the n umerical solution. In order to illustrate the p erformance of our app r oac h in practical applications, we giv e an example from confo cal microscop y , no w ada ys a s tandard tec hnique in fluorescence m icroscop y 24 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING (cf. [39]). When recording images with th is kind of microscop e, the original ob ject gets blurr ed b y a Gaussian k ernel (in first order). The observ ations (photon coun ts) can b e mo deled as a P oisson pro cess, i.e. Y ν = P oiss(( K u 0 ) ν ) , ν ∈ X . (27) The image depicted in Figure 7(a) shows a recordin g of a PtK2 cell tak en from th e kidney of p otor ous tridactyl us . Before the recording, the pr otein β -tub ulin w as tagg ed w ith a fluorescen t mark er su c h that it can b e traced by the microscop e. The im age in 7(a) sho ws an area of 18 × 18 µ m 2 at a r esolution of 798 × 798 pixel. The p oint spread function of the optical system can b e mo deled as a Gaussian kernel with f ull width at half maxim um of 230nm, whic h corresp ond s to σ = 4 . 3422 in (26). Note that (27) do es not fall immediately in to the range of mo dels co v ered by (1 ). W e will adapt th e p resen t situation to the SMRE metho dology d escrib ed in S ection 1 by standard- ization and consider instead of (3) the mo d ifi ed problem inf u ∈ U J ( u ) s.t. T Y − K u √ K u ≤ 1 (28) where the division is unders to od p oint w ise. Clearly , the problem of finding a solution of (28) is muc h more inv olved than solving (3) for the constraints b eing nonc onvex : fir stly , the functional G as defin ed in (9) is n on conv ex as a consequence of which the con v ergence result in Theorem 2.2 do es not apply and secondly Dykstra’s pro jection algorithm as describ ed in Section 2 cannot b e employ ed. W e p rop ose the follo wing ansatz in order to circum v en t th is problem: instead of pro jecting on to the int ersection C of s ets C S as describ ed in (15), w e no w pr o ject in the k -th step of Algorithm 1 on to C P [ k ] = N \ n =1 C P ,S [ k ] where C P ,S [ k ] = n v ∈ H : µ S v / p K u k ≤ q o . with a p oin t wise division by the square ro ot of K u k . Pu t differently , in the k -th step of Algorithm 1 we u se the p revious estimate u k of u 0 as a lagge d standar dization in order to appro ximate the constrain ts in (28) . In fact, we use p max( K u k , ε ) w ith a s m all num b er ε > 0 for standardization, in order to a v oid ins tabilities. W e note that while with this mo dification Dykstra’s algorithm b ecomes applicable again, the pro jection problem (12) n o w c hanges in eac h iteration step of Algorithm 1. As a conse- quence, Theorem 2.2 d o es not h old an ymore after this mo dification, either. So far, we h a v e not come up with a similar conv ergence analysis. W e compute the SMRE ˆ u 0 . 9 b y emplo ying Algorithm 1 with th e mo difi cations describ ed ab o v e. As in th e denoising examples in Section 3.2 the in dex set S consists of all squares with the side-lengths { 1 , . . . , 25 } and w e c ho ose ω S = χ S and Λ = Id. W e note, that this resu lts in an o v eral l n um b er of # S = 95 436 200 inequalit y constrain ts. The constan t q are c hosen as in (23), where we assum e that ε ν are indep end ent and standard n ormally distributed r.v. In Algorithm 1 w e set λ = 0 . 05 and compu te 100 steps. W e obser ve that after a f ew iterations ( ∼ 15) the error τ falls b elo w 10 − 3 and almost s tagnates thereafter, whic h is due to the fact that we d o not increase the accuracy in the subroutines for (13) and (12). Eac h iteration step in Algorithm 1 appro ximately takes 10 minutes, where 90% of the computation time is needed for (13). Th e result is depicted in Figure 7(b). The b enefits of our metho d are tw ofo ld: ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 25 (a) Fluorescence microscop y data of a PtK2 cell in p otor ous tridacty- lus k idney . The bright filaments indicate the location of the protein β - t ubulin. (b) SMRE ˆ u 0 . 9 : fully automated and lo cally adaptive deconvo lution of microscopy data. Figure 7. Reconstru ction of confo cal microscop y d ata. i) Th e amount of regularizatio n is c hosen in a c ompletely automatic way . T he only param- eter to b e selected is the lev el α in (23). Note that the parameter λ in Algorithm 1 has no effect on the output (though it has an effect on th e num b er of iterations needed and the numerical stabilit y). ii) Th e reconstruction has an app ealing lo cally adaptive b eha vior w hic h in the present ex- ample mainly concerns the gaps b et ween th e protein filamen ts: w hereas th e marked β -tubulin is concen trated in r egions of basically one scale, th e gaps in b et w een actually mak e up the multisca le n atur e of the image. (a) STED microscopy recording of the PtK2 cell data set. (b) Detail comparison b etw een confocal recording (left), SMRE ˆ u 0 . 9 (middle) and STED reco rding (right). Figure 8. Comparison with h igh-resolution STED microscop y data. 26 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING In th e p resent situation w e are in the comfortable p osition to ha v e a r eference image at hand b y means of whic h we can ev aluate the result of our metho d: STED (ST im ulated Emission Depletion) m icroscopy constitutes a relativ ely new metho d , that is capable of r ecording image s at a physical ly 5-10 times higher resolution as confo cal microscop y (see [26 , 27]). Hence a STED image of this ob ject ma y serve as “gold standard” reference image. Figure 8(a) depicts a STED r ecording of the PtK2 cell data set in Figure 7(a). The compari- son of the SMRE ˆ u 0 . 9 with the ST ED recording in Figure 8(b) s ho ws that our SMRE tec hnique c ho oses a reasonable amount of regularization: no artifacts du e to un der-regularization are generated and on the other hand almost all r elev an t geometrical features th at are present in the h igh-resolution S TED r ecording b ecome visible in the r econstruction. In particular, we note that filament bifurcations (one su c h bifur cation is mark ed by a black b o x in Figure 8(b) ) b ecome apparent in our reconstru ction th at are not visible in the recorded d ata. Finally , we men tion that aside to standardization, other transform ations of the P oisson data (27) could p ossibly b e consid ered. F or example Ansc omb e’s tr ansformation is known to yield reasonable appr o ximations to norm alit y eve n for lo w Poisson-in tensitie s and hence has a particular app eal for e.g. microscop y d ata with lo w p h oton-coun ts. W e are currentl y in v estigat ing S MREs that emplo y Ans combe’s transform, w here in particular the arising pro jection problems are c hallenging. 4. Conclusion and Outlook In this w ork, we p rop ose a general estimation tec hnique f or nonparametric in v erse regres- sion problems in the white noise mo del (1) b ased on the con v ex program (7). It amoun ts to finding a minimizer of a con v ex regularization functional J ( u ) ov er a set of feasible estimators that satisfy the fi delt y condition T ( Y − K u ) ≤ q , where T is assumed to b e the maxim um o v er simple con v ex constraints and q is s ome quant ile of the statistic T ( ε ). Any suc h min i- mizer w e call statistic al multir esol ution estimator (SMRE) . Th is app roac h co v ers we ll kno wn uni-scale tec hniques, suc h as the Dan tzig selector, bu t with a v ast field of p oten tially n ew application areas, suc h as lo cally adaptiv e imaging. The particular app eal of the multi-sca le generalizat ion arises for those situations w h ere a “neighb orin g relationship” w ithin the signal can b e emplo y ed to gain additional information by “a v eraging” neighborin g residuals. W e demonstrate in v a rious examples that this impro v emen t is drastic. W e app roac h the n u merical solution of the problem by the ADMM (cf. Algorithm 1) that decomp oses th e problem in to tw o su bproblems: A J -p enalized least squares problem, indep en den t of T , and an orthogonal p r o jection pr oblem on to the f easible set of (7) that is indep en den t of J . Th e first problem is we ll studied and for m ost typical choic es of J fast and reliable n umerical app roac hes are at hand. The p ro jection pr oblem, h o w ev er, is computa- tional demandin g, in particular for image d en oising applications. W e prop ose Dykstra’s cyclic pro jection metho d f or its appro ximate solution. Finally , b y extensiv e numerical exp erimen ts, w e illustrate the p erformance of our estimation scheme (in n onparametric regression, image denoising and deblurr ing p roblems) and the applicabilit y of our algorithmic app roac h. Summarizing, this pap er is mean t to in tro duce a n o v el class of statistical estimators, to pro vide a general algorithmic ap p roac h for their n umerical computation and to ev aluate their p erform an ce by n umerical sim ulations. The inheren t questions on th e asymptotic b ehavi our of these estimators (such as consistency , con v ergence rates or oracle inequalities) r emain —to a large exten t— unans wered. This op ens an interesting area for future researc h. ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 27 A first attempt has b een made in [21] where it is assumed that the mo del s pace U ∋ u 0 is some Hilb ert-space of real v alued fu nctions on some domain Ω and that K : U → L 2 (Ω) is linear and b ound ed. The error mo d el (1) then has to b e adapted accordingly . When Y is a Gaussian pro cess on L 2 (Ω) with mean K u 0 and v ariance σ 2 > 0, consistency and conv ergence rates for SMREs as σ → 0 + ha v e b een p r o v ed in [21] for the case w hen Λ = Id. How ever, in ord er to extend these results to the present setting, one w ould rather w ork w ith a discrete sample of K u 0 on the grid X and then consid er the case wh en the num b er of observ ations N = md tends to infi nit y . The previous analysis in [21] in dicates t w o ma jor asp ects that ha v e to b e considered in the asymp totic analysis for SMREs: (a) As N → ∞ usu ally the cardinalit y of the index set S (and hence of the set of we igh t functions W ) gets unb ound ed . Th us, the mutliresolutio n statistic T ( ε ) = T N ( ε ) in (4) is likely to degenerate u n less it is pr op erly normalized and W satisfies some entr opy c ondition . In the linear case (Λ = Id) w e utilized a result from [17] that guarant ees a.s. b oun d edness of T N ( ε ). (b) I n order to d eriv e c on v ergence rates (or risk b ou n ds) it is w ell kn o wn that the true signal u 0 has to satisfy some apr iori regularity conditions. When using general con v ex regularization functionals J , this is usually expressed by the sour c e c ondition K ∗ p 0 ∈ ∂ J ( u 0 ) , for some p 0 ∈ L 2 (Ω) . Here K ∗ denotes the adjoint of K and ∂ J the (generalized) d eriv ativ e of J . F or example, if J ( u ) = 1 2 k u k 2 , then this conditions means that u 0 ∈ ran( K ∗ ). It wo uld b e of great in terest to transf er and extend th e results in [21] to the pr esen t situation. It is to b e exp ected that (a) and (a) ab o v e are necessary assumptions for this pu rp ose. As stressed by the referees, other extensions are of in terest and will b e p ostp oned to futu re w ork. In cont rast to imaging, in m an y other app lications the design X is random, rather than fixed. In these situations an ob vious w a y to extend our algorithmic framew ork w ould b e to select suitable partitions S according to the design densit y , i.e. with finer r esolution at lo cations with a high concen tration of design p oin ts. It also remains an op en issue how to extend the SMRE metho dology to d ensit y estimation rather than regression, in p articular in a decon v olution setup. F or d = 1 a first step in this direction h as b een tak en in [11] and it will b e of great interest to exp lore whether our approac h allo w this to b e extended to d ≥ 2. A cknowledgement K.F. and A.M. are su pp orted by the DF G-SNF Research Group F OR916 Statistic al R e gu- larization and Qu alitative c onstr aints . P .M is supp orted by the BMBF p ro ject 03MUP AH6 INVERS . A.M and P .M. are s upp orted by the SFB755 Photonic Imaging on the Nanosc ale and the SFB803 F unc tionality Contr o l le d b y Or ganizat ion i n and b etwe en Membr anes . W e thank S. Hell, A. Egner and A. Schoenle (Departmen t of NanoBiophotonics, Max Planc k Ins titute for Biophysical C hemistry , G¨ ottingen) for p ro viding the m icroscop y data and L. D ¨ u m bgen (Univ ersit y of Bern) for stimulat ing discussions. Finally , we are grateful to an asso ciate editor and to an anonymous referee for their helpful commen ts. Appendix A. Proofs In this section we shall giv e th e pr o ofs of T heorems 2.2 and 2.4 as w ell as Corollary 2.5. W e note, that con v ergence of Algorithm 1 is a classical su b ject in optimization theory and a pr o of can e.g. b e f ound in [20, Chap. I I I Thm. 4.1]. Ho w ev er, in ord er to apply these results, it is 28 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING necessary that certain regularit y conditions f or J hold, that are not r ealistic for our purp oses (as e.g. in the case of total-v ariation regularization). The assertions of Theorems 2.2 and 2.4 are mo difications of the standard r esults. Moreo v er, w e w ill allo w for appr oximate solution of the su b problems (12) and (13). T o this end, we rewrite these t w o subp roblems as v a riational in equalities, i.e. giv en ( u k − 1 , v k − 1 , p k − 1 ) w e find ( u k , v k , p k ) suc h that G ( v ) − G ( v k ) + λ − 1 h K u k − 1 + v k − Y − λp k − 1 , v − v k i ≥ − ε k , ∀ v ∈ H (29a) J ( u ) − J ( u k ) + λ − 1 h K u k + v k − Y − λp k − 1 , K u − K u k i ≥ − δ k , ∀ u ∈ U (29b) p k = p k − 1 − ( K u k + v k − Y ) /λ, (29c) where we assume that { ε 1 , ε 2 , . . . } and { δ 1 , δ 2 , . . . } are given sequen ces of p ositiv e n um b ers. Note that (29a) implies that G ( v k ) = 0 and hence v k ∈ C and that for ε k = δ k = 0 (29a) and (29b) are equiv alent to (12) and (13), resp ectiv ely . Finally , w e remind the reader of the defin ition of the sub differ ential (or generalized deriv- ativ e) ∂ F of a conv ex fun ction F : V → R on a real Hilb ert-sp ace V : ξ ∈ ∂ F ( v ) ⇔ F ( w ) ≥ F ( v ) + h ξ , w − v i V ∀ ( w ∈ V ) . If ξ ∈ ∂ F ( v ), then ξ is called sub gr adient of F at v . It follo ws fr om [19, Ch ap I I I, Prop. 3. 1 and Prop . 4.1] that the Lagrangian L (and hence also th e augmented Lagrangian L λ ) has a saddle-p oint ( ˆ u, ˆ v , ˆ p ) ∈ U × H × H if and only if K ˆ u + ˆ v = Y , K ∗ ˆ p ∈ ∂ J ( ˆ u ) and ˆ p ∈ ∂ G ( ˆ v ) . (30) W e will henceforth assume that { ( u k , v k , p k ) } k ∈ N is a sequence generated by iterativ ely re- p eating the steps (29a) - (29c) . F urther, we introdu ce the notation ¯ u k := u k − ˆ u, ¯ v k := v k − ˆ v and ¯ p k := p k − ˆ p. W e start with the follo win g Lemma A.1. F or all k ≥ 1 we h av e that k ¯ p k − 1 k 2 + λ − 2 k K ¯ u k − 1 k 2 − k ¯ p k k 2 + λ − 2 k K ¯ u k k 2 ≥ λ − 2 k K ¯ u k + ¯ v k k 2 + k K ¯ u k − 1 − K ¯ u k k 2 − 2 λ − 2 ( δ k + δ k − 1 ) − δ k − ε k (31) Pr o of. Th e assertion follo ws by r ep eating the steps (5.6)-(5.25) in the pr o of of [20, Chap. I I I Thm. 4.1] after replacing (5.9) and (5.10) b y (29a) and (29b) resp ectiv ely . W e con tin ue with the pr o of of Theorem 2.2 . More precisely , w e p ro v e the follo wing gener- alized v ersion Prop osition A.2. Assume that the sums P ∞ k =1 δ k and P ∞ k =1 ε k ar e finite. Then, the se qu enc e { ( u k , v k ) } k ≥ 1 is b ound e d i n U × H and eve ry we ak cluster p oint is a solution of (8) . Mor e over, X k ∈ N k K u k + v k − Y k 2 + k K ( u k − u k − 1 ) k 2 < ∞ . ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 29 Pr o of. Let k ≥ 1 and d efi ne D = P ∞ k =1 δ k and E = P ∞ k =1 ε k . S umming up Inequalit y (31) o v er k and ke eping in mind th at K ¯ u k + ¯ v k = K u k + v k − Y and K ¯ u k − 1 − K ¯ u k = K u k − 1 − K u k sho ws ∞ X k =1 k K u k + v k − Y k 2 + k K u k − 1 − K u k k 2 ≤ λ 2 k ˆ p k 2 + k K ˆ u k 2 + (4 λ − 2 + 1) D + E < ∞ where w e ha v e used that ¯ u 0 = ˆ u and ¯ p 0 = ˆ p . F u rthermore, it follo ws again from (31) that k ¯ p k k 2 + λ − 2 k K ¯ u k 2 ≤ k ˆ p k 2 + λ − 2 k K ˆ u k 2 + (4 λ − 2 + 1) D + E < ∞ (32) This together with the fact that k K u k + v k − Y k → 0 shows that max( k K u k k , k v k k , k p k k ) = O (1) . T ogether with the optimalit y condition for (13) th is in turn implies that for an arb itrary u ∈ H J ( u k ) ≤ J ( u ) + λ − 1 h K u k + v k − Y − λp k − 1 , K u − K u k i + δ k = O (1) . (33) Summarizing, w e find that max S ∈S µ S ( K u k − Y ) + J ( u k ) ≤ max S ∈S ω S k Λ( K u k − Y ) k + J ( u k ) ≤ c < ∞ for a su itably c hosen constan t c ∈ R , since Λ is su pp osed to b e con tin uous. Thus, it follo ws from Assum ption A that { u k } k ∈ N is b ounded and h ence sequentiall y w eakly compact. No w, let ( ˜ u, ˜ v , ˜ p ) b e a w eak cluster p oin t of { ( u k , v k , p k ) } k ∈ N and recall that ( ˆ u, ˆ v , ˆ p ) wa s assumed to b e a saddle p oin t of th e augmen ted Lagrangian L λ . Setting u = ˆ u in (33) th us results in J ( u k ) ≤ J ( ˆ u ) + λ − 1 h K u k + v k − Y , K ˆ u − K u k i + h p k − 1 , K u k − K ˆ u i + δ k = J ( ˆ u ) + h p k − 1 , K u k − K ˆ u i + o(1) (34) Using the relation K ˆ u + ˆ v = Y we fu rther find h p k − 1 , K u k − K ˆ u i = h p k − 1 , K u k − Y + ˆ v i = h p k − 1 , K u k + v k − Y i − h p k − 1 , v k − ˆ v i = o(1) − h p k − 1 , v k − ˆ v i (35) F rom the defin ition of v k in (29a) and from the fact that ˆ v , v k ∈ C it follo ws that h Y + λp k − 1 − ( K u k − 1 + v k ) , ˆ v − v k i ≤ ε k whic h in turn implies that − h p k − 1 , v k − ˆ v i ≤ λ − 1 h Y − ( K u k − 1 + v k ) , v k − ˆ v i + ε k = λ − 1 h Y − ( K u k + v k ) , v k − ˆ v i + λ − 1 h K u k − K u k − 1 , v k − ˆ v i + ε k = o(1) (36) Com bining (34 ), (35) and (36) giv es lim sup k →∞ J ( u k ) ≤ J ( ˆ u ) . No w, choose a su bsequence u ρ ( k ) k ∈ N suc h that u ρ ( k ) ⇀ ˜ u . Since J is con v ex and lo w er semi- con tin uous it is also wea kly lo w er semi-cont in uous and hence the previous estimate yields J ( ˜ u ) ≤ lim inf k →∞ J ( u ρ ( k ) ) ≤ J ( ˆ u ) . 30 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING Moreo v er, we ha v e that v ρ ( k ) ∈ C for all k ∈ N . Since C is closed w e conclude that ˆ v ∈ C . Since K ˜ u + ˜ v = Y this shows that ( ˜ u, ˜ v ) solve s (8) and th us J ( ˜ u ) = J ( ˆ u ). W e p ro ceed with the pro of of T heorem 2.4. Again w e present a generalized ve rsion. T o this end, let D and E b e as in Prop osition A.2. Prop osition A.3. Ther e exists a c ons tant C = C ( λ, ˆ u, ˆ v , ˆ p, E , D ) such that 0 ≤ J ( u [ τ ]) − J ( ˆ u ) − h K ∗ ˆ p, u [ τ ] − ˆ u i U ≤ C τ + δ k [ τ ] + ε k [ τ ] ∀ ( τ > 0) . Pr o of. Define B 2 = (4 λ − 1 + 1) D + E . T hen it follo ws from (32) that λ − 1 k K u k − K ˆ u k ≤ k ˆ p k + λ − 1 k K ˆ u k + B (37) k p k k ≤ 2 k ˆ p k + λ − 1 k K ˆ u k + B , (38) where ( ˆ u, ˆ v , ˆ p ) is an arbitrary saddle p oin t of L λ ( u, v , p ). Assu me that τ > 0 an d that k = k [ τ ] is suc h that max( k K u k + v k − Y k , k K u k − 1 − K u k k ) ≤ τ . Then, it follo ws from (34), (36), (37) and (38) and that J ( u k ) ≤ J ( ˆ u ) + λ − 1 h K u k + v k − Y , K ˆ u − K u k i + h p k − 1 , K u k − K ˆ u i + δ k ≤ J ( ˆ u ) + τ ( k ˆ p k + λ − 1 k K ˆ u k + B ) + k p k − 1 k τ + h p k − 1 , ˆ v − v k i + δ k ≤ J ( ˆ u ) + τ (3 k ˆ p k + 2 λ − 1 k K ˆ u k + 2 B ) + 2 λ − 1 τ k v k − ˆ v k + δ k + ε k (39) After observing that K u k − K ˆ u + v k − ˆ v = Y it follo ws that k v k − ˆ v k ≤ τ + k K u k − K ˆ u k and com bining (39) and (37) giv es J ( u k ) ≤ J ( ˆ u ) + τ 5 k ˆ p k + 4 λ − 1 k K ˆ u k + 4 B + 2 λ − 1 τ + δ k + ε k . No w, obs er ve that fr om the defi nition of the s ubgradient and (30), it follo w s that J ( u k ) ≥ J ( ˆ u ) + h K ∗ ˆ p, u k − ˆ u i and that h ˆ p, v k − ˆ v i ≤ 0. This and the fact that K ˆ u + ˆ v = Y implies that 0 ≤ J ( u k ) − J ( ˆ u ) − h K ∗ ˆ p, u k − ˆ u i = J ( u k ) − J ( ˆ u ) − h ˆ p, K u k + v k − Y i + h ˆ p, K ˆ u + ˆ v − Y i + h ˆ p, v k − ˆ v i ≤ J ( u k ) − J ( ˆ u ) + k ˆ p k τ ≤ τ 6 k ˆ p k + 4 λ − 1 k K ˆ u k + 4 B + 2 λ − 1 τ + δ k + ε k . (40) This together with (39) finally p ro v es the first part of the assertion. Remark A.4. F or the case when D = E = 0, it is seen f rom the pro of of Prop osition (A.3 ) that the constan t C tak es the simple form C = τ 6 k ˆ p k + 4 k K ˆ u k + 2 τ λ . Pr o of of Cor ol lary 2.5 . Assum e that J ( u ) = 1 2 k Lu k 2 V . Then it follo ws (see e.g. [22, Lem. 2.4]) that the sub differen tial ∂ J ( ˆ u ) consists of the single elemen t L ∗ L ˆ u . Hence th e extremalit y relations (30) imply that K ∗ ˆ p = L ∗ L ˆ u . Now it is easy to observe that J ( u k ) − J ( ˆ u ) − h K ∗ ˆ p, u k − ˆ u i = 1 2 k L ( u k − ˆ u ) k 2 V . ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 31 Referen ces [1] J.-F. Aujol, G. Aub ert, L. Blanc-F ´ er au d , and A. Ch am b olle. Image decomp osition in to a b ounded v aria tion comp onent and an oscillating comp onent. J. Math. Imaging Vision , 22(1): 71–88, 2005. [2] S. Bec k er, E. Cands, and M. Gran t. T emplates for con v ex cone problems with app lications to sparse signal reco v ery . Math. Pr o gr am. Comput. , 3:165 –218, 2011. [3] M. Bertalmio, V. Caselles, B. Roug ´ e, and A. Sol ´ e. TV based image restoration with lo cal constrain ts. J. Sci. Comput. , 19(1-3) :95–12 2, 2003. Sp ecial issue in honor of the sixtieth birthday of Stanley Osh er . [4] P . J. Bic k el, Y. Rito v, and A. B. Tsybako v. S imultaneous analysis of lasso and Dantz ig selector. Ann. Statist. , 37(4):170 5–1732 , 2009. [5] J. P . Bo yle and R. L. Dykstra. A metho d f or fi nding pro jections on to the intersect ion of con v ex sets in Hilb ert spaces. I n A dva nc es in or der r estricte d statistic al inf e r enc e (Iowa City, Iowa, 1985) , v olume 37 of L e ctur e Notes in Statist. , pages 28–47. Spr inger, Berlin, 1986. [6] L. Bo ysen, A. Kemp e, V. Liebsc her, A. Mun k, and O. Wittic h. Consistencies and rates of con v erge nce of ju mp-p enalize d least squares estimators. Ann. Statist. , 37(1):15 7–183 , 2009. [7] E. C an d` es and T. T ao. The Dantzi g selector: statistical estimation w hen p is m uc h larger than n . Ann. Statist. , 35(6):231 3–2351 , 2007. [8] S. S. Chen, D. L. Donoho, and M. A. S aunders. A tomic decomp osition by b asis pu rsuit. SIAM R eview , 43(1):129–1 59, 2001. [9] I. Csisz´ ar. Wh y lea st squares and maxim um entrop y? An axiomatic approac h to inference for linear in v erse problems. Ann. Statist. , 19(4) :2032–2 066, 1991. [10] P . L. Da vies and A. K ov ac. Lo cal extremes, runs, strin gs and multiresolution. Ann. Statist. , 29(1):1– 65, 2001. With d iscussion and rejoinder b y the authors. [11] P . L. Da vies and A. Kov ac. Densities, sp ectral densities and mo dalit y . Ann. Statist. , 32(3): 1093–1 136, 2004. [12] P . L. Da vies, A. Ko v ac, and M. Meise. Nonparametric regression, confid ence regions and regularization. Ann. Statist. , 37(5B):259 7–262 5, 2009. [13] F. Deutsc h and H. Hund al. The rat e of con v ergence o f D ykstra’s cyclic pro jections algorithm: the p olyhedral case. Numer. F unct. Anal. Optim. , 15(5-6):537 –565, 1994. [14] D. C. Dobson and C. R. V ogel. C on v ergence of an iterativ e metho d for total v ariation denoising. SIAM J. Numer. Anal . , 34(5):1 779–1 791, 1997. [15] Y. Dong, M. Hin term ¨ uller, and M. Rincon-Camac ho. Au tomated regularization p arame- ter selection in multi- scale total v ariation mo dels f or image restoration. J. Math. Imaging Vision , 40:82–1 04(23), Ma y 2011. [16] L. D ¨ um bgen and R. B. Johns. Confi dence b an d s for isotonic median cur v es u sing sign tests. J. Comput. Gr aph. Statist , 13(2) :519–5 33, 2004. [17] L. D ¨ umbgen and V. G. Sp ok oin y . Multiscale testing of qualitativ e h yp otheses. A nn. Statist. , 29(1):12 4–152, 2001. [18] L. D ¨ um bgen and G. W alther. Multiscale inferen ce ab out a d ensit y . Ann. Statist. , 36(4): 1758–1 785, 2008. [19] I. Ek eland and R. T emam. Convex analysis and variational pr oblems , v olume 1 of Studies in Mathematics and its A pplic atio ns . North-Holland Pub lishing Co., Amsterdam-Oxford, 1976. 32 ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING [20] M. F ortin and R. Glowinski. Augmente d Lagr angian metho ds , volume 15 of Studies in Mathematics and its Applic ations . North-Holland Publishing Co., Amsterdam, 1983 . Applications to th e numerical solution of b oundary v alue p roblems, T ranslated fr om the F renc h by B. Hu n t and D. C. Spicer. [21] K. F rick, P . Marnitz, and A. Mu nk. Shap e constrained r egularisation by statistical m ul- tiresolution for inv erse problems: Asymptotic analysis. T ec hnical rep ort, 2010. Av ailable at http://a rxiv.or g/abs/1003.3323 . [22] K. F r ic k and O. Scherzer. R egularization of ill-p osed linear equ ations by the non- stationary Au gmen ted Lagrangian Metho d . J. Inte gr al Equations A ppl. , 22(2):217–2 57, 2010. [23] N. Gaffke and R. Mathar. A cyclic pr o jection algorithm via d ualit y . M etrika , 36:29– 54, 1989. [24] M. Grasmair. The equiv alence of the taut string algorithm and BV-regularization. J . Math. Imaging Vision , 27(1):5 9–66, 2007. [25] D. M. Hawkins and R. Wixley . A note on the transform ation of chi-squared v ariables to normalit y . Amer.Statist. , 40(4): 296–29 8, 1986. [26] S . W. Hell. F ar-Field Optical Nanoscop y. Scienc e , 316(5828 ):1153–11 58, 2007. [27] S . W. Hell and J. Wic hmann. Breaking the d iffr action resolution limit by stim ulated emission: stim ulated-emission-depletion fluorescence m icroscop y . Opt. L ett. , 19(11):78 0– 782, 1994. [28] T. Hotz, P . Marnitz, R. Stic h tenoth, L. Da vies, Z. Kabluc hk o, and A. Mun k. Lo cally adaptiv e image denoising by a statistica l multireso lution criterion. Comput. Stat. D ata An. , 56(3):54 3 – 558, 2012. [29] G. M. J ames, P . Radchenk o, and J. Lv. Dasso: connections b et w ee n the dantz ig selector and lasso. J. R. Stat. So c. Ser. B Stat. M etho dol. , 71:127–1 42, 2009. [30] Z . Kabluchk o. Extremes of the standardized Gauss ian n oise. Sto c hastic Pr o c ess. Appl. , 121(3 ):515–5 33, 2011. [31] Z . K abluc hk o and A. Munk. Sh ao’s theorem on the maxim um of standardized random w alk incremen ts for multidimensional arr a ys. ESAIM Pr o b ab. Stat. , 13:409 –416, 2009. [32] J. Kaipio and E. S omersalo. Statistic al and c omputational inverse pr oblems , vol ume 160 of Applie d Mathematic al Scienc es . Spr inger-V erlag, New Y ork, 2005. [33] Z . Lu, T. K. Po ng, and Y. Zh an g. An alternating direction metho d f or fi nding Dan tzi g selectors. T echnical rep ort, 2010. Av aila ble at http://arxiv.or g/abs/101 1.4604v1 . [34] E. Mammen and S . v an d e Geer. Lo cally adap tive regression splines. Ann. Statist. , 25(1): 387–41 3, 1997. [35] Y. Mey er. Osci l lating p atterns in image pr o c essing and nonline ar evolution e q u ations , v olume 22 of University L e ctur e Series . American Mathematical So ciet y , Provi dence, RI, 2001. The fifteen th Dean Jacqueline B. Lewis memorial lectures. [36] T. Milden b erger. A geomet ric inte rpretation of the multireso lution criterion in nonp ara- metric regression. J. Nonp ar ametr. Stat. , 20(7):1048 –5252, 2008. [37] G. O. Mohler, A. L. Bertozz i, T. A. Gol dstein, and S. J. Osh er. F ast TV regularizat ion for 2d maxim um p enalized likelihoo d estimation. J. Comput. Gr aph. Statist , 20(2) :479–4 91, 2011. [38] A. Munk, N. Bissantz, T. W agner, and G. F r eitag. On difference-based v ariance estima- tion in nonparametric regression wh en the cov ariate is high dimensional. J. R. Stat. So c. Ser. B Stat. Metho do l. , 67(1): 19–41, 2005. [39] J. B. Pa w ley . Handb o ok of Biolo gic al Confo c al Mic r osc opy . Sp ringer, 2006. ST A TISTICAL M UL TIRESOLUTION DANTZIG ESTIMA TION IN IMAG ING 33 [40] J. P olzehl and V. G. S p oko in y . Adaptiv e w eigh ts smo othing with applications to image restoration. J. R. Stat. So c. Ser. B Stat. Metho dol. , 62(2) :335–35 4, 2000. [41] J. K. Rom b erg. The dantzig selector and generalized thr esh olding. In CISS , pages 22–25. IEEE, 2008. [42] D. Siegm und and B. Y akir. T ail p robabilities for the null distribution of scanning statis- tics. Bernoul li , 6(2):19 1–213, 2000. [43] R. Tibsh ir ani. Regressio n shr ink age and selection via the lasso. J. R. Stat. So c. Ser. B Stat. M etho dol. , 58:26 7–288, 1994. [44] C. R. V ogel. Computa tional metho d s for inverse pr oblem s , v ol ume 23 of F r ontiers i n Applie d Mathematics . So ciet y for Ind ustrial and Applied Mathematics (S IAM), Philadel- phia, P A, 2002. With a forew ord b y H. T. Banks. [45] Z . W ang, A. C. Bo vik, H. R. Sheikh, and E. P . S imoncelli. Image qualit y assessment : F rom error visibilit y to structural similarit y . IEEE T r ans. Image Pr o c ess. , 13(4):600–6 12, 2004. [46] S . Xu . Estimation of the conv ergence rate of Dykstra’s cyclic pro jections algorithm in p olyhedr al case. A cta Math. Appl. Sinic a (English Ser.) , 16(2):217–2 20, 2000.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment