Ptychographic phase-retrieval by proximal algorithms

Pt yc hographic phase-retriev al b y pro ximal algorithms Hanfei Y an h yan@ bnl.go v National Sync hrotron Light Source II, Bro okha ven National Lab oratory , Upton, NY 11973, USA Septem b er 17, 2019 Abstract W e deriv e a set of pt ychograph y phase-retriev al iterative en- gines based on pro ximal algorithms originally developed in conv ex optimization theory , and discuss their connections with existing ones. The use of proximal op erator creates a simple frame work that allo ws us to incorp orate the eﬀect of noise from a maxim um- lik eliho o d principle. W e focus on three particular algorithms, namely proximal minimization, alternating direction metho d of m ultiplier and accelerated pro ximal gradien t, and b enc kmark their p erformance with numerical simulations and exp erimental x-ra y data. Among them, accelerated pro ximal gradien t sho ws sup erior p erformance in terms of b oth accuracy and conv ergence rate for a noisy dataset. 1 In tro duction Pt ychograph y is a p ow erful scanning imaging technique that utilizes ad- v anced mathematical to ols to retrieve the missing phase information of the w av e-ﬁeld from a sequence of intensit y measurements [1 – 3]. The attraction of this tec hnique comes from its capability of recov ering b oth the complex- v alued prob e and ob ject functions, a blind decon volution, as well as its abil- it y of breaking the resolution barrier set by the focusing optics. It gains 1 increasing popularity in recent years for its robustness in practice and w as used successfully for many imaging applications in diﬀerent ﬁelds [4 – 9]. The ma jor challenge of this tec hnique resides in the fact that the mathematical problem is non-conv ex and ill posed. P articularly for real-w orld problems, the exp erimental data alw ays con tains noise and other t yp es of errors, there- fore ﬁnding a solution optimized globally is extremely diﬃcult, if not com- pletely imp ossible. A great deal of eﬀorts ha ve b een dev oted to dev eloping robust pt ychographic iterative engines, either based on alternating-pro jection or gradien t-descent metho ds [10, 3, 11 – 18]. More complex algorithms that can handle mixed states [19], positioning errors [20, 21], diﬀraction eﬀect [22, 23] and instability of the prob e [24] w ere developed as w ell. F or conv ex optimization problems, a class of algorithms called proximal algorithms hav e b een studied extensively [25]. They turn out to b e w ell-suited for constrained, large-scale and distributed optimization problems, and pt y- c hography falls in to this category . In these techniques, the solving process is divided in to sub-problems in volving the ev aluation of the pro ximal op erator, whic h usually has a closed-form solution. Inspired by these developmen ts, here we prop ose to com bine pro ximal algorithms and Wirtinger deriv ativ es to create a simple frame w ork for solving ptyc hography problems, where either alternating-pro jection or gradien t-descent algorithms can b e derived straigh tforwardly . W e wan t to emphasize that proximal algorithms are originally dev el- op ed for con vex problems dealing with real-v alued n umbers, and ptyc hog- raph y problems are non-con vex and inv olv es complex-v alued n umbers. The Wirtinger deriv ativ e allo ws us to use the common rules for diﬀerentiation kno wn from real-v alued analysis [26], so the dev elop ed solving tec hniques in proximal algorithms can be readily applied. W e show some previously rep orted ptyc hograph y algorithms can b e derived in this frame work. Here w e do not attempt to pro vide a rigorous theoretical pro of of the ov erall con- v ergence, but rather oﬀer a heuristic for future work b y demonstrating the eﬀectiv eness of these algorithms with n umerical simulation and real-w orld applications. As hav e b een sho wn that man y phase-retriev al algorithms can ﬁnd their counterparts in conv ex optimization theory [27], so can pt yc hogra- ph y . In this paper, w e fo cus on pt ychographic reconstruction with noise and round-oﬀ errors, a common pr oblem encoun tered in all measuremen ts. Three maxim um-likelihoo d (ML) based algorithms, namely pro ximal minimization (PM), alternating direction method of m ultiplier (ADMM) and accelerated pro ximal gradien t (APG) are deriv ed and b enc hmarked with b oth sim ulation and exp erimen t data. Among them, APG, whic h has not b een rep orted b e- fore, exhibits a superior p erformance. Although curren t model only considers noise and round-oﬀ errors, it is not diﬃcult to extend it to a more complex 2 case. This pap er is organized as follows. In the model section, w e ﬁrst build the mathematical mo del for the optimization problem in pt ychograph y , and then discuss ho w diﬀerent solving algorithms can b e derived from proximal op erators. Connections with existing tec hniques are discussed. Tw o statis- tical mo dels for the noise, in tensit y Poission and amplitude Gaussian, are considered throughout the pap er. In the numerical sim ulation section, w e b enc hmark p erformance of the deriv ed algorithms at diﬀeren t signal-to-noise ratios, and discuss the optimal conditions for con vergence and accuracy in resp ectiv e cases. In the experimental data section, PM, ADMM and APG are tested with an exp erimental dataset tak en with x-rays. The result con- ﬁrms that APG outp erforms the other t wo and ac hieves the state-of-the-art p erformance. 2 Mo del In a pt yc hographic scan, a prob e, p , impinges on diﬀerent parts of an ob ject, o . The transmitted wa v e is assumed to b e simply a pro duct of the prob e and ob ject function, and propagates to a farﬁeld detector. The wa v eﬁeld at the ob ject plane is link ed to the wa v eﬁeld at detector plane by F ourier transform. Because a physical detector only measures in tensit y (or amplitude in other words), the phase of the wa veﬁeld is lost in an exp erimen t. A phase-retriev al algorithm in tends to reco ver this set of missing information based on the measured amplitude under certain conditions. Assuming that w e collect farﬁeld diﬀraction patterns at K diﬀerent p ositions, W e no w hav e t wo constrain ts for the measured data to satisfy . One is in real space (sample plane). The w av eﬁeld has to be written as a product of a prob e and an ob ject function with known translations. The other is in recipro cal space (detector plane). The modulus of its F ourier transform has to agree with the measuremen t. y i = | F [ p ( r ) o ( r − r i )] | , i = 1 , 2 , ...K, (1) where y i is the measured amplitude of the ith image and F is the t w o- dimensional F ourier transform op erator. In Eq. (1), another set of kno wn information is the prob ed position, r i . The goal here is to ﬁnd complex- v alued functions p and o that satisfy b oth the amplitude and translation constrain ts. W e discretize the problem and rewrite Eq.(1) in to a v ector form b y concatenating an image along its column directions, y i = | x i | , x i = FPS i o . (2) 3 Here the low er-case b old letter represents a column vector and a capital b old one corresp onds to a matrix. The absolute op erator is elemen t-wise. In the ab o ve equation, y i ∈ R N × 1 is the measured amplitude, F ∈ C N × N is the F ourier transform matrix, p ∈ C N × 1 is the prob e vector, P = diag( p ) is a diagonal matrix, S i ∈ R N × M is a sparse matrix con taining only zeros and ones that selects the illuminated elemen ts of the ob ject v ector, o ∈ C M × 1 , at the ith prob ed p osition. The resulting vector is o i ∈ C N × 1 . Problem describ ed b y Eq. (2) can b e turned into an unconstrained optimization problem, min x i f ( x i ) + g ( x i ) , (3) where f and g are indicator functions corresp onding to the t wo constraints resp ectiv ely , f ( x i ) =  0 if x i ∈ dom f ∞ otherwise , dom f = { x i ∈ C N × 1 | | x i | = y i } , g ( x i ) =  0 if x i ∈ dom g ∞ otherwise , dom g = { x i ∈ C N × 1 | x i = FPS i o } . (4) 2.1 Alternating Pro jection The classical metho d of solving the ab o v e problem is alternating pro jection algorithm (AP) that pro jects the solution into tw o domains, z t +1 i = Π f ( x t i ) , x t +1 i = Π g ( z t +1 i ) . (5) Here Π f and Π g are Euclidean pro jection op erators. By iterativ ely pro jecting in to these t wo domains, w e hop e that an initial guess can con verge to a solution fulﬁlling b oth constraints. A pro ximal op erator of a function f is deﬁned b y , pro x λf ( v ) = argmin x ( f ( x ) + 1 λ k x − v k 2 2 ) . (6) Here w e use 1 /λ instead of 1 / 2 λ for con v enience b ecause we need to deal with complex-v alued v ariables. F or the tw o indicator functions in Eq. (3) 4 their corresponding proximal op erators are simpliﬁed to, pro x f ( v i ) = argmin x i ∈ dom f K X i =1 k x i − v i k 2 2 = Π f ( v i ) , pro x g ( v i ) = argmin x i ∈ dom g K X i =1 k x i − v i k 2 2 = Π g ( v i ) . (7) Based on the deﬁnition, these proximal op erators are just Euclidean pro jec- tions. Therefore, the classical AP metho d can also b e written as, z t +1 i = pro x f ( x t i ) , x t +1 i = pro x g ( z t +1 i ) . (8) The solution to the ﬁrst pro jection is simply to replace the amplitude of v i with y i while retain its phase, Π f ( v i ) = diag( y i ) ϑ ( v i ) , ϑ ( v i ) =  v i / | v i | v i 6 = 0 0 otherwise (9) The divide is an elemen t-wise op eration. The Solution to the second pro jec- tion can b e obtained b y man y diﬀerent w ays. The prob e and ob ject func- tions can b e up dated sequen tially [16], collectively [3] or join tly [14]. Here w e choose the usual collective up date, o = ( K X i =1 S H i P H PS i ) − 1 ( K X i =1 S H i P H F H v i ) , p = ( K X i =1 O H i O i ) − 1 ( K X i =1 O H i F H v i ) , O i = diag( S i o ) , Π g ( v i ) = FPS i o , (10) where sup erscript H denotes conjugate and transpose op eration. Π g deﬁnes another pro jection satisfying the second constrain t. Sp eciﬁcally , w e bac k- propagate the w a veﬁeld to the sample plane using inv erse F ourier transform. Then we up date prob e and ob ject functions collectively based on the prob ed p ositions. One can run the iteration once or multiple times for high accuracy . Lastly , we replace the w av eﬁeld at the sample plane by the pro duct of the 5 prob e and ob ject functions and propagate it to the detector plane. This metho d is often referred as error reduction (ER) algorithm. F rom a statistical p oint of view, if we assume a Gaussian likelihoo d func- tion of the amplitude and use its negative log as our cost function, w e arrive at, L = K X i =1 k y i − | FPS i o |k 2 2 . (11) Their Wirtinger deriv ativ es with resp ect to the probe and ob ject are, ∂ L ∂ o ∗ = K X i =1 ( FPS i ) H [ x i − diag( y i ) ϑ ( x i )] , ∂ L ∂ p ∗ = K X i =1 ( F O i ) H [ x i − diag( y i ) ϑ ( x i )] . (12) A t a stationary point the deriv ativ es hav e to b e zero, K X i =1 ( S H i P H PS i ) o − K X i =1 ( S H i P H F H )diag( y i ) ϑ ( x i ) = 0 , K X i =1 ( O H i O i ) p − K X i =1 O H i F H diag( y i ) ϑ ( x i ) = 0 . (13) They can b e solved iterativ ely by a ﬁxed-p oin t algorithm that seeks a ﬁx p oin t of the equation, z = q ( z ). In this case, the unkno wn v ariables are o and p . If they are solved in sequence w e arriv e at, o t +1 = [ K X i =1 ( P t S i ) H P t S i ] − 1 [ K X i =1 ( P t S i ) H F H diag( y i ) ϑ ( x t i )] , p t +1 = ( K X i =1 O t +1 i H O t +1 i ) − 1 [ K X i =1 O t +1 i H F H diag( y i ) ϑ ( x t i )] , x t +1 i = FP t +1 S i o t +1 . (14) This is no diﬀerent from the classical ER algorithm sho wn in Eq. (10). Therefore, w e can also in terpret ER algorithm for ptyc hograph y as some 6 sort of ﬁx-p oint algorithm that seeks the stationary p oin t of the amplitude Gaussian lik eliho o d function with resp ect to p and o . Algorithm 1: ER Data: y i ∈ R N × 1 , S i ∈ R N × M , i = 1 , 2 , ...K Result: prob e function p and ob ject function o initialization: p 0 , o 0 , x 0 i , t max ; rep eat z t +1 i = diag( y i ) ϑ ( x i ); o t +1 = up date o( p t , z t +1 i ); p t +1 = up date p( o t +1 , z t +1 i ); x t +1 i = F diag( p t +1 ) S i o t +1 ; un til t > t max ; 2.2 Alternating Direction Metho d of Multiplier F or real-w orld phase-retriev al problems, ER is kno wn to suﬀer from slo w con vergence and stagnation issues. A far more robust and p opular algo- rithm is ADMM. One sp ecial v arian t of its form is Douglas-Rac hford split- ting metho d, also kno wn as diﬀerence map (DM), which is widely used for pt ychograph y reconstructions. ADMM is usually derived from argumen ted Lagrangian metho d. In the framew ork of proximal algorithms, it can be writ- ten in a very concise form. An in-depth discussion of ADMM can b e found in the monography by Bo yd et al. [28], and its application for ptyc hograph y w ere rep orted in previous publications [12, 18]. Recen tly it w as applied for join t ptyc ho-tomograph y reconstruction [29]. Th us, here we skip the deriv a- tion process. W e c hange our optimization problem [Eq. (3)] slightly , min x i , z i f ( x i ) + g ( z i ) , sub ject to x i = z i . (15) F or t wo indicator functions deﬁned in Eq.(4), the ADMM algorithm is, x t +1 i = pro x f ( z t i − u t i ) , z t +1 i = pro x g ( x t +1 i + u t i ) , u t +1 i = u t i + x t +1 i − z t +1 i . (16) If w e deﬁne a new v ariable w t i = z t i + u t i and substitute it into Eq. (16), we arriv e at, x t +1 i = pro x f (2 z t i − w t i ) , z t +1 i = pro x g ( x t +1 i + w t i − z t i ) , w t +1 i = w t i + x t +1 i − z t i . (17) 7 V ariable x t +1 i is not indep endent and can b e replaced. Rearrange terms and use Euclidean pro jection operators derived in Eqs. (9) and (10), we hav e, z t i = Π g ( w t i ) , w t +1 i = w t i + Π f (2 z t i − w t i ) − z t i . (18) This is the well-kno wn DM algorithm [3]. F or a noisy dataset, DM is kno wn to ha ve stabilit y problem b ecause it attempts to ﬁnd a solution with its amplitude exactly equal to the measured v alue. A simple remedy is to replace the indicator function, f , with a negativ e log lik eliho o d function, L . In such a case, x i up date in Eq. (16) is mo diﬁed to, x t +1 i = pro x λ L ( z t +1 i − u t i ) , pro x λ L ( v i ) = diag( ϑ ( v i )) 2(1 + λ )  2( λ y i + | v i | ) , G | v i | + ( p | v i | 2 + 4 λ (1 + λ ) y 2 i ) , P = E MAP ( λ, y i , v i ) (19) Here ‘G’ and ‘P’ refers to amplitude Gaussian and intensit y P oisson, respec- tiv ely . One may notice that the up date sc heme for x i is now parameter- dep enden t. F rom Ba y es’ theorem, the pro ximal op erator can b e interpreted as maxim um-a-p osterior (MAP) probabilit y estimate, where the prior prob- abilit y follo ws a normal distribution. The parameter, λ , con trols ho w close the new up date should b e to its prior v alue, and pla ys an imp ortant role in determining the p erformance of the algorithm. W e will ha ve a more detailed dissuasion in the following section. Algorithm 2: mADMM Data: y i ∈ R N × 1 , S i ∈ R N × M , i = 1 , 2 , ...K Result: prob e function p and ob ject function o initialization: p 0 , o 0 , z 0 i , u 0 i , λ , β , δ , t max ; rep eat x t +1 i = E MAP ( λ, y i , z t i − u t i ); o t +1 = up date o( p t , x t +1 i + u t i ); p t +1 = up date p( o t +1 , x t +1 i + u t i ); z t +1 i = F diag( p t +1 ) S i o t +1 ; u t +1 i = u t i + x t +1 i − z t +1 i ; if k u t +1 − u t k 2 / k u t +1 k 2 < δ then λ = β λ ; end un til t > t max ; 8 2.3 Pro ximal Minimization The ﬁxed p oin t of a proximal op erator is also the minimizer of the original function. This leads to a simple pro ximal iterative algorithm, x t +1 i = pro x λ L + g ( x t i ) . (20) F or the pt ychograph y problem considered in this paper, we deﬁne, pro x λ L + g ( v i ) = argmin x i ∈ dom g L ( x i ) + g ( x i ) + K X i =1 1 λ k x i − v i k 2 2 = argmin p , o L ( p , o ) + K X i =1 1 λ k FPS i o − v i k 2 2 (21) Again, we can interpret the up date as a MAP estimate. The diﬀerence from ADMM discussed abov e is that x i and z i are forced to be equal here. In ADMM, the splitted v ariables b elong to their individual domains and are not necessary the same. Similar to the deriv ation of Eq. (14), w e mak e t wo-step up date, z t +1 i = E MAP ( λ, y i , x t i ) , x t +1 i = Π g ( z t +1 i ) . (22) Compared to ER, the only diﬀerence is that the measured amplitude is replaced with a MAP-estimated v alue. This metho d is ﬁrst proposed by Katk ovnik et al. [13]. Here we show it is equiv alent to PM algorithm and pro vide an alternativ e p ersp ective. Algorithm 3: PM Data: y i ∈ R N × 1 , S i ∈ R N × M , i = 1 , 2 , ...K Result: prob e function p and ob ject function o initialization: p 0 , o 0 , x 0 i , λ , t max ; rep eat z t +1 i = E MAP ( λ, y i , x t i ); o t +1 = up date o( p t , z t +1 i ); p t +1 = up date p( o t +1 , z t +1 i ); x t +1 i = F diag( p t +1 ) S i o t +1 ; un til t > t max ; 2.4 Accelerated Pro ximal Gradien t Let’s consider the optimization problem, min x i L ( x i ) + g ( x i ) . (23) 9 Wirtinger deriv ativ e allo ws us to derive the gradien t of the real-v alued like- liho o d function with resp ect to the complex-v alued v ariable x i , ∇ x ∗ i L =  x i − diag( y i ) ϑ ( x i ) , G x i − diag( x i | x i | 2 + ε )( y 2 i + ε ) , P (24) Here ε is a small real-v alued constan t introduced to av oid the discontin uit y at zero, as suggested in Ref [18]. The negative of the gradient is also the steep est descent direction of the function. The pro ximal gradient algorithm is, x t +1 i = pro x g ( x t i − λ t ∇ x ∗ i L ( x t i )) (25) Here λ t is a p ositiv e step size that can v ary at eac h iteration. W e use a simple metho d to determine its v alue [30]. The step size remains the same unless the follo wing condition is violated z i = pro x g ( x t i − λ t ∇ x ∗ i L ( x t i )) , L ( z i ) ≤ Q λ t ( z i , x t i ) , Q λ t ( z i , x t i ) = L ( x t i ) + K X i =1 2Re( ∇ x ∗ i L ( x t i ) H ( z i − x t i )) + 1 λ t k z i − x t i k 2 2 , (26) In such a case, we reject the up date and multiply the step size by a factor β ∈ (0 , 1). The pro cess is rep eated until the abov e condition is satisﬁed and then the iteration is completed, x t +1 i = z i . The pro ximal gradien t al- gorithm can b e understo o d from a p oint of view of lo calized optimization. F or completeness, we give an explanation due to Bec k and T eboulle [30]. The function, Q λ t ( z i , x t i ), is an upp er b ound to L ( x t i ) that is tight at x t i , i.e. Q λ t ( x t i , x t i ) = L ( x t i ) and Q λ t ( z i , x t i ) ≥ L ( z i ), provided that λ t ∈ (0 , L ], where L is a Lipschitz constan t of ∇ x ∗ i L . This function can b e considered as an ﬁrst-order appro ximation to L with a regularization term. W e may rewrite it as, Q λ t ( z i , x t i ) = L ( x t i ) + K X i =1 1 λ t k z i − ( x t i − λ t ∇ x ∗ i L ) k 2 2 − λ t k∇ x ∗ i Lk 2 2 (27) In the vicinity of x t i , we replace the original optimization problem with an appro ximate one, min z i Q λ t ( z i , x t i ) + g ( z i ) (28) 10 Dropping constan t terms in Eq. (27) does not aﬀect the solution to Eq. (28). As a result we arriv e at, argmin z i K X i =1 1 λ t k z i − ( x t i − λ t ∇ x ∗ i L ) k 2 2 + g ( z i ) = pro x g ( x t i − λ t ∇ x ∗ i L ) (29) Consequen tly , w e can interpret eac h iteration as a proximal op erator of g along the steep est decent direction of L , as the name pro ximal gradien t suggests. By deﬁnition, L ( x t +1 i ) + g ( x t +1 i ) ≤ Q λ t ( x t +1 i , x t i ) + g ( x t +1 i ) ≤ Q λ t ( x t i , x t i ) + g ( x t i ) . (30) In this particular case g is an indicator function, th us, g ( x t i ) = g ( x t +1 i ) = 0 (31) W e can simplify the inequality as, L ( x t +1 i ) ≤ Q λ t ( x t +1 i , x t i ) ≤ Q λ t ( x t i , x t i ) = L ( x t i ) . (32) Therefore, eac h iteration decent the negativ e log lik eliho o d function mean- while satisfying the constraint g . F or a faster conv ergence, the accelerated v ersion of the proximal gradien t method whic h includes an additional ex- trap olation step can b e used, w t i = x t i + ω t ( x t i − x t − 1 i ) , x t +1 i = pro x g ( w t i − λ t ∇ w ∗ i L ( w t i )) , ω t = t t + 3 . (33) W e note that Xu et al. [15] recently prop osed accelerated Wirtinger ﬂo w (A WF) metho d for pt yc hography , whic h stems from the p opular Wirtinger ﬂo w (WF) algorithm for phase-retriev al problems [31]. It shares some simi- larit y with APG. Ho wev er, they diﬀer fundamentally in many asp ects. A WF can b e considered as a steepest decen t method with a constant step size, while APG here is a pro jected gradient metho d with a v arying step size. W e ma y consider APG as a hybid algorithm com bining gradien t descen t and pro- jection metho ds. W e ﬁrst decent the cost function in recipro cal space with a small step, and then pro ject it to the domain in real space which satisﬁes the translation constraint. The up date is completed only when suc h a mo ve w ould mak e the cost function smaller. F or a Gaussian amplitude likelihoo d 11 function, if λ t is equal to one, as can be seen the up dating sc heme is no dif- feren t from ER algorithm. Therefore, w e can c ho ose one as the initial v alue of the step size. Algorithm 4: APG Data: y i ∈ R N × 1 , S i ∈ R N × M , i = 1 , 2 , ...K Result: prob e function p and ob ject function o initialization: p 0 , o 0 , x 0 i , λ 0 = 1, β , t max ; rep eat ω t = t t +3 ; w t i = x t i + ω t ( x t i − x t − 1 i ); rep eat z i = w t i − λ t ∇ x ∗ i L ( w t i ); o t +1 = up date o( p t , z i ); p t +1 = up date p( o t +1 , z i ); z i = F diag( p t +1 ) S i o t +1 ; if L ( z i ) ≤ Q λ t ( z i , w t i ) then return x t +1 i = z i ; else λ t = λ t β ; end un til λ t is suﬃciently smal l ; λ t +1 = λ t ; if λ t +1 < δ then λ t +1 = λ 0 ; end un til t > t max ; 3 Numerical sim ulation In this section w e will compare the p erformance of diﬀerent metho ds using sim ulation data. The test ob ject function is sho wn in Fig. 1. The image ‘Cameraman’ is used as its amplitude and ‘Barbara’ as its phase. The pixel size of the image is assumed to b e 5 nm. A prob e of size 37 nm is produced b y a F resnel Zone plate and a sp ecial fermat scan that follows the equation (in polar co ordinates) [32], r i = c √ i, θ i = 2 . 4 i, i = 1 , 2 ...K . (34) is p erformed to illuminate the diﬀeren t parts of the sample. c is chosen to b e 20 nm (4 pixels) and a total of 1261 far-ﬁeld diﬀraction patterns are collected. 12 The maxim um detector intensit y is scaled to the range of 10 2 − 10 6 and a P oisson noise is added to eac h pixel accordingly . The ”measured” in tensity con tains t wo diﬀerent t yp es of errors. One is the round-oﬀ error since the measured intensit y are in tegers. This approximation reduces the dynamical range of the signal. P articularly in the detector region where the in tensit y drops b elow 0.5 coun t, they are all set to zero. Tw o is the P oisson noise, whic h adds bac kground ﬂuctuation to the signal. T o b e more quantitativ e, w e calculate the signal-to-noise ratio (SNR) of the intensit y as, SNR = 10log 10 ( P K i =1 k ˆ y 2 i k 2 2 P K i =1 k y 2 i − ˆ y 2 i k 2 2 ) , (35) where ˆ y i are the ground-truth v alues. W e ﬁrst study the case with SNR = 32.25 dB (max. detector pixel in- tensit y = 10 4 coun ts). The reconstruction results obtained from diﬀeren t algorithms discussed in the preceding section are sho wn in Fig. 1. All of ML- based metho ds yield a high-quality reconstruction with nearly indistinguish- able diﬀerence. If we pa y close atten tion on the reconstructed amplitude, a v ery faint cross-talk from the phase image can b e seen in the bac kground. In contrast, the phase reconstruction do esn’t sho w any visible artifacts. As a comparison, we also plot results from the non-statistical algorithm, DM. Strong cross-talk in the reconstructed amplitude image can b e observ ed. In addition, the reconstructed phase image is less sharp than the others and con- tains some visible artifacts. This suggests that even at this lev el of signals, one ma y still need to use ML-based algorithm to ac hiev e the b est result. T o quantify the error for a systematic study , we use a ro ot mean square error (RMSE) deﬁned as, RMSE = k ˆ o − a o k 2 k ˆ o k 2 , (36) where ˆ o is the ground-truth and a is a complex-v alued constan t to account for the ambiguit y in pt yc hography reconstruction. The assessmen t under diﬀeren t conditions is presented in Fig. 2. F or PM (Fig. 2a), its con vergence rate is λ -dep endent, while the resultant RMSE is not; they all con v erge to the same v alue. The Poisson mo del leads to a smaller RMSE, which is not a surprise since a P oisson noise is added. When the amplitude Gaussian is used, PM is no better than ER. As we discussed earlier, ER can also b e considered as a ML-based metho d. F or ADMM (Fig. 2b), w e observe that a larger v alue of λ leads to a faster con vergence. In the limit of inﬁnit y , ML-based ADMM b ecomes DM, whic h sho ws the fastest conv ergence rate at the b eginning. How ev er, a large v alue 13 Figure 1: Comparison of reconstruction results obtained from diﬀerent algo- rithms with a noisy dataset. The maxim um num b er of photons receiv ed at one pixel of the detector is scaled to 10 4 coun ts, and a P oisson noise is added accordingly . This corresp onds to SNR = 32.25 dB for the collected diﬀraction patterns. T op panel: amplitude image. Bottom panel: phase image. of λ can cause a stability issue, whic h can b e seen in the plot. In this case when λ > 5, they do not tend to con verge to a stable solution after initial fast con vergence, but rather ﬂuctuate as iteration go es. Also, a large λ results in a higher RMSE. Therefore the solution is less accurate. This suggests a m ulti-stage strategy for ADMM to optimize b oth the con v ergence rate and accuracy . W e can start with a large λ for fast conv ergence. When a stable solution is reached ( k u t − u t − 1 k / k u t k < threshold), we reduce the v alue of λ b y m ultiplying it with a constan t β ∈ (0 , 1), use the solution obtained from the last stage as the initial guess and contin ue the iteration. T o distinguish it from a regular ADMM algorithm, w e call it mADMM thereafter, where ‘m’ refer to multi-stage. F or APG (Fig. 2c), w e alwa ys choose the initial v alue of λ 0 as one and it is adjusted automatically in the up date. Similar to the cases in PM and ADMM, in tensit y Poisson mo del outp erforms amplitude Gaussian. It is w orth noting that when the initial guess of the prob e is bad, λ t can quic kly b ecomes very small, particularly for intensit y Poisson mo del. This will cause a stagnation problem. A remedy is to reset λ t to its initial v alue when it b ecomes to o small, but a price to pay is more computation time. T o assess p erformance across algorithms, in Fig. 2d w e plot their RMSE v ariations as a function of iteration num b er. F or a fair comparison, all start with a disk-lik e prob e and a square ob ject as the initial guess. Poisson model is used in all algorithms. Among them, APG has the b est p erformance. Not 14 Figure 2: RMSE v ariations as a function of iteration n umbers with (a) PM, (b) ADMM, (c) APG algorithms under diﬀeren t conditions. A comparisons across algorithms is shown in (d), where a multi-stage strategy is employ ed for ADMM (named mADMM) to achiev e the b est result, δ = 10 − 5 , β = 0 . 7. The sim ulated dataset used for reconstruction is the same with that in Fig. 1. only its o v erall conv ergences rate is higher, but also the resultant RMSE is smaller. With curren t SNR, though quantitativ e analysis shows the diﬀerence in the reconstruction, visually the results lo ok almost the same (Fig. 1). In a more extreme case, w e consider a maxim um detector in tensit y of 100 counts (SNR = 12.77). Compared to the previous case, the diﬀraction in tensit y is reduced b y 100-fold. Fig. 3a shows a t ypical diﬀraction pattern with limited coun ts in logarithmic scale, and a comparison with the ground-truth (Fig. 3b). The error-free data has a dynamical range o ver seven orders of magnitude, while that for the limited counts is reduced to less than t wo due to the round-oﬀ to in tegers. The added noise makes the situation ev en w orse. Such a noisy dataset p oses a signiﬁcant c hallenge to ptyc hographic 15 Figure 3: A t ypical diﬀraction pattern of the noisy dataset (SNR=12.77) (a) and the corresp onding error-free one (b). Intensit y plot in logarithmic scale. (c) reconstruction amplitude and phase images using PM ( λ = 1), mADMM ( λ = 1 , δ = 10 − 5 , β = 0 . 7) and APG ( δ = 0 . 1 , β = 0 . 5) algorithms with the noisy dataset seen in (a). reconstruction. W e presen t in Fig. 3c the reconstructed results obtained from PM, APG and mADMM algorithms. Because DM do es not con verge at all in this case, its result is not sho wn here. Again a disk-like prob e and a square ob ject as the initial guess are used. All three algorithms lead to a conv erged solution. As is clear, their results ho wev er diﬀer considerably from algorithm to algorithm. PM recov ers high-frequency features w ell, but the reconstructed images lo ok more grainy . There are also strong cross- con tamination betw een the amplitude and the phase images. In contrast, mADMM yields smo oth images, but the high-frequency details are lost. APG balances the t w o well and pro duces the b est ov erall results. The qualit y of the 16 obtained phase image is still v ery acceptable, without having visible artifacts and losing to o muc h details. Figure 4: RMSE v ariations at diﬀerent levels of signal strength using PM, mADMM and APG algorithms. Fig. 4 depicts the achiev ed RMSE of the reconstructed complex image at v arious lev els of signal. In general, the log-log plot sho ws a close-to- linear relationship, suggesting that they all follow a p ow er law approximately . In most circumstances, APG outp erforms the other t wo. Unlik e PM and mADMM algorithms for which the v alue of λ has to b e c hosen accordingly with the noise lev el to achiev e the b est p erformance, there is no need to tune an y parameter for APG when SNR c hanges. 4 Exp erimen tal Data The simulation data only tak e in to accoun t round-oﬀ errors and P oisson noise. The real-world problem can b e m uc h more complicated. In order to assess the robustness of these algorithms, we p erform reconstruction on an exp erimen tal dataset that was tak en at the hard x-ray nanoprobe b eamline of National Sync hrotron Ligh t Source I I, Bro okha ven National Lab oratory . The nanob eam with a size of ∼ 13 nm 2 at 12 keV was pro duced by tw o crossed m ultilay er Laue lenses. Details ab out the exp erimen tal setup can b e found 17 elsewhere [33]. The sample consists of cubic Au nanoparticles with a size of 50 nm dep osited on a Si substrate. They form an ordered array with gaps b et ween them as small as 10 nm. The sample w as placed at a do wnstream p osition with a distance of 25 um to the fo cal plane. A fermat scan with c = 20 nm [Eq. (34)] w as p erformed to a void p erio dic aliasing eﬀect, and a total of 792 frames w ere collected. The diﬀraction pattern collected on a far-ﬁeld pixel-arra y detector (Merlin, Quantum Detectors) has o ver 4 × 10 4 maxim um detector coun ts at one pixel. In Fig. 5 reconstructed complex-v alued images with diﬀerent algorithms are presented. F or a fair comparison, they all start with the same initial guess of the prob e function, which is obtained b y in verse F ourier transform- ing the measured far-ﬁeld amplitude and then propagating 25 um to the sample plane. In other words, the initial guess assumes a lens with no phase ab erration. Because DM does not con v erge to a stable solution, we ha ve to c ho ose an in termediate reconstruction result that lo oks the best. Nev- ertheless, the phase image is a bit noisy , and the amplitude part is barely recognizable. PM yields a smo other result, but b oth the phase and ampli- tude exhibit some ghost image around the b oundary of the arra y . mADMM pro duces a further impro v ed result, but the ghost image can still be seen, par- ticularly in the amplitude image. Not surprisingly , the b est result is ac hieved with APG. It has no apparen t artifacts seen in the reconstruction. W e can clearly resolve the shap e of individual 50-nm nanoparticles and their sharp edges in the phase image, ev en though the amoun t of the phase v ariation of one la y er is in the order of ∼ 0.05 radian. Because the sample has a very lo w absorption contrast ( ∼ 1.7%), the amplitude image usually is to o fuzzy to b e useful. How ev er in the one obtained from APG we can still recognize the particle array . F or this dataset, w e conclude that APG leads to a re- construction result with o verall image quality noticeably b etter than that of others. There are a few remarks we’d lik e to p oint out. Among all the algorithms tested in this pap er, DM tak es the least n umber of iterations to arriv e at a plausible solution, particularly when the initial guess of the prob e is far from the ground truth. How ev er, with the presence of noise DM can b ecome unstable and diverge. On the contrary , ML-based algorithms conv erge slo wer, but are stable. The reason is that their iteration pro cesses usually inv olves a MAP-based sub-optimization step which requires the up dated solution to b e close to its prior v alue. As a result, a go o d guess of the initial prob e is more imp ortan t for these ML-based algorithms. In addition, for the exp erimen tal data tested in this case w e do not see visible diﬀerence b etw een Poisson and Gaussian mo dels. This suggests that the diﬀerence in reconstruction b et ween t wo statistical mo dels diminishes as intensit y increases. 18 Figure 5: Pt yc hography reconstruction of a Au nanoparticle arra y using diﬀeren t algorithms. Scale bar is 250 nm. T op panel: amplitude. Bottom panel: phase. 5 Conclusion In summary , we presented several solving techniques for ptyc hographic imag- ing derived in the framework of proximal algorithms. The separable nature of the pro ximal operator makes it w ell-suited for dealing with large-scale pty- c hography reconstruction problems where its ev aluation can b e parallelized. The optimization problem is usually divided in to sub-optimization steps in- v olving pro ximal op erators for which often a closed-form solution can b e found. Therefore, the problem b ecomes more tractable. W e deriv ed ER, PM, ADMM, DM and APG algorithms and b enchmark ed their p erformance with noisy datasets. Among them, APG depicted the b est reconstruction result not only in n umerical simulation but also in exp eriment. In the curren t w ork, we only consider a noise model and round-oﬀ er- rors. In the same frame w ork, it is not diﬃcult to enable mo des to deal with partial coherence [19] and bluring eﬀect in ﬂy-scan [34, 35]. In such cases, con tribution from diﬀeren t modes will add up incoheren tly and the lik eliho o d function has to b e mo diﬁed accordingly . W e can also consider to incorp orate more constrain ts on prob e or ob ject function. F or example, adding a regular- ization term with denoiser has sho wn suppressed noise in the reconstructed ob ject function [13, 36]. With an additional constraint on the prob e, it was demonstrated that the p erio dic aliasing eﬀect seen in grid scans could b e 19 mitigated [18]. There is still a lot of ro om for impro vemen t, and they will b e the future work. 6 Ac kno wledgemen ts The author thanks X. Huang for a fruit discussion on ptyc hograph y algo- rithms and F. Lu for providing Au nanoparticles. This research used b eam- line 3ID of the National Sync hrotron Light Source I I, a U.S. Departmen t of Energy (DOE) Oﬃce of Science User F acility op erated for the DOE Of- ﬁce of Science b y Bro okhav en National Lab oratory under Contract No. de- sc0012704. References [1] J. M. Rodenburg and H. M. L. F aulkner. A phase retriev al algorithm for shifting illumination. Applie d Physics L etters , 85(20):4795–4797, 2004. [2] J. M. Ro denburg, A. C. Hurst, A. G. Cullis, B. R. Dobson, F. Pfeiﬀer, O. Bunk, C. Da vid, K. Jeﬁmo vs, and I. Johnson. Hard-x-ra y lensless imaging of extended ob jects. Physic al R eview L etters , 98(3):4, 2007. [3] Pierre Thibault, Martin Dierolf, Andreas Menzel, Oliver Bunk, Chris- tian David, and F ranz Pfeiﬀer. High-resolution scanning x-ra y diﬀraction microscop y . Scienc e , 321(5887):379–382, 2008. [4] S. O. Hruszkewycz, M. J. Highland, M. V. Holt, D. Kim, C. M. F olkman, C. Thompson, A. T ripathi, G. B. Stephenson, S. Hong, and P . H. F uoss. Imaging lo cal polarization in ferro electric thin ﬁlms b y coherent x-ray bragg pro jection ptyc hograph y . Physic al R eview L etters , 110(17):177601, 2013. [5] M. Holler, M. Guizar-Sicairos, E. H. R. Tsai, R. Dinapoli, E. Muller, O. Bunk, J. Raab e, and G. Aeppli. High-resolution non-destructiv e three-dimensional imaging of in tegrated circuits. Natur e , 543(7645):402, 2017. [6] J. J. Deng, D. J. Vine, S. Chen, Q. L. Jin, Y. S. G. Nashed, T. P eterk a, S. V ogt, and C. Jacobsen. X-ra y ptyc hographic and ﬂuorescence mi- croscop y of frozen-hydrated cells using con tinuous scanning. Scientiﬁc R ep orts , 7:445, 2017. 20 [7] Da vid A. Shapiro, Y oung-Sang Y u, T olek T yliszczak, Jordi Cabana, Ric h Celestre, W eilun Chao, Konstan tin Kaznatc heev, A. L. Da vid Kil- co yne, Filipe Maia, Stefano Marchesini, Y. Shirley Meng, T on y W arwick, Lee Lisheng Y ang, and Ho w ard A. P admore. Chemical composition mapping with nanometre resolution by soft x-ra y microscop y . Natur e Photonics , 8(10):765–769, 2014. [8] G. A. Zheng, R. Horstmey er, and C. H. Y ang. Wide-ﬁeld, high-resolution fourier ptyc hographic microscop y . Natur e Photonics , 7(9):739–745, 2013. [9] Yi Jiang, Zhen Chen, Yimo Han, Pratiti Deb, Hui Gao, Saien Xie, Prafull Purohit, Mark W. T ate, Jiw o ong Park, Sol M. Gruner, V eit Elser, and Da vid A. Muller. Electron pt yc hography of 2d materials to deep sub- ˚ angstr¨ om resolution. Natur e , 559(7714):343–349, 2018. [10] A. M. Maiden and J. M. Ro denburg. An impro ved pt ychographi- cal phase retriev al algorithm for diﬀractiv e imaging. Ultr amicr osc opy , 109(10):1256–1262, 2009. [11] P . Thibault and M. Guizar-Sicairos. Maximum-lik eliho o d reﬁnement for coheren t diﬀractive imaging. New Journal of Physics , 14:20, 2012. [12] Z. W. W en, C. Y ang, X. Liu, and S. Marc hesini. Alternating direc- tion metho ds for classical and ptyc hographic phase retriev al. Inverse Pr oblems , 28(11):18, 2012. [13] V. Katk ovnik and J. Astola. Sparse pt ychographical coheren t diﬀractiv e imaging from noisy measuremen ts. Journal of the Optic al So ciety of A meric a a-Optics Image Scienc e and Vision , 30(3):367–379, 2013. [14] M. Odstrcil, A. Menzel, and M. Guizar-Sicairos. Iterativ e least-squares solv er for generalized maximum-lik eliho o d pt ychograph y . Optics Ex- pr ess , 26(3):3108–3123, 2018. [15] Rui Xu, Mahdi Soltanolk otabi, Justin P . Haldar, W alter Unglaub, Josh ua Zusman, An thon y F. J. Levi, and Ric hard M. Leahy . Accelerated wirtinger ﬂow: A fast algorithm for ptyc hograph y . , 2018. [16] A. Maiden, D. Johnson, and P . Li. F urther impro vemen ts to the pt y- c hographical iterative engine. Optic a , 4(7):736–745, 2017. 21 [17] R. Hesse, D. R. Luk e, S. Sabach, and M. K. T am. Proximal hetero- geneous blo c k implicit-explicit method and application to blind pt y- c hographic diﬀraction imaging. Siam Journal on Imaging Scienc es , 8(1):426–457, 2015. [18] H. B. Chang, P . Enfedaque, and S. Marchesin. Blind pt yc hographic phase retriev al via conv ergent alternating direction metho d of m ultipli- ers. Siam Journal on Imaging Scienc es , 12(1):153–185, 2019. [19] P . Thibault and A. Menzel. Reconstructing state mixtures from diﬀrac- tion measuremen ts. Natur e , 494(7435):68–71, 2013. [20] A. M. Maiden, M. J. Humphry , M. C. Sarahan, B. Kraus, and J. M. Ro den burg. An annealing algorithm to correct positioning errors in pt ychograph y . Ultr amicr osc opy , 120:64–72, 2012. [21] F. C. Zhang, I. Peterson, J. Vila-Comamala, A. D. F. Berenguer, R. Bean, B. Chen, A. Menzel, I. K. Robinson, and J. M. Ro denburg. T ranslation p osition determination in ptyc hographic coherent diﬀraction imaging. Optics Expr ess , 21(11):13592–13606, 2013. [22] A. M. Maiden, M. J. Humphry , and J. M. Ro denburg. Pt yc hographic transmission microscopy in three dimensions using a multi-slice ap- proac h. Journal of the Optic al So ciety of Americ a A , 29(8):1606–1614, 2012. [23] E. H. R. Tsai, I. Uso v, A. Diaz, A. Menzel, and M. Guizar-Sicairos. X-ra y pt ychograph y with extended depth of ﬁeld. Optics Expr ess , 24(25):29090–29109, 2016. [24] M. Odstrcil, P . Baksh, S. A. Bo den, R. Card, J. E. Chad, J. G. F rey , and W. S. Bro c klesby . Pt yc hographic coheren t diﬀractive imaging with orthogonal probe relaxation. Optics Expr ess , 24(8):8360–8369, 2016. [25] Neal P arikh and Stephen Boyd. Pro ximal algorithms. F ound. T r ends Optim. , 1(3):127–239, 2014. [26] Raphael Hunger. An introduction to complex diﬀerentials and complex diﬀeren tiability . 2007. [27] H. H. Bausc hk e, P . L. Com b ettes, and D. R. Luke. Phase retriev al, error reduction algorithm, and ﬁen up v ariants: a view from conv ex optimiza- tion. Journal of the Optic al So ciety of Americ a a-Optics Image Scienc e and Vision , 19(7):1334–1345, 2002. 22 [28] Stephen Bo yd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Ec kstein. Distributed optimization and statistical learning via the al- ternating direction metho d of m ultipliers. F oundations and T r ends R  in Machine L e arning , 3(1):1–122, 2011. [29] Selin Aslan, Viktor Nikitin, Daniel J. Ching, T ekin Bicer, Sven Leyf- fer, and Do˘ ga G ¨ urso y . Join t pt ycho-tomograph y reconstruction through alternating direction metho d of multipliers. Optics Expr ess , 27(6):9128– 9143, 2019. [30] Amir Bec k and Marc T eb oulle. Gradient-based algorithms with applica- tions to signal reco very problems. In D. Palomar and Y. Eldar, editors, Convex Optimization in Signal Pr o c essing and Communic ations , pages 42–88. Cam brige Univ ersity Press. [31] E. J. Candes, X. D. Li, and M. Soltanolk otabi. Phase retriev al via wirtinger ﬂo w: Theory and algorithms. Ie e e T r ansactions on Informa- tion The ory , 61(4):1985–2007, 2015. [32] Xiao jing Huang, Hanfei Y an, Ross Harder, Y eukuang Hwu, Ian K. Robinson, and Y ong S. Chu. Optimization of o v erlap uniformness for pt ychograph y . Optics Expr ess , 22(10):12634–12644, 2014. [33] Hanfei Y an, Nathalie Bouet, Juan Zhou, Xiao jing Huang, Evgen y Nazaretski, W eihe Xu, Alex Co cco, Wilson K. S. Chiu, Kyle Brinkman, and Y ong S. Chu. Multimo dal hard x-ray imaging with resolution ap- proac hing 10 nm for studies in material science. Nano F utur es , 2:011001, 2018. [34] P . M. Pelz, M. Guizar-Sicairos, P . Thibault, I. Johnson, M. Holler, and A. Menzel. On-the-ﬂy scans for x-ra y ptyc hography . Applie d Physics L etters , 105(25):251101, 2014. [35] X. J. Huang, K. Lauer, J. N. Clark, W. H. Xu, E. Nazaretski, R. Harder, I. K. Robinson, and Y. S. Ch u. Fly-scan ptyc hography . Scientiﬁc R e- p orts , 5:9074, 2015. [36] B. S. Shi, Q. S. Lian, S. Z. Chen, and X. Y. F an. Sbm3d: Sparse regularization mo del induced by bm3d for w eigh ted diﬀraction imaging. Ie e e A c c ess , 6:46266–46280, 2018. 23

Ptychographic phase-retrieval by proximal algorithms

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment