Sundial: Using Sunlight to Reconstruct Global Timestamps

This paper investigates postmortem timestamp reconstruction in environmental monitoring networks. In the absence of a time-synchronization protocol, these networks use multiple pairs of (local, global) timestamps to retroactively estimate the motes' …

Authors: Jayant Gupchup, Ru{a}zvan Musu{a}loiu-E., Alex Szalay

Sundial: Using Sunlight to Reconstruct Global Timestamps
Sundial: Using Sunligh t to Reconstruct Global Timestamps Ja yan t Gup c hup ∗ R˘ azv an Mus˘ aloiu-E. ∗ Alex Szala y ‡ Andreas T erzis ∗ Computer Science Departmen t ∗ Ph ysics and Astronomy Departmen t ‡ Johns Hopkins Univ ersity { gupchup,razvanm,terzis } @jhu.edu ∗ szalay@jhu.edu ‡ Abstract. This pap er in vestigates p ostmortem timestamp reconstruc- tion in en vironmental monitoring net works. In the absence of a time- sync hronization protocol, these net works use m ultiple pairs of (local, global) timestamps to retroactively estimate the motes’ clo ck drift and offset and th us reconstruct the measuremen t time series. W e presen t Sun- dial , a nov el offline algorithm for reconstructing global timestamps that is robust to unreliable global clock sources. Sundial reconstructs times- tamps b y correlating annual solar patterns with measurements provided b y the motes’ inexp ensive light sensors. The surprising abilit y to accu- rately estimate the length of day using light intensit y measurements en- ables Sundial to be robust to arbitrary mote clock restarts. Experimental results, based on multiple environmen tal netw ork deploymen ts spanning a p erio d of ov er 2.5 years, show that Sundial ac hieves accuracy as high as 10 parts p er million (ppm), using solar radiation readings recorded at 20 min ute interv als. 1 In tro duction A n umber of en vironmen tal monitoring applications ha ve demonstrated the abil- it y to capture environmen tal data at scientifically-relev an t spatial and temp oral scales [11,12]. These applications do not need online clo ck sync hronization and in the interest of simplicit y and efficiency often do not emplo y one. Indeed, motes do not k eep any global time information, but instead, use their local clo c ks to generate local timestamps for their measurements. Then, a p ostmortem timestamp reconstruction algorithm retroactively uses (lo cal, global) times tamp pairs, recorded for each mote throughout the deploymen t, to reconstruct global timestamps for all the recorded lo cal timestamps. This sc heme relies on the as- sumptions that a mote’s lo cal clo ck increases monotonically and the global clock source (e.g., the base-station’s clo ck) is completely reliable. How ever, we hav e encoun tered multiple cases in which these assumptions are violated. Motes often reb o ot due to electrical shorts caused by harsh environmen ts and their clocks restart. F urthermore, basestations’ clo cks can b e desynchronized due to human and other errors. Finally the basestation might fail while the netw ork contin ues to collect data. 0 5000 10000 15000 20000 25000 30000 35000 0.0e+00 1.5e+07 3.0e+07 Sequence number Local timestamp Fig. 1. An illustration of mote reb o ots, indicated by clo c k resets. Arrows indicate the segments for which anchor p oints are collected. W e presen t Sundial , a robust offline time reconstruction mechanism that op- erates in the absence of any global clock source and tolerates random mote clo ck restarts. Sundial’s main contribution is a nov el approach to reconstruct the global timestamps using only the rep eated o ccurrences of day , night and no on. W e ex- p ect Sundial to w ork alongside existing p ostmortem timestamp reconstruction algorithms, in situations where the basestations’ clo c k b ecomes inaccurate, motes disconnect from the netw ork, or the basestation fails entirely . While these situ- ations are infrequent, we ha ve observed them in practice and therefore warran t a solution. W e ev aluate Sundial using data from tw o long-term environmen tal monitoring deplo ymen ts. Our results sho w that Sundial reconstructs timestamps with an accuracy of one min ute for deploymen ts that are well ov er a year. 2 Problem Description The problem of reconstructing global timestamps from lo cal timestamps applies to a wide range of sensor netw ork applications that correlate data from different motes and external data sources. This problem is related to mote clo ck synchro- nization, in which motes’ clo cks are p ersisten tly synchronized to a global clock source. How ever, In this work, we fo cus on environmen tal monitoring applica- tions that do not use online time synchronization, but rather emplo y p ostmortem timestamp reconstruction to recov er global timestamps. 2.1 Reco vering Global Timestamps As mentioned b efore, each mote records measurements using its local clock which is not sync hronized to a global time source. During the lifetime of a mote, a basestation equipp ed with a global clo ck collects multiple pairs of (lo cal, global) timestamps. W e refer to these pairs as anchor p oints 1 . F urthermore, w e refer to the series of local timestamps as LT S and the series of global timestamps 1 W e ignore the transmission and propagation delays associated with the anchor point sampling pro cess. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 min 1 hour 1 month 1 year 1.44 min 8.75 h 6 ms 0.36 s 8.64 s 52 min 0.6 ms 36 ms 5.25 min 60 us Error 1 permil 100 ppm 10 ppm 1 ppm Fig. 2. Time reconstruction error due to α estimation errors as a function of the deplo yment lifetime. as GT S . The basestation main tains a list of anchor p oints for eac h mote and is resp onsible for reconstructing the global timestamps using the anchor p oin ts and the lo cal timestamps. The mapping b etw een local clo ck and global clo c k can b e describ ed b y the linear relation GT S = α · LT S + β , where α represen ts the slope and β represents the in tercept (start time). The basestation computes the correct α and β for eac h mote using the anchor p oints. Note that these α and β v alues hold, if and only if the mote does not reb o ot. In the subsections that follow, w e describ e the c hallenges encountered in real deploymen ts where the estimation of α and β b ecomes non-trivial. 2.2 Problems in Timestamp Reconstruction The metho dology sketc hed in Section 2.1 reconstructs the timestamps for blo cks of measurements where the lo cal clo ck increase monotonically . W e refer to such blo c ks as se gments . Under ideal conditions, a single segmen t includes all the mote’s measurements. How ever, softw are faults and electrical shorts (caused by moisture in the mote enclosures) are tw o common causes for unattended mote reb o ots. The mote’s lo cal clo ck resets after a reb o ot and when this happ ens we sa y that the mote has started a new segmen t. When a new segment starts, α and β must b e recomputed. This implies that the reconstruction mec hanism describ ed ab ov e must obtain at least tw o anc hor p oin ts for each segment. How ev er, as node reb o ots can happ en at arbitrary times, collecting tw o anc hor p oints p er segment is not alwa ys possible. Figure 1 shows an example where no anchor p oints are tak en for the biggest segment, making the reconstruction of timestamps for that segmen t problematic. In some cases w e found that no des rebo oted repeatedly and did not come back up immediately . Ha ving a reb o ot counter helps reco ver the segment c hronology but does not pro vide the precise start time of the new segmen t. ADC values Apr 2 Apr 3 Apr 4 300 400 500 600 700 800 Jun 16 Jun 17 Jun 18 Node 72 Node 76 Fig. 3. Am bient temp erature data from tw o motes from the L deploymen t. The correlation of temp erature readings in the left panel indicates consistent times- tamps at the segmen t’s start. After t wo mon ths, the mote’s reading b ecome inconsisten t due to inaccurate α estimates. F urthermore, the basestation is resp onsible for pro viding the global times- tamps used in the anc hor p oints. Our experience sho ws that assuming the v e- racit y of the basestation clock can b e precarious. Inaccurate basestation clo cks can corrupt anchor p oints and lead to bad estimates of α and β in tro ducing er- rors in timestamp reconstruction. Long deplo yment exacerbate these problems, as Figure 2 illustrates: an α error of 100 parts p er million (ppm) can lead to a reconstruction error of 52 minutes ov er the course of a year. 2.3 A T est Case Our L e akin Park deploymen t (referred to as “L” ) provides an interesting case study of the problems describ ed ab o ve. The L deploymen t comprised six motes deplo yed in an urban forest to study the spatial and temp oral heterogeneity in a typical urban soil ecosystem. The deplo yment spanned o ver a y ear and a half, pro viding us with half a million measurements from fiv e sensing mo dalities. W e do wnloaded data from the sensor no des very infrequen tly using a laptop PC and collected anchor p oin ts only during these do wnloads. One of the soil scientists in our group discov ered that the am bient temp erature v alues did not correlate among the differen t motes. F urthermore, correlating the ambien t temp erature with an indep endent w eather station, we found that the reconstruction of times- tamps had a ma jor error in it. Figure 3 sho ws data from t wo ambien t temp erature sensors that were part of the L deplo yment. No de 72 and 76 show coherence for the p erio d in April, but data from June are completely out-of-sync. W e traced the problem back to the laptop acting as the global clo c k source. W e made the mistake of not sync hronizing its clo ck using NTP b efore going to the field to download the data. As a result the laptop’s clo ck was off by 10 hours, giving rise to large errors in our α and β estimates and thereb y in tro ducing large errors in the reconstructed timestamps. T o complicate matters further, we discov ered that Algorithm 1 Robust Global Timestamp Reconstruction (RGTR) constan ts Q  Constant used to identify anchor p oints for the segment δ H I GH , δ LOW , δ DE C  Constants used in iterative fit pro cedure ClockFit ( ap ) ( r, i ) ← (0 , 0) q ← HoughQuantize ( ap ) for each γ in Keys ( q ) do s ← Size ( q { γ } ) if s > r then ( r, i ) ← ( s, γ ) return ComputeAlphaBet a ( q { i } ) pro cedure HoughQuantize ( ap ) q ← {}  Map of empty sets for each ( lts i , g ts i ) in ap do for each ( lts j , g ts j ) in ap and ( lts j , g ts j ) 6 = ( lts i , g ts i ) do α ← ( g ts j − gts i ) / ( lts j − lts i ) if 0 . 9 ≤ α ≤ 1 . 1 then  Check if part of the same segment β ← g ts j − α · lts j γ ← R OU N D ( β /Q ) Inser t ( q { γ } , ( lts i , g ts i )) Inser t ( q { γ } , ( lts j , g ts j )) return q pro cedure ComputeAlphaBet a ( ap ) δ ← δ H I GH bad ← {} while δ > δ LOW do ( α, β ) ← LLSE ( ap ) for each ( lts, g ts ) ∈ ap and ( l ts, g ts ) / ∈ bad do residual ← ( α · lts + β ) − g ts if r esidual ≥ | δ | then Inser t ( bad, ( lts, g ts )) δ ← δ − δ DE C return ( α, β ) some of the motes had reb o oted a few times b et ween t wo consecutiv e downloads and we did not hav e any anchor p oints for those segmen ts of data. 3 Solution The test case ab o ve served as the motiv ation for a nov el metho dology that robustly reconstructs global timestamps. The Robust Global Timestamp Re- construction (RGTR) algorithm, presented in Section 3.1, outlines a pro cedure to obtain robust estimates of α and β using anchor p oints that are p oten tially unreliable. W e address situations in which the basestation fails to collect an y anc hor p oints for a segment through a nov el metho d that uses solar information alone to generate anchor p oints. W e refer to this mechanism as Sundial. 3.1 Robust Global Timestamp Reconstruction (R GTR) Ha ving a large n umber of anchor p oints ensures immunit y from inaccurate ones, pro vided they are detected. Algorithm 1 describes the Robust Global Timestamp Hours Length of day Solar noon Jan 2006 Jul 2006 Jan 2007 Jul 2007 Jan 2008 10 11 12 13 14 15 11:30 11:45 12:00 12:15 12:30 Fig. 4. The solar (mo del) length of da y (LOD) and no on pattern for a p erio d of tw o years for the latitude of our deploymen ts. 650000 700000 750000 800000 850000 900000 0 200 400 600 800 1000 Local timestamps Hours ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Light Smooth Derivative Fig. 5. The light time series (ra w and smoothed) and its first deriv a- tiv e. The inflection p oints represent sunrise and sunset. Reconstruction (R GTR) algorithm that ac hieves this goal. R GTR tak es as input a set of anchor p oints ( ap ) for a given segment and iden tifies the anchor p oin ts that belong to that segmen t, while censoring the bad ones. Finally , the algorithm returns the ( α, β ) v alues for the segment. RGTR assumes the av ailability of tw o pro cedures: Inser t and Llse . The Inser t ( x, y ) pro cedure adds a new element, y , to the set x . The Linear Least Square Estimation [4], Llse pro cedure takes as input a set of anc hor p oints b elonging to the same segmen t and outputs the parameters ( α , β ) that minimize the sum of square errors. R GTR b egins by identifying the anchor p oints for the segment. The pro- cedure HoughQuantize implements a well known feature extraction metho d, kno wn as the Hough T ransform [5]. The central idea of this method is that an- c hor p oints that b elong to the same segment should fall on a straigh t line having a slop e of ∼ 1 . 0. Also, if w e consider pairs of anchors (tw o at a time) and quan tize the in tercepts, anchors b elonging to the same segment should all collapse to the same quan tized v alue (bin). HoughQuantize returns a map, q , which stores the anchor p oin ts that collapse to the same quantized v alue. The k ey (stored in i ) that contains the maxim um num ber of elemen ts con tains the anc hor points for the segment. Next, we inv ok e the pro cedure ComputeAlphaBet a to compute robust estimates of α and β for a given segmen t. W e b egin by creating an empt y set, bad . The set bad maintains a list of all anc hor p oints that are detected as b eing outliers and do not participate in the parameter estimation. This pro cedure is iterativ e and b egins by estimating the fit ( α , β ) using all the anchor p oints. Next, w e lo ok at the residual of all anchor points with the fit. Anchor p oints whose residuals exceed the current threshold, δ , are added to the bad set and are excluded in the next iteration fit. Initially , δ is set conserv atively to δ H I GH . A t the end of ev ery iteration, the δ threshold is lo wered and the pro cess rep eats un til no new en tries are added to the bad set, or δ reac hes δ LOW . ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 100 200 300 400 8 10 12 14 16 Node 73 Days Length of day [hours] ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 100 200 300 400 8 10 12 14 16 Node 76 Days Length of day [hours] Fig. 6. The length of da y pattern for tw o long segmen ts b elonging to different no des. Day 0 represents the start-time for eac h of the segments. 3.2 Sundial The parameters of the solar cycle (sunrise, sunset, no on) follo w a w ell defined pattern for lo cations on Earth with a given latitude. This pattern is eviden t in Figure 4 that presents the length of da y (LOD) and solar no on for the perio d b et ween January 2006 and June 2008 for the latitude of the L deploymen t. Note that the LOD signal is p erio dic and sinusoidal. F urthermore, the frequency of the solar no on signal is twice the frequency of the LOD signal. W e refer the reader to [6] for more details on ho w the length of da y can b e computed for a giv en lo cation and da y of the y ear. The paragraphs that follo w explain ho w information extracted from our ligh t sensors can b e correlated with known solar information to reconstruct the mea- suremen t timestamps. Extracting light patterns: W e b egin by lo oking at the time series L i of ligh t sensor readings for node i . L i is defined for a single segment in terms of the lo cal clock. First, we create a smo oth version of this series, to remov e noise and sharp transients. Then, w e compute the first deriv ative for the smoothed L i series, generating the D i time-series. Figure 5 provides an illustration of a t ypical D i series o verlaid on the light sensor series ( L i ). One can notice the pattern of inflection p oin ts representing sunrise and sunset. The regions where the deriv ative is high represent mornings, while the regions where the deriv ative is low represen t ev enings. F or this metho d, we select sunrise to b e the p oint at whic h the deriv ative is maximum and sunset the p oint at whic h the deriv ativ e is minimum. Then, LOD is giv en as the difference b etw een sunrise and sunset, while no on is set to the midpoint b etw een sunrise and sunset. The metho d describ ed ab ov e accurately detects no on time. How ever, the metho d introduces a constant offset in LOD detection and it underestimates LOD due to a late sunrise detection and an early sunset detection. The no on Hours ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Apr 2006 Jun 2006 Aug 2006 Oct 2006 Dec 2006 Feb 2007 Apr 2007 ● * Model LOD Model Noon Computed LOD Computed Noon 10 11 12 13 14 15 16 Fig. 7. An illustration of the computed LOD and no on v alues for the lag with maxim um correlation with the solar mo del. time is unaffected due to these equal but opposite biases. In practice, we found that a simple thresholding scheme w orks b est for finding the sunrise and sunset times. The light sensors’ sensitivity to changes simplifies the pro cess of selecting the appropriate threshold. In the end, w e used a hybrid approach whereby we obtain no on times from the metho d that uses deriv ativ es and LOD times from the thresholding metho d. The net result of this pro cedure is a set of noon times and LOD for each da y from the segmen t’s start in terms of the lo cal clo ck. Figure 6 sho ws the LOD v alues obtained for tw o differen t no de segments after extracting the light patterns. Solar reconstruction of clo cks: The solar mo del pro vides the LOD and no on v alues in terms of the global clo ck ( LO D GT ), while the pro cedure describ ed in the previous paragraph extracts the LOD and no on v alues from light sensor measuremen ts in terms of the motes’ lo cal clo cks ( LO D LT ). In order to find the b est possible da y alignmen t, w e lo ok at the correlation b etw een the tw o LOD signals ( LO D GT , LO D LT ) as a function of the lag (shift in days). The lag that giv es us the maximum correlation ( ρ max ) is an estimate of the day alignmen t. Mathematically , the day alignment estimate (lag) is obtained as arg max lag Cor( LO D GT , LO D LT , lag ) where Cor( X , Y , s ) is the correlation b etw een time series X and Y shifted by s time units. Figure 7 presents an example of the match b etw een mo del and computed LOD and no on times achiev ed b y the lag with the highest correla- tion. The computed LOD time series tracks the one giv en b y the solar mo del. One also observ es a constant shift betw een the t wo LOD patterns, whic h can b e attributed to the horizon effect. F or some da ys, canopy co ver and weather patterns cause the extracted LOD to be underestimated. Ho wev er, as the da y alignmen t is obtained by p erforming a cross-correlation with the mo del LOD Day Offset Shift Anchor Points RGTR Global Timestamps Solar Model Light Timeseries Correlation Cross Sundial Length of Day Filter Length of Day Length of Day (local timestamps) Noon (local timestamps) Noon (global timestamps) (global timestamps) Fig. 8. The steps inv olv ed in reconstructing global timestamps using Sundial. pattern, the result is robust to constant shifts. F urthermore, Figure 7 shows that the equal and opposite effect of sunrise and sunset detection ensures that the no on estimation in unaffected in the av erage case. After obtaining the day alignment, w e use the no on information to generate anc hor p oin ts. Sp ecifically , for each da y of the segment we hav e av ailable to us the no on time in lo cal clo ck (from the light sensors) and no on time in global clo c k (using the mo del). RGTR can then b e used to obtain robust v alues of α and β . This fit is used to reconstruct the global timestamps. As Figure 4 suggests, the no on times change slo wly o ver consecutive da ys as they oscillate around 12:00. Thus, even if the day estimate is inaccurate, due to the small difference in no on times, the α estimate remains largely unaffected. This implies that even if the da y alignment is not optimal, the time reconstruction within the day will be accurate, pro vided that the noon times are accurately aligned. The result of an inaccurate lag estimate is that β is off by a v alue equal to the difference b etw een the actual day and our estimate. In other w ords, β is off b y an integral and constant num b er of days (without an y skew) ov er the course of the whole deploymen t p erio d. W e find that this metho dology is well suited in finding the correct α . T o im- pro ve the β estimate, we p erform an iterative pro cedure which works as follows. F or each iteration, we obtain the b est estimate fit ( α, β ). W e conv ert the motes’ lo cal timestamps into global timestamps using this fit. W e then lo ok at the dif- ference b etw een the actual LOD (given by the mo del) and the curren t estimate 289 days 481 days 587 days 567 days 341 days 308 days 158 days 141 days 167 days 134 days ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 76 73 72 74 71 75 2,5 6 8,11,13,14 29,43,44,45 Deployment L Deployment J Feb 2006 Aug 2006 Feb 2007 Aug 2007 Feb 2008 Aug 2008 Fig. 9. No de identifiers, segments and length of each segment (in da ys) for the t wo deploymen ts used in the ev aluation. for that day . If the difference b etw een the exp ected LOD and the estimate LOD exceeds a threshold, we label that day as an outlier. W e remo ve these outliers and p erform the LOD cross-correlation to obtain the da y shift (lag) again. If the new lag differs from the lag in the previous iteration, a new fit is obtained by shifting the no on times by an amount prop ortional to the new lag. W e iterate until the lag do es not change from the previous iteration. Figure 8 shows a sc hematic of the steps inv olved in reconstructing global timestamps for a segmen t. 4 Ev aluation W e ev aluate the prop osed metho dology using data from t wo deploymen ts. De- plo yment J w as done at the Jug Bay w etlands sanctuary along the Patuxen t riv er in Anne Arundel Count y , Maryland. The data it collected is used to study the nesting conditions of the Eastern Box turtle ( T err ap ene c ar olina ) [10]. Eac h of the motes was deploy ed next to a turtle nest, whereas some of them hav e a clear view of the sky while others are under multiple lay ers of tree canopy . Deplo yment L , from Leakin P ark, is describ ed in Section 2.3. Figure 9 summarizes the no de iden tifiers, segments, and segment lengths in da ys for each of the tw o deploymen ts. Recall that a segment is defined as a blo c k of data for which the mote’s clo ck increases monotonically . Data obtained from the L dataset con tained some segmen ts lasting w ell ov er 500 days. The L deplo yment uses MicaZ motes [3], while the J deploymen t uses T elosB m otes [8]. Motes 2, 5, and 6 from Deploymen t J collected samples ev ery 10 minutes. All other motes for b oth deploymen ts had a sampling interv al of 20 min utes. In addition to its on-b oard light, temp erature, and humidit y sensors, eac h mote w as connected to t wo soil moisture and t wo soil temp erature sensors. In order to ev aluate Sundial’s accuracy , w e must compare the reconstructed global timestamps it pro duces, with timestamps that are kno wn to b e accurate and precise. Thus we b egin our ev aluation by establishing the ground truth. Error [days] ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 2 4 6 8 10 12 14 16 18 20 Deployment L Deployment J Fig. 10. Error in days for different motes from the L and J deplo y- men ts. Root Mean Square Error [minutes] ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 2 4 6 8 10 12 14 16 18 20 22 Deployment L Deployment J Fig. 11. Ro ot mean square error in min utes ( RM S E min ). 4.1 Ground T ruth F or eac h of the segments shown in Figure 9, a set of go o d anc hor p oin ts (sampled using the basestation) were used to obtain a fit that maps the local timestamps to the global timestamps. W e refer to this fit as the Gr ound truth fit . This fit w as v alidated in t wo wa ys. First, w e correlated the am bient temp erature readings among different sensors. W e also correlated the motes’ measurements with the air temp erature measuremen ts recorded b y nearb y w eather stations. The w eather station for the L deplo ymen t w as lo cated approximately 17 km a wa y from the deplo yment site [1], while the one for the J deploymen t w as lo cated less than one km aw a y [7]. Considering the proximit y of the t w o weather stations we exp ect that their readings are strongly correlated to the motes’ measurements. Note that even if the absolute temp erature measuremen ts differ, the diur- nal temp erature patterns should exhibit the same b ehavior thus leading to high correlation v alues. Visual insp ection of the temp erature data confirmed this in- tuition. Finally , we note that due to the large length of the segmen ts w e consider, an y inconsistencies in the ground truth fit would b ecome apparent for reasons similar to the ones provided in Section 2.2. 4.2 Reconstructing Global Timestamps using Sundial W e ev aluate Sundial using data from the segments shown in Figure 9. Specifically , w e ev aluate the accuracy of the timestamps reconstructed by Sundial as though the start time of these segmen t is unkno wn (similar to the case of a mote rebo ot) and no anchor p oints are a v ailable. Since w e mak e no assumptions of the segment start-time, a v ery large mo del (solar) signal needs to b e considered to find the correct shift (lag) for the da y alignment. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ρ ρ max Day Error [days] 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 5 10 15 Fig. 12. Relation betw een ρ max and error in days. ● ● ● ● ● ● 0.9765754 0.9766024 0.9766295 0 10 20 30 40 50 ppm Deployment L Deployment J ● ● ● ● ● ● ● ● ● ● ● 0.999998 1.000129 1.000259 0 50 100 150 ppm Fig. 13. α estimates from Sundial and estimation errors in ppm. Ev aluation Metrics: W e divide the timestamp reconstruction error to: (a) error in days; and (b) error in min utes within the day . The error in min utes is computed as the root mean square error ( RM S E min ) ov er all the measure- men ts. W e divide the reconstruction error into these t w o comp onents, b ecause this decoupling naturally reflects the accuracy of estimating the α and β param- eters. Sp ecifically , if the α estimate w ere inaccurate, then, as Figure 2 suggests, the reconstruction error w ould gro w as a function of time. In turn, this would result in a large ro ot mean squared error in minutes within the day ov er all the measuremen ts. On the other hand, a low RM S E min corresp onds to an accurate estimate for α . Lik ewise, inaccuracies in the estimation of β would result in large error in days. Results: Figures 10 and 11 summarize Sundial’s accuracy results. Overall, w e find that longer segments sho w a low er day error. Segments b elonging to the L deplo yment span well ov er a year and the minimum day error is 0 while the maxim um day error is 6. In contrast, most of the segments for deploymen t J are less than 6 mon ths long and the error in days for all but t wo of those segments is less than one week. Figure 12 presents the relationship b etw een the maximum correlation ( ρ max ) and the da y error. As ρ max measures ho w w ell w e are able to match the LOD pattern for a no de with the solar LOD pattern, it is not surprising that high correlation is generally asso ciated with low reconstruction error. The R M S E min obtained for each of the segments in deploymen t L is very lo w (see Figure 11) . Remark ably , w e are able to achiev e an accuracy ( R M S E min ) of under a minute for the ma jority of the no des of the L deplo yment ev en though w e are limited by our sampling frequency of 20 minutes. Moreo ver, RM S E min error is alwa ys within one sample p erio d for all but one segment. In terestingly , we found that the α v alues for the tw o deplo yments were sig- nifican tly different. This disparity can b e attributed to differences in no de types ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 80 120 160 200 240 280 0 10 20 30 40 50 60 70 Segment Length [days] Day Error [days] Fig. 14. Error in da ys as a function of segment size. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 80 120 160 200 240 280 0 10 20 30 40 Segment Length [days] Root Mean Square Error [minutes] Fig. 15. Error in min utes ( RM S E min ) as a function of segmen t size. and thus clo ck logic. Nonetheless, Sundial accurately determined α in both cases. Figure 13 presen ts the α v alues for the tw o deploymen ts. W e also show the error b et ween the α obtained using Sundial and the α v alue obtained b y fitting the go o d anc hor p oints sampled by the gatewa y (i.e., ground truth fit). The ppm error for b oth the deploymen ts is remark ably lo w and close to the operating error of the quartz crystal. 4.3 Impact of Segment Length Sundial relies on matching solar patterns to the ones observed by the ligh t sen- sors. The natural question to ask is: what effect do es the length of segmen t ha ve on the reconstruction error. W e address this question by exp erimenting with the length of segments and observing the reconstruction error in days and RM S E min . W e selected data from three long segments from deplo yment L . T o eliminate bias, the start of each shortened segment was chosen from a uniform random distribution. Figure 15 shows that the RM S E min tends to b e remark- ably stable even for short segments. One concludes that ev en for short segment lengths, Sundial estimates the clo ck drift ( α ) accurately . Figure 14 shows the effect of segment size on day error. In general, the day error decreases as the segmen t size increases. Moreov er, for segments less than 150 days long, the error tends to v ary considerably . 4.4 Da y Correction The results so far show that 88% (15 out of 17) of the motes ha ve a da y offset of less than a week. Next, we demonstrate how global even ts can b e used to correct for the da y offset. W e lo oked at soil moisture data from eight motes of 0 15 30 45 60 75 90 105 120 Days Soil moisture Rainfall 1 2 3 4 5 6 7 Lag [days] Cosine Similarity Fig. 16. An illustration of the cosine similarit y ( θ S M − P P T ) v alues for seven differen t day lags b etw een moisture and rainfall vectors. θ S M − P P T p eaks at the correct lag of five days, providing the correct day adjustment. the J deploymen t after obtaining the b est p ossible timestamp reconstruction. Sp ecifically , we correlated the motes’ soil moisture data with rainfall data to correct for the day offset. W e used rainfall data from a p erio d of 133 days, starting from December 4, 2007, during which 21 ma jor rain even ts o ccurred. T o calculate the correlation, w e created weigh ted daily vectors for soil moisture measurements ( S M ) whose v alue was greater than a certain threshold and similarly rainfall v ectors having a daily precipitation ( P P T ) v alue of greater than 4.0 cm. Next, w e extracted the lag at which the cosine angle b etw een the tw o v ectors (cosine similarit y , θ S M − P P T ) is maxim um. This metho d is inspired by the well-kno wn do cumen t clustering mo del used in the information retriev al communit y [9]. Note that we computed θ S M − P P T for a t wo-w eek window ( ± seven days) of lags and found that seven out of the eigh t motes could b e aligned p erfectly . Figure 16 illustrates the soil moisture vectors, rainfall v ectors and the associated θ S M − P P T for seven lags for one of the segmen ts. Note that θ S M − P P T p eaks at the correct lag of five, leading to the precise day correction. While w e use soil moisture to illustrate how global ev ents can b e used to ac hieve macro-lev el clo ck adjustmen ts, other mo dalities can also b e used based on the application’s parameters. 5 Related W ork This study prop oses a solution to the problem of p ostmortem timestamp re- construction for sensor measurements. T o our knowledge, there is little previous w ork that addresses this problem for deploymen ts that span a y ear or longer. Deplo yment length can b e an issue b ecause the reconstruction error monotoni- cally increases as a function of time (cf. Sec.2.2). The timestamp reconstruction problem was first introduced by W erner-Allen et al. who provided a detailed ac- coun t of the challenges they faced in synchronizing mote clo c ks during a 19-day deplo yment at an active v olcano [13]. Specifically , while the system employ ed the FTSP proto col to synchronize the net w ork’s motes, unexp ected faults forced the authors to rely on an offline time r e ctific ation algorithm to reconstruct global timestamps. While exp eriences such as the one rep orted in [13] pro vide motiv ation for an indep endent time reconstruction mec hanism suc h as the one prop osed in this pap er, the problem addressed by W erner-Allen et al. is different from the one we aim to solve. Sp ecifically , the v olcano deploymen t had access to precise global timestamps (through a GPS receiv er deploy ed at the site) and used linear regression to translate lo cal timestamps to global time, once timestamp outliers w ere remo v ed. While RGTR can also be used for outlier detection and timestamp reconstruction, Sundial aims to recov er timestamps in situations where a reliable global clo ck source is not av ailable. Finally , Chang et. al. [2] describ e their exp eriences with motes reb o oting and resetting of logical clo cks, but do not furnish any details of how they recon- structed the global timestamps when this happens. 6 Conclusion In this pap er w e present Sundial, a method that uses light sensors to reconstruct global timestamps. Specifically , Sundial uses light intensit y measuremen ts, col- lected b y the motes’ on-board sensors, to reconstruct the length of day (LOD) and no on time throughout the deploymen t p erio d. It then calculates the slop e and the offset by maximizing the correlation b etw een the measurement-deriv ed LOD series and the one provided by astronomy . Sundial op erates in the absence of global clo c ks and allows for random no de reb o ots. These features make Sun- dial very attractive for environmen tal monitoring net works deploy ed in harsh en vironments, where they op erate disconnected ov er long p erio ds of time. F ur- thermore, Sundial can be used as an indep endent verification tec hnique along with any other time reconstruction algorithm. Using data collected by t wo netw ork deploymen ts spanning a total of 2.5 y ears we show that Sundial can achiev e accuracy in the order of a few minutes. F urthermore, we show that one can use other global ev ents such as rain ev ents to correct an y day offsets that migh t exist. As exp ected, Sundial’s accuracy is closely related to the segment size. In this study , we p erform only a preliminary in vestigation on how the length of the segment affects accuracy . An interest- ing research direction w e would lik e to pursue is to study the applicability of Sundial to differen t deploymen ts. Sp ecifically , we are interested in understand- ing how sampling frequency , segment length, latitude and season (time of y ear) collectiv ely affect reconstruction accuracy . Sundial exploits the correlation b etw een the w ell-understo o d solar mo del and the measurements obtained from inexp ensive light sensors. In principle, an y mo dality having a well-understoo d mo del can b e used as a replacement for Sundial. In the absence of a model, one can exploit correlation from a trusted data source to achiev e reconstruction, e.g., correlating the ambien t temp erature measuremen t b etw een the motes with data obtained from a nearb y weather station. How ever, we note that many mo dalities (such as am bient temp erature) can be highly susceptible to micro-climate effects and exhibit a high degree a spatial and temp oral v ariation. Thus, the micro-climate in v ariant solar mo del mak es ligh t a robust mo dality to reconstruct timestamps in the absence of an y sampled anchor p oints. Finally , we w ould like to emphasize the observ ation that most environmen tal mo dalities are affected b y the diurnal and annual solar cycles and not by the h uman-created univ ersal time. In this regard, the time base that Sundial estab- lishes offers a more natural reference basis for environmen tal measuremen ts. Ac kno wledgments W e would like to thank Y ulia Savv a (JHU, Department of Earth and Planetary Science) for helping us identify the timestamp reconstruction problem. This re- searc h was supp orted in part b y NSF grants CNS-0546648, CSR-0720730, and DBI-0754782. Any opinions, finding, conclusions or recommendations expressed in this publication are those of the authors and do not represent the p olicy or p osition of the NSF. References 1. Baltimore-W ashington In ternational airp ort, w eather station. Av ailable at: http://weather.marylandweather.com/cgi- bin/findweather/getForecast? query=BWI . 2. M. Chang, C. Cornou, K. Madsen, and P . Bonett. Lessons from the Hogthrob Deplo yments. In Pr o c e e dings of the Se c ond International Workshop on Wir eless Sensor Network Deployments (WiDeploy08) , June 2008. 3. Crossb o w Corp oration. MICAz Sp ecifications. Av ailable at http://www.xbow. com/Support/Support_pdf_files/MPR- MIB_Series_Users_Manual.pdf . 4. R. Duda, P . Hart, and D. Stork. Pattern Classific ation . Wiley , 2001. 5. R. O. Duda and P . E. Hart. Use of the Hough transformation to detect lines and curv es in pictures. Commun. ACM , 15(1):11–15, 1972. 6. W. C. F orsythea, E. J. Rykiel Jr., R. S. Stahla, H. W ua, and R. M. Schoolfield. A mo del comparison for daylength as a function of latitude and day of y ear. Sci- enc eDir e ct , 80(1), Jan. 1994. 7. National Estuarine Research Reserve. Jug Bay weather station (cbmjbwq). Av ail- able at http://cdmo.baruch.sc.edu/QueryPages/anychart.cfm . 8. J. P olastre, R. Szew czyk, and D. Culler. T elos: Enabling Ultra-Lo w P o wer Wireless Researc h. In Pr o c e e dings of the F ourth International Confer enc e on Information Pr o c essing in Sensor Networks: Sp e cial tr ack on Platform T o ols and Design Meth- o ds for Network Emb e dde d Sensors (IPSN/SPOTS) , Apr. 2005. 9. G. Salton, A. W ong, and C. S. Y ang. A vector space model for automatic indexing. Commun. ACM , 18(11):613–620, 1975. 10. K. Szlav ecz, A. T erzis, R. Musaloiu-E., C.-J. Liang, J. Cogan, A. Szalay , J. Gup ch up, J. Klofas, L. Xia, C. Swarth, and S. Matthews. T urtle Nest Monitor- ing with Wireless Sensor Netw orks. In Pro c e e dings of the Americ an Ge ophysic al Union, F al l Me eting , 2007. 11. A. T erzis, R. Musaloiu-E., J. Cogan, K. Szlav ecz, A. Szalay , J. Gray , S. Ozer, M. Liang, J. Gupch up, and R. Burns. Wireless Sensor Net works for Soil Science. International Journal on Sensor Networks . 12. G. T olle, J. Polastre, R. Szewczyk, N. T urner, K. T u, P . Buonadonna, S. Burgess, D. Gay , W. Hong, T. Dawson, and D. Culler. A Macroscop e in the Redwoo ds. In Pr o c e e dings of the 3 rd ACM SenSys Confer enc e , Nov. 2005. 13. G. W erner-Allen, K. Lorincz, J. Johnson, J. Lees, and M. W elsh. Fidelit y and Yield in a V olcano Monitoring Sensor Netw ork. In Pr o c e e dings of the 7th USENIX Symp osium on Op er ating Systems Design and Implementation (OSDI) , Nov. 2006.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment