A Static Optimality Transformation with Applications to Planar Point Location

Over the last decade, there have been several data structures that, given a planar subdivision and a probability distribution over the plane, provide a way for answering point location queries that is fine-tuned for the distribution. All these method…

Authors: John Iacono, Wolfgang Mulzer

A Static Optimality Transformation with Applications to Planar Point   Location
A Static Optimalit y T ransformation with Applications to Planar P oin t Lo cation John Iacono W olfgang Mulzer Marc h 2, 2022 Abstract Ov er the last decade, there hav e b een several data structures that, given a planar sub division and a probability distribution ov er the plane, provide a w ay for answ ering p oin t lo cation queries that is fine-tuned for the distribution. All these metho ds suffer from the requirement that the query distribution m ust b e known in adv ance. W e presen t a new data structure for p oin t location queries in planar triangulations. Our structure is asymptotically as fast as the optimal structures, but it requires no prior information ab out the queries. This is a 2-d analogue of the jump from Knuth’s optim um binary search trees (discov ered in 1971) to the splay trees of Sleator and T arjan in 1985. While the former need to know the query distribution, the latter are static al ly optimal . This means that we can adapt to the query sequence and achiev e the same asymptotic performance as an optimum static structure, without needing any additional information. 1 In tro duction W e consider the problem of finding a statically optimal data structure for planar p oint lo cation in triangulations. This problem and related problems ha ve a long history that go es bac k to the da wn of computer science. Thus, b efore giving a formal description of the problem and of our results, let us first pro vide some background on the history and motiv ation b ehind our w ork. 1.1 1-D History Comparison-based predecessor search constitutes one of the oldest problems in computer science: giv en a set S from a totally ordered universe U , w e would lik e to construct a data structure for answ ering pr e de c essor queries . In suc h a query , we are given an element x ∈ U , and w e need to return the largest y ∈ S with y ≤ x (or −∞ , if no such y exists). In the most general de cision-tr e e mo del, we are allo wed to ev aluate in eac h step an arbitr ary function f : U → { 0 , 1 } on x , where the c hoice of f ma y dep end on the outcomes of the previous ev aluations. The classic solution sorts S during prepro cessing and answers queries in O ( log n ) steps through binary searc h, where n denotes the size of S . Information theoretic arguments imply that any such comparison-based algorithm requires Ω(log n ) steps in the worst case (see, e.g., Ailon et al. [2, Section 2] for more details). Ho wev er, the story do es not end here. Early in the history of computer science, researchers realized that if the distribution of query outcomes is sufficien tly biased, o ( log n ) exp ected-time query pro cessing b ecomes p ossible. This insigh t led to the inv en tion of optimal se ar ch tr e es . These are sp ecialized data structures for the case that the query outcomes are drawn indep enden tly from a 1 kno wn fixed distribution, and a wide literature studying their v ariants and extensions hav e b een dev elop ed [10, 22 – 24, 28, 33, 35 – 37, 41 – 44, 51, 52]. In this context, optimality is characterized by the entr opy of the distribution: if p i denotes the probability of the i th outcome, the entrop y H is defined as P i − p i log 2 p i . Information theory [48] shows that H is a low er b ound for the exp ected num b er of steps that any comparison-based algorithm needs to answer a predecessor query , assuming that the searc hes are drawn indep enden tly from a fixed distribution (e.g., [2, Claim 2.2]). All the ab o ve results require that the distribution, or a suitable approximation thereof, b e known in adv ance. This situation changed in 1985, when Sleator and T arjan [49] in tro duced splay tr e es . These trees ha ve many amazing prop erties, not the least of which is called static optimality . This means that for any sufficiently long query sequence, splay trees are asymptotically as fast as optimal static searc h trees. F or this, spla y trees require no prior information on the query distribution. 1.2 2-D History Planar p oint lo cation is a fundamental problem in computational geometry . A triangulation S is a partition of the plane into (p ossibly infinite) triangles. Giv en S , we need to construct a data structure for p oint lo c ation queries : giv en a p oin t p ∈ R 2 , return the triangle of S that c on tains it. Again, we use a decision-tree mo del. This means that in each step w e may ev aluate an arbitrary function f : R 2 → { 0 , 1 } on p , where f may dep end on the previous comparisons. There are several p oin t lo cation structures with O ( log n ) query time, which is optimal in our decision-tree mo del. These structures are notable not only for ac hieving optimality , but for doing so through v ery different methods, such as planar separators [39, 40], Kirkpatric k’s successive refinemen t approac h [34], p ersistence [46], lay ered DA Gs [20], or randomized incremental construction [45, 47]. Once again, it makes sense to consider biased query distributions. F or a kno wn fixed distribution of p oin t lo cation queries, there are several data structures that ac hieve optimal exp ected query time, assuming indep endence. These biase d structures are analogous to optimal search trees. Th us, we can use the same information theoretic arguments to characterize the optimal exp ected query time b y the entrop y H of the probabilities of the queried regions [2, Claim 2.2]. A series of pap ers by Arya et al. [3 – 8] conv erge on t wo algorithms. The first one achiev es query time H + O ( √ H + 1) with O ( n ) space, while the second, simpler, algorithm supp orts queries in time (5 ln 2) H + O (1) and O ( n log n ) space. 1 The latter algorithm is a truly simple v ariant of randomized incremen tal construction [45, 47], where the random choices are biased according to the distribution. Both structures are randomized and hav e sup erlinear construction costs. Iacono [31] presented a data structure that supp orts O ( H ) time queries in O ( n ) space, but, unlik e the aforementioned results, it is deterministic, can b e constructed in linear time, and has terrible constants. 1.3 Creating a p oin t lo cation structure that is statically optimal In view of the developmen ts for binary search trees, one question presents itself: Is there a p oin t lo cation structure that is asymptotically as fast as the biased structures, without explicit knowledge of the query distribution? Or, put differently , can a p oin t lo cation structure achiev e a running time similar to the static optimality b ound of splay trees? This op en problem, which we resolve here, explicitly app ears in several previous works on p oin t lo cation, e.g., in Arya et al. [8, Section 6]: 1 In this context, query time refers to the exp ected depth of the asso ciated decision tree. 2 T aking this in a different direction, supp ose that the query distribution is not known at all. That is, the probabilities that the query p oin t lies within the v arious cells of the sub division are unknown. In the 1-dimensional case it is known that there exist self-adjusting data structures, such as splay trees, that achiev e go o d exp ected query time in the limit. Do suc h self-adjusting structures exist for planar p oin t lo cation? There are several p ossible approaches tow ards statically optimal p oin t lo cation. One, suggested ab o v e, would b e to create some sort of self-adjusting p oin t lo cation structure and to analyze it in a w ay similar to splay trees. This has not b een done; we susp ect that the main stumbling blo c k is that all known efficient structures are c omp arison DA Gs [20, 34, 45 – 47]: they can b e represented as a directed acyclic graph with a unique source and out-degree 2, such that each no de corresp onds to a planar region. A p oin t lo cation query pro ceeds by starting at the source and by following in eac h step an edge that is determined by comparing the query p oin t with a fixed line. The query con tinues until it reac hes a sink, whose corresp onding region constitutes the desired query outcome. In order to ac hieve reasonable space usage, it seems essential to use a DA G instead of a simple tree. Unfortunately , w e do not kno w how to p erform rotation-like lo cal changes in suc h DA Gs that would mimic the b eha vior of splay trees. Another p ossible av en ue is to use splay trees in an existing structure. Go odrich et al. [26] follo wed this approach, using essentially a hybrid of spla y trees and the p ersisten t line-sweep metho d. Unfortunately , their metho d do es not give a result optimal with resp ect to the entrop y of the original distribution of query outcomes, but rather to the entrop y of the probabilities of querying regions of a strip de c omp osition of the triangulation. The latter is obtained b y drawing vertical lines through ev ery p oin t of the triangulation. This strip decomp osition could split a high-probability triangle in to several parts and could p oten tially increase the en tropy of the query result by Ω( log n ), the w orst p ossible; see Figure 1 for an example. . . . . . . . . . . . . n + 1 v ertices n + 1 v ertices (b) (a) Figure 1: A bad example for the strip decomp osition of [26]: (a) W e hav e n + 3 v ertices and n + 1 triangles. The small triangles each hav e query probability 1 /n 2 , the large, shaded, triangle has query probabilit y 1 − 1 /n . The entrop y is (1 /n ) log n 2 + (1 − 1 /n ) log ( n/ ( n − 1)) = O (1). (b) The strip decomp osition partitions the large triangle in to n + 1 parts. Supp ose each part has probabilit y ( n − 1) /n ( n + 1) ≈ 1 /n . T he resulting en tropy is larger than (1 − 1 /n ) log ( n ( n + 1) / ( n − 1)) = Ω( log n ). One might also try to create a structure with the working set pr op erty . This prop ert y , originally used in the analysis of splay trees, states that the pro cessing time for a query q is logarithmic in the n umber of distinct queries since the last query that returned the same result as q . The w orking set prop erty implies static optimalit y [30]; it has also prov ed useful in sev eral other con texts [15, 16, 21, 29, 32]. Most imp ortantly , there is a general transformation from a dynamic 3 O ( log n ) time structure into one with the working set prop ert y [30]. Unfortunately , even though sev eral dynamic data structures for predecessor searching are known (e.g., A VL trees [1] or red-black trees [27]), it remains a prominent op en problem to develop a p oin t lo cation structure that supp orts insertions, deletions, and queries in O ( log n ) time. (Note that [50] claims to m odify Kirkpatrick’s metho d to allow for O ( log n ) time insertions, deletions, and queries. The claimed result is wrong. 2 ) Our solution to the problem of statically optimal point lo cation is v ery simple: w e tak e a biased structure that needs to b e initialized with distributional information, and w e rebuild it p eriodically using the observed frequencies for eac h region. W e do not store all the regions in the biased structure—this w ould make the rebuilding step to o exp ensiv e. Instead, up on rebuilding we create a structure storing only the n β most frequent items observed so far, where β ∈ (0 , 1) is some suitable constant. W e resort to a static O ( log n ) time structure to complete queries for the remaining regions. The rebuilding takes place after ev ery n α queries, for some constant α ∈ ( β , 1 − β ). This is a simple and general metho d of conv erting biased structures into statically optimal ones, and it enables us to waiv e the requirement of distributional knowledge present in all previous biased p oin t lo cation structures, at least for triangulations. Our approach can b e seen as a generalization and simplification of a metho d by Go o dric h for dictionaries [25]. 2 Notation Let U b e som e universal set, and let S b e a partition of U in to n pieces. The elemen ts of U are called p oints , the subsets in S are called r e gions . A lo c ation query tak es some p oin t p ∈ U and returns the region s ∈ S with p ∈ s . The result of a lo cation query with input p is denoted by q ( p ). A data structure for lo cation queries is called a lo c ation query structur e . Let P = h p 1 , p 2 , . . . , p m i b e a sequence of m queries, and let Q := h q ( p 1 ) , q ( p 2 ) , . . . , q ( p m ) i denote the results of these queries. Let f t ( s ) b e the num b er of o ccurrences of s in the first t elemen ts of Q , and define f ( s ) := f m ( s ), the n umber of times s o ccurs in the entire sequence. F urthermore, let t j ( s ) b e the time of the j th o ccurrence of s in Q ; thus f t j ( s ) ( s ) = j . W e use log x to refer to max (1 , log 2 x ); this av oids clutter generated by additive terms that w ould otherwise b e needed to handle degenerate cases of our analysis. W e next define the notion of a biased structure. Definition 2 L et S b e set of n r e gions, and let D b e a lo c ation query structur e for S . We say that D is biased if the fol lowing holds: Ther e exists a function c D : N → N such that given any weight function w : S → R + , D exe cutes any query se quenc e P in total time O c D ( n ) + X s ∈ S f ( s ) log P r ∈ S w ( r ) w ( s ) ! . The function c D is c al le d the construction cost of the structur e. Supp ose we c ho ose w ( s ) prop ortional to the n umber of queries that return the given region, e.g., w ( s ) := f ( s ) + 1. In this case, a biased lo cation query structure achiev es an amortized query time that is (of the order of ) the entrop y H of the query distribution. As w e argued in the introduction, this is optimal for our decision-tree mo del. W e no w define the notion of static optimalit y . 2 The metho d presented makes the assumption that given a triangle T in a triangulation of size n on which a Kirpatric k hierarc hy has been built, the complexity of the intersection of T with any level of the hierarch y is constant; this is false as examples where the intersection is size √ n are easy to pro duce. 4 Definition 3 L et S b e set of n r e gions, and let D b e a lo c ation query structur e for S . We say that D is statically optimal if ther e exists a function c D : N → N such that D exe cutes any query se quenc e P of length m in total time 3 O c D ( n ) + X s ∈ S f ( s ) log m f ( s ) ! . We c al l c D the construction cost of D . Note that a statically optimal structure is giv en neither the frequency function f nor any weigh ts in adv ance, in particular, the structure do es not need to b e static. W e provide a simple metho d for making a biased lo cation query structure statically optimal, assuming a few technical conditions. The main such condition is that we should b e able to construct the biased query structure not just on the set S , but on any subset S 0 of S . W e require that a lo cation query structure for S 0 p erforms as quickly as a biased structure for S when a region in S 0 is queried, and that it rep orts failure in O (log n ) time if the query lies outside of S 0 . F ormally: Definition 4 L et S b e set of n r e gions, and let D b e a lo c ation query structur e. We c al l D subset- biased on S if the fol lowing holds: ther e exists a function c 0 D : N → N such that given a subset S 0 ⊆ S of size n 0 and a weight function w 0 : S 0 → R + , the structur e D exe cutes any query se quenc e P of length m in time O c 0 D ( n 0 ) + X s 0 ∈ S 0 f ( s 0 ) log P r 0 ∈ S 0 w 0 ( r 0 ) w 0 ( s 0 ) + m − X s 0 ∈ S 0 f ( s 0 ) ! log n ! . F or e ach query p ∈ P , we r e quir e that D r ep orts the r e gion s 0 ∈ S 0 with p ∈ s 0 , if it exists, and that D r ep orts a failur e otherwise. The function c 0 D is c al le d the construction cost of the structur e. Note that m − P s 0 ∈ S 0 f ( s 0 ) is just the num b er of queries that result in failure. Given Definition 4, w e may now state our main theorem: Theorem 5 L et S b e a set of n r e gions. Supp ose we have an O ( log n ) time lo c ation query struc- tur e on S with c onstruction c ost O ( n ) and a subset-biase d structur e on S with c onstruction c ost O ( n 0 log n 0 ) . Then we c an c onstruct a static al ly optimal structur e on S with c onstruction c ost O ( n ) . 3 The transformation W e now describ e the construction for Theorem 5. By assumption, w e are given a set S of n regions, and w e hav e a v ailable an O ( log n ) time lo cation query structure D on S with construction cost O ( n ) as w ell as a subset-biased structure with construction cost O ( n 0 log n 0 ). 3 By conv en tion, f ( s ) log( m/f ( s )) := 0 if f ( s ) = 0. 5 3.1 Description of the structure Let α and β b e tw o constants such that 0 < β < α < 1 − β < 1 (e.g., α = 1 / 2 and β = 1 / 3). The simple idea b ehind our transformation is as follows: after ev ery n α queries, we build a subset-biased structure for the n β most commonly accessed regions, in O ( n β log n ) = o ( n α ) time. W e also k eep a static O ( log n ) time structure as a bac kup for failed queries in the subset-biased structure. F ormally , the structure has several parts: 1. A static O (log n ) query time structure. 2. A structure that keeps track of ho w often eac h region w as queried and that is capable of rep orting the k most p opular regions in O ( k ) time. Since in each step we increment the count for a single region by 1, we can easily maintain such a structure in linear space and constant time p er up date. (The additional space ov erhead can b e made sublinear at the exp ense of determinism thorough the use of a streaming algorithm for the so-called he avy hitters problem (e.g., [12]). This shows that our transformation is also useful in a context where additional space is at a premium, for example for implicit data structures or when the data resides in read-only memory [9]). 3. A subset-biased structure D 0 that is built after 2 n α queries and rebuilt every n α th query thereafter. The structure con tains the at most n β most p opular regions at the time of the rebuilding that hav e b een queried at least 2 n α times. In the choice of these regions, we break ties arbitrarily . The w eight of a region s , denoted w 0 ( s ), is the n umber of queries to s at the time of the rebuilding. More precisely , if the rebuilding is at time t , we set w 0 ( s ) := f t ( s ). Computing the n β most p opular regions and the weigh t function w 0 tak es time O ( n β ) with the structure from Part 2. By assumption, the construction cost of D 0 is O ( n β log n ). A searc h is executed on the subset-biased structure first. If it fails (at amortized cost O (log n )), it is executed in the static O (log n ) time structure. 3.2 Initial analysis of structure W e will now analyze the prop erties of our structure. Our first lemma describ es a key prop ert y of the rebuilding pro cess: for an y sufficiently p opular region s , the amortized query time for s is prop ortional to the amortized query time a biased structure w ould achiev e if it were weigh ted with the frequencies observ ed so far. Lemma 6 Consider the query p t at time t , and let s := q ( p t ) denote the r esulting r e gion. Supp ose that f t ( s ) ≥ 2 n α . Then the amortize d c ost for query p t is O  log t f t ( s ) − n α  . Pro of: Since f t ( s ) ≥ 2 n α , we ha ve t ≥ 2 n α . Thus, we first query the subset-biased structure D 0 . Supp ose that D 0 has b een rebuild last at time t 0 ≥ t − n α . There are tw o cases. Supp ose first that s is contained in D 0 . Definition 4 ensures that the amortized time for the query in D 0 is O ( log ( W 0 /f t 0 ( s ))), where W 0 denotes the total n umber of queries for the regions in 6 D 0 at time t 0 . W e hav e W 0 ≤ t (there hav e b een t queries so far) and f t 0 ( s ) ≥ f t ( s ) − n α (there hav e b een at most n α queries since rebuilding). The lemma follo ws. No w supp ose that s is not in D 0 . In this case, the query takes O ( log n ) amortized time in D 0 and O ( log n ) time in the static structure. W e know that at time t 0 , there were n β regions at least as p opular as s . Thus, n β f t 0 ( s ) ≤ t 0 ≤ t . It follows that β log n = log n β ≤ log t f t 0 ( s ) ≤ log t f t ( s ) − n α , and the claimed b ound suffices to account for the O (log n ) query time. 2 Using Lemma 6, we can now b ound the running time in terms of the query frequencies. Lemma 7 L et S b e a set of n r e gions. Our structur e exe cutes any query se quenc e P on S of length m in time O        X s ∈ S        min( f ( s ) , 2 n α ) log n | {z } first 2 n α queries to s + f ( s ) X j =2 n α log t j ( s ) f t j ( s ) ( s ) − n α | {z } queries to s after the 2 n α th        + j m n α k n β log n | {z } r ebuild biase d structur e + n |{z} static structur e c onstruction        . Pro of: The main summation is o ver the regions in S . F or each region s , the initial 2 n α (or less) queries take time O ( log n ), since during these queries s is never in the subset-biased structure. The running times for the remaining queries (if any) are b ounded using Lemma 6. The first additional term comes from the O ( n β log n ) construction cost of the subset-biased structure, incurred every n α op erations. The final term is the linear one-time cost to build the static structure. 2 3.3 T echnical Lemmas In order to simplify the b ound in Lemma 7, w e need t wo technical lemmas to deal with the v arious terms. The first lemma sho ws how to simplify the summation for the later queries. Lemma 8 L et S b e a set of n r e gions, and let P b e a query se quenc e on S of length m . F or e ach r e gion s ∈ S , we have f ( s ) X j =2 n α log t j ( s ) f t j ( s ) ( s ) − n α ≤ f ( s )  3 + log m f ( s )  . Pro of: Since f t j ( s ) ( s ) = j ≥ 2 n α , w e hav e f t j ( s ) ( s ) − n α ≥ j / 2. Also, t j ( s ) ≤ m . Th us, f ( s ) X j =2 n α log t j ( s ) f t j ( s ) ( s ) − n α ≤ f ( s ) X j =1 log m j / 2 = log (2 m ) f ( s ) f ( s )! ≤ log (2 em ) f ( s ) f ( s ) f ( s ) ≤ f ( s )  3 + log m f ( s )  . Here, w e used Stirling’s formula to b ound f ( s )! ≥ ( f ( s ) /e ) f ( s ) . 2 The second lemma deals with the time for the initial queries. 7 Lemma 9 L et γ b e a c onstant with α < γ < 1 . If m ≥ n γ , then min( f ( s ) , 2 n α ) log n = O  f ( s ) log m f ( s )  . Pro of: Set δ := ( α + γ ) / 2. If f ( s ) ≤ n δ , the lemma holds since f ( s ) log( m/f ( s )) ≥ f ( s ) log( m/n δ ) = Ω( f ( s ) log n ) = Ω(min( f ( s ) , 2 n α ) log n ) . If f ( s ) > n δ , then f ( s ) log( m/f ( s )) > n δ ≥ 2 n α log n, for n large enough, as desired (recall that we defined log x to b e at least 1). 2 3.4 Main theorem W e can now prov e our main theorem. Pro of: [of Theorem 5] By Definition 3, we need to prov e that the execution time is O n + X s ∈ S f ( s ) log m f ( s ) ! . By Lemma 7, the running time is b ounded b y O   X s ∈ S   min( f ( s ) , 2 n α ) log n + f ( s ) X j =2 n α log t j ( s ) f t j ( s ) ( s ) − n α   + j m n α k n β log n + n   . W e now apply Lemma 8 and note that  m n α  n β log n = o ( m ) to obtain a running time b ound of O X s ∈ S  min( f ( s ) , 2 n α ) log n + f ( s )  3 + log m f ( s )  + n + m ! . Since w e defined log x to b e at least 1, this simplifies to O X s ∈ S  min( f ( s ) , 2 n α ) log n + f ( s ) log m f ( s )  + n ! . If m ≤ n 1 − β , the sum ov er s ∈ S is at most n 1 − β log n = o ( n ). In this case, the b ound simplifies to O ( n ), and the theorem is pro ved. Otherwise, if m > n 1 − β , Lemma 9 applies with γ := 1 − β (a legal choice by our assumption on α and β ), and the term min ( f ( s ) , 2 n α ) log n collapses into f ( s ) log( m/f ( s )) to give the theorem. 2 8 4 P oin t lo cation Theorem 10 Ther e is a data structur e for p oint lo c ation in a planar triangulation of size n that c an exe cute any query se quenc e of length m in time O n X s ∈ S f ( s ) log m f ( s ) + n ! . Pro of: It is easy to apply our general transformation to the problem of planar p oin t lo cation in a triangulation, as all of the required ingredients are well known. W e assume that the triangulation is giv en in a standard representation, such as a doubly-connected edge list (e.g., [11, § 2.2]). F or the static structure with O ( log n ) query time and O ( n ) construction time, Kirkpatrick’s algorithm [34] can b e used. F or the subset-biased structure, the pro vided subset of n β triangles may not b e a connected triangulation and thus needs to b e triangulated; this takes time O ( n β log n ) using the classic line sw eep approach [38]. This creates O ( n β ) new triangles, whic h are marked sp ecially and given small weigh ts. The resultant triangulation and weigh ting is given to a biased structure suc h as Iacono’s [31]. The marking can b e used to detect whether a query to the subset-biased structure w as successful. With all ingredients in hand, the claim now follows from Theorem 5. 2 Our choice of structures reflects a desire for the strongest asymptotic b ounds p ossible. Th us, w e hav e a voided structures that are randomized or that ha ve non-linear construction cost; such structures, ho wev er, ha ve far sup erior constan ts than the ones w e use. If we to ok a data structure for the static O ( log n ) time queries with an O ( n log n ) construction cost instead of O ( n ), this would simply c hange the linear additive term in Theorem 10 to n log n . 5 P oin t lo cation in p olygonal sub divisions with non-constan t sized cells Our work applies to p oin t lo cation in triangulations. It can also b e extended to p olygonal sub divisions where eac h region has constan t complexity . Indeed, supp ose ev ery region has k + 2 edges. W e can just triangulate each region and then apply our result. As men tioned in the introduction, this op eration could increase the entrop y of the query outcomes. Ho wev er, the lo g sum ine quality [19, Theorem 2.7.1] implies that P k i =1 p i log (1 /p i ) ≤ p log ( k /p ) for an y nonnegativ e p 1 , p 2 , . . . , p k and p = P k i =1 p k . Thus, if we sub divide a region with probability p in to k triangles, the entrop y increases by at most p log k . It follo ws that the ov erall entrop y grows by at most log k , which is acceptable if k is constan t. Recen tly , sev eral data structures hav e b een dev elop ed for optimal p oint lo cation where the distribution is known in adv ance for conv ex connected [17], connected [18], and arbitrary p olygonal [14] sub divisions of the plane, as well as the more general o dds-on trees [13]. Unfortunately , these structures are not biased according to our definition, since en tropy-based low er b ounds are not meaningful for them: a conv ex k -gon splits the plane into tw o regions, so the entrop y of the query outcomes is constan t. Nonetheless, some distributions require Ω( log n ) time for a p oin t lo cation query (in a reasonable mo del of computation that is more restrictiv e than the one describ ed here). The entrop y-sensitive structures for non-triangulations all basically work by triangulating the giv en sub division as a function of the provided probability distribution, and then using one of the biased structures on the resultant triangulation. The main conceptual problem in using our framew ork with such a structure is that it is unclear how to triangulate during the rebuilding pro cess, 9 since the optimal triangulation is not known in adv ance. One could imagine that triangulating during each rebuild based on the observed queries so far would w ork well, but proving this w ould require a more complex and sp ecialized analysis than what has b een presented in this pap er. Ac kno wledgmen ts The second author would lik e to thank Pat Morin for suggesting the problem to him, for stim ulating discussions on the sub ject, and for hosting him during a w onderful stay at the Computational Geometry Lab at Carleton Universit y . W e would also like to thank the anonymous referees for insigh tful comments that help ed impro ve the presentation of the pap er. References [1] G. M. Adel 0 son-V el 0 ski ˘ ı and E. M. Landis. An algorithm for organization of information. Dokl. A kad. Nauk SSSR , 146:263–266, 1962. [2] N. Ailon, B. Chazelle, K. L. Clarkson, D. Liu, W. Mulzer, and C. Seshadhri. Self-improving algorithms. SIAM J. Comput. , 40(2):350–375, 2011. [3] S. Arya, S.-W. Cheng, D. M. Mount, and R. Hariharan. Efficien t exp ected-case algorithms for planar p oin t lo cation. In Pr o c. 7th Sc andinavian Workshop on Algorithm The ory (SW A T) , v olume 1851 of L e ctur e Notes in Computer Scienc e , pages 353–366. Springer-V erlag, 2000. [4] S. Arya, T. Malamatos, and D. M. Mount. Nearly optimal exp ected-case planar p oin t lo cation. In Pr o c. 41st Annu. IEEE Symp os. F ound. Comput. Sci. (FOCS) , pages 208–218, 2000. [5] S. Arya, T. Malamatos, and D. M. Mount. En tropy-preserving cuttings and space-efficien t planar p oin t lo cation. In Pr o c. 12th Annu. ACM-SIAM Symp os. Discr ete Algorithms (SODA) , pages 256–261, 2001. [6] S. Arya, T. Malamatos, and D. M. Moun t. A simple en tropy-based algorithm for planar p oin t lo cation. In Pr o c. 12th Annu. ACM-SIAM Symp os. Discr ete Algorithms (SODA) , pages 262–268, 2001. [7] S. Arya, T. Malamatos, and D. M. Mount. A simple entrop y-based algorithm for planar p oin t lo cation. A CM T r ans. A lgorithms , 3(2):Art. 17, 17 pp, 2007. [8] S. Ary a, T. Malamatos, D. M. Mount, and K. C. W ong. Optimal exp ected-case planar p oin t lo cation. SIAM J. Comput. , 37(2):584–610, 2007. [9] T. Asano, W. Mulzer, G. Rote, and Y. W ang. Constan t-work-space algorithms for geometric problems. JoCG , 2(1):46–68, 2011. [10] S. W. Bent, D. D. Sleator, and R. E. T arjan. Biased searc h trees. SIAM J. Comput. , 14(3):545– 568, 1985. [11] M. de Berg, O. Cheong, M. v an Kreveld, and M. Ov ermars. Computational Ge ometry: A lgorithms and Applic ations . Springer-V erlag, Berlin, third edition, 2008. 10 [12] R. Berinde, P . Indyk, G. Cormo de, and M. J. Strauss. Space-optimal heavy hitters with strong error b ounds. A CM T r ans. Datab ase Syst. , 35(4):Art. 26, 28 pp, 2010. [13] P . Bose, L. Devroy e, K. Dou ¨ ıeb, V. Dujmovic, J. King, and P . Morin. Odds-on trees. arXiv:1002.1092 , 2010. [14] P . Bose, L. Devro y e, K. Dou ¨ ıeb, V. Dujmo vic, J. King, and P . Morin. Poin t lo cation in disconnected planar sub divisions. , 2010. [15] P . Bose, K. Dou ¨ ıeb, V. Dujmo vic, and J. How at. Lay ered w orking-set trees. In Pr o c. 9th L atin Americ an The or etic al Informatics Symp osium (LA TIN) , v olume 6034 of L e ctur e Notes in Computer Scienc e , pages 686–696. Springer-V erlag, 2010. [16] P . Bose, K. Dou ¨ ıeb, and S. Langerman. Dynamic optimality for skip lists and B-trees. In Pr o c. 19th A nnu. ACM-SIAM Symp os. Discr ete Algorithms (SODA) , pages 1106–1114, 2008. [17] S. Collette, V. Dujmovic, J. Iacono, S. Langerman, and P . Morin. Distribution-sensitiv e p oin t lo cation in con vex sub divisions. In Pr o c. 19th Annu. ACM-SIAM Symp os. Discr ete Algorithms (SOD A) , pages 912–921, 2008. [18] S. Collette, V. Dujmovic, J. Iacono, S. Langerman, and P . Morin. En tropy , triangulation, and p oin t lo cation in planar sub divisions. , 2009. [19] T. M. Co ver and J. A. Thomas. Elements of Information The ory . Wiley-In terscience, second edition, 2006. [20] H. Edelsbrunner, L. J. Guibas, and J. Stolfi. Optimal p oin t lo cation in a monotone sub division. SIAM J. Comput. , 15(2):317–340, 1986. [21] A. Elmasry . A priorit y queue with the working-set prop ert y . Internat. J. F ound. Comput. Sci. , 17(6):1455–1465, 2006. [22] G. N. F rederic kson. Implicit data structures for weigh ted elements. Inform. and Contr ol , 66(1-2):61–82, 1985. [23] M. L. F redman. Two applications of a probabilistic search tec hnique: Sorting x + y and building balanced search trees. In Pr o c. 7th Annu. ACM Symp os. The ory Comput. (STOC) , pages 240–244, 1975. [24] A. M. Garsia and M. L. W achs. A new algorithm for minim um cost binary trees. SIAM J. Comput. , 6(4):622–642, 1977. [25] M. T. Go o dric h. Comp etitiv e tree-structured dictionaries. In Pr o c. 11th Annu. ACM-SIAM Symp os. Discr ete Algorithms (SODA) , pages 494–495, 2000. [26] M. T. Go odrich, M. Orletsky , and K. Ramaiyer. Metho ds for achieving fast query times in p oin t lo cation data structures. In Pr o c. 8th Annu. ACM-SIAM Symp os. Discr ete Algorithms (SOD A) , pages 757–766, 1997. [27] L. J. Guibas and R. Sedgewick. A dichromatic framework for balanced trees. In Pr o c. 19th A nnu. IEEE Symp os. F ound. Comput. Sci. (FOCS) , pages 8–21, 1978. 11 [28] T. C. Hu and A. C. T uck er. Optimal computer search trees and v ariable-length alphab etical co des. SIAM J. Appl. Math. , 21:514–532, 1971. [29] J. Iacono. Improv ed upp er b ounds for pairing heaps. In Pr o c. 7th Sc andinavian Workshop on A lgorithm The ory (SW A T) , v olume 1851 of L e ctur e Notes in Computer Scienc e , pages 32–45. Springer-V erlag, 2000. [30] J. Iacono. Alternativ es to splay trees with O ( log n ) worst-case access times. In Pr o c. 12th A nnu. ACM-SIAM Symp os. Discr ete Algorithms (SOD A) , pages 516–522, 2001. [31] J. Iacono. Exp ected asymptotically optimal planar p oint lo cation. Comput. Ge om. The ory Appl. , 29(1):19–22, 2004. [32] J. Iacono. Key-indep enden t optimality . A lgorithmic a , 42(1):3–10, 2005. [33] J. H. Kingston. A new pro of of the Garsia-Wachs algorithm. J. Algorithms , 9(1):129–136, 1988. [34] D. Kirkpatrick. Optimal search in planar sub divisions. SIAM J. Comput. , 12(1):28–35, 1983. [35] D. E. Kn uth. Optimum binary search trees. A cta Inform. , 1:14–25, 1971. [36] J. F. Korsh. Greedy binary search trees are nearly optimal. Inform. Pr o c ess. L ett. , 13(1):16–19, 1981. [37] H.-P . Kriegel and V. K. V aishnavi. W eigh ted multidimensional B-trees used as nearly optimal dynamic dictionaries. In Pr o c. 10th Symp osium on Mathematic al F oundations of Computer Scienc e (MFCS) , volume 118 of L e ctur e Notes in Computer Scienc e , pages 410–417. Springer- V erlag, 1981. [38] D. T. Lee and F. P . Preparata. Location of a p oin t in a planar sub division and its applications. SIAM J. Comput. , 6(3):594–606, 1977. [39] R. J. Lipton and R. E. T arjan. A separator theorem for planar graphs. SIAM J. Appl. Math. , 36(2):177–189, 1979. [40] R. J. Lipton and R. E. T arjan. Applications of a planar separator theorem. SIAM J. Comput. , 9(3):615–627, 1980. [41] K. Mehlhorn. Nearly optimal binary search trees. A cta Inform. , 5(4):287–295, 1975. [42] K. Mehlhorn. A b est p ossible b ound for the weigh ted path length of binary search trees. SIAM J. Comput. , 6(2):235–239, 1977. [43] K. Mehlhorn. Dynamic binary search. SIAM J. Comput. , 8(2):175–198, 1979. [44] K. Mehlhorn. Arbitrary w eight changes in dynamic trees. RAIRO Inform. Th´ eor. , 15(3):183–211, 1981. [45] K. Mulmuley . A fast planar partition algorithm. I. J. Symb olic Comput. , 10(3-4):253–280, 1990. [46] N. Sarnak and R. E. T arjan. Planar p oin t lo cation using p ersisten t searc h trees. Commun. A CM , 29(7):669–679, 1986. 12 [47] R. Seidel. A simple and fast incremental randomized algorithm for computing trap ezoidal decomp ositions and for triangulating p olygons. Comput. Ge om. The ory Appl. , 1(1):51–64, 1991. [48] C. E. Shannon and W. W eav er. The Mathematic al The ory of Communic ation . The Universit y of Illinois Press, Urbana, Ill., 1949. [49] D. D. Sleator and R. E. T arjan. Self-adjusting binary search trees. J. ACM , 32(3):652–686, 1985. [50] A. Z. B. H. T alib, M. Chen, and P . T ownsend. Three ma jor extensions to Kirkpatrick’s p oin t lo cation algorithm. In Pr o c. Confer enc e on Computer Gr aphics International (CGI) , pages 112–121, 1996. [51] K. Unterauer. Dynamic weigh ted binary searc h trees. A cta Inform. , 11(4):341–362, 1978/79. [52] F. F. Y ao. Efficient dynamic programming using quadrangle inequalities. In Pr o c. 12th Annu. A CM Symp os. The ory Comput. (STOC) , pages 429–435, 1980. 13

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment