Logical Queries over Views: Decidability and Expressiveness

Logical Queries o v er Views: Decidabilit y and Expressive ness 1 JAMES BAILEY, The University of Mel b o urne and GUOZHU DONG, W right State University and ANTHONY WIDJAJA TO University of E dinburgh W e study the problem of deciding satisﬁabilit y of ﬁrst order logic queries o ve r views, our aim being to delimit the boundary b et ween the decidable and the undecidable fragments of this language. Views curr en tly occupy a cen tral place in database researc h, due to their role in applications suc h as information in tegration and data w arehousing. Our main result is the ide ntiﬁcat ion of a decidable class of ﬁr st order queri es o ver unary conjunctive views that generalises the decidability of th e classical class of ﬁrst order sen tences o ver u nary relations, kno wn as the L¨ ow enheim class. W e then demonstrate how v arious extensions of this class lead to undecidabilit y and also provide some expressi vit y results. Besides i ts theoretical interest , our new decidable class is p otentially int eresting for use in applications such as deciding i mplication of complex dep endencies, analysis of a restricted cl ass of activ e database rul es, and ontology reasoning. Categories and Sub ject Descriptors: F4.1 [ MA THEMA TICAL LOGIC AND FO RMAL LANGUA GES ]: Mathematical Logic; H2.3 [ D A T ABASE MANAGEMENT ]: Languages General T erms: Theory Additional Key W ords and Phr ases: Satisﬁabilit y , containmen t, unary view, decidability , ﬁrst order logic, database query , database view, conjunctiv e query , L¨ ow enheim class, monadic logic, unary logic, on tology reasoning 1. INTRODUCTION The study of v iews in relational databases ha s attrac ted muc h a tten tion over the years. Views a re an indisp ensable co mpo nent for a ctivities such as data integration and data warehousing [Widom 1 995; Garcia- Molina et al. 19 95; Le v y et al. 1996 ], where they can b e used a s “media to rs” for so urce information that is not directly accessible to use r s. This is especially helpful in mo delling the int egr ation of data from diverse sources, s uch as lega cy systems and/ o r the w or ld wide web. Much of the re search related to views has addres s ed fundamental proble ms such as containment a nd rewriting /optimisation of quer ies using views (e.g. see [Ullman 1997; Halevy 20 01]). In this pa p er, we examine the use of views in a so mewhat dif- ferent context, w he r e they are used as the basic unit fo r writing log ical expressio ns. W e provide results on the related decision pr oblem in this pap er, for a r ange o f p os- sible v ie w deﬁnitions. In particular, fo r the case wher e views a re monadic/ unary 1 A pr eliminary version of this paper appeared in [ Bailey and Dong 1999] 2 · conjunctive q ue r ies, we show that the corres p o nding query log ic is dec idable. This corres p o nds to an in teresting new frag ment o f ﬁrst order logic. On the a pplication side, this decidable query lang ua ge als o has some interesting p otential applications for are a s such a s implication o f complex dep endencies, ontology reas o ning and ter - mination results for a ctive r ules. 1.1 Info rmal Statement of the Pro b lem Consider a rela tional v o cabular y R 1 , . . . , R p and a set of views V 1 , . . . , V n . Eac h view deﬁnition corres po nds to a ﬁrst or der formula ov er the voca bulary . Some example v iews (using ho rn c la use s tyle no tation) ar e V 1 ( x 1 , y 1 ) ← R 1 ( x 1 , y 1 ) , R 2 ( y 1 , y 1 , z 1 ) , R 3 ( z 1 , z 2 , x 1 ) , R 4 ( z 2 , x 1 ) V 2 ( z 1 ) ← R 1 ( z 1 , z 1 ) Each such view can b e expanded into to a ﬁrst or der sentence, e .g. V 1 ( x 1 , y 1 ) ⇔ ∃ z 1 , z 2 ( R 1 ( x 1 , y 1 ) ∧ R 2 ( y 1 , y 1 , z 1 ) , R 3 ( z 1 , z 2 , x 1 ) ∧ ¬ R 4 ( z 2 , x 1 )). A ﬁrst or der view query is a ﬁr s t order for mula express ed solely in terms of the given views. e.g. q 1 = ∃ x 1 , y 1 (( V 1 ( x 1 , y 1 ) ∨ V 1 ( y 1 , x 1 )) ∧ ¬ V 2 ( x 1 )) ∧ ∀ z 1 ( V 2 ( z 1 ) ⇒ V 1 ( z 1 , z 1 )) is an example ﬁrs t order view query , but q 2 = ∃ x 1 , y 1 ( V 1 ( x 1 , y 1 ) ∨ R ( y 1 , x 1 )) is not. By expanding the vie w deﬁnitions , every ﬁrst order view quer y can clear ly b e re- written to eliminate the views. Hence, ﬁrst order view queries can be thought o f as a fra gment of ﬁrst order logic, with the exact nature of the fra gment v arying according to how expre s sive the views are p ermitted to be . F ro m a database p ersp ective, ﬁr st o rder view queries ar e pa rticularly suited to applications where the sour ce da ta is unav ailable, but summary da ta (in the form of views) is. Since many databas e and rea soning langua ges ar e based on ﬁrst order logic (or e x tensions thereof ), this makes it a useful choice for manipulating the views. Our purp os e in this paper is to determine, for wha t t yp es of view deﬁnitions, satisﬁability (ov er b o th ﬁnite a nd inﬁnite mo dels ) is decidable for the lang uage. If views can be binary , then this language is cle arly as powerful as ﬁrst order logic ov er bina ry base relations, and hence undecidable (see [Bo erge r et al. 19 96]). The situation b ecomes far more interesting, when we restr ict the for m that views may take — in particula r, when their ar ity must b e unar y . Such a res triction has the eﬀect o f cons training which pa r ts of the under lying da tabase ca n b e “seen” by the view for mula a nd als o co nstrains how such parts may b e connected. 1.2 Contributions The main co ntribution o f this pap er is the deﬁnition o f a la nguage called the ﬁrst or der unary c onjunctive view language (UCV) and a proo f of its decidabilit y . As its name suggests, it uses unary arity v iews deﬁned b y conjunctive queries 2 . W e demonstrate that it is a maxima l decidable class, in the sense that incr easing the expressiveness of the view deﬁnitions results in undecidability . Some in teresting asp ects of this decida bility r esult ar e: 2 More generally , views may b e an y existential formulas with one free v ariable, since this can b e rewritten into a disjunction of conjunctiv e formulas with one free v ariable. · 3 —It is well k nown that ﬁrst o rder lo gic solely ov er monadic re la tions is decidable [L¨ owenheim 19 15], but the e x tension to dyadic re la tions is undecidable [B¨ orger et al. 19 97]. The ﬁrs t order unary co njunctiv e v iew language can b e seen a s an interesting intermediate case betw een the tw o, s inc e although o nly monadic predicates (views ) app ear in the query , they ar e intimately r elated to da tabase relations o f higher a rity . —The la nguage is able to express s ome int eres ting prop er ties, whic h might b e applied to v arious kinds of r easoning ov er o nt olo g ies. It can also b e thought of as a p ow erful generalisa tion of una ry inclusio n dep endencie s [Cosmadakis et al. 1990]. F urther more, it has an in teresting c hara cterisation as a decida ble cla ss of rules (triggers) for active databa ses. T o br ieﬂy g ive a feel for this decidable languag e, we next provide so me example unary conjunctive views and a ﬁrs t order unary co njunctive view query deﬁned ov er them: V 1 ( x ) ← R 1 ( x, y ) , R 2 ( y , z ) , R 3 ( z , x ′ ) , R 4 ( x ′ , x ) V 2 ( x ) ← R 1 ( x, y ) , R 1 ( x, z ) , R 4 ( y , z ) V 3 ( x ) ← R 1 ( x, y ) , R 1 ( x, z ) , R 4 ( y , y ) , R 4 ( z , x ) V 4 ( x ) ← R 1 ( x, y ) , R 3 ( y , z ) , R 4 ( z , x ′ ) , R 4 ( x ′ , y ′ ) , R 3 ( y ′ , x ) ∃ x ( V 2 ( x ) ∧ ¬ V 1 ( x )) ∧ ¬∃ y ( V 3 ( y ) ∧ ¬ V 4 ( y )) 1.3 P ap er Outline The pap er is structured as fo llows: Section 2 deﬁnes the necessar y preliminaries and background concepts. Section 3 presents the deﬁnition of the lo gic UCV. Section 4 is the co re section of the pap er, where the decida bilit y r esult for the class UCV is prov ed. Sectio n 5 shows that extensions to the languag e, such as allowing negation, inequa lity or recursio n in views, result in undecidability . Section 6 cov ers applications of the decida bilit y results and then Section 7 provides some res ults on expressiv it y . Section 8 discuss es related w ork and section 9 summarise s and discusses future w or k . 2. PRELIMINARIE S In this section, w e state basic deﬁnitions and rele v ant results. The reader is assumed to b e familia r with standar d results and notations from mathematical log ic (e.g. see [E nderton 2 001]). In the following, formulas are always ﬁrst-or de r . The symbol F O denotes the se t of ﬁrst o rder fo r mulas ov er any voc a bulary σ . In a ddition, if L ⊆ F O (i.e. L is a fragment of F O ), we deno te b y L ( σ ) the set of formulas in L ov er the vocabula ry σ . 2.1 First-order lo gic A (relational) vo c abulary σ is a tuple h R 1 , . . . , R n i of r elation sym b ols with each R i asso ciated with a s pe c iﬁe d a rity r i . A (r elational) σ - struct ur e A is the tuple h A ; R A 1 , . . . , R A n i where A is a non-empty set, called the universe (of A ), and R A i is an r i -ary relation ov er A in terpreting R i . W e refer to the elemen ts in the set A as the elements in 4 · A , or simply b y c onstants 3 (of A ). In the sequel, w e write R i instead o f R A i when the meaning is clea r fro m the con text. W e also use S T RU C T ( σ ) to deno te the set o f all σ -s tructures. W e as sume a count ably inﬁnite set V AR of v ariables . An instantiation (or valuation ) of a structure I is a function v : V AR → I . Extend this function to fr e e tuples (i.e. tuple o f v ar iables) in the obvious wa y . W e use the usual T arsk ian notion of satisfaction to deﬁne I | = φ [ v ], i.e., whether φ is true in I under v . If φ is a se nt ence, w e simply write I | = φ . The image of a struc tur e I under a fo r mula φ ( x 1 , . . . , x n ) is φ ( I ) def = { v ( x 1 , . . . , x n ) : v is an instantiation of I , and I | = φ [ v ] } . In par ticular, if n = 0, we ha ve that φ ( I ) 6 = ∅ iﬀ I | = φ . W e say that t wo σ -structures A and B agr e e on L iﬀ for all φ ∈ L ( σ ) we have A | = φ ⇔ B | = φ . F ollowing the conv ention in databa se theory , the (tuple) datab ase D ( A ) c orr e- sp onding to t he struct ur e A (deﬁned ab ove) is the set { R i ( t ) : 1 ≤ i ≤ n and t ∈ R A i } . It is eas y to see that such a da tabase can b e considered a str ucture with universe adom ( A ), which is deﬁned to b e the set of all elements of A o ccurr ing in at least o ne relation R i , and relations built appropriately from D ( A ). Abusing ter minologies, we refer to the ele ment s of D ( A ) as tuples (asso ciate d with A ) . In addition, when the meaning is clear from the cont ext, we shall also abuse the term fr e e tuple to mean a n a tomic for mu la R ( u ), where R ∈ σ a nd u is a tuple of v ariable s . A formula φ is said to be satisﬁable if ther e exis ts a structure A (either of ﬁnite or inﬁnite size) suc h that φ ( A ) 6 = ∅ ; such a structure is said to be a mo del for φ . W e say that φ is ﬁ nitely satisﬁable if there exists a ﬁnite structure I such that φ ( I ) 6 = ∅ . Without lo ss o f generality , we shall fo cus only on sentences when we a re dealing with the s atisﬁability problem. In fact, if φ has some fr ee v ariables, taking its existential closur e pr eserves satisﬁability [Indeed we shall see that the languages we co nsider are clos e d under ﬁrst-o rder quantiﬁcation]. Given t w o σ -str uctur es A , B , recall that A is a su bstructur e of B (written A ⊆ B ) if A ⊆ B and R A ⊆ R B for every relation s y mbol R in σ . W e say that A is an induc e d substructur e of B (i.e. induc e d by A ⊆ B ) if for every relation symbol R in σ , R A = R B ∩ A r , where r is the arity of R . Now, a homo morphism from A to B is a function h : A → B such that, for every r e lation symbol R in σ a nd a = ( a 1 , . . . , a r ) ∈ R A , it is the case that h ( a ) def = ( h ( a 1 ) , . . . , h ( a r )) ∈ R B . An isomorphi sm is a bijectiv e homomor phism whose inv erse is a homomorphism. The quantiﬁer r ank qrank( φ ) of of a for mula φ is the maximum ne s ting depth of quantiﬁers in φ . 2.2 Views F or our purp ose, a view over σ can b e thought of a s a n arbitra ry FO formula ov er σ . W e say that a view V is c onjunctive if it can b e written a s a c onjunctive query , 3 Although it is common in mathematical logic to use the term “constan ts” to mean the i nt erpre- tation of constant symbols in the structure, no confusion shall aris e in this article, as we assume the absence of constan t symbols in the vocabulary . Our results, nevertheless, easily extend to v o cabularies with constan t symbols. · 5 i.e. of the form ∃ x 1 , . . . , x n ( R 1 ( u 1 ) ∧ . . . ∧ R k ( u k )) where each R i is a relation symbol, and each u i is a fr e e tuple of a ppropriate ar it y . W e ado pt the horn clause style notation for writing c o njunctive views. F or example, if { y 1 , . . . , y n } is the set of free v aria bles in the above conjunctive quer y , then we can r ewrite it a s V ( y 1 , . . . , y n ) ← R 1 ( u 1 ) , . . . , R k ( u k ) where V ( y 1 , . . . , y n ) is called the he ad of V , a nd the conjunction R 1 ( u 1 ) , . . . , R k ( u k ) the b o dy of V . The length of the conjunctive view V is deﬁned to be the sum of the a r ities of the relation symbols in the multiset { R 1 , . . . , R k } . F or ex ample, the lengths o f the tw o views V and V ′ deﬁned as V ( x ) ← E ( x, y ) V ′ ( x ) ← E ( x, y ) , E ( y , z ) are, resp ectively , tw o and four. Additionally , if n = 1 (i.e. has a head of a rity 1 ), the view is said to be unar y . Unless state d otherwise, we shal l say “view” to me an “unary-c onjunctive view with neither e quality nor ne gation in its b o dy” . 2.3 Graphs W e use standa rd deﬁnitions from graph theory (e.g. see [Diestel 200 5]). A gr aph is a structure G = ( G, E ) where E is a binary relatio n. The girth of a g raph is the length of its s ho rtest cycle. F or tw o vertices x, y ∈ G , we denote their distance by d G ( x, y ) (or just d ( x, y ) when G is clea r from the cont ext). F or tw o sets S 1 and S 2 of vertices in G , we deﬁne their distance to b e d G ( S 1 , S 2 ) := min { d G ( a, b ) : a ∈ S 1 and b ∈ S 2 } . In a weigh ted gr aph G with weigh t w G : E → N , the weight w G ( P ) of a path P in G is just P e ∈ E ( P ) w G ( e ). W e shall write w instead of w G if the meaning is clear from the context. In the sequel, we shall frequently mention trees and forests. W e alwa ys ass ume that a ny tree has a selected no de , whic h we ca ll a r o ot of the tree. Given a tree T = ( T , E ), we can partition T accor ding to the distance of the vertices fro m the r o ot. The Gaifman gr aph (se e [Gaifman 1982]) asso ciated with a structure A is the weigh ted undirected multi-graph G ( A ) = ( G, E ) such tha t: (1) G = A . (2) The multi-set E is deﬁned as follows: for each x, y ∈ G , we put a n R ( t )-lab eled edge xy in E with weigh t r (the a rity of R ) iﬀ x a nd y app ear in a tuple R ( t ) in D ( A ). [Notice that the multiplicit y of xy in E dep ends on the num b er of tuples in D ( A ) that contain b oth x and y as their arguments.] Note also that the subgraph of G ( A ) induced b y the set of a ll elements of A in a tuple t is the complete gra ph K r , and so a n L -lab elled edge is adjacent to an edge e ∈ E iﬀ all L -lab elled edges a re adjacent (i.e. co nnected) to the edge e . F or any a, b ∈ A , we deﬁne the distanc e d A ( a, b ) betw een a and b to b e their distance in G ( A ). Also, extend this distance function to tuples and sets of tuples 6 · by interpreting them as sets of elemen ts of A that app ear in them. An y pair of tuples R ( t ) and R ′ ( t ′ ) in D ( A ) are said to b e c onne cte d (in A ) if in G ( A ) so me (and hence all) R ( t )-lab eled edge is a djacent to some (and hence all) R ′ ( t ′ )-lab eled edge. 2.4 Una ry formulas A u nary formula is an arbitra r y F O formula without e quality suc h that each of its relation symbols has arity one. Let σ be a vo cabulary whose relation symbols are of arity one. W e shall use UFO( σ ) to denote the set of a ll unar y for mulas without equality ov er σ . Also, we deﬁne UFO = ∪ σ UF O( σ ). The following lemma will b e useful for pr oving expr essiveness results in Section 7. Lemma 2 .1. F or every unary sentenc e, ther e exists an e quivalent one of quanti- ﬁer ra nk 1. Proof. By a straightf or ward manipulation. See the proo f of lemma 21.12 in [Bo olos et al. 2002]. [Their pro of actually gives more than the result they claim. In fact, their construction conv erts an arbitra ry unary sentence into o ne with one unary v a r iable and of quantiﬁer rank 1.] 2.5 Ehrenfeucht-Fra ¨ ısse Games W e shall need a limited form o f Ehrenfeuch t-F ra ¨ ısse ga mes ; for a gene r al account, the re ader may c onsult [Libkin 2 004]. The ga mes ar e play ed by tw o players, Spo iler and Duplicator , on tw o σ -structures A and B . The go al of Sp oiler is to s how tha t the structure s a re diﬀerent, while Duplicator a ims to show that they are the s ame. The game consis ts o f a single r ound. Spoiler cho o ses a structure (sa y , A ) and a n element a in it, after which Duplicator has to resp o nd by choosing a n element b in the other structur e B . Duplicator wins the game iﬀ the substructur e of A induced by { a } is isomorphic to the substructure of B induced by { b } . Duplicator has a winning stra tegy iﬀ Duplicator has a winning mo ve, regardless of how Spo iler behaves. Proposition 2.2 (Ehrenfeucht-Fra ¨ ısse Games). Du plic ator has a winning str ate gy on A and B iﬀ A and B agr e e on ﬁrst-or der formulas over σ of quantiﬁer r ank 1. 2.6 Other No tation Regarding other no tation we shall use thro ughout the r est o f the pap er: we sha ll use a, b for consta nts, x, y , z for v ariables, u for free tuples, U, V for view s , U , V for sets of v ie w s , σ for voca bularies, R 1 , R 2 , . . . for relatio n symbols, A , B , . . . fo r structure s and A, B for their resp ective universes. If D is a da tabase (a set of tuples), we use adom ( D ) to denote the set of constants in D . Finally , given a a ∈ adom ( D ) and a “new” co nstant b / ∈ D , we deﬁne D [ b/ a ] to b e the database that is obtained from D by replacing e very o ccur rence of a by b . The nota tio n D [ b 1 /a 1 , . . . , b n /a n ] is deﬁned in the sa me way . 3. DEFINITION OF FIRST ORDE R UNARY-CONJUNCTIVE-VIEW LOGIC Let σ be a n a rbitrar y voc abulary , a nd V be a ﬁnite set of (unar y conjunctive) views ov er σ , which we refer to as a σ -view s et . W e now inductively deﬁne the set · 7 UCV( σ , V ) of ﬁrst or der unary-c onjunctive-view (UCV) qu eries/formulas over the vo c abulary σ and a σ -view set V : (1) if V ∈ V , then V ( x ) ∈ UCV( σ, V ); and (2) if φ, ψ ∈ UCV( σ , V ), then the formulas ¬ φ, φ ∧ ψ and ∃ xφ b elo ng to UCV( σ , V ). The smallest set of so -constructed formulas deﬁnes the set UCV( σ, V ). W e denote the set of a ll UCV form ulas over the vocabular y σ by UCV( σ ), i.e. UCV( σ ) def = S V UCV( σ , V ) where V may be any σ -view set. F urther, the set o f all UCV querie s is denoted by UCV, i.e. UCV def = S σ UCV( σ ), where σ is any vocabula ry . As usual, w e use the shortha nds φ ∨ ψ , φ → ψ , φ ↔ ψ , and ∀ xφ for (resp ectively) ¬ ( ¬ φ ∧ ¬ ψ ) , ¬ φ ∨ ψ , ( φ → ψ ) ∧ ( ψ → φ ), a nd ¬∃ x ¬ φ . Thus, the UCV language is closed under b o olean co mbinations and ﬁrst-or der quantiﬁcations. As an exa mple, consider the UCV for mu la q 1 = ∃ x ( V ( x ) ∧ ¬ V ′ ( x )) where V and V ′ are deﬁned as V ( x ) ← E ( x, y ) V ′ ( x ) ← E ( x, y ) , E ( y , z ) This for mula ass erts that there exis ts a vertex from which there is an o utgoing a rc, but no o utgoing dir ected walk of length 2. Let us make a few remark s o n the expre ssive p ow er of the logic UCV with resp ect to o ther lo gics. It is easy to s ee that the UCV langua ge strictly subsumes UFO (the L¨ ow enheim class without equality [L¨ owenheim 1915 ; B¨ o rger et al. 1997 ]), as UCV q ueries can be deﬁned ov er any relational vocabular ies (i.e. including ones that include k -ary r elation sy mbols with k > 1). It is als o easy to see that allowing any g e neral ex istential p ositive fo rmula (i.e. of the form ∃ xφ ( x ) where φ is a quantiﬁer-free formula with no negation) with one free v ariable, do es not increase the expr essive power of the log ic. Indeed, the quantiﬁer-free subformula φ can be rewritten in disjunctive nor ma l form without introducing negatio n, after whic h w e may distribute the existential quantiﬁer acro ss the disjunctions and conseque ntly transform en tire formula to a disjunction of conjunctiv e queries with one or zero free v ariables . Each such conjunctive query can then b e trea ted as a view. There are t wo w ays in which w e ca n interpret a UCV fo r mula. The standard wa y is to think of a UCV quer y as an FO for mu la ov er the underlying vocabulary . T a ke the afore-mentioned query q 2 as an ex a mple. W e can interpret this quer y as the formula ∃ x ( ∃ y , z ( E ( x, y ) ∧ E ( y , z )) ∧ ¬∃ y ( E ( x, y )) ov er the gr aph voc a bulary . The no n-standard wa y is to r egard a UCV query φ as a unary formula ov er the view set. F or example, we can think o f q 2 as a unary fo rmula ov er the vo cabulary σ ′ = h V , V ′ i . Now, if φ ∈ UCV( σ , V ), then we denote by φ V the unary for m ula over V cor resp onding to φ in the non-s tandard interpretation of UCV queries. How ever, for nota tional conv enience, we shall write φ ins tead of φ V when the mea ning is clear from the context. Given a voc a bulary σ a nd a σ -view set 8 · V = { V 1 , . . . , V n } , w e ma y deﬁne the function Λ : S T R U C T ( σ ) → S T RU C T ( V ) such that for any I ∈ S T R U C T ( σ ) Λ( I ) def = h I ; V Λ( I ) 1 , . . . , V Λ( I ) n i where V Λ( I ) i def = V i ( I ). F o r example, let σ = h E i and V = { V , V ′ } b e as ab ove, a nd let I = h{ 1 , 2 , 3 , 4 } ; E I = { (1 , 2) , (2 , 3) , (3 , 4 ) }i . Then, w e hav e J def = Λ( I ) = h{ 1 , 2 , 3 , 4 } , V J = { 1 , 2 , 3 } , V ′ J = { 1 , 2 }i . In the following, we shall reserve the symbol Λ to denote this sp ecia l function. In addition, if J ∈ S T R U C T ( V ) and there exists a structure I ∈ S T R U C T ( σ ) such that Λ( I ) = J , we say that the structure J is r e alizable with resp ect to the vocabular y σ and the view set V , o r that I r e alizes J . W e shall omit mention of σ and V if they a r e understo o d by context. A num b er o f remarks ab out the notio n of realiza bilit y are in order. Firs t, some unary str uctures a r e not realizable with r esp ect to a g iven view set V . F or example, the query q 2 has inﬁnitely many mo dels if tre a ted as a una r y fo r mula, but none of these mo dels a re rea lizable, since V ′ ⊆ V . Second, if φ ∈ UCV( σ , V ) ha s a mo del I , then the structure Λ( I ) over V is a mo del for φ V . In other w or ds, if a UCV q uery is sa tisﬁable, then it is also satisﬁable if treated a s a unary formula. Conv ersely , it is also clear ly true that a UCV q ue r y is satisﬁable, if it is satisﬁable when trea ted as a una ry for m ula and that at least one of its mo dels is rea lizable. Mo re precisely , if Λ( I ) is a mo del for φ V , then I is a model for φ . So, combining these, we hav e I | = φ iﬀ Λ( I ) | = φ V . So, we immediately have the following lemma: Lemma 3 .1. Supp ose A , B ∈ S T R U C T ( σ ) and φ ∈ UCV ( σ, V ) . Then, for Λ : S T R U C T ( σ ) → S T RU C T ( V ) deﬁne d ab ove, the fol lowing statements ar e e quivalent: ( 1 ) A | = φ iﬀ B | = φ , ( 2 ) Λ( A ) | = φ Λ iﬀ Λ( B ) | = φ Λ . This lemma is useful when combined with Ehrenfeuch t-F ra ¨ ısse games. F or example, suppo se that we are given a mo del A for φ , and we co nstruct a “nicer ” structure B that, w e wish, sa tisﬁes φ . If we ca n prove that the s econd s tatement in the lemma (whic h is often easier to establish as views have ar ity o ne), we might deduce that B | = φ . 4. DECIDAB ILITY OF UCV QUER IES In this section, we prove o ur main result that sa tisﬁability is decida ble for UCV formulas. Our main theo rem stipulates that UCV has the b ounded mo del prop erty . Theorem 4.1. L et φ b e a formula in UCV. Supp ose, furt her, that φ c ontains pr e cisely the views in the view set V , and r elation symb ols in the vo c abulary σ , with m b eing the maximum length of t he views in V , and p = | σ | . If φ is satisﬁable, then · 9 it has a mo del using at most 2 2 q ( p,m ) elements, for some ﬁxe d p olynomial q in p and m . Before w e pr ov e this theorem, we ﬁrst derive some corollar ie s . Simple algebra ic manipulations yield the fo llowing co r ollary . Corollar y 4. 2. Continu ing fr om The or em 4.1, if n is the size of (the p arse tr e e of ) a satisﬁable formula φ , then φ has a mo del of size 2 2 g ( n ) for some ﬁxe d p olynomial g in n . Corollar y 4.2 immediately leads to the decidability of satisﬁability for UCV. W e can in fact derive a tighter b ound. Theorem 4.3. Satisﬁability for the UCV class of formulas is in 2-NEXPTIME. This theo r em follows immediately from the following prop o sition and co rollar y 4.2 . Proposition 4.4. L et s b e a n on-de cr e asing funct ion with s ( n ) ≥ n . Then, the pr oble m of determining whether an FO sentenc e has a mo del of size at most s ( n ) , wher e n is t he size of the input formula, c an b e de cide d nondeterministic al ly in 2 O ( n log ( s ( n ))) steps. Proof. W e may us e any reaso nable enco ding co de( A ) of a ﬁnite structure A in bits (e.g . see [Libkin 20 04, Cha pter 6]). The size of the enco ding, deno ted | A | , is po lynomial in | A | . W e ﬁrst guess a structure A of s ize at most s ( n ). Let s ′ = | A | . Since the size | A | of the enco ding of A is po ly nomial in s ′ , the guessing pro cedure takes O ( s k ( n )) time steps for so me constant k . W e, then, use the usua l pro cedure for ev aluating whether A | = φ . This can b e done in O ( n × | A | n ) steps (e.g . see [Libkin 2004 , Prop os itio n 6.6]). Simple algebra ic manipula tions give the so ught after upp er b o und. Observe that a low er bo und for satisﬁability of UCV formulas follo ws immediately from the NEXPTIME completenes s for s a tisﬁability o f UFO fo rmulas given in [B¨ orger et al. 19 97] Theorem 4.5. Satisﬁability for the UCV class of formulas is NEXPTIME har d. What remains now is to prove theorem 4.1. Proof of theorem 4.1. Let φ, m, p be a s stated in theorem 4.1 . W e b eg in by ﬁrst en umerating all p ossible views over σ of length at most m . As we s hall see later in the pro of of Subprop erty 4.1 4, doing so will help fa cilitate the corr ectness of o ur constructio n of a ﬁnite mo del, since en umerating a ll s uch vie w s eﬀectively allows us to determine all p ossible ways the mo del may be “s e en” by vie w s , o r par ts of view s . Let U = { V 1 , . . . , V N } b e the set of all non-equiv alent views obtained. By elementary co unting, one may ea sily verify that N ≤ m ( mp ) m . Indeed, each view is comp osed of its head and its bo dy , whos e length is bo unded b y m . The bo dy is a set o f conjuncts tha t we may ﬁx in some order. There a re at most m v ariables that the head can take. Each p ositio n in the b o dy is a v a r iable ( m choices) that is part o f a re lation R ( p choices). The upp er b ound is then immediate. Let I 0 be a (p oss ibly inﬁnite) mo del for φ . [If it is inﬁnite, b y the L¨ o wenheim- Skolem theo rem, we may a ssume that it is countable.] Without loss of gener ality , 10 · we may a ssume that there exists a “universe” relation U in I 0 which contains each constant in adom ( I 0 ). Otherwise, if U ′ / ∈ σ is a unar y relation sy m b ol, the ( σ ∪ { U ′ } )-structure obtained by adding to I 0 the re lation U ′ , which is to b e interpreted as I 0 , is als o a mo del fo r φ . Let us now deﬁne 2 N formulas C 0 , . . . , C 2 N − 1 of the fo rm C i ( x ) def = ( ¬ ) V 1 ( x ) ∧ . . . ∧ ( ¬ ) V N ( x ) , where the conjunct V j ( x ) is negated iﬀ the j th bit of the bina ry r e pr esentation o f i is 0 . F or each A ∈ S T RU C T ( σ ), these form ulas induce an equiv alence relation on A with each set C i ( A ) b eing an equiv alence clas s. When A is clear, w e refer to the equiv alence cla s s C i ( A ) simply as C i . In addition, the exis tence of the universe relation U in I 0 implies that the a ll-negative equiv a lence cla ss C 0 is empty . W e next desc r ib e a seq uence of ﬁv e sa tis fa ction-preser ving procedures for deriving a ﬁnite mo del from I 0 . This seq uence is b est descr ib e d diag r ammatically: I 0 makeJF − → I 1 rename1 − → I 2 rename2 − → I 3 copy − → I 4 prune − → I 5 . The i th pro cedur e ab ove ta kes a structure I i as input, and outputs ano ther structure I i +1 . The structure I 5 is guara nteed to b e ﬁnite (and indee d b ounded). That each pro cedure preser ves s atisﬁability immediately follows by subprop erties 4.8, 4.10 , 4.12, 4 .13, and 4.14. While reading the description of the pro c e dur es b e low, it is instructive to keep in mind that the prop er ty that C i ( I j ) = ∅ iﬀ C i ( I j +1 ) = ∅ is s uﬃcient for showing that the j th pro cedur e prese r ves sa tis ﬁa bility (see lemma 4.7). Roughly sp eaking, the pro cedur e m akeJF tra nsforms the initially given structure I 0 int o a nother structure that has a fo rest-like g raphical repr esentation, called a “justiﬁcation fore s t”. E ach subsequent pro cedure works o nly o n justiﬁcation fores ts. In the sequel, we s hall use H i to denote o ur graphical representation of I i ( i ∈ { 1 , . . . , 5 } ). The pr o c e dur e m akeJF W e deﬁne the str ucture I 1 by ﬁr s t deﬁning a seq uence I 0 1 , I 1 1 , . . . of structures such that I k 1 is a substructure of I k +1 1 , and then setting I 1 = S ∞ k =0 I k 1 . [Note: w e take the normal union, not disjoint union .] W e ﬁrst deal with the base case of I 0 1 . F or each non-empty equiv alence class C i ( I 0 ), w e cho ose a witnessing constant a i ∈ C i ( I 0 ). W e de ﬁne I 0 1 as the co llection of all such a i s. All r elations in I 0 1 are empty . E a ch a i is s a id to b e unjustiﬁe d in I 0 1 , meaning that the mo del is missing tuples that c a n witness the truth of a i being a member of some equiv alence class. W e now describ e how to deﬁne I k +1 1 from I k 1 . F or each a ∈ I k 1 , if a ∈ C i ( I 0 ) for some i , it is the case that a ∈ V j ( I 0 ) iﬀ bit j ( i ) = 1 fo r 1 ≤ j ≤ N . F or such a , we may take a minimal witnessing substructure S a of I 0 such that a ∈ V j ( S a ) iﬀ bit j ( i ) = 1. As each constant in adom ( S a ) app ear s in a t least one relation in S a , we shall often think of these witness ing structures a s databases (i.e. sets of tuples), and refer to them as justiﬁc ation s et s . W e deﬁne the structure I k +1 1 to b e the union of I k 1 and a ll the witnessing structures S a such that a is unjustiﬁed in I k 1 . The e lement s in I k 1 bec ome justiﬁe d in I k +1 1 . The elements in I k +1 1 − I k 1 are then said to be unjustiﬁe d in I k +1 1 . Observe that the structure I K +1 1 do es not unjustify any e le ment s that were justiﬁed in I k 1 , since there is no negation in the view de ﬁnitio ns. Finally , the structure I 1 is · 11 deﬁned as the union of all I k 1 s. Observe that each element in I 1 app ears in a t least one r elation in I 1 . The str ucture I 1 has an in tuitiv e g r aphical repre s entation, which we deno te by H 1 . The gra ph H 1 is simply a lab eled for e st in which each tr ee T i (for some 0 ≤ i ≤ 2 N − 1) co rresp o nds to exac tly one witnessing co nstant a i for eac h non- empt y C i . W e deﬁne T i as follows: the roo t o f T i is lab eled by S a i × C i ; and for each j = 0 , 1 , . . . , any S b × C k -lab eled no de v at level j (for s ome justiﬁcation set S b and equiv alence cla ss formula C k ), and a ny co nstant c in adom ( S b ) that is distinct from b , deﬁne a new S c × C k ′ -lab eled no de to be a child of v , for the unique k ′ such that c ∈ C k ′ ( I 0 ). In the following, when the meaning is clear, we sha ll often refer to an ( S a × C k )-lab eled no de simply as a S a -lab eled no de. Also, o bserve the similarity o f the construction of H 1 and that of I 1 . In fact, the union of a ll S a , for which there is a n S a -lab eled no de in H 1 , is pr ecisely I 1 . O bserve also that ea ch tree T i may b e inﬁnite. F or obvious reaso ns, we shall refer to T i as a just iﬁc ation tr e e (of a i ), a nd to H 1 as ju s tiﬁc ation for est . In the following, for any justiﬁca tio n tree T and any justiﬁcation for est H , their c orr esp onding structur es (or datab ases ), denoted by D ( T ) and D ( H ) resp ectively , are deﬁned to b e the union o f all S a , such that ther e is an S a -lab eled no de in, resp ectively , T and H . F urther more, we shall use adom ( T ) and adom ( H ) to deno te adom ( D ( T )) and adom ( D ( H )), resp ectively . The elemen ts in the set adom ( T ) and adom ( T ) and adom ( H ) are referred to as, resp ectively , constants in T and constants in H . W e now illustrate this pro cedure by a s mall ex ample. Deﬁne the UCV for mula φ = ∀ x ( V 1 ( x ) ∧ ¬ V 2 ( x )) , where the view s ar e V 1 ( x ) ← E ( x, y ) V 2 ( x ) ← E ( x, x ) . Here, we hav e V = { V 1 , V 2 } , σ = h E i , a nd m = 2. Suppos e that I 0 = h N , E = { (0 , 1) , (1 , 2) , (2 , 3) , (3 , 4 ) , . . . }i is a path extending indeﬁnitely to the right. Then, we hav e I 0 | = φ . Enumerating all non-equiv alent views ov er σ of length a t most m , we hav e U = { V 1 , V 2 , V 3 } where V 3 ( x ) ← E ( y , x ) . Now, ther e are exa ctly tw o non-empty e q uiv a lence cla sses: C 100 = { 0 } C 101 = { 1 , 2 , . . . } . Then, we ha ve S 0 = { E (0 , 1) } and S i = { E ( i − 1 , i ) , E ( i, i + 1 ) } for i > 0. F ollowing the ab ove pro cedure, we o bta in the trees T 100 and T 101 as depicted in ﬁgur e 1. Note that H 1 is the disjoint union of T 100 and T 101 . The pr o c e dur e r ename1 Pr oviso : in subse quent pr o c e dur es ( inclu ding the pr esent one), we shal l not change the se c ond entries ( i.e. C i ) of e ach no de lab el (i.e. of the form S a × C i ) and fr e quently omit mention of them. 12 · S4 T 101 T S0 S1 S0 S2 S1 S1 S3 S1 S0 S2 S1 S0 S1 S3 S2 S0 S2 S2 100 Fig. 1. A depiction of the justiﬁcation forest H 1 as an output of makeJF . The a im of this proc e dur e is to ensure that there a re no t wo justiﬁcation trees T and T ′ with adom ( T ) ∩ adom ( T ′ ) 6 = ∅ . It essentially p erfor ms rena ming o f constants in adom ( T ), for each tree T in H 1 . This step will la ter help us g uarantee the correctness of the la st step that is used to pro duce the ﬁnal model I 5 , which relies on a kind o f “tree dis joint ness” prop erty . Mor e formally , we deﬁne I 2 to be the dis jo int union 4 of D ( T ) ov er a ll tr ees T in H 1 . The justiﬁcation forest H 2 corres p o nding to I 2 can b e obtained from H 1 by renaming co nstants o f the tuples in each tr ee T in H 1 according ly . Let us cont inue with our previous example of H 1 . The g r aph H 2 in this case will b e precisely iden tical to H 1 , except that in T 101 we us e the lab e l, say , S 0 ′ = { E (0 ′ , 1 ′ ) } (r e s p. S i ′ = { E (( i − 1 ) ′ , i ′ ) , E ( i ′ , ( i + 1 ) ′ ) } for i > 0 ) instead o f S 0 (resp. S i for i > 0 ). The pr o c e dur e r ename2 The aim of this pr o cedure is to tr ansform the mo del in such a wa y that each co nstant a can app ear only a t tw o consecutive levels, s ay j a nd j + 1, within e ach tree. It app ears a t level j as par t of an S b -lab eled no de v , for some c o nstant b 6 = a , and at level j + 1 as par t of a n S a -lab eled no de that is a child of v . F urther, the pro cedure ensures that any given constant o cc urs in at most one no de’s lab el a t each level in a tree. Again, this will step will later help us guarantee the correctnes s o f the step that is used to pro duce the ﬁnal mo del I 5 , which relies on the existence of a kind of in ternal “disjointness” prop erty within trees. 4 The di sjoint union of tw o σ -structures A and B with A ∩ B = ∅ is the structure with unive rse A ∪ B and relation R inte rpreted as R A ∪ R B . If A ∩ B 6 = ∅ , one can simply for ce disjointness by renaming constan ts. · 13 Let us ﬁx a sibling or de r ing for the no des within ea ch tree T i in H 2 . Deﬁne a set U of co nstants disjoint fro m I 2 as fo llows: U = { a j,l : j, l ∈ N and a ∈ I 2 } . F or a, b ∈ I 2 , we require that a j,l 6 = b j ′ ,l ′ whenever either j 6 = j ′ , o r l 6 = l ′ , o r a 6 = b . F or each tree T i and for each j = 1 , 2 , . . . , cho o se the l th no de v with resp ect to the ﬁxed sibling order ing (say , S a -lab eled) at level j in T i . Let v ’s children be v 1 , . . . , v k (lab eled by , res p ectively , S b 1 , . . . , S b k with b h 6 = a ). No w do the following: change v to S a [ b 1 j,l , . . . , b k j,l /b 1 , . . . , b k ]; and change v h , where 1 ≤ h ≤ k , to S b h j,l def = S b h [ b h j,l /b h ]. Obse r ve tha t there are t wo stag es in this pro cedure where each no n-ro ot no de at level j , say S a -lab eled, undergo es r elab eling: ﬁrst when we are at level j − 1 (the constant a is renamed by a j,k for some k ), and seco nd when we are at level j (co nstants other than a j,k are renamed for what is now S a j,k ). The output of this pro cedure on H 2 is deno ted by H 3 , whos e corr esp onding s tructure we deno te b y I 3 . Contin uing with our previous example. The r o ot no de u 1 of T 100 in H 2 is S 0 = { E (0 , 1) } , its child u 2 (sibling zero at level 1 ) is S 1 = { E (0 , 1) , E (1 , 2) } and in turn the children of that child are u 3 = S 0 = { E (0 , 1) } (sibling 0 a t level 2) and u 4 = S 2 = { E (1 , 2) , E (2 , 3) } (sibling 1 a t level 2). Under the r ename 2 pro cedure, no de u 1 is unchanged, since it is a t level zero. No de u 2 is changed to S 1 = { E (0 1 , 0 , 1) , E (1 , 2 1 , 0 ) } No de u 3 is changed to S 0 1 , 0 = { E (0 1 , 0 , 1 2 , 0 ) } a nd u 4 is changed to S 2 1 , 0 = { E (1 2 , 1 , 2 1 , 0 ) , E (2 1 , 0 , 3 2 , 1 ) } . The pr o c e dur e c opy This pro cedure makes a n umber of isomorphic copies o f the mo del H 3 and then unions them together. Duplicating the mo del in this w ay facilitates the c o nstruction of a b ounded mo del by the pr une pr o cedure, that will b e describ ed shor tly . Let δ be the total num be r of constants that app ear in some tuples from a no de lab el at level h := cm in H 3 , for some ﬁxed c ∈ N , indep endent fro m φ , whose v alue will later b ecome clear in the pro ofs that follow. By virtue of pro cedure mak eJF , we are guaranteed that each no de in H 3 can have at most N × m c hildren, where N × m represents an upp er bo und on the num ber of constants ea ch justiﬁca tion set mig ht contain. Since there are at most 2 N trees in H 3 , b y elemen tary co untin g, we s e e that δ ≤ 2 N × ( N × m ) h . Now, letting g := cm , make ∆ := δ g (isomorphic) copies of H 3 , each with a disjoint set o f constants. That is , the no de labe ling o f each new copy of H 3 is isomor phic to that of H 3 , exce pt that is uses disjoint set of cons tants. Let us call them the copies B 1 , . . . , B ∆ (the or iginal copy of H 3 is included). So, we hav e B i ∩ B j = ∅ , for i 6 = j . F or each tree T i in H 3 , we denote by T k i the iso morphic copy of T i in B k . Now, let H 4 = B 1 ∪ . . . ∪ B ∆ . The str ucture cor r esp onding to H 4 is denoted b y I 4 . In the seque l, each no de at level h in B k is sa id to b e a (p otential) le af of B k . The pr o c e dur e p rune The pur po se of this pro cedure is to transfo rm H 4 int o a ﬁnite mo del. Intuitiv ely , this is achiev ed by “pruning” all tre es a t level h and then re justifying the resulting 14 · unjustiﬁed co nstants by “linking” them to a justiﬁcation b eing used in some other part of the mo del. This is the most complex step in the en tire sequences o f pro - cedures, and care will b e needed later to prov e to ensure that s a tisﬁability is not violated when constants ar e b eing r e justiﬁed. W e b eg in ﬁrst by des cribing the co nnections that we wish to construct b e- t ween the diﬀerent parts of the mo del. Roughly sp eaking , the model we int end to construct somew ha t resembles a δ -regular graph, whose no des a r e the copies B 1 ∪ . . . ∪ B ∆ made earlier , and where edges between copies indica te that o ne copy is b eing used to make a new justiﬁcation for a no de at level h in ano ther co py . Firstly though, we s tate a prop ositio n fro m extremal graph theory (see [Bollo bas 2004, Theo rem 1.4 ’ Chapter I I I]] for pr o of ) that can b e us ed to guar antee the existence o f the k ind of δ - regular graph we intend to constr uct. Proposition 4.6. Fix two p ositive inte gers δ, g and take an inte ger ∆ with ∆ ≥ ( δ − 1) g − 1 − 1 δ − 2 . Then, t her e exists a δ -r e gular gr aph of size ∆ with girth at le ast g . Using δ, g and ∆ as deﬁned in the co py pro cedure, this propo sition implies that there exists a δ -regular gra ph G with vertices {B 1 , . . . , B ∆ } and with gir th at le a st g . Let us now treat G as a directed gr aph, wher e e a ch edge in G is regarded a s t wo bidirectiona l ar cs. Observe that, for each vertex B k , there is a bijectio n out k from the set of leafs (no des at height h ) of B k to the set of arcs going out from B k in G . W e next take each leaf of B k in turn. F or a leaf v (say , S b -lab eled), s upp o s e that out k ( v ) = ( B k , B k ′ ). Cho ose i such that b ∈ C i ( I 4 ). If the ro o t of T k ′ i is S c -lab eled, for some c ∈ I 4 , then we delete all descendants o f v in T k i and change v to S c [ b/c ]. In this wa y , we “prune” each o f the trees in H 4 , and link each leaf node to the ro ot no de o f another tree for the purpo se of justiﬁcation. W e denote by H 5 the resulting collection of interlink ed mo dels, whose co rresp onding structure is denoted by I 5 . H 5 can b e thought of as a collectio n of interlinked forests, where ea ch forest corres p o nds to one of the copies {B 1 , . . . , B ∆ } a nd each forest is a collection of trees. Observe now tha t ea ch “tree” in H 5 is of height h . Since there are at most ∆ × 2 N trees in H 5 , each o f which has at most ( N × m ) h +1 constants, we see that I 5 ≤ (∆ × 2 N ) × ( N × m ) h +1 ≤ ((2 N × ( N m ) cm ) cm × 2 N ) × ( N × m ) cm +1 It is easy to calcula te now that I 5 ≤ 2 2 q ( p,m ) for some p olynomial q in p a nd m . W e hav e th us mana ged to co nstruct a b ounded model I 5 which satisﬁes the orig inal UCV for mula φ . W e now pr ov e the c orrectness of o ur construction for theorem 4.1 . The pr o of is divided in to a series o f subprop erties that assert the cor rectness o f each pr o cedure in our co nstruction. First, we prov e a simple lemma. Lemma 4 .7. L et V b e a set of (unary) views over a vo c abulary σ . Supp ose I , J ar e σ -st ructur es su ch t hat C ( I ) is non-empty iﬀ C ( J ) is non-empty for e ach e qu iv- · 15 alenc e class formula C c onstructe d with r esp e ct to V . Then, I and J agr e e on UCV ( σ , V ) . Proof. By standar d Ehrenfeuch t-F r a ¨ ısse arg umen t, we see that Λ( I ) and Λ( J ) agree on UFO( V ). T he n, by lemma 3.1, w e hav e that I and J agr ee on UCV ( σ , V ). Subprope r ty 4.8 (Correctness of MakeJ F ). ( 1 ) F or e ach n o de v (say, ( S a × C i ) -lab ele d) of H 1 , a ∈ C i ( I 1 ) with witnessing structure S a . ( 2 ) If I 0 | = φ , then I 1 | = φ . Proof. First, note that I 1 ⊆ I 0 . Since conjunctive queries are monoto nic , we hav e V ( I 1 ) ⊆ V ( I 0 ) fo r each view V ∈ U . So, we hav e that a ∈ V ( I 1 ) implies that a ∈ V ( I 0 ). In addition, for e a ch constant a ∈ I 1 , if a ∈ V ( I 0 ), then a ∈ V ( I 1 ), which is witness ed at some S a -lab eled no de. In turn, this implies that for a ∈ I 1 , it is the case that a ∈ C ( I 1 ) iﬀ a ∈ C ( I 0 ). This prov es the ﬁrst statement. Also, by construction, if C i ( I 0 ) is non-empty , where i ∈ { 0 , . . . , 2 N − 1 } , we k now that o ne of its member s b elongs to I 1 , witness ed at the ro ot of T i . Ther efore, we also hav e that C ( I 0 ) is non-empty iﬀ C ( I 1 ) is non-empty . In view of lemma 4.7, we conclude the second sta tement . A t this stage , it is worth noting that, once H 1 has b een c o nstructed, the subsequent pro cedures might mo dify the labe l ( S a × C i ) — its name (e.g. from S a × C i to S a ′ × C i for so me new constant a ′ ) as well as its co nt ents (e.g. replacing each o ccurrence o f a tuple R ( a, a ) by R ( a ′ , a ′ ) for s ome new cons ta nt a ′ ). Despite this, we wish to highlig ht that one inv aria nt is pr eserved by ea ch of these proc e dures that hav e be e n des c rib ed: Inv ariant 4 .9 (Justifica tion Set). Su pp ose H is a justiﬁc ation for est of a structur e I , and v a ( S a × C i ) -lab ele d no de of H . Then, we have a ∈ C i ( I ) with witnessing st ructur e S a . Subprop erty 4.8 shows that this is satis ﬁe d by H 1 . In fact, tha t this inv ariant is pre s erved by the later pro ce dures will be almost immediate from the pro of of correctnes s of the pro ce dure. Hence, we leave it to the reader to verify . Subprope r ty 4.10 (Correctness of rena me1 ). If I 1 | = φ , then I 2 | = φ . Proof. In this pro cedur e , we p erfor m constant r enaming for ea ch tree T i in H 1 . F o r the purp ose of this pr o of, let us denote the tr ee so obtained by T ′ i . Such a renaming induces a bijection f i : adom ( T i ) → adom ( T ′ i ). E xtend f i to tuples, structures, and tr ees in the obvious wa y . Obser ve that the structur e s corr esp onding to the tr ees f i ( T i ) and T i are isomor phic. Now, in view of lemma 4.11 , it is easy to chec k that for ea ch tree T i in H 1 and a constant a in the structure corresp o nding to T i , a ∈ C j ( I 1 ) iﬀ f i ( a ) ∈ C j ( I 2 ). By virtue of b y lemma 4.7, we co nclude our pro of. Lemma 4 .11. Supp ose that a is a c onstant in the st ructur e c orr esp onding to T i of H 1 . Then, a ∈ V ( I 1 ) iﬀ f i ( a ) ∈ V ( I 2 ) . Proof. ( ⇒ ) By subpro p erty 4.8 , it is the case that a ∈ V ( S a ). Sin ce f i is a bijection, it is a lso true that f i ( a ) ∈ V ( f i ( S a )). 16 · ( ⇐ ) Let M b e a minimal set of tuples in I 2 such that f i ( a ) ∈ V ( M ). Observe that there is a o ne-to-one function mapping the s et of conjuncts in V to M . F o r each tree T j in H 2 , let M j denote the members of M that ca n b e found in T j . Note that adom ( M j ) ∩ adom ( M j ′ ) = ∅ for j 6 = j ′ . No w, let M ′ = S j f − 1 j ( M j ). It is not hard to see that a ∈ V ( M ′ ). Since M ′ ⊆ I 1 , w e hav e a ∈ V ( I 1 ). Subprope r ty 4.12 (Correctness of rena me2 ). If I 2 | = φ , then I 3 | = φ . Proof. Deﬁne the function η : I 3 → I 2 such tha t η ( a j,k ) = a . Note tha t η is onto. Extend η to tuples, and sets of tuples in the obvious wa y . In view of lemma 4.7, it is s uﬃcie nt to show that, for each a ∈ I 3 and i ∈ { 0 , . . . , 2 N − 1 } , a ∈ C i ( I 3 ) iﬀ η ( a ) ∈ C i ( I 2 ). In tur n, it is enough to show that, a ∈ V ( I 3 ) iﬀ η ( a ) ∈ V ( I 2 ). ( ⇒ ) T ake a minimal s e t M o f tuples in I 3 such that a ∈ V ( M ). Then, we hav e η ( a ) ∈ V ( η ( M )). Since η ( M ) ⊆ I 2 , w e hav e η ( a ) ∈ V ( I 2 ). ( ⇐ ) Since inv aria nt 4.9 holds for H 2 , the fact that η ( a ) ∈ V ( I 2 ) is witnessed by S η ( a ) ∈ H 2 . Since S a and S η ( a ) are is omorphic justiﬁcation se ts, we have that a ∈ V ( I 3 ) is justiﬁed by S a ∈ H 3 . Subprope r ty 4.13 (Correctness of copy ). ( 1 ) F or e ach no de v (say, ( S a × C i ) -lab ele d) of H 4 , a ∈ C i ( I 4 ) with witnessing structure S a . ( 2 ) If I 3 | = φ , then I 4 | = φ . Proof. Similar to the pr o of of subpro pe r ty 4.10 . Subprope r ty 4.14 (Correctness of prun e ). If I 4 | = φ , t hen I 5 | = φ . Proof. Recall that there are N l def = δ × ∆ leafs in H 4 . Let us order these no des as v 1 , . . . , v N l . Suppose a lso that v i is lab eled by S b i for some b i ∈ I 4 . By v irtue of rename 2 , we see that b i 6 = b j whenever i 6 = j . Next, we may think of the pro cedure prune as co nsisting of N l steps, where at step i , the no de v i has all its descendants remov ed (pruned) and v i is changed to S ′ b i def = S c i [ b i /c i ] for some c i ∈ I 4 . Letting K 0 def = H 4 , we deno te by K i ( i = 1 , . . . , N l ) the resulting mo del after executing i steps o n K 0 . The structure corresp o nding to K i is denoted by J i . W e wish to prov e by induction on 0 ≤ i < N l that (I). F or each a ∈ J i +1 and V ∈ U , a ∈ V ( J i +1 ) iﬀ a ∈ V ( J i ). (II). Inv ariant 4 .9 ho lds for J i +1 . (III). F or each a ∈ J i +1 , w e hav e a ∈ C i ( J i +1 ) iﬀ a ∈ C i ( J i ). Note that J i +1 ⊆ J i . So , b y lemma 4 .7 and the fact that inv ar iant 4 .9 holds for the initial case J 0 (from pr o ofs of previous subprop erties), sta tement (I I I) will imply what we wish to prove. It is e asy to see that statement (I I I) is a direct conse quence of s tatement (I). It is a ls o eas y to show tha t statement (I) implies sta tement (I I). This follows since ﬁrstly , at s tep i + 1 , we repla ce the conten t of S b i by that of S c i , except for substituting b i for c i . Second, the elements b i and c i belo ng to the same equiv a lence class in J i , and inv aria nt 4 .9 holds for J i by induction. Ther efore, it remains only to prove statement (I). Let us now ﬁx i < N l , a ∈ J i +1 , and V ∈ U . It is s imple to prove that a ∈ V ( J i ) implies a ∈ V ( J i +1 ). This is witnessed by tuples in the S a -lab eled (or S ′ b i -lab eled if a = b i ) no de in K i +1 , whic h exists by c o nstruction. · 17 Conv ersely , w e ta ke a minimal set M of tuples in J i +1 with a ∈ V ( J i +1 ), wit- nessed by the v aluation ν . Our aim is to ﬁnd a set M ′ of tuples in J i with a ∈ V ( M ′ ). L e t M b i def = M − D ( J i ). Intuitiv ely , M b i contains the set of new tuples. These are tuples which did not exist in the str ucture J i and hav e b een created sp ecif- ically to justify the no de whose descendants (justiﬁcations) hav e just b een pruned. By co nstruction, we hav e M b i ⊆ S ′ b i , whic h implies that adom ( M b i ) ⊆ adom ( S ′ b i ). Observe also that b i ∈ adom ( t ) for each tuple t in M b i ; otherwise, t would b e a tuple in S c i ⊆ J i (i.e. it would not b e a new tuple). Deﬁne L := { t ∈ M − M b i : t is connected to some t ′ ∈ M b i in M } . L consis ts of tuples that are connected to ne w tuples. Also, let L ′ def = M − M b i − L , i.e., the set of all tuples o f M tha t are not co nnected to a ny (new) tuples in M b i . Note that L ∪ L ′ ⊆ J i , and that the s ets M b i , L , and L ′ form a partition on M . Also, by deﬁnition, w e have adom ( L ′ ) ∩ adom ( M b i ∪ L ) = ∅ . In the following, we deﬁne M c i def = M b i [ c i /b i ]. Note that M c i ⊆ S c i ⊆ J i . Before we pro ceed fur ther , it is helpful to see how we partition M on a simple example. Suppos e that the v ie w V is deﬁned as V ( x 0 ) ← E ( x 0 , x 1 ) , E ( x 1 , x 2 ) , R ( x 3 , x 4 ) , R ( x 4 , x 5 ) . F urther more, supp os e that we take the v aluation ν deﬁned as ν ( x i ) = i . In this case, M ca n b e describ ed diagr ammatically as follows V (0 ) ← E (0 , 1) , E (1 , 2 ) , R (3 , 4) , R (4 , 5) . Assume now that the only tuple in M that do es n’t b elong to D ( J i ) is E (0 , 1). Then, we have M b i = { E (0 , 1) } . It is easy to show that L = { E (1 , 2 ) } and L ′ = { R (3 , 4) , R (4 , 5) } . W e next state a result r egarding L that will shortly be needed. It clariﬁes the nature of a partition that exists for L and the rela tionships which hold b etw een the elements o f the par tition. Proposition 4.15. We c an ﬁnd tuple-sets A , B ⊆ L such that: ( 1 ) A ∩ B = ∅ , ( 2 ) A ∪ B = L , ( 3 ) adom ( A ) ∩ adom ( B ) = ∅ , ( 4 ) b i / ∈ adom ( B ) , and ( 5 ) adom ( M b i ) ∩ adom ( A ) ⊆ { b i } . The pro of of this prop o sition can be found at the end of this section. W e now shall construct M ′ ⊆ D ( J i ) suc h tha t a ∈ V ( M ′ ). Firs t, w e put L ′ in M ′ . This do es no t aﬀect our choice o f tuple-sets that replace M b i , A , a nd B as adom ( L ′ ) ∩ adom ( M b i ∪ L ) = ∅ (i.e. the set o f free tuples instan tiated by L ′ and the set of free tuples instantiated b y M b i ∪ L s hare no co mmon v ariables), a s we hav e noted earlier. There a re t wo cases to consider: c ase 1. a = b i . Let F be the set o f all fr ee tuples in the bo dy of V suc h that { ν ( u ) : u ∈ F } = M b i ∪ B . Suppose X is the set of all v ariables in V . Let 18 · { y 1 , . . . , y r } ⊆ X b e the set of v ariables in F such that ν ( y j ) = b i . With y as a new v a riable, let F ′ := F [ y /y 1 , . . . , y r ]. Deﬁne the new view V ′ ( y ) whose conjuncts are exactly F ′ : V ′ ( y ) ← ^ F ′ T r ivially , we have b i ∈ V ′ ( M b i ∪ B ). Then, as b i / ∈ adom ( B ) by pro po sition 4.15, we hav e c i ∈ V ′ ( M c i ∪ B ). Note tha t M c i ∪ B ⊆ D ( J i ) and V ′ ∈ U s inc e l eng th ( V ′ ) ≤ m . So, since by induction b i and c i belo ng to the same e quiv a lence class in J i , there exist tuple-sets P b i and B ′ with P b i ∪ B ′ ⊆ J i such that b i ∈ V ′ ( P b i ∪ B ′ ). [ P b i and B ′ , re s p e ctively , replace the r ole o f M c i and B .] Obser ve now tha t a / ∈ adom ( B ) as a = b i . Since adom ( M b i ) ∩ adom ( A ) ⊆ { b i } a nd adom ( A ) ∩ adom ( B ) = ∅ from prop osition 4.15, it is easy to verify that a ∈ V ( P b i ∪ A ∪ B ′ ∪ L ′ ) . c ase 2. a 6 = b i . This is divided into tw o further cases: (a). b i ∈ adom ( A ). This is divided into t wo further cas e s: (i). a ∈ adom ( A ). In this case, note that a / ∈ adom ( M b i ) (using Prop o sition 4.15(5)) and a / ∈ adom ( B ) (using Prop osition 4 .15(3)). W e can then con tinue in the same fas hio n as in the case 1. (ii). a / ∈ adom ( A ). Let F b e the set of all free tuples in the b o dy of V such tha t { ν ( u ) : u ∈ F } = A . Let { y 1 , . . . , y r } ⊆ X b e the set of v ariables in F such that ν ( y j ) = b i . Let y b e a new v ariable (i.e. y / ∈ X ) and F ′ := F [ y /y 1 , . . . , y r ], i.e., w e replace ea ch o ccurrence of the v ar iables y 1 , . . . , y r in F by y . Then, let V ′ ( y ) be the view whos e conjuncts are exactly F ′ : V ′ ( y ) ← ^ F ′ . Then, V ′ ∈ U and b i ∈ V ′ ( A ). Since A ⊆ D ( J i ) and becaus e b i and c i belo ng to the same equiv alence class in J i (b y the induction hypo thesis), there exists a set A ′ ⊆ D ( J i ) such that c i ∈ V ′ ( A ′ ). Since adom ( M b i ) ∩ adom ( A ) ⊆ { b i } and adom ( A ) ∩ adom ( B ) = ∅ fr om prop o s ition 4.15 , it is easy to c heck tha t a ∈ V ′ ( M c i ∪ A ′ ∪ B ∪ L ′ ). (b). b i / ∈ adom ( A ). Let M c i def = M b i [ c i /b i ]. By co nstruction, we see that M c i ⊆ S c i ⊆ D ( J i ). By prop osition 4.15 (items 4 and 5), it is the c a se that a ∈ V ( M c i ∪ A ∪ B ∪ L ′ ) . In any ca se, we have a ∈ V ( J i ). This completes the pro of. It remains to prove prop o s ition 4 .1 5. Proof of pr oposition 4. 15. The pr esent situation is depicted in ﬁgure 2 . This is a s na pshot o f the mo ment just b efore we apply step i + 1. Step i of p rune pro cedure simply prunes the subtree ro oted at the S b i -lab eled no de v , a nd links (rejustiﬁes) v using the S c i -lab eled no de w , wher e b i and c i belo ng to the sa me equiv a lence class in J i . It is imp or tant to note that some cous in 5 w 3 of v might 5 node of the same tree and l ev el · 19 w 1 w 3 w 4 w 5 w 6 v w w 2 Fig. 2. v is the S b i -lab eled no de whose conten ts are to b e changed by S c i [ b i /c i ]. The no de w is S c i -lab eled, and will b e “linked” to nod e v after step i + 1 of prune procedu re is ﬁnished — signiﬁed by the dotted line. Solid lines represent links th at hav e b een established in step j < i + 1 of the pro cedure. also b e linked to a ro ot w 6 of ano ther tree, which in turn might b e linked to a leaf no de w 2 of another tree, which in turn might hav e a cousin w 1 that satisﬁes the same prop erty a s w 3 and so on. F urthermor e, the no de w might ha ve a lso been linked to some other leaf w 4 that has a cousin w 5 that is connected to a roo t of some other tree, a nd so on. Note that it is imp oss ible for tw o leafs of a tr ee to be link ed to the sa me ro ot no de o f a tree b y construction. Hence, the three trees in the middle (i.e. where v , w , and w 6 are loca ted) are nec e s sarily distinct. The leftmost and rightmost tree might b e the same tree dep ending on the v alue of the girth g tha t we deﬁned ear lier. Let us now deﬁne A def = { t ∈ L : d G ( J i ) ( t, b i ) ≤ m } B def = { t ∈ L : d G ( J i ) ( t, S c i ) ≤ m } . Int uitively , the set A contains tuples up to dista nce m from the lab el S b i of no de v in J i , while the set B co ntains tuples up to distance m from the la be l S c i of w in J i . Note that this is distance in the structure J i , not J i +1 . It is immediate that we have prop erty (2) A ∪ B = L , as the length of the view V is at most m and that adom ( M b i ) ⊆ { b i } ∪ adom ( S c i ). So, it is suﬃcient to show that prop erties 3 and 5 ar e satisﬁed, as they obviously imply prop erties 1 and 4. Note that our construction has ensured that: (1) Two no des in any given tre e in K i that are at least distance tw o apart cannot share a constant. (2) Two tr ees T and T ′ in K i cannot shar e a constant except on: (i) a unique leaf of T and the ro o t of T ′ , as is the case for v and w in Figure 2 or alternatively (ii) a unique lea f of T a nd a unique leaf o f T ′ . This c a se can happ en when b oth leafs a re connected to the r o ot o f a diﬀerent tr ee T ′′ , as is the s ituation for w 2 and w 3 in Figure 2 . Therefore, for so me suﬃciently larg e constant c ′ ∈ N , tw o no des v ′ and v ′′ in K i of distance c ′ m cannot hav e t wo elements of J i that are of distance ≤ m in 20 · G ( J i ). [In fact, a careful ana lysis will show that c ′ = 1 is suﬃcient.] Ther efore, the lo cations o f the constants in A (resp. B ) ca nnot b e “very far awa y” from the tuple v (r esp. w ). In fact, if we set c ≥ c ′ (recall that g = cm ) and consider the path P b etw een a tuple t ∈ A a nd the cons tant b i (whic h b elo ngs to v and its par ent ), it ca nnot connect a ro o t and a leaf o f the same tre e (i.e. throug h the b o dy of the tree). So, either it is completely c o ntained in the tree o f which v is a leaf, or it has to a lternate alternate b etw een leafs and ro ot s e veral times, and then end in some tree. In ﬁg ure 2 , we may pick the following example v → ∗ w 3 → w 6 → w 2 → ∗ w 1 → . . . , where we use the notatio n → ∗ to mean “ path in the sa me tree”. The same analysis can b e a pplied to determine the lo cations of the tuples of B . Therefore, in order the ensure that prop er ties 3 and 5 are satisﬁe d, we just need to ensur e that the height of ea ch tr ee a nd the girth of K i be lar ge enoug h, whic h c a n b e done b y taking a suﬃciently larg e c . When the girth (as ensured in the copy and prune pro cedures ) is suﬃciently larg e, we can be sure that no paths of length ≤ m ex ist b etw een v a nd w in K i [In fact, a car eful but tedious analysis shows that c = 1 is suﬃcient .] Theorem 4 .1 a lso holds for inﬁnite mo dels , since even if the initial justiﬁcation hierarchies are inﬁnite, the pro of metho d used is unchanged. W e thus also obtain ﬁnite c ontrollabilit y (every sa tisﬁable formula is ﬁnitely satisﬁa ble) for UCV. Proposition 4.16. The UCV class of formulas is ﬁn itely c ontr ol lable. 5. EXTENDING TH E VIEW DEFINIT IONS The previous se c tion showed that the ﬁrst o rder la nguage using unary conjunctive view deﬁnitions is decidable . A natura l way to incr ease the p ow er of the languag e is to make view b o dies mo re expressive (but retain unary a rity for the vie w s ). W e say earlier that allowing unary views to us e disjunction in their deﬁnition do es not ac tua lly incr e ase expressiveness of the UCV language and hence this case is decidable. Unfortunately , as we will show, e mploying other wa ys of extending the views r esults in sa tisﬁability be c oming undecida ble. The ﬁrst extens ion we conside r is allowing inequality in the views, e.g., V ( x ) ← R ( x, y ) , S ( x, x ) , x 6 = y Call the ﬁrst o rder language ov er such views the ﬁrst or der unary c onjunctive 6 = view language . In fact, this langua ge allows us to c heck whether a tw o counter machine computation is v alid and terminates, which thus leads to the following res ult: Theorem 5.1. Satisﬁability is unde cidable for t he ﬁrst or der unary c onjunctive 6 = view qu ery language. Proof. The proo f is by a r eduction fr o m the halting problem of tw o counter machines (2CM’s) star ting with z ero in the counters. Given any description o f a 2CM a nd its computatio n, we can show how to a) enco de this des c ription in database r elations and b) deﬁne querie s to chec k this descr iption. W e construct a query which is satisﬁa ble iﬀ the 2CM halts. The bas ic idea of the simulation is similar to one in [Levy et al. 199 3], but with the ma jor diﬀerence that cycles ar e al lowe d in the successor r elation, though there must b e at least one g o o d chain. · 21 A tw o-counter machine is a deterministic ﬁnite sta te machine with tw o non- negative counters. The machine can test whether a pa rticular counter is empty o r non-empty . The tra nsition function has the form δ : S × { = , > } × { = , > } → S × { pop, pu sh } × { pop, pu sh } F or ex ample, the statement δ (4 , = , > ) = (2 , push, pop ) means that if we are in state 4 with counter 1 equal to 0 and counter 2 gr e a ter than 0, then go to state 2 and a dd o ne to counter 1 a nd subtract o ne fro m counter 2. The computation o f the machine is stored in the relation conf i g ( t, s, c 1 , c 2 ), where t is the time, s is the state and c 1 and c 2 are v alue s of the counters. The states o f the machine ca n b e descr ib ed by in tegers 0 , 1 . . . , h where 0 is the initial state and h the halting (accepting) state. The ﬁrst conﬁgura tion of the machine is conf ig (0 , 0 , 0 , 0) and there after, for each mov e, the time is increased by one a nd the state a nd counter v a lues changed in co rresp ondence with the transition function. W e will use so me relatio ns to e nc o de the c omputation of 2CMs starting with zer o in the co unt ers . These are: — S 0 , . . . , S h : each contains a constant which repr esents that particula r state. — succ : the successor relation. W e will ma ke sure it contains one c hain star ting from z er o and ending at l ast (but it may in a dditio n co nt ain unrelated cycles). — conf ig : cont ains co mputation of the 2CM. — z ero : contains the ﬁrst constant in the chain in succ . This constant is also used as the num b er zero. — l ast : cont ains the last co nstant in the chain in succ . Note that w e sometimes blur the distinction b etw een unary relations and unary views, since a view V can sim ulate a unary relation U if it is deﬁned by V ( x ) ← U ( x ). The una r y and nullary views (the latter ca n b e eliminated using quantiﬁed unary views) a re: — hal t : true if the machine halts. — bad : true if the database do es n’t cor rectly des crib e the computation of the 2CM. — dsucc : contains all co nstants in succ . — dT : contains all time stamps in conf i g . — dP : contains all constants in succ with predec essors. — dC ol 1 , dC ol 2 : are pro jections o f the ﬁr st and seco nd co lumns of succ . When deﬁning the views, w e also state some formulas (such as hasP re d ) ov er the views which will b e used to form our ﬁr st o r der sentence ov er the views . —The “domain” vie w s (those star ting with the letter d ) a re easy to deﬁne, e.g. dP ( x ) ← succ ( z , x ) dC ol 1 ( x ) ← suc c ( x, y ) dC ol 2 ( x ) ← suc c ( y , x ) — hasP r ed says “ each nonzer o constant in succ has a pre decessor:” 22 · hasP r ed : ∀ x ( dsucc ( x ) ⇒ ( z er o ( x ) ∨ dP ( x ))) — sameD om says “the constants used in succ and the timestamps in conf i g a re the same set” : sameD om : ∀ x ( dsucc ( x ) ⇒ dT ( x )) ∧ ∀ y ( dT ( y ) ⇒ dsu cc ( y ))) — g oodz er o says “the zer o o ccur s in succ ” : g oodz er o : ∀ x ( z er o ( x ) ⇒ dsucc ( x )) — nempty : each o f the do mains a nd unar y bas e re la tions is not empt y nempty : ∃ x ( dsucc ( x )) —Check that each constant in succ has at most one successor and a t most one predecessor and tha t it has no cycles o f length 1. bad ← succ ( x, y ) , su cc ( x, z ) , y 6 = z bad ← succ ( y , x ) , su cc ( z , x ) , y 6 = z bad ← succ ( x, x ) Note that the ﬁrs t tw o o f these r ules could b e enforced by da tabase s t yle func- tional dep endencie s x → y and y → x on succ . —Check that every co nstant in the chain in succ which isn’t the last o ne must hav e a suc c e ssor hassuccnext : ∀ y ( dC ol 2 ( y ) ⇒ ( l ast ( y ) ∨ dC ol 1 ( y )) —Check that the last co nstant has no successor and zero (the ﬁrs t constant) has no pr e decessor. bad ← l ast ( x ) , succ ( x, y ) bad ← z er o ( x ) , su cc ( y , x ) —Check tha t e very co nstant elig ible to b e in last and zero must be so . el ig ibl ez er o : ∀ y ( dC ol 1 ( y ) ⇒ ( dC ol 2 ( y ) ∨ z ero ( y )) el ig ibl el ast : ∀ y ( dC ol 2 ( y ) ⇒ ( dC ol 1 ( y ) ∨ l ast ( y ))) —Each S i and z er o and la st contain ≤ 1 element . bad ← S i ( x ) , S i ( y ) , x 6 = y bad ← z er o ( x ) , z ero ( y ) , x 6 = y bad ← l ast ( x ) , l ast ( y ) , x 6 = y —Check tha t S i , S j , l ast, z ero are dis joint (0 ≤ i < j ≤ h ): bad ← z er o ( x ) , l ast ( x ) bad ← S i ( x ) , S j ( x ) bad ← z er o ( x ) , S i ( x ) bad ← l ast ( x ) , S i ( x ) —Check that the timestamp is the key for conf ig . Ther e are thre e r ules, one for the state and tw o for the tw o counters; the one for the s tate is: bad ← conf ig ( t, s, c 1 , c 2 ) ,conf ig ( t, s ′ , c ′ 1 , c ′ 2 ) , s 6 = s ′ · 23 —Check the co nﬁguration of the 2CM at time zero. conf ig m ust have a tuple at (0 , 0 , 0 , 0) and there must not b e any tuples in conﬁg with a zero state and non zero times or counters. V z s ( s ) ← z er o ( t ) ,conf ig ( t, s, x, y ) V z c 1 ( c ) ← z er o ( t ) ,conf ig ( t, x, c, y ) V z c 2 ( c ) ← z er o ( t ) ,conf ig ( t, x, y , c ) V y s ( t ) ← z er o ( s ) ,conf ig ( t, s, x, y ) V y c 1 ( c 1 ) ← z er o ( s ) ,conf ig ( t, s, c 1 , x ) V y c 2 ( c 2 ) ← z er o ( s ) ,conf ig ( t, s, x, c 2 ) g oodconf i g z er o : ∀ x ( V z s ( x ) ⇒ S 0 ( x ) ∧ ( V z c 1 ( x ) ∨ V z c 2 ( x ) ∨ V y s ( x ) ∨ V y c 1 ( x ) ∨ V y c 2 ( x )) ⇒ z ero ( x )) —F or each tuple in conf ig at time t which isn’t the halt state, there must also be a tuple at time t + 1 in conf ig . V 1 ( t ) ← conf ig ( t, s, c 1 , c 2 ) , S h ( s ) V 2 ( t ) ← su cc ( t, t 2) , conf i g ( t 2 , s ′ , c ′ 1 , c ′ 2 ) hasconf i g next : ∀ t (( dt ( t ) ∧ ¬ V 1 ( t )) ⇒ V 2 ( t )) —Check that the transitions of the 2CM a re followed. F o r e ach tra ns ition δ ( j, > , =) = ( k , pop, push ), we include three rules, one for chec king the state, one for chec king the ﬁrst coun ter and one for c hecking the s e cond counter. F or the transition in question we hav e for checking the state V δ ( t ′ ) ← conf ig ( t, s, c 1 , c 2 ) , succ ( t, t ′ ) , S j ( s ) , succ ( x, c 1 ) , z e ro ( c 2 ) V δ s ( s ) ← V δ ( t ) , conf ig ( t, s, c 1 , c 2 ) g oodstate δ : ∀ s ( V δ s ( s ) ⇔ S k ( s )) and for the ﬁrst co unter, we (i) ﬁnd all the times wher e the tra nsition is deﬁnitely correct fo r the ﬁrs t count er Q 1 δ ( t ′ ) ← conf ig ( t, s, c 1 , c 2 ) , succ ( t, t ′ ) , S j ( s ) , succ ( x, c 1 ) , z ero ( c 2 ) , succ ( c ′′ 1 , c 1 ) , conf ig ( t ′ , s ′ , c ′′ 1 , c ′ 2 ) (ii) ﬁnd all the times where the transition may or may no t b e co rrect for the ﬁrst counter Q 2 δ ( t ′ ) ← conf ig ( t, s, c 1 , c 2 ) , succ ( t, t ′ ) , S j ( s ) , succ ( x, c 1 ) , z e ro ( c 2 ) and make sur e Q 1 δ and Q 2 δ are the s ame g oodtr ans δ c 1 : ∀ t ( Q 1 δ ( t ) ⇔ Q 2 δ ( t )) Rules for s econd c o unter ar e similar . F or tra ns itions δ 1 , δ 2 , . . . , δ k , the co mbination can b e expr essed thus: g oodstate : g oodstate δ 1 ∧ g oodstate δ 2 ∧ . . . ∧ g oodstate δ k g oodtr ans c 1 : g oodtrans δ 1 c 1 ∧ g oodtr ans δ 2 c 1 ∧ . . . ∧ g oodtrans δ k c 1 g oodtr ans c 2 : g oodtrans δ 1 c 2 ∧ g oodtr ans δ 2 c 2 ∧ . . . ∧ g oodtrans δ k c 2 —Check tha t ha lting sta te is in conf ig . 24 · hl t ( t ) ← conf ig ( t, s, c 1 , c 2 ) , S h ( s ) hal t : ∃ xhl t ( x ) Given these views, we claim that s atisﬁability is undecida ble for the query ψ = ¬ bad ∧ hasP r ed ∧ same D om ∧ hal t ∧ g oodz er o ∧ g oodconf i g z er o ∧ ∧ nempty ∧ hassuccnext ∧ el ig ibl ez er o ∧ el ig i bl el ast ∧ g oodstate ∧ g oodtrans c 1 ∧ g oodtr ans c 2 ∧ hasconf i g next The second extension we consider is to allow “safe” nega tio n in the conjunctive views, e .g . V ( x ) ← R ( x, y ) , R ( y , z ) , ¬ R ( x, z ) Call the ﬁr st order la nguage ov er such views the ﬁrs t or der unary c onjunctive ¬ view language . It is als o undecida ble, by a result in [Bailey et al. 19 98]. Theorem 5.2. [Bailey et al. 1998] Satisﬁability is unde cidable for the ﬁrst or der unary c onjunctive ¬ view query language. A third p oss ibilit y for incre asing the express iveness of views would be to k eep the bo dy a s a pure conjunctive query , but allow views to hav e binary arity , e.g. V ( x, y ) ← R ( x, y ) This do es n’t yield a decidable la nguage either, since this lang uage has the same expressiveness as ﬁrst order logic o ver binary r elations, which is known to be un- decidable [B ¨ o rger et a l. 19 97]. Proposition 5.3. Satisﬁability is u nde cida ble for the ﬁ rst or der binary c onjunc- tive view language. A fourth p ossibility is to use una ry conjunctive views, but allow r ecursive view deﬁnitions. e.g. V ( x ) ← edg e ( x, y ) V ( x ) ← V ( x ) ∧ edg e ( y , x ) Call this the ﬁrst order unary conjunctive r ec language. This la nguage is undecidable also. Theorem 5.4. Satisﬁability is un de cidable for the ﬁrst or der unary c onjunctive r ec view language. Proof. (sketc h): The pro of of theore m 5.1 can b e adapted b y r emoving ineq ual- it y and instead using recurs io n to ensure there e xists a co nnected chain in succ . It then b ecomes more complicated, but the main prop er ty needed is that z ero is connected to l a st via the constants in succ . This can b e expres sed by conn z ero ( x ) ← z er o ( x ) conn z e ro ( x ) ← conn z er o ( y ) , succ ( y , x ) ∃ x ( last ( x ) ∧ conn z er o ( x )) · 25 6. APPLICA TIONS 6.1 Reasoning Over Ontol ogies A currently activ e area of resea rch is that of r e asoning ov er on tolog ies (see e .g. [Horro cks 2 005]). The aim here is to us e decidable quer y languages used for ac - cessing a nd reaso ning ab out information and structure for the Seman tic W eb. In particular, on tologie s provide vocabula ries which can deﬁne relationships or asso- ciations betw een v a r ious co ncepts (cla sses) and a lso pro p erties that link diﬀerent classes together. Description logics are a key to ol for r easoning over schemas and ontologies and to this end, a c onsiderable num ber of diﬀerent description logics hav e bee n developed. T o illustrate so me reasoning ov er a simple ontology , we adopt an example from [Hor ro cks et al. 20 0 3], descr ibing people, countries and some rela- tionships. This example can b e enco ded in a descr iption log ic such as S H I Q and also in the UCV query language. W e show how to acco mplish the latter. —Deﬁne classes such as C ountr y , P erson, S tudent a nd C anadian . Thes e are just unary views deﬁned over unar y relations, e.g . C ou ntry ( x ) ← countr y ( x ). Ob- serve that we can blur the distinctio n b etw een unar y views and unar y rela tions and use them interc hangea bly . —State that student is a sub class of P er son . ∀ xS tudent ( x ) ⇒ P er son ( x ) —State that C ana da and E ng la nd ar e both ins tances of the cla ss C ountry . T o accomplish this in the UCV langua ge, w e could deﬁne C anada and E ng lan d as unary views and e ns ure that they ar e contained in the C ountr y relatio n and a re disjoint with all o ther clas s es/instances. —Declare N ati onal ity as a prop erty relating the clas ses P erson (its domain) and C ountr y (its r ange). In the UCV language , we could mo del this as a binar y relation N ational i ty ( x, y ) and imp ose co ns traints on its domain and range. e.g. dom N ational i ty ( x ) ← N ati onal ity ( x, y ) rang e N ational ity ( y ) ← N ati onal ity ( x, y ) ∀ x ( dom N ational ity ( x ) ⇒ P er son ( x )) ∀ x ( rang e N ational i ty ( x ) ⇒ C oun try ( x )) —State that C ountry and P erson are disjoint cla sses. ∀ x ( C ountry ( x ) ⇒ ¬ P er son ( x )). —Assert that the clas s S tateles s is deﬁned prec isely as those members o f the class P er son that hav e no v a lues for the prop er t y N ational ity . has N ational i ty ( x ) ← N ati onal ity ( x, y ) S tatel ess ( x ) ⇔ P er son ( x ) ∧ ¬ has N ational ity ( x ) The ab ove types of s tatements a r e reasonably simple to express. In o rder to achiev e more expressiveness, pro pe r ty chaining and prop erty c o mp o sition hav e b een ident iﬁed as imp or tant reas oning features . T o this end, integration o f rule-based KR and DL-based KR is an active ar ea o f research. The UCV query language has the adv an tage o f b e ing a ble to express certain types of pro pe rty chaining, which would not b e expressible in the descr iption logic SHIQ, which is not a ble to accomplish chaining [Horro cks e t al. 200 3]. F or example 26 · —An uncle is precisely a parent’s brother. uncl e 1 ( z ) ← par ent ( x, y ) , br other ( x, z ) uncl e 2 ( z ) ← par ent ( x, y ) , br other ( z , x ) uncl e ( z ) ⇔ u ncl e 1 ( z ) ∨ uncl e 2 ( z ) W e co nsequently b elieve the UCV quer y languag e has some intriguing p otential to be use d as a reaso ning comp onent for ontologies, p os sibly to s upplement des cription logics for some specialize d applications. W e leav e this as an open area for future inv estigation. 6.2 Containment and Eq uivalence W e now brieﬂy exa mine the applicatio n of our results to query containmen t. The- orem 4.1 implies w e can test whether Q 1 ( x ) ⊆ Q 2 ( x ) under the constraints C 1 ∧ C 2 . . . ∧ C n where Q 1 , Q 2 , C 1 , . . . , C n are a ll ﬁr st order unary conjunctive view queries in 2 -NEXPTIME. This just a mounts to testing whether the s ent ence ∃ x ( Q 1 ( x ) ∧ ¬ Q 2 ( x )) ∧ C 1 ∧ . . . ∧ C n is unsa tisﬁable. E quiv a lence of Q 1 ( x ) and Q 2 ( x ) ca n b e tested with co ntainmen t tests in b oth directions. Of cours e, we can also s how that tes ting the containmen t Q 1 ⊆ Q 2 is undecidable if Q 1 and Q 2 are ﬁrst o rder unary conjunctive view 6 = queries, ﬁrst order una ry conjunctive v ie w ¬ queries and ﬁr s t or der una ry conjunctive r ec view quer ies. Containmen t o f queries with nega tion was ﬁrs t considered in [Sag iv a nd Y an- nak akis 19 8 0]. There it w as essentially shown that the pr oblem is decidable for queries which do not a pply pro jection to sub ex pr essions with diﬀer ence. Such a language is dis joint from ours, since it cannot expr ess a sentence such as ∃ y V 4 ( y ) ∧ ¬∃ x ( V 1 ( x ) ∧ ¬ V 2 ( x )) where V 1 and V 2 are views deﬁned over several v ariables. 6.3 Inclusion Dep end encies Unary inclusion dependencies were identiﬁ ed as useful in [Cosmada kis et al. 1990 ]. They take the for m R [ x ] ⊆ S [ y ]. If we allow R a nd S a b ov e to b e unary c o njunctive view quer ies, w e co uld obtain unary c onjunctive view c ontainment dep endencies . Observe that the unar y v iews are actually unar y pr o jections of the join of one or more relations. W e ca n also deﬁne a sp ecial t yp e of dep endency called a pr op er ﬁrst order unary conjunctive inclusion dep endency , having the form Q 1 ( x ) ⊂ Q 2 ( x ), where Q 1 and Q 2 are ﬁrs t or der una ry conjunctive view queries with one free v a riable. If { d 1 , . . . , d k } is a s et o f such depe ndencie s, then it is straightforward to test whether they imply another dep endency d x , b y testing the satisﬁabilit y o f an appropria te ﬁrst o rder unary conjunctive v iew quer y . Theorem 6.1. Implic ation for t he class of un ary c onjunctive view c ont ainment dep endencies with subset and pr op er subset op er ators is i) de cida ble in 2-NEXPTIME and ii) ﬁnitely c ontr ol lable. The results fro m [Cosmadakis et a l. 19 9 0] s how that implication is decidable in p olynomia l time, but no t ﬁnitely controllable, for either of the combinations i) functiona l dependencies plus unary inclusio n dep endencies, ii) full implication depe ndencie s plus unar y inclusion dep endencies. In contrast, the stated complexity · 27 in the ab ov e theorem is m uch higher, due to the increa sed express iveness of the depe ndencie s, yet interestingly the cla ss is ﬁnitely controllable. W e might also consider una r y conjunctive 6 = containmen t dep endencies. The tests in the pr o of of theo r em 5.1 for the 2CM ca n b e wr itten in the form Q 1 ( x ) ⊆ Q 2 ( x ), with the exception of the non-emptiness constraints, which must use the pr op er subset op era tor. In terestingly also, w e c an see fro m the pro o f of theorem 5.1, that a dding the ability to expres s functiona l dep endencie s would also res ult in undecidability . W e ca n s umma r ise these observ ations in the fo llowing theorem and its corollar y . Theorem 6.2. Implic ation is un de cidable for un ary c onjunctive 6 = (or c onjunct ive ¬ ) view c ontainment dep endencies with t he subset and the pr op er subset op er ators. Corollar y 6. 3. Implic ation is unde cidable for the c ombination of unary c on- junctive view c ontainment dep endencies plus functional dep endencies. 6.4 Active Rule T e rmin ation The languages in this pap er have their origins in [Baile y et al. 1998 ], where active database r ule la nguages based on views w ere studied. The decidabilit y result for ﬁrst o rder unary co njunctive views can b e used to p o sitively ans wer an o p en ques- tion raised in [Bailey et al. 1 998], which essentially asked whether termination is decidable fo r active data ba se r ules ex pressed using unary conjunctive views . 7. EXPRESSIVE PO WER OF THE UCV LANGUAGE As we have seen in the previo us sections, the logic UCV is quite s uitable to rea son ab out her editary information such as “ x is a gr andchild o f y ” ov er family trees. This is due to the fact that UCV can express the existence of a direc ted walk o f length k in the graph, for any ﬁxed p ositive integer k . Therefor e, it is natural to also ask what is inexpressible in the lo gic. In this se ction, we des crib e a ga me- theoretic tec hnique for proving inexpress ibilit y re sults for UCV. First, we show an easy adaptation of Ehrenfeuch t-F ra ¨ ıss´ e g ames for proving that a b o olea n quer y is inexpressible in UCV( σ , V ) for a signature σ and a ﬁnite view set V over σ . Second, we ex tend this result for pr oving that a bo o lean q uery is inexpre ssible in UCV( σ ). An inexpress ibility result of the second kind is clear ly mor e interesting, as it is independent of o ur choice of the view s et V over σ . Moreover, such a result places an ultimate limit of wha t can be expre ssed by UCV queries. Although it can b e adapted to any class C of structures, we shall only state our theore m for proving inexpressibility results in UCV over al l ﬁnite stru ctur es . F or this section only , we shall us e S T R U C T ( σ ) to denote the set of all ﬁn ite σ -structures . Our ﬁrst go al is quite easy to achiev e. Recall that each view set V over σ induces a ma pping Λ : S T R U C T ( σ ) → S T R U C T ( V ) as de ﬁned in sectio n 2. Theorem 7.1. L et A , B ∈ S T RU C T ( σ ) . Deﬁne the function Λ : S T R U C T ( σ ) → S T R U C T ( V ) . Then, t he fol lowing statements ar e e quivalent: ( 1 ) A and B agr e e on UCV ( σ, V ) . ( 2 ) Λ( A ) ≡ UF O ( V ) 1 Λ( B ) (i.e. t hey agr e e on UFO ( V ) formulas of quantiﬁer r ank 1. 28 · Proof. Immediate from lemma 2 .1, a nd lemma 3.1 . So, to prove tha t a b o olean quer y Q is not expr essible in UCV( σ, V ), it s uﬃces to ﬁnd t wo σ -structures such that Λ( A ) ≡ UFO( V ) 1 Λ( B ), but A and B do not a gree on Q . In turn, to show that Λ( A ) ≡ UFO( V ) 1 Λ( B ), we can use Ehrenfeuch t-F ra ¨ ısse games. W e now turn to the s econd task. Let us beg in by stating an obvious corolla ry of the preceding theorem. Corollar y 7. 2. L et A , B ∈ S T R U C T ( σ ) . F or any view set V , deﬁne the function Λ V : S T RU C T ( σ ) → S T RU C T ( V ) . Then, t he fol lowing statements ar e e quivalent: ( 1 ) A and B agr e e on UCV ( σ ) . ( 2 ) F or any view set V over σ , we have Λ V ( A ) ≡ UF O ( V ) 1 Λ V ( B ) This c orollar y is not of immediate use. Namely , checking the second statement is a daunting task, as there ar e inﬁnitely many po ssible view se ts V over σ . Instead, we s ha ll prop ose a suﬃcient condition for this, which employs the easy directio n of the w ell-known homomorphism preser v a tion theorem (see [Ho dg es 1 997]). Deﬁnition 7.3. A form ula φ over a v o cabula ry σ is said to b e pr eserve d under homomorph isms , if for any A , B ∈ S T R U C T ( σ ) the following statemen t holds: whenever a def = ( a 1 , . . . , a m ) ∈ φ ( A ) and h is a homomorphism from A to B , it is the case tha t h ( a ) def = ( h ( a 1 ) , . . . , h ( a m )) ∈ φ ( B ). Lemma 7 .4. Conjunctive queries ar e pr eserve d under homomorph isms. Theorem 7.5. L et A , B ∈ S T RU C T ( σ ) . T o pr ove that Λ( A ) ≡ UF O ( V ) 1 Λ( B ) for al l σ -view sets V , it is suﬃcient to show that ( 1 ) F or every a ∈ A , ther e exists a homomorphism h fr om A to B and a homo- morphism g fr om B to A such that g ( h ( a )) = a . ( 2 ) F or every b ∈ B , ther e exists a homomorph ism h fr om A to B and a homo- morphism g fr om B to A such that h ( g ( b )) = b . Proof. T ake an arbitrar y σ -view se t V . W e use Ehr enfeuch t-F ra ¨ ısse game a rgu- men t. Supp ose Sp oiler places a p ebble on an element a o f Λ( A ), whose domain is A . Then, the ﬁrst as sumption tells us that there ex is t homomorphisms h : A → B and g : B → A such that g ( h ( a )) = a . Duplicator may respo nd by placing the other pebble from the same pair on the element h ( a ) of Λ( B ). T o show this, we need to prove that a 7→ h ( a ) deﬁnes a n isomor phism b etw een the substructures of Λ( A ) and Λ( B ) induced by , re s pe ctively , the sets { a } and { h ( a ) } . Let V ∈ V . It is eno ug h to s how that a ∈ V ( A ) iﬀ h ( a ) ∈ V ( B ). If a ∈ V ( A ), then we hav e h ( a ) ∈ V ( B ) by lemma 7.4 . Similarly , if h ( a ) ∈ V ( B ), theorem 7.4 implies that a = g ( h ( a )) ∈ V ( A ). F or the case wher e Sp oiler plays an element of B , we can use the same argument with the aid of the second assumption a bove. In either ca se, we hav e Λ( A ) ≡ 1 Λ( B ). · 29 This theorem allows us to g ive easy inexpress ibilit y pro o fs for a v a riety of ﬁrst- o rder queries. W e now give three easy inexpressibility pro ofs for ﬁrs t-order quer ies over directed g raphs (i.e. structures with one binar y r elation E ). Example 7.1. We show t hat the formula S Y M ≡ ∀ x, y ( E ( x, y ) ↔ E ( y , x )) ac- c epting gr aphs with symmetric E is not expr essible in UCV ( σ ) . T o do this, c onsider the gr aphs A and B deﬁne d as fol lows a b c d a b c d A B = = Obviously, the gr aph E A is symmetric, while E B is not. Consider the fun ct ions h 1 , h 2 : A → B and g : B → A deﬁne d as — h 1 ( a ) = h 1 ( c ) = a and h 1 ( b ) = h 1 ( d ) = b , — h 2 ( a ) = h 2 ( c ) = c and h 2 ( b ) = h 2 ( d ) = d , and —for i ∈ B , g ( i ) = i . It is e asy to verify that h 1 and h 2 ar e homomorphisms fr om A to B , wher e as g a homomorphism fr om B t o A . Now, for x ∈ { a, b } , we have g ( h 1 ( x )) = x and h 1 ( g ( x )) = x . F or x ∈ { c, d } , we have g ( h 2 ( x )) = x and h 2 ( g ( x )) = x . So, by the or em 7.5 and c or ol lary 7.2, we c onclude t hat S Y M is not ex pr essible in U CV ( σ ) over al l ﬁnite dir e cte d gr aphs. Example 7.2. We now show that the tr ansitivity query T RAN S ≡ ∀ x, y , z ( E ( x, y ) ∧ E ( y , z ) → E ( x, z )) is not expr essible in UCV ( σ ) . T o do this, c onsider the gr aphs A and B deﬁne d as B = A = 0 1 2 0 1 2 3 4 5 It is obvious t hat A | = T R AN S , and it is not the c ase that B | = T RAN S . Consider the homomorphi sms h 1 , h 2 fr om A to B , and the homomorphism g fr om B to A deﬁne d as —for i ∈ A , h 1 ( i ) = i ; —for i ∈ A , h 2 ( i ) = i + 3 ; and —for i ∈ B , g ( i ) = i mo d 3 . Then, for i ∈ A , we have g ( h 1 ( i )) = i . Conversely, supp ose that i ∈ B . If i = 0 , 1 , 2 , then h 1 ( g ( i )) = i . Similarly, if i = 3 , 4 , 5 , then h 2 ( g ( i )) = i . So, by the or em 7.5 and c or ol lary 7.2, tra nsitivity is not expr essible in UCV ( σ ) over ﬁnite dir e cte d gr aphs. 30 · Example 7.3. The query ∀ x, y E ( x, y ) is also not expr essible in UCV ( σ ) . It is e asy to apply the or em 7.5 and c or ol lary 7.2 on the following gr aph s to verify this fact. = A = B 8. RELA TED WORK Satisﬁability of ﬁrst order logic has b een thoroughly inv estigated in the co nt ext of the cla ssical dec is ion problem [B¨ orger et al. 1997]. The main thrus t there has bee n deter mining fo r which quantiﬁer pr eﬁxes ﬁr st o rder languages a re decida ble. W e a re not aw are of any re sult of this t yp e whic h could be use d to demo ns trate decidability o f the ﬁr st or der unar y co njunctive view lang uage. Instead, our result is b est cla ssiﬁed as a new decidable class ge ne r alising the tr aditional decidable unary ﬁrst-order la nguage (the L¨ owenheim clas s [L¨ o wenheim 19 15]). Use of the L¨ owenheim c la ss itself for r easoning ab o ut schemas is describ ed in [Theo dor a tos 1996], wher e applications tow ards chec king intersection and disjoin tness of o b ject oriented classes are g iven. As obse r ved ear lier, description log ics are important logic s for expressing con- straints o n desir ed mo dels . In [Calv anese et al. 199 8], the quer y containmen t prob- lem is s tudied in the co nt ext of the description logic D LR r eg . There are cer tain similarities betw een this and the ﬁr st orde r (unary) view languag es we have stud- ied in this pa p e r. The key diﬀere nce app ears to b e that a lthough D LR r eg can b e used to deﬁne v ie w co nstraints, these cons tr aints cannot expr ess unar y conjunctive views (since a ssertions do no t allow a rbitrar y pro jection). F urthermor e, DLR r eg can express functional dep endencies on a single attribute, a feature whic h would make the UCV la nguage undecidable (see pro of of theorem 5.1). There is a result in [Calv anese et al. 199 8], howev er, showing undecida bilit y for a fra gment of DLR r eg with inequality , which could b e a da pted to give a n alter native pro of of theor em 5 .1 (although ine q uality is used there in a slightly more p ow erful way). Another int eresting family of decidable log ics ar e guar ded logic s. The Guarded F ra gment [Andrek a et al. 1998 ] a nd the Lo osely Guar ded F ra gment [V an Ben- tham 1997 ] are b o th logics that hav e the ﬁnite mo del pro p e rty [Ho dkinson 200 2 ]. The philosophy of UCV is somewhat similar to these g uarded logics , since the decidability of UCV also a rises from certain restrictions o n qua ntiﬁ er use. In terms of expr e ssiveness though, guar ded logics seem distinct fr o m UCV formu- las, no t b eing able to express cyclic views, such a s ∃ x ( V ( x )), where V ( x ) ← R ( x, y ) , R ( y , z ) , R ( z , z ′ ) , R ( z ′ , x ). Another area of work that deals with complex ity of views is the view consis- tency pr oblem, with results given in [Abiteb oul and Duschk a 19 98]. This inv olves determining whether there exis ts an underlying database instance that realises a sp e ciﬁc (b ounded) view instance . The problem we have fo cuse d on in this paper is slig htly more complica ted; testing satisﬁability o f a ﬁrst order view query as k s the question whether there exis ts an ( unb ounde d ) view insta nce that makes the query true. This explains how sa tisﬁability c a n b e undecidable for ﬁrs t or der unar y conjunctive 6 = view queries , but view consistency for non recursive datalog 6 = views · 31 T able 1: Summary of D ecidabilit y Results for First Order View Langua ges Unary Conjunctiv e View Decidable Unary Conjunctiv e ∪ View Decidable Unary Conjunctiv e 6 = View Undec idable Unary Conjunctiv e r ec View Undecidable Unary Conjunctiv e ¬ View Undecidable [Bailey et al. 1998] Binary Conjunctive View Undecidable is in N P . Monadic views hav e b e en rec ent ly examined in [Nas h et a l. 2007 ], where they w ere shown to exhibit nice prop erties in the cont ext of answering a nd rewriting conjunctive queries using o nly a se t of vie ws. This is an interesting counterp oint to the result of this pap er, which demo nstrate how monadic views can form the basis of a decida ble fra gment o f ﬁrst or der logic . 9. SUMMARY AND FURT HER WORK In this pap er, w e hav e in tro duced a new decida ble lang uage based on the use of unary conjunctive views e m b edded within ﬁrst order logic. This is a powerful gen- eralisatio n of the well known fr agment o f ﬁrst order logic using only unary relations (the L¨ o wenheim c lass). W e also show ed that our new class is ma ximal, in the sense tha t increasing the expressiv it y o f views is not p oss ible without undecidabil- it y resulting. T a ble 1 provides a summary of our decidability res ults. Note that the Unary Conjunctiv e ∪ View language corresp onds to the e x tension of UCV b y allowing dis junction in the view deﬁnition. W e feel that the dec ida ble cas e we ha ve iden tiﬁed, is suﬃciently natural a nd int eresting to be o f practical, as well as theoretica l interest. An interesting op en problem for future work is to inv estigate the decida bilit y of an extension to the ﬁrst order unary co njunctiv e view language, when equalit y is allow ed to b e used outside of the unary views (i.e. included in the ﬁrst orde r part). An ex ample for mula in this new languag e is ∀ X , Y ( V 1 ( X ) ∧ V 2 ( Y ) ⇒ X 6 = Y ) W e conjecture this extended languag e is decidable, but do not c urrently hav e a pro of. F or other future work, we believe it would b e worth while to in vestigate rela- tionships with description logics and also exa mine alternative wa ys of introducing negation into the UCV languag e. One p oss ibilit y mig ht b e to a llow views o f arity zero to sp ecify desc r iption lo gic like co nstraints, such as R 1 ( x, y ) ⊆ R 2 ( x, y ). Finally , there is still an exp onential ga p b e t ween the upp er bound complexit y of 2-NEXPTIME and low er b ound complexity of NEXPTIME-hardness that w e derived. The prima ry reaso n for this exp o nent ial blow-up is the enumeration o f all subviews of the views tha t are present in the for mula, which we need fo r the pr o of. 32 · AC KNOWLEDGMENTS W e thank Sanming Zhou for po inting o ut useful refere nces on extremal graph the- ory . W e ar e grateful to Leonid Libkin for his comments on a dr a ft of this pap er. REFERENCES Abiteboul, S . and Du schka, O. 1998. C ompl exit y of answering queries using materialized views. In Pr o c e e dings of the 17th A CM SIGMOD-SIGA CT-SIGAR T Symp osium on Principles of Datab ase Syst ems . Seattle, W ashington, 254–263. Andreka, H. , v an Bentham , J. , and Nem eti, I. 1998. Mo dal logics and b ounded fr agmen ts of predicate logics. J. Philosophic al L o gic 27 , 217–274. Bailey, J. and Dong , G. 1999. Decidability of ﬁr s t-order l ogic queries ov er views. In Pr o ce e dings of the International Confer enc e on D atab ase The ory (ICDT) . 83–99. Bailey, J. , Dong, G . , and Ramamohan arao, K. 1998. Decidability and undecidability results for the termination problem of active database rules. In Pr o ce e dings of t he 17th ACM SIGMOD- SIGA CT-SIGAR T Symp osium on Principles of Datab ase Systems . Seattle, W ashington, 264– 273. Boerger, E. , G raedel, E. , and Gurev ich, Y. 1996. The Classic al De cision Pr oblem . Springer- V erlag. Bollobas, B. 2004. Extr emal Gr aph The ory . Dov er Publications. Boolos, G. S. , Burgess, J. P. , and J effrey, R. C. 2002. Computability and Lo gic . Camb ridge Unive rsi t y Press. B ¨ orger, E. , G r ¨ adel, E. , and Gu revich, Y. 1997. The Classic al De cision Pr oblem . Spri nger- V erlag. Cal v anese, D. , De Giacomo, G . , and Lenzerini, M. 1998. O n the decidabilit y of query con- tainmen t under constrain ts. In Pr o c e e dings of the 17th ACM SIGMOD-SIGA CT-SIGAR T Symp osium on Principles of Datab ase Sy stems . Seattle, W ashington, 149–158. Cosmadakis, S. , Kanellakis, P. , and V ardi, M. 1990. Polyno mial time implication pr oblems for unary inclusion dependencies. Journal of the ACM 37, 1, 15–46. Diestel, R. 2005. Gr aph The ory . Springer- V erlag. Ender ton, H. B. 2001. A Mathematica l Intr o duction T o L o gic . A Har court Science and T ec h- nology Company . Gaifman , H. 1982. On lo cal and nonlo cal prop erties. In L o gic Col lo quium ’81 , J. Stern, Ed. North Holland, 105–135. Garcia-Molina, H. , Qua ss, D. , P ap akonst antinou, Y. , Rajaraman , A. , and Sagiv, Y. 1995. The tsimmis approach to mediation: Data mo dels and l anguage. In The Sec ond International Workshop on Ne xt Gener ation Information T e chnolo gies and Systems . N ahari a, Israel. Halevy, A. Y. 2001. Answering queries using views: A surve y . VLD B Journal: V ery L ar ge D ata Bases 10, 4, 270–294. Hodges, W. 1997. A Shorter Mo del The ory . Cambridge Universit y Press. Hodkinson, I. M. 2002. Loosely guarded fr agmen t of ﬁrst-order l ogic has the ﬁni te mo del pr op- ert y . Studia Lo gic a 70, 2, 205–240. Horrocks , I. 2005. Applications of description logics: State of the art and r esearc h cha llenges. In Pr o c. of 13th International Confer enc e on Conc eptual Struct ur es (ICCS) . 78–90. Horrocks , I. , P at el-Schneider, P. F. , and v a n Harmelen, F. 2003. F rom s hiq and rdf to owl: the making of a web onto logy language. Journal of Web Semantics 1, 1, 7–26. Levy, A. , Mumick, I. S. , Sagiv, Y. , and Shm u eli, O. 1993. E quiv alence, query reac hability , and satisﬁability in datalog extensions. In Pr o ce e dings of the t welfth ACM SIGA CT-SIGMOD- SIGAR T Symp osium on Principles of Datab ase Systems . W ashington D.C., 109–122. Levy, A. , Ra jaraman, A. , and Ordille, J. 1996. Querying heterogeneou s information sources using source descriptions. In Pr o c e e dings of 22th International Confer enc e on V ery L ar ge Data Bases . Mumbai, India, 251–262. Libkin, L. 2004. Elements of Finite Mo del Theo ry . Springer-V erlag. L ¨ owenheim, L. 1915. ¨ Uber m¨ oglich keiten im r elativk alkul. Math. Annalen 76 , 447–470. · 33 Nash, A. , Segoufin, L. , an d Vianu , V. 2007. Determinacy and rewriting of conjunctiv e queries using views: A progress rep ort. In Pr o c e e dings of the International Confer enc e on Datab ase The ory . 59–73. Sagiv, Y. and Y annakakis, M. 1980. Equiv alence s among relational expressions with the union and di ﬀerence oper ators. Journal of the ACM 27, 4, 633–655. Theodora tos, D. 1996. Deductiv e ob ject oriented sc hemas. In Pr o c e e dings of ER’96, 15th International Confer enc e on Conc eptual Mo deling . 58–72. Ullman, J. D. 1997. Information inte gration using logical views. In Pr o c ee dings of the Sixth International Confer enc e on D atab ase The ory, LNCS 1186 . Delphi, Greece, 19–40. V an Bentham, J. 1997. Dynamic bits and pieces. T ech. R ep. ILLC Researc h Rep ort LP- 97-01, Unive rsi t y of Amsterdam. Widom, J. 1995. Research problems in data warehousing. In Pr o c e e dings of the 4th International Confer enc e on Information and Know le dge Management . Baltimor e, Mar yland, 25–30.

Logical Queries over Views: Decidability and Expressiveness

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment