Formal Component-Based Semantics

One of the proposed solutions for improving the scalability of semantics of programming languages is Component-Based Semantics, introduced by Peter D. Mosses. It is expected that this framework can also be used effectively for modular meta theoretic …

Authors: Ken Madlener (Radboud University Nijmegen, The Netherl, s)

M.A. Reniers, P . Sobocinski (Eds.): W orkshop on Structural Operational Semantics 2011 (SOS 2011) EPTCS 62, 2011, pp. 17–29, doi:10.4204 /EPTCS. 62.2 c  Madlener , S metsers & v an E ekelen This work is licensed under the Creativ e Commons Attr ibution License. F ormal Com ponent-B ased Semantic s Ke n Madlener Institute for Computing and Information Sciences (iCIS), Radboud Uni versity Nijmegen, The Netherlands k.madlener @cs.ru.nl Sjaak Smetsers Institute for Computing and Information Sciences (iCIS), Radboud Uni versity Nijmegen, The Netherlands s.smetsers @cs.ru.nl Marko v an Eekelen Institute for Computing and Information Sciences (iCIS), Radboud Uni versity Nijmegen, The Netherlands School of Computer Science, Open Univ ersity of the Netherlands m.vaneekel en@cs.ru.nl One of the p roposed solutions for impr oving the scalability of semantics of pro grammin g lang uages is Comp onent-Based Sem antics, introdu ced by Peter D . Mosses. It is expected that th is fra mew o rk can a lso be used effecti vely for modular meta th eoretic reasonin g. This p aper presents a formaliza - tion of Compone nt-Based Sem antics in the theorem prover C O Q . It is based on Mo dular SOS, a variant of SOS, and makes essen tial use of dep endent types, while pro fiting from ty pe cla sses. Th is formalizatio n co nstitutes a c ontribution towards mod ular m eta theoretic fo rmalizations in theorem provers. As a sma ll example, a modular proof of determinism of a mini-language is de velop ed. 1 Introd uction Theorem prov er fo rmalization o f pro gramming language meta theory and semantic s recei ves a lot of attenti on. Most notably , the P O P L M A R K Challenge [1] calls for experi m ents on ver ifications of meta theory and semantics using proof tools. One of the main issues that programming language formal- ization s ha ve to cope with is the lack of reusabil ity of exis ting work. Many prog ramming languages ha ve language c onstructs in common, b ut o ften ha ve (slight) d if ferences in their p recise s emantics (e.g. assign m ents in C versus ass ignments in J A V A ). Component -B ased Semantics , introduced by Peter D . Mosses, aims to resolve th is reusa bility issue by cons tructing language descripti ons from combin ations of basic abstr act construct s [9]. Basic con- structs are supposed to ha ve a fix ed meaning and be langu age-indepen d ent. As an example, the basic constr uct of condition al expre ssions sho uld not depend on whether the expressi ons may ha ve side-ef fects or not, terminat e abrup tly or ev en int eract with o ther proce sses. One could e ven go as far a s creating a reposi tory of constructs th at may be freel y co mbined to b uild ne w la nguag es. This repository is therefore necess arily op en-ended, enablin g users to add newly dis cov ered basic constru cts. Modular Structural O peratio nal Semantics (MSOS) [7], a vari ant o f SOS, provides an adequate frame work for the independe nt description of langua ge componen ts [9]. MSOS was designed to ad- dress th e lack of reu sability of SOS rules: e very auxilia ry entity use d in a rule, suc h as an en viron m ent or a store, needs to be thre aded through a ll rules of the lan guage . MSOS p rovides a way to au tomatical ly propag ate unmentio ned entiti es between the pre m ise(s) and con clusio n of a rule, enabli ng the reuse of rules in dif ferent languag es. SOS is v ery suitable for the formaliz ation of lan guages and ha s theref ore been widely adopted by the theorem pro ver community . MSO S ha s so far recei ved less attentio n. This pa per p roposes a formali zation of C omponen t-Based Semantics b ased on MSOS in the theorem 18 Formal Compo nent-Bas ed Semantics pro ver C O Q [15]. 1 Our main c ontrib ution is a wa y to constru ctiv ely formalize programming language semantic s: basic constructs can be dev eloped in separate C O Q files, which may be verified independen tly . The formalization has been tested by bu ilding a small reposi tory of construc ts. Moreov er , it is po ssible to equip the constructs wit h small proofs t hat c an be used to co nstruct lar ger pr oofs of properties holdin g for a full language . For this reason, we shall use the term componen t instead of construct in this paper . Our formaliza tion supports meta theor etic reasonin g about a programming languag e, b ut does not support reason ing a bout the format of MSOS rules. The for malizatio n follo ws the origin al design of MS OS in its u se of arrows of a ca tegory for the aux- iliary entities (encapsulate d in labels) appear ing in the transitio n rules. A very elementary le vel of kno wl- edge about cate gory theory and a modest amount of familia rity with theorem provin g is re quired to read this paper . Our formalizatio n make s essentia l use o f depen dent types to formalize th e labels in MSOS, and profits from C O Q ’ s support for type classes. Each component is represen ted by a parametrized so- called C O Q section. T o define a full languag e, it is suf fi cient to enumerat e its compone nts. The correct instan tiation of the correspo nding par ameters can in principle be performed automat ically by C O Q ’ s po werful type system. 2 Component-Based Semantics W e ill ustrate the de scription of pr ogramming l angua ges in terms of basic abstra ct construc ts by mea ns of a while-loop example take n fro m [9 ]. Dependin g on what concrete language is being analyzed, a standard command such as while m ay hav e dif ferent inter pretations. For e xample, if the language inc ludes a break command that abruptly terminates the program throwing a particula r except ion, then the descriptio n of while should include the handler for that exceptio n. W e assume that Cmd J K and E x p J K are functio ns mapping concre te expressi ons to abstract express ions of Cmd and E xp , respecti vely . B elo w , cond-lo op is a simple while-loop that takes an expre ssion and a command, and propagates abrupt termination. The other constru cts in v olved can be found in T able 1. The descrip tion is then: C md J while ( E ) C K = catch ( cond-lo op ( E x p J E K , C md J C K ) , abs ( eq ( b reaking ) , skip )) C md J b reak K = thro w ( b reaking ) A simple exten sion is a while-loo p that handles continue commands. T o describe such w hile-lo ops, all that is need ed is to change the abo ve examp le in such a way that C m d J C K is enca psulated by a catch constr uct. T able 1 conta ins some possible constructs, which are used as example s throughout the rest of this paper . A n e xample of an open-en ded reposi tory containing more constructs can be found in e.g. [9]. An important facet of Component- Based Semantics is that the construct repositorie s ideally contain no redundanc y . If two basic construct s with diffe rent names hav e the exac t same semantics, then one of them should be discarde d. Moreov er , if a constru ct can be expres sed purely in terms of exis ting basic constr ucts, then this construct should also be disc arded. A repo sitory therefore essentiall y describ es a uni versal languag e that can be used to define th e semantic s of a concrete language in ques tion. This uni- ver sal language provide s a fi xed name for each basic construct, w hich in our formaliza tion corresponds to the name of a C O Q file. In the rest this pap er we prefer to use the term compon ent instead of const ruct , to emphasize we do not only refer to syntax when we use the term component, but also to its semanti cs and prop erties 1 The source can be obtained at http://www.cs.ru.nl/ ~ kmadlene/fcbs.html . Madlene r , Smetsers & van Eeke len 19 Syntactic Categ ories Cmd commands Exp exp ressions Dcl declar ations Pcd proced ure abstractio ns Prm para m eter patterns, encapsulati ng declaration s Construc ts Cmd :: = s eq ( Cmd , . . . , Cmd ) normal command sequenc ing Cmd :: = s kip normal termination Cmd :: = cond-lo op ( Exp , Cmd ) a simpl e while-loop, propa gating abrupt terminati on Cmd :: = cat ch ( Cmd , Pcd ) tries to hand le abrupt terminatio n of Cmd by proc e- dure abstracti on Pcd Cmd :: = t h row Exp terminate s abruptly with the val ue of the Exp Pcd :: = abs ( Prm , Cm d ) a parametr ized proced ure abst raction (with static scopin g) Prm :: = eq Exp a parameter that matche s only the entit y computed by the Exp . Exp :: = blo ck ( Dcl , Exp ) locally binds Dcl in the Exp T able 1: A basic repository . that it may be equipped with. For the semanti cs of each compo nent to be languag e-independ ent, it is necess ary that it d oes not d epend on 1) a uxiliary entities t hat are not mention ed by the component, 2) th e transit ion relation of the full language, and 3) abstract syntax of the full language. In our formalizat ion we parametrize the components on these pieces of information . Howe ver , we fi rst re vie w MSOS , the frame work our formalization is based on. 2.1 Modular SOS In SOS, the ope rational semantics of a lang uage with ef fects is modeled in by a la beled tran sition system (LTS) h Γ , A , →i , where Γ is the set of configuration s, A is the set of actions , and → ⊆ Γ × A × Γ is the tra nsition r elation (sometimes called step r elation ). It is possib le to consider more genera l transition systems that include terminal states, but these are only rele vant when one considers computation traces, which is outside the scope of this paper . A straightforw ard example of a set of configuratio ns that w e will use belo w is Cmd × ρ × σ . W e will call ρ and σ auxili ary entities , or simply entities . A dra wback of SOS is its l ack of support for modularity . It is sometimes ne cessar y to up date ex isting rules by decorat ing the transiti ons with addition al entities, e.g. a second store to m odel a separate part of memory . If we were to a dd an aux iliary entity to the c onfigurat ions, then this en tity needs to be th readed throug h all the ru les that define the semantic s. This pre ven ts the rules fr om bein g reusable, and ther efore plain S OS is not a suitable framewo rk for Component-Bas ed Semantics . One can get around this problem informal ly , by implicitly propag ating the entities that are not mentioned , by using a con ven tion such as: ρ ⊢ h c 1 , σ i − → h c ′ 1 , σ ′ i ρ ⊢ h seq ( c 1 , c 2 ) , σ i − → h seq ( c ′ 1 , c 2 ) , σ ′ i c 1 − → c ′ 1 seq ( c 1 , c 2 ) − → seq ( c ′ 1 , c 2 ) Normal command sequenc ing does not m anipu late any of the entities and we can therefore assume that 20 Formal Compo nent-Bas ed Semantics Label : = { . . . } seq ( skip , c ) − → c (1) c 1 { X } − − → c ′ 1 seq ( c 1 , c 2 ) { X } − − → seq ( c ′ 1 , c 2 ) (2) Figure 1: Normal command sequencing Label : = { ρ : e n v , . . . } d { X } − − → d ′ blo ck ( d , e ) { X } − − → bl o ck ( d ′ , e ) (3) e { ρ = ρ 0 [ ρ 1 ] , X } − − − − − − − − → e ′ blo ck ( ρ 1 , e ) { ρ = ρ 0 , X } − − − − − − → blo ck ( ρ 1 , e ′ ) (4) blo ck ( ρ 1 , v ) − → v (5) Figure 2: Local binding s the y are propaga ted. This informal descriptio n style enables formulation of rules independ ent of the auxili ary entities that may or may not be present and thereby provi des reusabilit y of the rules. MSOS is a varia nt of SOS that has special support for the propagat ion of unmention ed entities. The ke y disti nction is tha t it separates phra ses of the language fro m entities by mo ving the entities in to a label on the tra nsition. T hat is, tran sitions are of the f orm γ α − → γ ′ , such t hat γ and γ ′ merely consi st of abstrac t syntax (which may include computed v alues), and α is a label containin g the auxiliary entities. Before we discuss the associ ated transition systems, let us consider some example s of rules specified in MSOS. Figures 1 and 2 prov ide examples of normal command sequencin g and local binding s. The abstra ct syntax is stand ard, and the meta-v ariables c , d , e , ρ and v sta nd for commands , declaration s, expres sions, en viron m ents and val ues, respecti vely . The meta-v ariable X plays an importan t r ˆ ole in the rules. It binds the unmenti oned entities, allowin g us to propaga te them between the premise(s ) and conclu sion of e ach r ule, with out s pecifically describi ng what these entities are. Differe nt occurrence s of X in the same rule stan d for the same entities . N ote that the rules assu m e neithe r the prese nce or absence of partic ular auxiliary entities: the only entities that are mentione d are the ones used by the transit ions in the rule in question . T he Label box specifies what entities the label should at least include. Entities in labels can be matched in rules using notatio n such as ‘ { ρ = ρ 0 [ ρ 1 ] , X } ’, where ρ 0 [ ρ 1 ] stands for updating ρ 0 by ρ 1 . R ules w ithou t labels on them are unobserv able , meaning that they implic itly assume that the entities remain unchan ged during the transit ion (e.g. in rule (1)). As an aside, w e remark that skip too is a component: it has an empty label and an empty set of rules. Mosses [7] recognized that the arro ws of a categ ory provid e an adequate mathematical structu re for labels . That is, two consec uti ve steps are onl y allowed to be made when their labels are composable, i.e., γ p − → q − − − → γ ′ r − → s − − − → γ ′′ is only allo wed if if q = r . Hence , the associate d transition syste ms are a triple h Γ , A , →i similar to L T Ses, with the diff erence that Γ strictly consists of abstract syntax, and the additi onal requi rement that A are the arro ws of a label cate gor y A . The label cate gory is a product of elementa ry categ ories that correspo nd to the entities, w hich w e w ill discuss in Section 4. T he valu es of the auxilia ry entities are the objects of A . As an example, a simple step with rule (1) looks as follo ws, if the label contains an en vironmen t and a store: seq ( skip , c ) h ρ , σ i− →h ρ , σ i − − − − − − − − → c (6) Identi ty arro ws are used to expre ss unobserv ability , used in e.g. rule (1). Madlene r , Smetsers & van Eeke len 21 3 F ormalization In C omponen t-Based MSOS, the source configuration γ of a transition γ α − → γ ′ plays a special r ˆ ole . Namely , it deter m ines to w hich component the rule permitting that particular transitio n belongs. The formaliza tion defines for each component a so-called local transition relation, which describes the rules for source con figuratio ns that belo ng to that particu lar componen t. Provided with the grammar of the full language , we construc t the transition relatio n of the full languag e by combining the local transitio n relatio ns. Components may option ally provide proof of a property that it satisfies, which can like wise be combined to bu ild the proof of that property about the full language (if all components satisfy that proper ty). This will be demonstrated in Section 5. W e make use of C O Q ’ s sup port for type c lasses [ 13] to a utomatically “fill in th e d etails”, i.e. combin- ing the components and filling in the paramete rs to constr uct the full language. T ype classes, ho wev er , are not stric tly necessary for the formaliza tion. It is possi ble in our formalizat ion to construct se veral full language s from the same repository , b ut it is not possible to create an exte nsion of an existin g full langua ge without completely specify ing the extend ed language’ s grammar . 3.1 T ypes for tr ansition re lations The transiti on re lations of labeled transit ion systems (see Section 2.1) can be a ssigne d the follo wing ty pe: Step Γ A : Γ → A → Γ → Prop In ot her wor ds, the y are pre dicates which t akes ar guments γ , α and γ ′ and r eturn an eleme nt of Prop (the b uilt-in sort of proposi tional types in C O Q ). Just like the l abeled transit ion systems associat ed with SOS specifica tions, there is no apparen t distinctio n between syntax and the auxiliary entities. Follo wing the principles of MS OS, we upda te the type of Step to feature arrows of a category as labels on the transitions. Step now becomes parametric in the full label category A of the full language (which has a collec tion O of objects ), resultin g in the follo wing type: Step Γ O ( A : Catego ry O ): Γ → Arro ws A → Γ → Prop W e ha ve to remark that to av oid confusio n, we are not follo wing the exac t syntax used in our formaliza- tion at thi s point. Moreov er , we omit the definitio n of Catego ry in th is pa per , b ut we el aborat e on Arro ws in Section 4. Component -B ased MSOS requires both a modular way to specify the step relation and a modula r way to speci fy the abstra ct syntax. T he co m ponen t seq of Figure 1 implicitly specifies i ts own si gnature, namely the produc tion rule Cmd : : = seq ( Cmd , C md ) , and specifies two new ru les. It also assumes th at a syntactica l catego ry Cmd exists, and to b e able to define rule (2), it assumes that a transitio n relation on Cmd exis ts. W e there fore parametrize the component (i.e. its local transit ion relation and lemmas) w ith Γ , represen ting the syntactic categor y , the full transition relation S on Γ , and the componen t’ s construc t C (where P is a type that stands for its parameters, see the next section). Since the components alway s define t he semantics for precise ly one constr uct of the l anguag e, we re strict the inp ut configurat ion to the phrase s built by that construc t. W e call the transitio n relation of a component a local step , to emphasize the dif ference with a transit ion relation defined on a full syntacti c catego ry . Lo calStep Γ O ( A : Catego ry O ) ( S : S tep Γ O A ) P ( C : Construct P Γ ): restr C → Arrows A → Γ → Prop 22 Formal Compo nent-Bas ed Semantics T o define the full languag e, it is sufficie nt to enumerate the components it is built of. T his results in a transition relation of type Step for each syntactic cate gory , which we call a global step relation. This is describe d later on in this section . 3.2 Grammar As a runni ng exampl e, we define a languag e th at consi sts of just the compone nts skip and seq (see Figure 1). Although it is a fairl y simple e xample, it allows us to ex plain the formalization without havi ng to get ahead too much on labels, which are treated in Section 4. The grammar of our skip-seq langua ge is straight forwar dly encode d by the follo wing inducti ve type: Inductive Cmd : = skip | seq ( c 1 c 2 : Cmd ). Recall from Section 2 that each component is parametrize d on its abstract constru ct. The argu m ents are passed on as an injection -projec ti on p air which we will call Construct . Injection correspon ds to applyi ng a constructo r and projectio n correspo nds to pattern matching. Construct consis ts of two prop- erties saying that i and p are (partial) in verses of each other . This is ne eded to prov e properties about the compone nt. Class Inject P Γ : = inject : P → Γ . Class Projec t P Γ : = p roject : Γ → option P . Class Construct P Γ { i : Inject P Γ } { p : Project P Γ } : = { H i : ∀ x : P , p ( i x ) = S ome x ; H p : ∀ γ : Γ , match p roject γ with | None ⇒ T rue | Some x ⇒ i x = γ end } . For constructs that take se veral argumen ts, such as Cm d :: = seq ( Cmd , Cm d ) , the a r guments are tuple d. The Cl ass keywo rd declare s the definitions to be type classes. The con ve nience of type classes is that class fields (such as inject or proje ct ) may be used without explici tly mentioning which instance of that class should be used. The curly bracke ts arou nd i and p indicate that these argu ments are implicit. In this case, thes e impli cit ar guments become class co nstrain ts, i.e., order to b uild an instance of Construct , instan ces of Inject and Project need to be present. For our exampl e langua ge, the correspo nding inst ances are: Instance : Inject unit Cmd : = λ , skip . Instance : Inject ( Cmd ∗ Cmd ) Cmd : = λ p , let (c 1 , c 2 ) : = p in s eq c 1 c 2 . Instance : Project unit Cmd : = λ γ , ma tch γ with | skip ⇒ Some tt | ⇒ None end . Instance : Project ( Cmd ∗ Cmd ) Cmd : = λ γ , ma tch γ with | seq c 1 c 2 ⇒ Some (c 1 , c 2 ) | ⇒ None end . Instance : Construct unit Cmd . Instance : Construct ( Cmd ∗ Cmd ) Cmd . Madlene r , Smetsers & van Eeke len 23 The type class mechanis m can be seen at work here: we do not hav e to specify the argu m ents i and p , for the y can be res olv ed from the sign atures. In fact , the manual d eclarat ion of the se type class instanc es is straightfor ward and can be omitted by an augmentation of C O Q ’ s type class resolution algorithm, but we skip the details here. The reader may hav e noted that when the full langu age has two con structs with the s ame signat ure, the type class instance reso lution algo rithm may fill in t he w rong Construct instanc e. This is solv ed in the formaliz ation by adding an ar gument (i.e. a string ) to Construct , enabl ing us to uniqu ely identify each instance . Returnin g to the Lo calStep type, the Construct ar gument is actuall y a class constraint (i.e. it is an implicit argu m ent) in the formalization . In fact, the catego ry and the Step relation are also class constr aints. Some component s require the presence of other components . For instance, the component seq “imports ” the (very basic) componen t skip . T o this end, the S kip cons truct becomes an additional constr aint of seq . This d oes not interfere with mo dulari ty: all other de tails about the full lang uage remain opaqu e. 3.3 Semantics A straig htforward way to encod e transiti on relations in a theorem prov er is by means of an induc ti ve predic ate [2]. Making the definition inducti ve guarantees that the only val id transitions are the ones that can be bu ilt by its constru ctors, which correspon d to the rules. The encoding of rules is straightf orward using nest ed implications , where uni versal quantificatio ns are added for v ariables that occur in the ru les. As an example , we giv e the transiti on relation for seq : Inductive ls : restr Seq → Arro ws A → Cmd → Prop : = | seq 1 : ∀ c 1 c 2 c ′ 1 a r , step c 1 a r c ′ 1 → ls ( Seq · (c 1 , c 2 )) ar ( i ( c ′ 1 , c 2 )) | seq 2 : ∀ c 2 a r , unobs ar → ls ( Seq · ( skip tt , c 2 )) ar c 2 . The premise unobs ar expresse s unobs erv ability of the label, i.e., it has to stay uncha nged. W e ha ve suppre ssed the class constraint s here for readability . That is, ls requires suitable instance s of Catego ry , Step , Construct and Lab el (the latter is presented in Section 4). The type restr C is used to restrict phrase s of the full language to ones bu ilt by construc tor C . By means of an inducti ve type with a single constr uctor , we can ens ure th at the only way to b uild an instance o f type restr C is b y prov iding an object of P : Inductive restr ‘( C : Construct P Γ ) : = restr cons ( γ : Γ ). Notation ”C · γ ” : = ( restr cons C γ ) ( at level 50, left asso ciativity ). The backtick performs implicit generaliz ation: n ecessary v ariable s to the arg ument C are automatically declar ed as implicit argu m ents of restr . W riting e.g. Seq · (c 1 , c 2 ) is similar to applyin g the “real” constr uctor (e.g. seq c 1 c 2 ), b ut not exact ly the same. One can obtain c 1 , c 2 by straigh tforwa rd pat- tern m atchin g on restr cons . In contrast, it is only poss ible obtain c 1 , c 2 from seq c 1 c 2 by using the eliminati on princip le of Cmd , which is not a v ailable inside the component. The induc ti ve predicate ls is made into a type class instan ce to enable resolut ion: Instance LS Seq : Lo cal Step O : = ls . The semantics of the full languag e is essent ially defined by a case distinc tion on the construc tors of the dataty pes. The full step relation is defined as an induc ti ve predicate s that combines the exi sting local step relation s of the used component s into one global step relation. This is done by means of an induct ive predicate that has a single constructo r . The construct or assumes a l o calstep of any of the local 24 Formal Compo nent-Bas ed Semantics transit ion relations of the syntactic categ ory in question (passing along s itself), and returns an object of s (as above , in ls ). The reader intereste d in the details is referre d to the source code. This construct ion satisfies equatio ns such as: lo calize Skip S Cmd = LS S kip lo calize S eq S Cmd = LS Seq The operato r l o calize maps t he gi ven Step instanc e (in th is case S Cmd ) to the can onical LocalS tep w .r .t. the provid ed construct. T hese equations are necessar y to prov e propert ies about the components . For exa mple, consi der the compo nent seq , which imports the compo nent skip . T o be able to prove properties about seq , the lo cal step relatio n of skip (which is empt y) need s to be accessible . This is done by passing on the first equation as an arg ument. The equality is ov erload ed with the obvious meaning that the Step instan ces agree on all inputs (i.e. ar , γ and γ ′ ). In conjunct ion with C O Q ’ s bui lt-in support for setoid re writing (re writing modulo an equiv alence rela tion), this enab les us to perform shor t proofs for meta theory (used in Section 5). 4 Labels Auxiliary entities such as en vironment s and stores in S OS are encapsula ted in a label on the transition s in MSOS . In Section 2 we ha ve expla ined that the labels on the transitio ns hav e the structur e of arro ws of a categ ory: the labels of consecuti ve transiti ons should be composab le. A subtle diff erence between MSOS and SO S is that the chosen label category may restrict the transition relation speci fied by the rules, where as in SOS it is so lely the rules th at determin e this relati on. This can be seen by a ssuming the label categ ory to be a discrete category , i.e., the category with just identity arro ws. Mosses [7] has sho wn that a suitable category is the product A ˆ = ∏ i ∈ I A i of elementary categ ories repres enting the auxiliary entities. The usual types of entities used in SOS rules are en vironments, stores and labels, which correspond to read-only , read-write or write-only permissio ns, respecti vely . In MSOS, each entity (with index i ) has a correspon ding set of object s S i that, together with the permissio ns, determin es its correspon ding categ ory A i : • read-o nly: A i is the discrete categ ory with S i as its objects; • read-writ e: A i is the pre-or der cate gory with S i as its object s, and S 2 i as its morphisms; • write-on ly: A i is the cate gory with a single object ∗ , and the free monoid on S i as its morphis ms. A distingui shing feature of MS OS is its i nherent su pport for write-o nly entitie s. For example , a transition in a system with a single write-only entity can be pictured as γ ∗ − → ∗ − − − → γ ′ . If it appears as the conclus ion of a rule, then the premise s of that rule can not possibly depend on the valu e of that entity , because it is simply ∗ . For this reason, we hav e adopted the use of arro ws as labe ls in our formalizatio n. An alterna tive is to consider a relation on a product of entities as the label catego ry . This is a special case that does not provi de true support for write-only entities. Recall that the componen ts are parametrized by a label categ ory A on a collection of objects O . T o b uild the product categ ory , O is instantiat ed with the entity map i 7→ A i . Inside the component, the label cate gory is entirel y opaque. In othe r words , it is imposs ible to learn anything from A excep t that it is a product categ ory . T he Label box in the componen t specification expre sses what entities the full label should at le ast include. For e xample, Figure 2 require s that the full labe l includes an en viro nment entity . Madlene r , Smetsers & van Eeke len 25 This is reflected in our formalizatio n by provid ing two functor s P M and P U to each component, that projec t full labels to their mentioned entities and unmention ed entities , respecti vely : A P M − → ∏ i ∈ M A i , A P U − → U . The idea is that the product of mentioned entities is transpar ent to the component, whereas U is opaque . W e use the functor P U to express unobserv ability , nee ded e.g. in rule (1). Additionally , the component requir es that ( P M , P U ) is an isomorphism, which is crucial to enabl e modular proof. Let us consi der determin ism as an illustrati on of this. Pro p erty 1 Assume configur ations γ γ ′ γ ′′ : Γ and labels a r ′ : x − → y , a r ′′ : x − → z . The step r elation on Γ is determinis tic when both γ ar ′ − → γ ′ and γ ar ′′ − → γ ′′ imply that γ ′ = γ ′′ and a r ′ = a r ′′ . The requirement that the arrows are equiv alent ensures not only that the post configurati ons are equal, b ut also the outputs through the write-only component s are equal. T o prove that the component seq is determin istic, one proceeds by straightf orward case analysis on the structure of the input configuration . In the case that it is s eq ( skip , c ) , we hav e two arrows a r ′ , a r ′′ such that P U a r ′ = P U a r ′′ = id , and P M a r ′ = P M a r ′′ = () (the empty tuple). In other components that do ha ve mentioned entiti es, these projec tions of P M ha ve to be equiv alent. Using the isomorph ism we can then conclude that a r ′ = a r ′′ . 4.1 Formalization of labels The category theory we hav e used in our formalizatio n is pro vided by the M A T H - C L A S S E S library by v an der W eege n and Spitters [14]. Their librar y makes e xtensi ve use of a technique called “unb undlin g”, which boils do wn to separ ating th e compone nts of mathematical str ucture s into sep arate type classes. An exa mple of this are catego ries. In Section 3.1, we ha ve treated Catego ry as a record structure containin g Arro ws as a field for presentation purposes . Howe ver , in the actual formalizat ion, Arrows is a separat e type class: Class Arro ws ( O : T yp e ): T yp e : = Arrow : O → O → T yp e . Infix ” − → ” : = Arro w ( at level 90, ri ght asso cia tivity ). T o bu ild a Catego ry , among other components , an equiv alence relatio n on the correspondi ng A rro ws instan ce is necessary , to enable the compariso n of arrows. W e use this relation in our formalizat ion to define the pre dicate unobs for uno bserva bility . The following instances a re used for the entity categori es: Instance a rro ws ro : Arro ws O : = λ x y , x = y . Instance a rro ws rw : Arro ws O : = λ x y , unit . Instance a rro ws wo : A rro ws unit : = λ x y , list O . W e no w define the type class Lab el , which is used to pro vide the projecti on functor s. Lab el assumes the presen ce of the follo wing objects: I M : T yp e O : I → T yp e A : ∀ i : I , Arrow s ( O i ) O M : M → T yp e A M : ∀ i : M , Arrows ( O M i ) In other words , for both index sets I and M it is required that a collectio n of arrows e xists. 26 Formal Compo nent-Bas ed Semantics Class Lab el : = { cover O : ∀ i : M , O M i = O ( to I i ); cover A : ∀ i : M , A M i = h h λ T , Arrows T | eq sym ( cover O i ) i i A ( to I i ) } . The cover O property says that for ev ery index of the mentioned entitie s, the objects hav e to correspo nd to the objects of the full categ ory . L ike wise, the arro ws of the mentioned entities hav e to corresp ond. A cast operation [6 ] on the objects (indicated by h h | i i ) is needed to be able to express the latter , but we omit th e detail s in th is paper . Given an instan ce of Lab el , we can d eriv e the functo rs P M and P U togeth er with the f act that they are i somorph ic. Each componen t has a Lab el type clas s constrain t which lea ves O and A parametric, bu t specifies O M and A M . T o illustrate ho w a rule is interpre ted with help of the Lab el const ructio n, we consider rule (4) of Figure 2. L et us first write it using infor mal notation. Assume that a r : x − → y , ar ′ : x ′ − → y ′ and p roj ρ = π ρ ◦ P M is the projec tion of the componen t with index ρ . p roj ρ x ′ = ρ 0 [ ρ 1 ] p roj ρ x = ρ 0 P U a r = P U a r ′ e ar ′ − → e ′ blo ck ( ρ 1 , e ) ar − → bl o ck ( ρ 1 , e ′ ) In C O Q -syntax, this rule is: rule4 : ∀ ( ρ 0 ρ 1 : Env ) ( e e ′ : Exp ) ‘( ar ′ : x ′ − → y ′ ), p roj ρ x ′ = up date ρ 0 ρ 1 → p roj ρ x = ρ 0 → fmap P U a r = fmap P U a r ′ → step ar ′ e e ′ → (* ----------- --------- ---------------------------- *) ls ( Bl o ck · ( ρ 1 , e )) a r ( i ( ρ 1 , e ′ )) Note that the use of equality in the abov e code is highly ov erload ed, w hich is made possible by the use of type classes. Like the M A T H - C L A S S E S library , we represen t the functors by m eans of a functio n that maps the objects , which ha ve the actual names P M and P U , and functions that map the arro ws, which ha ve the fmap prefix. 5 Example of Modular Pr oof Once the full language is declared, it is possible to combin e proofs of the components to pro ve that a particu lar proper ty holds for the full language. L ike the local step relations , properti es are parametrized by a global step relation S . W e say that a property holds for a step relation if it holds for all the possible configura tions, but we are a bit more general and allow the user to express that a property holds for a particu lar configura tion. Not all propert ies can be prov ed by inductio n, and like wise not all proper ties hav e a modular proof. W e cons ider a class of admiss ible, well-beh ave d properties P such that P S ( I γ ) does not depe nd on any thing but the locali zed version of S w . r .t. C (here I γ injects γ into Γ ): Definition admissible Γ O ( P : Step Γ O A → Γ → Prop ) : = ∀ ‘( C : Construct A Γ ) ( S : Step Γ O A ) ( γ : restr C ), P ( global ize ( lo cal ize C S )) ( I γ ) → P S ( I γ ). The operator globalize is the in ve rse of lo cal i ze : it takes a local step relation ls and makes it global, beha ving lik e ls on phrases constructe d by C and not permitting any steps to be made that start from Madlene r , Smetsers & van Eeke len 27 other configuration s. The idea of admiss ible properties is that the y warrant that proof by indu ction is possib le. Lemma 1 Determinism is admissi ble. W e will demonstrate how this lemma is used to show that our skip-seq language is deterministic by illustr ating the seq case ( skip is similar). Inside the C O Q secti on of seq , we hav e prov ed the follo wing lemma that says that the componen t is determinis tic. Lemma det Seq ( c 1 c 2 : Cmd ): det global S Cmd c 1 → det lo cal LS Seq ( Seq · (c 1 , c 2 )). Note th at it as sumes that the glo bal step rela tion is deterministic on c 1 , which is es sentia lly the in ductio n hypot hesis. Recall the equiv alence relation s on Step , Lo calS tep of Section 3. Both det global and globaliz e respec t this relati on. U sing C O Q ’ s b uilt-in support for rewritin g modulo equiv alence relations (call ed setoid re w riting ), it can be sho wn that: det global S Cmd ( seq c 1 c 2 ) (fold I ) = det global S Cmd ( I ( Seq · (c 1 , c 2 ))) (Lemma 1) = det global ( globalize ( local i ze Seq S Cmd )) ( I ( Seq · (c 1 , c 2 ))) (re write eq Seq ) = det global ( globalize LS Seq ) ( I ( Seq · (c 1 , c 2 ))) (fold det lo cal ) = det lo cal LS Seq ( Seq · (c 1 , c 2 )) No w , the latter holds because this is a property prov ed in the compo nent seq . The proof for seq can therefo re be complete d by applyi ng det Seq , using the equation lo calize Skip S Cmd = LS Skip and the induct ion hypoth esis. Other c omponen ts follow the s ame pre scription. In future w ork, we w ant to au tomate the weaving of local proofs by genera lizing the abov e, and expl oiting automated proof search with the help of the type class mechanism in C O Q . Experiments ha ve already demonstrated that this is feasible, but fragile . 6 Related W ork A specification language for MS OS, called the MS OS Definition Formalism (MS DF), has been dev el- oped by M osses and Chalub (see [3 ]). It combines BNF notation w ith textual represe ntatio n of MSO S transit ions. A larg e number of basic components hav e alread y been identified and specified in MSDF . A tool that translates M L and (a part of) J A V A into this repositor y ha ve been de veloped by C halub and Braga [ 3], which c an be exec uted in the M AU D E tool. MSDF p rovides its o wn s pecificat ion langua ge for dataty pes, which can be constructe d from primiti ves such as sequences , lists, maps, etc. In contrast, our formaliza tion directly uses types defined in C O Q . Implicit- MSOS is an improveme nt of MSOS that reduces the amount of clutter in the rules ev en furthe r by implicitly propagati ng unmentio ned entitie s [10]. The interpre tation of Implicit-MSOS is gi ven in terms of MSOS, and we expect that it can be bu ilt on top of our formalizati on by cle ver use of type classes. Delaw are et al. [5] ha ve ve ry recently in v estiga ted the possibil ity of modular metatheory in C O Q . Their focus is on exten ding a programming language with new features , taking Featherweight J A V A as an ex ample. In their pape r , they demonstr ate how t o de velop a mod ular proof of type-s afety of a number of concrete extensio ns of Featherweig ht J A V A . T he cons idered extens ions do not ha ve ef fects, i.e., there are no entities. 28 Formal Compo nent-Bas ed Semantics The for malizatio n of the op erational semantics of O C A M L light in H O L by Scott Owens m ake s use of labels to encod e m utatio ns to the store in them [11]. These mutations are correl ated to a reduction in the progra m . The labels explicit ly carry m utatio ns and therefore simplify the notation, bu t do not enable a high degre e of reusability of the rules. In a theorem prov er (and functional language s), abstract syntax and transiti on relations are typically encod ed as inducti ve types, of which the constructo rs correspond to the grammar producti on rules and the constructo rs corresp ond to the rules of the step relation. The inducti ve definition ensures that those constr uctors are the only way to bu ild instanc es of those types. This correspo nds to the notions “initial algebr a” and “least relation”, sometimes used in this contex t (e.g. [10]). T o f acilitate Component-Bas ed Semantics, w e hav e to be able to build these inducti ve types from “partial version s” that define just the rules and prod uction rules of the compone nt in questio n. T o our best kno wledge, there is no theore m pro ver (or func tional languag e) that suppor ts (multiple) inherita nce of inducti ve types nati vely . 7 Conclusions and Futur e W ork In this paper we hav e presented a formalizati on of Component-Bas ed Semantics in the theorem prover C O Q . T he formalization makes essen tial use of dependent types, and profits from C O Q ’ s support for type classes. Our formaliz ation is based on the ideas of MSOS, and makes use of the idea of labels as arrows in cate gories, as proposed by Mosses [7]. Splitting the label categor y into a transparen t part for the mentioned entities and an opaque part for the unmention ed entities enables modular proof. W e ha ve demonstrated this by crafting a proof of determinis m of a m ini-la nguag e from smaller local proofs pro vided by the component s used. In future work we plan apply this work with the aim of scalable ve rification of specific programs. Another direct ion of research is to in vesti gate whether the full general ity of labels as arro ws (which our formalization prov ides) can be ex ploited for entiti es of types other than read-onl y , read-write and write-onl y . W e expect that by cho osing a suitable categ ory , it is possib le to enfor ce informatio n flow polici es, which has applica tions to security . O ur work also enables formal in vestiga tion of the appropriate definitio ns of bisimulat ion in MSOS , whic h as of now ha s an expe rimental status [7]. Acknowledgments. The authors wish to thank Peter D. Mosses for introduc ing them to the notion of Component-Based S emantics , and Bas Spitters for introdu cing them to type classes in C O Q . The author s would also like to than k Peter D. Mosses, Julie n Schmaltz and the anony m ous re vie w ers f or their comments on an earlier versi on of this paper . Refer ences [1] Brian E. A yde mir , Aaron Bohann on, Matthew Fairbairn, J. Nathan Foster , Benjamin C. Pierce, Peter Sewell, Dimitrios Vytiniotis, Geoffrey W ashburn, Stephan ie W eir ich & Ste ve Zdancewic (200 5): Mecha- nized Metatheo ry for the Masses: The P O P L M A R K Cha llenge . I n: TPHOLs , LNCS 3603, Springer, pp. 50–65 , doi: 10.1007/11541 868_ 4 . [2] Yves Bertot, G ´ er ard Hue t, Jean-Jacqu es L ´ evy & Gordo n Plotk in, e ditors (2009 ): Theor em pr oving suppo rt in pr ogramming la nguage sema ntics , ch apter 15 , pp . 3 37–36 1. Cambr idge University Press. A vailable at http:/ / h al.inria.fr/inria- 0016 0309/ . [3] Fabricio Chalub & Christiano Brag a (2007 ): Mau de MS OS T oo l . ENTCS 176(4 ), pp. 133–146 , doi: 10 . 1016/j.entcs.2007.06.012 . Madlene r , Smetsers & van Eeke len 29 [4] Ad am Chlipala (2011 ): Certified Pr ogramming with Depen dent T ypes . A vailable at http ://adam. chlipa la. n et/cpdt/ . T o appear . [5] Ben jamin Delaware, W illiam R. Cook & Do n Batory (20 11): Modu lar Mechanized Metatheory . I n prepara- tion. [6] Chu ng-Kil Hu r ( 2010) : Heq: A Coq Lib rary for Heter ogeneous E quality . A vailable at h ttp:/ / ww w.pps. jussie u.fr/ ~ gil/Heq/ . Info rmal presentation at the 2nd Coq W orkshop. [7] Peter D. Mosses (2004): Modula r Structural Operational Seman tics . J. o f Lo gic and Algebraic Programm ing 60-61 , pp. 195– 228, doi: 10.1016/j.jlap.2004.03. 0 08 . [8] Peter D. Mosses (2005 ): A Constructive Appr o ach to Language Definitio n . J. of Universal Computer Science 11(7) , p p. 1117 –1134 , doi: 10.3217/jucs- 011- 07- 1117 . [9] Peter D. M osses ( 2009) : Componen t-Based S emantics . In: Proc. o f SA VCBS’09 , ACM Press, p p. 3 –10, doi: 10.1145/15964 86.1596489 . [10] Peter D. Mosses & Mark J. Ne w (200 9): Imp licit Pr opaga tion in Structural Operational Se mantics . ENTCS 229(4 ), pp. 49–66 , d oi: 10.1016/j.entcs.2009.07.073 . [11] Scott Owens (200 8): A S ound Semantics for OCa ml light . In: Proc. of ESOP’08 , LNCS 4960, Spr inger, pp . 1–15, doi: 10 . 10 07/978- 3- 54 0- 78739- 6_ 1 . [12] Gordon D. Plotkin (2004 ): A Structural Appr oa ch to Operational Seman tics . J. of L ogic and Algeb raic Program ming 60-61 , pp. 17–139. [13] Matthieu Sozeau & Nicolas Oury (2008): Fir st-Class T ype Classes . In: TPHOLs , LNCS 5170, Springer, pp. 278–2 93, doi: 10.1007/978- 3- 540- 71067- 7_ 23 . [14] Bas Spitters & Eelis van der W eegen (2011 ): T y pe Classes fo r Math ematics in T ype Theory . MSCS 21, pp. 1–31, doi: 10 . 10 17/S096012 951100 0119 . [15] The Coq Development T eam (2 010): The Coq P r oof A ssistant R efer ence Manua l – V ersion V8.3 . A v ailable at http: // coq.inria.fr .

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment