A Web-Based Tool for Analysing Normative Documents in English

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Our goal is to use formal methods to analyse normative documents written in English, such as privacy policies and service-level agreements. This requires the combination of a number of different elements, including information extraction from natural language, formal languages for model representation, and an interface for property specification and verification. We have worked on a collection of components for this task: a natural language extraction tool, a suitable formalism for representing such documents, an interface for building models in this formalism, and methods for answering queries asked of a given model. In this work, each of these concerns is brought together in a web-based tool, providing a single interface for analysing normative texts in English. Through the use of a running example, we describe each component and demonstrate the workflow established by our tool.

💡 Research Summary

The paper presents “Contract Verifier”, a web‑based integrated tool for the analysis of normative documents written in English, such as privacy policies, software licences, and service‑level agreements. The authors argue that while such documents are intended for human readers, their length, cross‑references, and exception clauses make manual analysis error‑prone and time‑consuming. To address this, the tool combines four essential components: (1) natural‑language extraction, (2) a formal contract model, (3) multiple human‑readable representations of the model, and (4) query and verification facilities.

The workflow begins with the user supplying an English text. The extraction component, built on the Stanford Parser, produces dependency trees and automatically identifies the subject (agent), the verb (action), the deontic modality (obligation, permission, prohibition), and any temporal or non‑temporal conditions. The output is a tab‑separated values (TSV) table where each row corresponds to a clause and columns correspond to the extracted fields. Because automatic extraction is not perfect, the table is presented to the user for post‑editing; cells can be edited, rows added or removed.

After the user finalises the table, a conversion script transforms the TSV into an XML‑based contract model format (COML). The underlying formalism is based on C‑O diagrams, which capture deontic modalities, refinements (conjunction, choice, sequence), reparations, and temporal guards. The model can be visualised in three ways simultaneously: (i) the post‑edited natural‑language text, (ii) a Controlled Natural Language (CNL) version generated via the Grammatical Framework (GF), and (iii) a compact formal notation called C‑O Diagram Shorthand (CODSH). The CNL is a reduced, unambiguous version of English that can be parsed back into the internal model, though it fails if required lexical items are missing from the 64 000‑entry dictionary.

Analysis is performed through a set of query templates. Six syntactic queries are answered by traversing the contract model and filtering clauses; for example, “What are the obligations of

A Web-Based Tool for Analysing Normative Documents in English

💡 Research Summary

Comments & Academic Discussion

Leave a Comment