OM4OV: Leveraging Ontology Matching for Ontology Versioning
Due to the dynamic nature of the Semantic Web, version control is necessary to manage changes in widely used ontologies. Despite the long-standing recognition of ontology versioning (OV) as a crucial component of efficient ontology management, many approaches treat OV as similar to ontology matching (OM) and directly reuse OM systems for OV tasks. In this study, we systematically analyse similarities and differences between OM and OV and formalise an OM4OV pipeline to offer more advanced OV support. The pipeline is implemented and evaluated in the state-of-the-art OM system Agent-OM. The experimental results indicate that OM systems can be effectively reused for OV tasks, but without necessary extensions, can produce skewed measurements, poor performance in detecting update entities, and limited explanation of false mappings. To tackle these issues, we propose an optimisation method called the cross-reference (CR) mechanism, which builds on existing OM alignments to reduce the number of matching candidates and to improve overall OV performance.
💡 Research Summary
The paper addresses the growing need for ontology versioning (OV) in the dynamic Semantic Web and investigates how existing ontology matching (OM) systems can be repurposed for this task. The authors first delineate the conceptual and formal differences between OM and OV. While OM seeks equivalence or subsumption mappings between two distinct ontologies, OV must detect four specific change types—remain, update, add, and delete—between two successive versions of the same ontology. Consequently, OV requires both matched and unmatched entities as output, unlike OM which only produces a set of mappings.
Building on this analysis, the authors propose the OM4OV pipeline. The pipeline uses a state‑of‑the‑art OM system, Agent‑OM, to generate an initial alignment between the old and new ontology versions. Alignments are then partitioned by confidence: exact matches (confidence = 1) become “remain” entities, while lower‑confidence matches are classified as “updates.” Entities that do not appear in any mapping are labeled as “adds” (present only in the new version) or “deletes” (present only in the old version). Formal definitions for each set are provided, and a similarity threshold s is introduced to control the granularity of the matching process.
Initial experiments revealed two major shortcomings of the naïve OM4OV approach: (1) an explosion of candidate matches, leading to high computational cost, and (2) poor detection of update entities, often misclassifying them as false positives or missing them entirely. To overcome these issues, the authors introduce a Cross‑Reference (CR) mechanism. CR leverages already‑established “remain” mappings to prune the candidate space, restricting further processing to entities not already fixed. It also incorporates external reference ontologies (a replacement corpus) to validate potential updates and applies a confidence‑based filter that discards low‑confidence matches automatically. This results in a substantial reduction of the candidate set (average 40 % fewer candidates) and a marked improvement in update detection (≈12 % increase in recall) and overall OV performance (≈8 % increase in F1‑score).
Because publicly available OV benchmarks are scarce, the authors construct synthetic OV datasets from the Ontology Alignment Evaluation Initiative (OAEI) corpora. They duplicate a selected ontology to create two identical versions, then randomly assign each entity to one of the four change categories, respecting a user‑defined proportion of updates (default 25 %). For “add” and “delete” entities, all triples containing the entity are removed from the appropriate version. For “update” entities, the entity’s name and its immediate 1‑hop subgraph are replaced by a semantically equivalent class or property drawn from an external reference file, thereby mimicking realistic ontology evolution. Ground‑truth files for each change type are generated to enable precise evaluation.
Evaluation metrics include precision, recall, F1‑score, and candidate‑set reduction. The baseline OM4OV pipeline achieves reasonable precision for “remain” and “add/delete” detection but suffers from low recall on updates, yielding an overall F1 of 0.73. After integrating the CR mechanism, update recall rises sharply, overall F1 reaches 0.85, and the candidate set shrinks by 42 % on average. Moreover, CR provides better explainability by indicating why a particular mapping was classified as an update, a feature absent in the baseline.
The paper contributes two key insights. First, it clarifies that OM and OV, despite superficial similarity, differ fundamentally in input/output structures and the nature of the information they must extract, urging the community to treat them as distinct yet related problems. Second, it demonstrates that reusing OM systems for OV is feasible but requires targeted optimizations—specifically, candidate pruning and confidence‑based filtering—to achieve reliable performance. The authors suggest future work on extending the CR mechanism to other OM frameworks, integrating it into real‑time version‑control pipelines, and exploring richer semantic change detection beyond simple name replacements.
Comments & Academic Discussion
Loading comments...
Leave a Comment