GenOM: Ontology Matching with Description Generation and Large Language Model

GenOM: Ontology Matching with Description Generation and Large Language Model
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Ontology matching (OM) plays an essential role in enabling semantic interoperability and integration across heterogeneous knowledge sources, particularly in the biomedical domain which contains numerous complex concepts related to diseases and pharmaceuticals. This paper introduces GenOM, a large language model (LLM)-based ontology alignment framework, which enriches the semantic representations of ontology concepts via generating textual definitions, retrieves alignment candidates with an embedding model, and incorporates exact matching-based tools to improve precision. Extensive experiments conducted on the OAEI Bio-ML track demonstrate that GenOM can often achieve competitive performance, surpassing many baselines including traditional OM systems and recent LLM-based methods. Further ablation studies confirm the effectiveness of semantic enrichment and few-shot prompting, highlighting the framework’s robustness and adaptability.


💡 Research Summary

GenOM is a novel ontology matching framework that leverages large language models (LLMs) to enrich the semantic representation of biomedical ontology concepts, retrieve candidate correspondences, and make final alignment decisions. The system proceeds in four main stages. First, it extracts lexical and structural metadata (labels, synonyms, parent relations, equivalence axioms) from both source and target ontologies. Second, an LLM‑driven definition generation module uses this metadata as a prompt to produce natural‑language definitions for every concept. The authors experiment with several instruction‑tuned models (Qwen2.5‑7B/14B/32B‑Instruct, Llama3‑8B‑Instruct), showing that larger models yield higher‑quality definitions. Third, the generated definitions are embedded using a sentence‑transformer or the LLM’s own embedding layer, and cosine similarity is employed to retrieve a high‑recall candidate pool of concept pairs. Finally, two complementary judgment components are applied: (a) an LLM‑based binary classifier that, given a pair of definitions (and optionally structural cues), decides whether the concepts are equivalent; few‑shot examples are included in the prompt to improve consistency, and (b) a lightweight exact‑matching module that captures cases where lexical strings match perfectly (e.g., from BERTMap or LogMap). The two signals are fused into a confidence score, and only pairs exceeding a threshold are output as the final alignment.

The authors evaluate GenOM on the OAEI Bio‑ML track, covering five alignment tasks involving major biomedical ontologies such as SNOMED‑CT, NCIT, FMA, ORDO, DOID, and OMIM. Compared with strong baselines—including traditional systems (LogMap, AML), embedding‑based approaches (BERTMap, Deep Alignment), and recent LLM‑centric methods (LLM4OM, Olala)—GenOM consistently achieves higher F1 scores. Ablation studies reveal that (i) removing the definition generation step drops recall substantially, (ii) omitting the LLM judgment reduces precision, and (iii) discarding the exact‑match component harms both precision and recall, confirming that each component contributes meaningfully. Moreover, scaling experiments show a clear correlation between model size, definition quality, and overall alignment performance.

The paper also introduces a dedicated evaluation protocol for the generated definitions, combining lexical similarity metrics, alignment‑based indirect measures, and an “LLM‑as‑judge” semantic assessment to quantify correctness and discriminativeness.

Limitations are acknowledged: LLMs can hallucinate, large models incur significant computational cost, and the current implementation focuses solely on equivalence detection, leaving subsumption and relatedness for future work. The authors suggest integrating richer structural cues, exploring distilled or quantized LLMs to reduce inference overhead, and extending the framework to multi‑relation matching.

In summary, GenOM demonstrates that LLM‑generated textual definitions can effectively bridge the semantic gap inherent in biomedical ontologies, and that a hybrid pipeline combining LLM reasoning with traditional exact‑match heuristics yields robust, state‑of‑the‑art performance on large‑scale ontology alignment tasks.


Comments & Academic Discussion

Loading comments...

Leave a Comment