Ontologies are crucial for unlocking information. However, similar types have been created for different needs, reducing their interoperability. Critical for knowledge integration is the ability to reconcile terms from one ontology to related concepts in other ontologies. Performing ontology alignments can be complex, often requiring subject matter expertise to recognize context and nuance.
Several unsupervised and semi-supervised automated approaches which scale up mappings have been developed. These can broadly be classified into rules-based and machine learning techniques. Lexical mapping tools, including the open-source Agreement Maker Light (AML)1 or Large-Scale Ontology Matching System (LSMatch), use algorithms to detect similarities between labels and synonyms of corresponding ontologies. Whilst these approaches can achieve high levels of precision, they often work best when additional information is already available about a concept (synonyms, logical descriptions, existing classification, etc.), something that is often missing from flat lists or simple vocabularies.
Machine learning has been used to identify the relatedness between concepts. This is particularly useful for lists that lack ontology structure and metadata. These techniques include Word2Vec, which uses word embeddings, and language models like BERTmap, a fine-tuned version of BERT for mapping predictions coupled with analysis of ontology structure and logic.
The performance of classification models and mappings can be measured by considering their precision and recall. Precision can be seen as a measure of quality, whilst recall as a measure of quantity. Higher precision means that the algorithm returns more relevant results than irrelevant ones, and high recall means that an algorithm returns most of the relevant results (whether or not irrelevant ones are also returned).
Whilst there is a plethora of approaches available that generate large numbers of “likely” alignments, automated ontology mapping is not without its challenges. These include: Predictive mapping algorithms are not 100% accurate – Some level of human curation is required; Difficulty setting up and installing tools – Getting started can be mind-boggling and perhaps cumbersome, particularly if you just want to align a short list to public standards; Usability – Creating and viewing mappings isn’t always simple; Poor performance over large terminologies – For larger vocabularies, for example, ~100k terms and potentially 1M synonyms, it can be difficult to set up the right infrastructure to create and store mappings; Lack of flexibility – It’s not always easy to fine-tune mapping approaches for example, depending on your data, you may want to lose some precision to explore better recall; and Versioning and maintenance missing – Mappings produced using various techniques within the public domain are often not versioned or kept up to date.
Thus, the need is for a simple, user-friendly tool that allows users to map ontologies and select the target ontology they wish to map to.
Click here to read the original article published by SciBite.
Please give your feedback on this article or share a similar story for publishing by clicking here.