Science and Research Content

Empusa—A Framework for the Development and Usage of Ontologies -


The scientific community is increasingly adopting the resource description framework (RDF) as a single unifying data model. Currently, many datasets are made available using this model as RDF improves the findability, accessibility, interoperability, and reusability of complex data. Furthermore, by simply using an RDF link, the relationship between one piece of data and another piece of data, in other words, linked data, can be captured. The freedom of the RDF data model to link anything with everything, however, has a downside, as unintended links can be easily created.

The beauty of using Linked Data is the simplicity and flexibility in capturing the relationship between one piece of data and another by simply using an RDF link. Ontologies help in describing the intended relationships and specifying the cardinalities of data, thereby offering guidance to the transformation of data into structured datasets with a high degree of interoperability. In practice, the correct implementation of ontologies can be a complex task as an ontology may contain thousands of elements describing diverse relationships.

Therefore, to ease the understanding of an ontology and ensure correct usage by data scientists Empusa was developed. Empusa, a code generator for ontologies, and its application to the Genome Biology Ontology Language (GBOL)—the first ontology developed using Empusa—is an extendable ontology for genome annotation. It uses a combination of W3C Web Ontology Language (OWL)/ Shape Expressions (SheX), enabling rapid ontology development and in addition generates an application program interface (API) ensuring the usage of the ontology as intended by its designer. In addition, Empusa can generate markdown files, which in turn can be compiled to a website making it easy to have resolvable uniform resource locators (URL) from the designed ontology.

By leveraging Empusa, the scientific community can develop high-quality ontologies, and through the API enable the correct usage of the ontologies. Through GBOL, a new ontology, which can be used to capture complex genomic information feasibly, is also under consideration.

Click here to read the original article published in Springer Nature.

STORY TOOLS

  • |
  • |

Please give your feedback on this article or share a similar story for publishing by clicking here.


sponsor links

For banner ads click here