Semantics and linked data are becoming increasingly mainstream. Consequently, there is still a question around how ontologies differ from taxonomies. According to Kurt Cagle, writer, data scientist and futurist, it is important to understand the distinction between ontologies and taxonomies. It is critical for making decisions about metadata management and affects anyone who deals with enterprise-level data.
The difference between ontologies and taxonomies can be succinctly described as, ontologies specify while taxonomies classify. To elaborate, an ontology describes a formal framework for describing not just a taxonomy but also anything. It achieves this by establishing the classes, relationships and constraints that act on the concepts and entities within a given system. In comparison, taxonomies provide the terms or categories that can be used to describe a given entity. Additionally, taxonomies frequently describe one or more orthogonal dimensions, which provide narrower or broader classification.
In effect, an ontology is the system of classes and relationships that describe the structure of data. It is the rules that prescribe how a new category or entity is created, attributes defined, and constraints established. Therefore, a database schema represents an ontology. It is an ontology for creating records that satisfy the constraints of that database. In this instance, the ontology is not the data. The system defines columns and tables — described as classes — used by each row and each primary key/foreign key relationship.
Ontologies are axiomatic in nature. Therefore, an ontology can be used to build another. For instance, the resource description framework (RDF) provides a minimal set of classes and constraints that are in turn used by other ontologies. One of the most notable among them is the OWL (web ontology language) specification. It includes a number of logical extensions for describing a comprehensive logical framework. These frameworks can be represented in a number of ways including XML, JSON, Turtle (short for Terse RDF Language), functional or Manchester notation, etc. Furthermore, a new framework, SHACL, has been defined. It provides a more schematic approach to modeling. Moreover, SHACL has much of the same notation as OWL.
Significantly, at a working level, it is acceptable to replace the term ontology with data model, even though the terms are not synonymous. Hence, increasingly, ontologies are becoming acceptable as machine-readable data models and are treated at par with other modeling efforts.
In the case of taxonomy, one of its principal tenants is to minimize ambiguity. This accounts for the popularity of hierarchies as a classification tool. In fact, most taxonomies utilize some form of hierarchy, where the hierarchical trees provide a method for sub-classing. The challenge in using taxonomies arises from the fact that a given resource can be categorized in multiple ways. This challenge is further compounded because such categorizations have the potential to be in different branches. This is a significant challenge because different facets of categorization can mutually describe most things.
This does not mean taxonomies do not have its uses. Increasingly, taxonomies are being written using ontologies and ontological principles. This has several advantages. First, this can help differentiate between entities and categorizations when pure hierarchical taxonomies are used. An ontology that indicates something is at its core, a term in a taxonomy can be extended to do all the things that can be done with a term. Furthermore, the relationship associated with it can be retained. A pure taxonomic system would be unable to capture this nuance, whereas this can be added in an ontology without much effort.
In the future, ontology-enabled taxonomy tools will make it possible to make inferences and uniquely identify a global key. This will ensure external data systems can refer to the same concepts and potentially duplicate complex data structures. Ultimately, standalone taxonomy tools will be replaced by ontologically oriented systems.
However, taxonomists will still be needed to curate information. Rather than keeping track of terms and human definitions, their skill will be used for enriching and making machine and human readable metadata through semantic tools. This metadata can be used to drive processes and reduce data redundancy, and to surface new information through inferencing and rich querying/updates.
Click here to read the article.
Please give your feedback on this article or share a similar story for publishing by clicking here.