Science and Research Content

How to get a good ontology for enterprises -


Enterprises have three options when they want to have an ontology. They can borrow from an open source domain, buy a license to a formal ontology set, or build their own. Each of these options has its own set of advantages and disadvantages. So, what should enterprises do? According to Kurt Cagle, principal consultant at Semantical, LLC enterprises could solve this conundrum by combining all three options.

Until a few years ago, enterprises did not have an overarching ontology that they could borrow to describe things. Instead, there were just ontologies that described domains of information. Beyond a small number of primitive and frequently abused properties, there was a great deal of inconsistency between models.

However, things began looking up once Microsoft, Google and Yandex decided to use schema.org as the home of a new schema. The goal was to build the new schema organically, providing models for people, enterprises, addresses, literary works, and similar concepts. However, despite the network effect and the resultant snowballing effort, there are holes in schema.org. In addition, some design decisions may run counter to how an enterprise feels comfortable modeling things. Therefore, borrowing from an open source domain may not help an enterprise to meet all its ontology requirements.

Enterprises can look to buy a license to a formal ontology set. The license would offer an enterprise a formal structure to standardize how things are described within the enterprise. In addition, the ontology will provide transient data —how things are identified and how they change over time. This cuts down on the ETL process dramatically, which also means there is less incentive to spend a significant amount of money on integration. This also has obvious implications for natural language processing, entity extraction of transcripts and recordings and markup of media processing systems.

However, enterprises look to acquiring an ontology either for its data models or for the data — categorization information— within the data models. An authority such as corporate, research and governmental organizations supplies taxonomic and data-centric information. Taxonomic and data-centric information is very different from schematic information, as there will be multiple instances of the same information. This would pose a significant challenge, as ontologies such as schema.org are schematic in nature. Therefore, for the present, when enterprises buy a license to a formal ontology set they will be buying a custom model along with data. In addition, they can expect to see the models coalesce even if the data does not coalesce properly.

The question of whether it makes sense for enterprises to build an ontology comes down to its use. If the goal of the enterprise is to build a data hub (a data lake rendered using triples) then attempting to force fit an ontology will not serve the purpose. On the other hand, one benefit that emerges from the "dump everything into a triple store and see what sticks" method is that it can often help in developing a natural ontology, i.e., one that actually describes the data in its most natural form.

At this point, it is worth noting that arbitrarily creating data models to serve as the foundation of an ontology invariably ends in failure. The key issue here is that what most people think about data is wrong. An ontology, at its core, consists of lists of things. The attribute (or atomic) properties that are attached to these things add texture, but from an ontological standpoint, such properties are essentially decoration. This is what makes ontology design powerful and at the same time complex.

Therefore, it follows that for a large enterprise it is realistic to employ all three options—borrow, buy or build—when they are looking to create an ontology. For enterprises, ontology at its core is a representation of its business. Hence, many facets differ from other business. At the same time, many facets will be common to all businesses, particularly among those in the same industry. Accordingly, many common ontologies for the healthcare, entertainment and manufacturing sector have been developed or are being developed. However, enterprises should also look to leverage schema.org because it is having the momentum and evolving quickly to meet the varying business requirements of multiple industries.

Click here to read the article.

STORY TOOLS

  • |
  • |

Please give your feedback on this article or share a similar story for publishing by clicking here.


sponsor links

For banner ads click here