Science and Research Content

How BBC Online Intends to Drive Content Discovery -


The BBC’s online portfolio is over 20 years old and consists of a rich and varied collection of websites and services. One of the challenges faced by BBC as it rebuilds its digital portfolio with a new content strategy is how to make more content discoverable and personalize the content for more audiences. Therefore, the BBC has decided to use detailed metadata that will describe all the content using the same terminology and the same tools and data model.

Consequently, BBC is focusing on descriptive content metadata - tags that describe what an asset (e.g. an article, program, or TV/Audio clip) is about or who/what, it mentions. Utilizing Dynamic Semantic Publishing – data architecture—BBC is already using these tags on its other websites to drive thousands of subject-based aggregations or Topic pages. In addition, BBC is also using a mixture of genres and formats to support its online program information library.

In practice, BBC has to contend with two data silos along with its limitations. Therefore, to offer its audiences pan-BBC experiences or an in-depth understanding of the content BBC is doing two more things. One, BBC’s Digital Publishing team is developing tools that make content description possible at all stages of production across its online portfolio. Two, creating new vocabularies that allow for richer descriptions of the content.

To make this possible BBC has developed a concept that every piece of content has a basic set of common metadata associated with it, that it carries around wherever it is surfaced across the BBC’s portfolio. This set of basic common metadata is stored in what the BBC calls a ‘Passport’. To create and manage this metadata, the BBC has developed a tool called Passport Control.

Furthermore, BBC is using the simple graph model first developed for its 2012 Olympic’s coverage, to create subject-predicate-object triples to describe the nature of the relationship between an asset (the subject) and a tag (the object). This kind of subject-based tagging is well established at the BBC, especially in their journalism output.

However, BBC has added three new predicates: “editorial tone”, “intended audience”, and “genre”. Each predicate can be used with an associated controlled vocabulary of terms. In some cases, these controlled vocabularies are taxonomic hierarchies (like genres) while in others they are simple lists of terms developed to describe the output in ways that make sense to BBC and their audience.

These new types of metadata can be used to make much richer collections of content, as either manual editorial curations or algorithmically generated recommendations. With the amount and variety of material that BBC produces, from news articles to music mixes, live events to boxsets, they are in a good position to provide content for all kinds of different tastes.

Click here to read the original article published by BBC’s Technology & Creativity Blog.

STORY TOOLS

  • |
  • |

Please give your feedback on this article or share a similar story for publishing by clicking here.


sponsor links

For banner ads click here