BBC interlinks with DBpedia

09Sep08

The BBC is starting to embrace the Semantic Web. We were recently commissioned to create links between DBpedia and an internal BBC vocabulary, which enable the BBC to use DBpedia/Wikipedia as a controlled vocabulary. This allows them to suggest related content to their users across their multitude of content management systems (we hear there are 36 systems in use at the moment) and better integrate content from the web into their properties. This also means that third parties will gain access to BBC metadata and content in the very near future. Skeptics beware, this is reaching a tipping point!

7 Responses to “BBC interlinks with DBpedia”


  1. 1 Yves Posted 2008-09-09 - 4.10 am

    Awesome!!
    I am so sorry to not have been able to make it last week, looks like it was two great days :-)
    Wrt. to interlinking algorithm - do you use lucene to only index on literals? Is it working when two things have the same name?

    I suggested to Georgi some time ago that a more robust and general algorithm could perhaps index on adjacent literals (eg. literals that are a resource away). This would sort of like implement a light version of the algorithm i presented at ldow: it would give one more clue for disambiguating things that have the same name. If it still gives poor results, you can index on three adjacent literals, etc.

  2. 2 Christian Becker Posted 2008-09-09 - 4.30 am

    Hi Yves,
    Lucene indexes labels and article categories/templates. We use the labels to search; and the latter are restricted according to a manually determined list depending on the dataset (like in your algorithm) as well as to the automatically determined class equivalents (see the “Mary” example).
    What are your criteria for saying that a resource is adjacent - does it have to share a category?
    Sorry there wasn’t much time to meet up, we were sort of jetlagging and rushing to do some sightseeing ;)

  3. 3 Yves Posted 2008-09-09 - 4.39 am

    No, it doesn’t have to. For example, you may index “Both Simple Exercice”, to take back the example in the ldow article, instead of “Both” and “Simple Exercice”. In that case, Both is an artist, and Simple Exercice one of its records. You may also index “Both artist” (rdfs:label of Both and of mo:MusicArtist), etc.
    All that gives more clue for finding a particular resource.

  4. 4 Christian Becker Posted 2008-09-09 - 4.47 am

    Ah, missed that one :) My guess is that this works very well for episodes of series, songs from albums etc. that often don’t have individual Wikipedia articles and hence no DBpedia equivalent resources.
    But we’ll definitely try this out in the next iteration :)

Who's linking?

  1. 1 Interlinking the BBC and DBPedia Pingback on 2008-09-16
    "[...] Data community comes from Berlin student Christian Becker, who provides us with information about BBC Interlinking with DBPedia: ..."
  2. 2 Images for the future - Research blog » blog archive » Interesting links Pingback on 2008-10-17
    "[...] De week van het Nationaal Archief 2. The BBC is starting to embrace the Semantic Web 3. Open Content ..."
  3. 3 Images for the future - Research blog » blog archive » Interesting links digest Pingback on 2008-10-17
    "[...] 9. Online player trekt meer kijker September 23rd 2008 1. De week van het Nationaal Archief 2. The BBC ..."

Leave a Reply


Comment guidelines: No spamming, no profanity, and no flaming. Inappropriate comments will be deleted outright.


Where am I?

This is a single entry in the weblog.

"BBC interlinks with DBpedia" is filed under BBC, DBpedia, Linked Data, RDF and Semantic Web. It was published in September 2008.



www.flickr.com