The BBC is starting to embrace the Semantic Web. We were recently commissioned to create links between DBpedia and an internal BBC vocabulary, which enable the BBC to use DBpedia/Wikipedia as a controlled vocabulary. This allows them to suggest related content to their users across their multitude of content management systems (we hear there are 36 systems in use at the moment) and better integrate content from the web into their properties. This also means that third parties will gain access to BBC metadata and content in the very near future. Skeptics beware, this is reaching a tipping point!
BBC interlinks with DBpedia
09Sep087 Responses to “BBC interlinks with DBpedia”
Who's linking?
-
1
Pingback on 2008-09-16
"[...] Data community comes from Berlin student Christian Becker, who provides us with information about BBC Interlinking with DBPedia: ..." -
2
Pingback on 2008-10-17
"[...] De week van het Nationaal Archief 2. The BBC is starting to embrace the Semantic Web 3. Open Content ..." -
3
Pingback on 2008-10-17
"[...] 9. Online player trekt meer kijker September 23rd 2008 1. De week van het Nationaal Archief 2. The BBC ..."



Awesome!!
I am so sorry to not have been able to make it last week, looks like it was two great days
Wrt. to interlinking algorithm – do you use lucene to only index on literals? Is it working when two things have the same name?
I suggested to Georgi some time ago that a more robust and general algorithm could perhaps index on adjacent literals (eg. literals that are a resource away). This would sort of like implement a light version of the algorithm i presented at ldow: it would give one more clue for disambiguating things that have the same name. If it still gives poor results, you can index on three adjacent literals, etc.
Hi Yves,
Lucene indexes labels and article categories/templates. We use the labels to search; and the latter are restricted according to a manually determined list depending on the dataset (like in your algorithm) as well as to the automatically determined class equivalents (see the “Mary” example).
What are your criteria for saying that a resource is adjacent – does it have to share a category?
Sorry there wasn’t much time to meet up, we were sort of jetlagging and rushing to do some sightseeing
No, it doesn’t have to. For example, you may index “Both Simple Exercice”, to take back the example in the ldow article, instead of “Both” and “Simple Exercice”. In that case, Both is an artist, and Simple Exercice one of its records. You may also index “Both artist” (rdfs:label of Both and of mo:MusicArtist), etc.
All that gives more clue for finding a particular resource.
Ah, missed that one
My guess is that this works very well for episodes of series, songs from albums etc. that often don’t have individual Wikipedia articles and hence no DBpedia equivalent resources.
But we’ll definitely try this out in the next iteration