NEWSLETTER

Please keep me up to date about current SWC events & activities.

OUR PARTNERS

"The web must be a web, not a series of isolated data islands"

28.04.2008

Tom Heath

INTERVIEW: Tom Heath from Talis and an active member of the Linked Data Community discusses the growth and future of the Web of Data with SWC's Andreas Blumauer.

Web of Data - more than a marketing excuse

SWC: Sceptics might say that Web of Data is just another nice try to make the Semantic Web more popular. As Benjamin Nowack pointed out [1], the same old idea of a meaningful WWW comes in different flavours. So is there something really new about the 'Linked Data Web' compared to the 'Semantic Web'?

Tom Heath: The important thing to remember is that there is nothing particularly new about the idea of a meaningful Web, whatever the label one chooses to apply to the concept. In one of the original academic papers about the Web from the early nineties, Tim Berners-Lee and his co-authors mention a desire to see more machine-readable data on the Web -- what has become known as the Semantic Web is a clear development of this idea.

Far from being a cynical marketing exercise, use of terms such as 'Linked Data' and 'Web of Data' simply represent a clarification of the intentions behind the Semantic Web vision. The label 'Semantic Web' has itself been a victim of semantics, which has not aided adoption of the underlying ideas.

How the Semantic Web develops over the next ten years will remain to be seen, but in the meantime it's essential that we use terms that speak to people in clear terms, and convey more of the key features that can lead to a more meaningful Web. 'Web of Data' does just that, and 'Linked Data' is the means by which we are reaching that goal.

The Future of the Linked Open Data Cloud

SWC: The growth rate of the LOD cloud is tremendous as you have demonstrated [2]. One can expect that only in a couple of months this cloud will 'explode' due to network effects. This could mean that many new applications will be in place but also that the cloud could fall apart, because many things will be represented by many different URIs which aren't linked anymore. Which scenario do you think is more realistic?

Tom Heath: I'm confident that the Linked Data cloud will continue to grow at a phenomenal rate. In fact this growth in the cloud is such that representing the Web of Linked Data with the current 'cloud' visualisation maintained by Richard Cyganiak is becoming increasingly challenging. This Linked Data explosion, and any resulting 'explosion' of the cloud visualisation, should be welcomed with open arms.

I have no concerns that the Linked Data cloud will explode or dissipate through a lack of interlinking, for a number of reasons. Firstly, a prerequisite for joining this Web is the creation of links between new data sets and those that already exist -- a data set being available on the Web is not enough, it must also be *in* the Web. Secondly, I perceive an increased or renewed understanding within the Semantic Web community of the power of networks effects, and the value that linking to existing hubs such as DBpedia and Geonames can bring. This value will ensure the Web remains a Web, not a series of isolated data islands.

THE LOD Cloud in May 2008 (source Richard Cyganiak)

The LOD Tag Cloud in May 2008

And one year ago:

LOD Tag Cloud 2007

Inconsistency in the global information space

SWC: Which strategies you can think of that can help to make the LOD cloud a more or less consistent data set? Do you prefer a solely community-driven approach or also a partly moderated one? What could such a moderation look like?

Tom Heath: This question can be answered at a number of levels. In the first instance we need to bear in mind that inconsistency is an inevitable aspect of a global information space, and should accept these properties and the freedom they give to enable massive growth in distributed publishing of data.

The one area where consistency is important is in the use of Web standards such as HTTP, URIs and RDF, and the adoption of community conventions for publishing Linked Data. When Tim Berners-Lee introduced the four principles of Linked Data, he laid the foundations for community norms that are now adopted within the community.

Some of the finer technical points are still debated, but the crucial aspects of these norms is that they have enabled people to take action over the last 18 months, the results of which are clearly visible.

Ontologies: The necessary glue in the Semantic Web?

SWC: Which role are 'good old core ontologies' like Cyc, DOLCE or Sumo play in the next evolutionary steps of the Semantic Web? Do you think they are completely out of date?

Tom Heath: As someone who takes a fairly data-centric view on the world, these kind of upper ontologies are not really my area of expertise. As a result I hesitate to criticise in any way. I can envisage a point in the medium to long term where these serve as necessary glue in the Semantic Web, but would need to see concrete use cases that demonstrate the value they add. For now I think it is essential that we concentrate on applications and scenarios that exploit a basic level of semantics in order to be useful.

Linked Open Data for companies

SWC: For many companies it is still a futuristic idea to expose data to the web and make money out of it. What (simple) business models can you think of which could also explain the value of a Web of Data to a media company, a publisher, a governmental organisation etc.?

Tom Heath: Each of these audiences requires a slightly different answer. A large, complex entity with a public service function and certain requirements on transparency and accountability, such as a governmental organisation, stands to gain from a Web of Data through greater fulfilment of its key role of serving the electorate.

With commercial enterprises the picture is a little more complex, but I can see great potential in use of the Semantic Web as a means to drive traffic to channels with more conventional revenue streams. However, if we assume the existence of a Web of Data, which we can now begin to do, innovative new business models will arise that we could not previously anticipate, just as they did on the Web first time around.

About Tom Heath

Tom Heath is a member of the Talis Platform team which he joined in early 2008.

Tom completed his Ph.D thesis at the Open University's Knowledge Media Institute (KMi), and is well known to the Semantic Web community as an active member of the Linked Data community, winner of last year's Semantic Web Challenge and the brains behind Revyu.com.

Tom's Ph.D research was focused on using the Semantic Web to support recommendation-seeking in social networks, and his understanding of this space will be put to good use in and around Talis.

Tom has just returned from China, where he was one of the Workshop Chairs for the WWW 2008 Workshop: Linked Data on the Web (LDOW2008) which took place in April 2008 in Beijing, China.

References

[1] Semantic Web Aliases. Results of a Twitter Discussion between Benjamin Nowack and Ian Davies

[2] The Linking Open Data Project: Bootstrapping the Web of Data. [PDF, 1,8 MB] Talk presented by Tom Heath in February 2008 in Amsterdam (CATCH Programm)

Talis Information Ltd. UK

LinkedData.org (Linked Data community)

RESOURCES

Related Concepts