Status & Environs
During practical RDF projects, one big challenge is always how to choose good URIs for your resources. The RDF standards say very little about this topic. There are some best practices and helpful recommendations, but they are scattered all over the web. Creating “cool URIs for the semantic web” is hard.
Richard Cyganiak, Max Völkel and myself have written an article about how to choose cool URIs, filled with practical knowledge and background information about the problem and solutions. We have collected what we have learned during projects such as Semantic MediaWiki, dbpedia, D2R Server, Gnowsis, and Nepomuk. We hope that this article is a help for you or your students to get started programming Semantic Web applications.
Included below is a highly selective set of thoughts that provide scans of the technology from varied, contrasting points of view. The entries are in rough chronological order. A few early notes provide a bit of perspective. Most are from the last two years.
|Of special value is the book-length treatment by Tom Heath and Christian Bizer. Linked data: Evolving the Web into a Global Data Space is the most current how-to bible for the technology.
Several hard copy alternatives exist.
More information about the book is here.
Linked data as seen from various perspectives
Chris Bizer, Richard Cyganiak, Tom Heath
How to publish linked data on the web [superseded by: Linked data: evolving the web …]
Web 3.0: Chicken Farms on the Semantic Web
What we see in Web 3.0 is the Semantic Web community moving from arguing over chickens and eggs to creating its first real chicken farms. The technology might not yet be mature, but we’ve come a long way, and the progress promises to continue for a long time to come.
What is linked data?
The recent LinkedData Planet conference in NYC marked, I think, a real transition point. The conference signaled the beginning movement of the Linked Data approach from the research lab to the enterprise.
Linked Data is a set of best practices for publishing and deploying instance and class data using the RDF data model, naming the data objects using uniform resource identifiers (URIs), and exposing the data for access via the HTTP protocol, while emphasizing data interconnections, interrelationships and context useful to both humans and machine agents.
Christian Bizer, Tom Heath, Tim Berners-Lee
Linked data – the story so far
The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions-the Web of Data. In this article we present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. We describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward.
Give me an R
Since writing that post I have been thinking, on and off, about why I feel so negative about it.[ JISC eFramework] In the main I think it comes down to a fundamental change in architectural thinking over the last few years.[snip]
Perhaps SOA is still all the rage in ‘enterprise’ thinking, which is where the e-Framework seems to be most heavily focused, I don’t know?1 It’s not a world in which I dabble. But in the wider Web world it seems to me that all the interesting architectural thinking (at the technical level) is happening around REST and Linked Data at the moment. In short, the architectural focus has shifted from the ‘service’ to the ‘resource.’
So, it’s just about fashion then? No, not really-it’s about the way architectural thinking has evolved over the last 10 years or so. Ultimately, it’s that it seems more useful to think in resource-centric ways than it is to think in service-centric ways. Note that I’m not arguing that service oriented approaches have no role to play in our thinking at a business level, clearly they do-even in a resource oriented world. But I would suggest that if you adopt a technical architectural perspective that the world is service oriented at the outset then it is very hard to think in resource oriented terms later on and ultimately that is harmful to our use of the Web.
This is a quick sketch from a high-level business perspective. It provides an interesting contrast with the LOD communities’ outlook.
The Semantic Web standards-RDF, RDFS, OWL, SPARQL, and SKOS-are new technologies whose proponents hope will be adopted on a wide scale. In order for many of the Semantic Web scenarios speculated by the W3C to come to pass, these technologies have to be adopted on a large scale.
Karen Coyle Library Technology Reports, Understanding the Semantic Web and RDA Vocabularies
accompanied by Sarah Bartlett’s (Talis) commentary: part 1, part 2, part 3, part 4
The library world has to date been only subconsciously aware of the size and shape of these forces, and has intermittently adopted defensive ideas in response.
There has been a bit of a manic-depressive character on the Web waves of late with respect to linked data. On the one hand, we have seen huzzahs and celebrations from the likes of ReadWriteWeb and Semantic Web.com and, just concluded, the Linked Data on the Web ( LDOW) workshop at WWW2010. This treatment has tended to tout the coming of the linked data era and to seek ideas about possible, cool linked data apps . This rise in visibility has been accomplished by much manic and excited discussion on various mailing lists.
On the other hand, we have seen much wringing of hands and gnashing of teeth for why linked data is not being used more and why the broader issue of the semantic Web is not seeing more uptake. This depressive “ call to arms” has sometimes felt like ravings with blame being given to the poor state of apps and user interfaces to badly linked data to the difficulty of publishing same. Actually using linked data for anything productive (other than single sources like DBpedia) still appears to be an issue.
- Expected harvesting diameter?
- Expected inference behavior?
- Finding good sources, vocabularies?
- Establishing backlinks, crosslinks?
- Easy-to-use generalized client?
- Smooth integration with HTML web?
- Business models?
The briefing is intended to provide a starting point for those within the teaching and learning community who may have come across the concept of semantic technologies and the Semantic Web but who do not regard themselves as experts and wish to learn more. Examples and links are provided as starting points for further exploration.
The briefing paper is supplemented by the blog post When is Linked Data not Linked Data? which provides a summary of the debate surrounding the definition and characteristics of Linked Data.
Richard Wallis (Talis)
Is your data 5 star?
Many start with a large spreadsheet, or database, that they have never published to anyone before and are unsurprisingly a little concerned when confronted with feverous cries to publish everything as Linked Open Data – Now!
… Relax – make yourself a mug of your favourite hot beverage and approach this rationally.
and from Ed Summers:
While perusing the minutes of today’s w3c egov telecon I noticed mention of Tim Berners-Lee’s Bag of Chips talk at the gov2.0 expo last week in Washington, DC. I actually enjoyed the talk not so much for the bag-of-chips example (which is good), but for the examination of Linked Data as part of a continuum of web publishing activities associated with gold stars, like the ones you got in school. Here they are:
|make your stuff available on the web (whatever format)
|make it available as structured data (e.g., excel instead of image scan of a table)
|non-proprietary format (e.g., csv instead of excel)
|use URLs to identify things, so that people can point at your stuff
|link your data to other people’s data to provide context
I think it’s helpful to think of Linked Data in this context, and not to minimize (or trivialize) the effort and the importance of getting the first 3 stars.
Ian Mulvany (Mendeley)
The Future of Knowledge Organisation on the Web, a one day conference
His take on the meeting ( cited here) includes substantive commentary on each of the presenters’ varied takes on linked data:
- Nigel Shadbolt on Government Linked Data, a Tipping Point for the Semantic Web
- Antoine Isaac, SKOS and Linked Data
- Ricard Wallis from Talis on The Linked Data Journey
- Steve Dale, Linked Data in Local Government: The Knowledge Hub
- Martin Hepp, Linked Data in E-Commerce, The GoodRelations Ontology
- Andy Powell, Eduserv, Linked Data – the long and winding road
- John Goodwin from Ordinance Survey, Linking to Geographic Data
- Andreas Blaumer, PoolParty:SKOS Thesaurus Management utilising Linked Data
- Bertrand Vatant, Porting terminologies to the Semantic Web
Gillian Byrne and Lisa Goddard (Memorial Libraries, St. John’s, Newfoundland)
The Strongest Link: Libraries and Linked Data (in D-Lib Magazine)
Since 1999 the W3C has been working on a set of Semantic Web standards that have the potential to revolutionize web search. Also known as Linked Data, the Machine-Readable Web, the Web of Data, or Web 3.0, the Semantic Web relies on highly structured metadata that allow computers to understand the relationships between objects. Semantic web standards are complex, and difficult to conceptualize, but they offer solutions to many of the issues that plague libraries, including precise web search, authority control, classification, data portability, and disambiguation. This article will outline some of the benefits that linked data could have for libraries, will discuss some of the non-technical obstacles that we face in moving forward, and will finally offer suggestions for practical ways in which libraries can participate in the development of the semantic web.
We originally designed this course for government depts. working with data.gov.uk, refined based on our experience there and went on to deliver it to many teams throughout the BBC. It’s now been delivered dozens of times to interested groups and inside companies with no previous knowledge who want to get into this technology fast. In the spirit of sharing, the materials are freely available on the web and licensed under the Creative Commons Attribution License (CC-By).
Ian Davis (Talis)
Back to basics with linked data and HTTP
In the Semantic Web, it is not the Semantic which is novel, it is the Web
That quote, attributed to Chris Welty of IBM, is the one that best captures my outlook on the Semantic Web and Linked Data. The Web has connected people to information at an unprecedented rate and scale and comparisons to the impact of Caxton’s press are justified however trite they are. For the majority of people using the Web it’s a rich place full of stories, pictures, shops and encyclopedias but for us Web technologists we see that all of those marvelous things are enabled by the use of URIs, HTTP and various machine readable data formats.
That’s it really. No messing around with special status codes and redirects based on hard-to-pin-down concepts. No special types of URI that differ in meaning depending on what software you use. Just standard HTTP. When someone enters a URI in their browser or application, they get useful related information back. Moreover, the URI in their browser’s address bar is one they can use to refer to that resource in any context. They can bookmark it, send it in an email, use it in a SPARQL query or even write some of their own RDF with it. I like that kind of simplicity.
Jane Stevenson (Linked Open Copac Archives Hub … i.e, library and archival data)
Assessing linked data
It has raised one question that often applies when dealing with something quite technical: how much should a manager (in my case an archivist managing an online archive service) be expected to understand the ‘technical’ aspects of something? This is a question I have spoken about and written about before; mainly in terms of what archivists (or other information professionals) in the digital age need to know in order to understand the implications of choices around things like data structure and software systems. In the case of Linked Data, I am still not sure how much I need to know about the detail of Linked Data, the RDF model, the use of RDF XML, the benefits of other output formats, the application of stylesheets, etc. I have been thinking about how hard it is to create Linked Data …
The New Year’s almost here, and of course that brings with it a time to reflect on what’s been and muse on what’s ahead. To that end, the Semantic Web Blog asked some industry names to share their perspectives-and concerns about some of the direction, as well.
The rest of the post provides set of predictions for the semantic web in 2011:
Cha-ching: there’s money to be made:
- Ivan Herman, Semantic Web Activity Lead, W3
- Nick Ducoff, CEO and co-founder, Infochimps
- Mike Petit, Co-founder and CIO, Open Amplify
- Marco Neumann, CEO, Kona (more in his blog here)
Fulfilling Enterprise Expectations Around the Semantic Web
- Christine Connors, founder and information strategist, TriviumRLG
- Graham G. Rong, chair, MIT Sloan CIO Symposium
- Seth Grimes, founder, Alta Plana
- Mark Montgomery, CEO, Kyield
About Those Customer Experiences…
- Sid Bannerjee, CEO, Clarabridge
- Michelle de Haaff, CMO, Attensity
Where Search is Heading, Where Data is Going
- Alex Iskold, founder and CEO, AdaptiveBlue
- Marco Neumann, CEO, Kona
- John Breslin, lecturer and researcher at NUI Galway, Creator of SIOC
Semantics and the Social Web
- Nova Spivack, CEO of Lucid Ventures and LiveMatrix
Online Ads: Public Pressures, Real-Time Bidding, Brand Advertiser Buy-In Change Landscape
- Andy Ellenthal, CEO, Peer39
- J. Brooke Aker, CMO, ADmantX
Governments’ Hands In It All
- Ivan Herman, Semantic Web Activity Lead, W3C
- Mark Montgomery, founder and CEO, Kyield
Challenges, Obstacles, and Things That Go Bump in the Night
- William Mougayar, CEO, Eqentia
- Shion Deysarkar, CEO, 80legs
- Ivan Herman, Semantic Web Activity Lead, W3C
- David Siegel, entrepreneur and author, The Power of Pull
Paul Miller (semanticweb.com)
The semantic link podcast (monthly)
The panels I mean[to refer to]-the panels I love-are the ones where the panelists talk to one another and their audience. The ones where different perspectives are brought to the table and shared. The ones where knowledge and prior experience are self-evident. The ones where differences of perspective or opinion inform and enrich rather than disrupt and divide. The ones where the moderator knows to keep (reasonably) quiet. Them. They’re great, but sadly all too rare.
The impact of Linked Data will be greater than the World Wide Web it builds upon.
If we look back over the last decade and a half, which has moved us from those first monochrome, imageless, early web browsers to today’s web enabled and integrated world, we cannot deny the impact and changes that the web has brought upon us. It is sometimes hard to realise what is now part of our 21st Century existence-online commerce, social networking, and the likes of the Google and Wikipedia-that had not been dreamed of those few short years ago. In basic terms, all of this has stemmed from the encoding of pages of text in to html documents. The network effect of linking those documents together has been phenomenal. Each of those documents is backed by, or constructed from, many more individual facts and pieces of data. It is the potentially enormous further network effect that the emerging Linked Data powered Web of Data, publishing and interconnecting those facts and data, that I believe will impact our lives in ways we have yet to comprehend.
Tom Heath, Chris Bizer
Linked Data: Evolving the Web into a Global Data Space
The World Wide Web has enabled the creation of a global information space comprising linked documents. As the Web becomes ever more enmeshed with our daily lives, there is a growing desire for direct access to raw data not currently available on the Web or bound up in hypertext documents. Linked Data provides a publishing paradigm in which not only documents, but also data, can be a first class citizen of the Web, thereby enabling the extension of the Web with a global data space based on open standards-the Web of Data.
In this Synthesis lecture we provide readers with a detailed technical introduction to Linked Data. We begin by outlining the basic principles of Linked Data, including coverage of relevant aspects of Web architecture.
The remainder of the text is based around two main themes-the publication and consumption of Linked Data. Drawing on a practical Linked Data scenario, we provide guidance and best practices on:
- architectural approaches to publishing Linked Data;
- choosing URIs and vocabularies to identify and describe resources;
- deciding what data to return in a description of a resource on the Web;
- methods and frameworks for automated linking of data sets; and
- testing and debugging approaches for Linked Data deployments.
We give an overview of existing Linked Data applications and then examine the architectures that are used to consume Linked Data from the Web, alongside existing tools and frameworks that enable these. Readers can expect to gain a rich technical understanding of Linked Data fundamentals as the basis for application development, research or further study.
This American Library Association sponsored webinar provides a reasonable snapshot of the state of thinking about linked data at the technology grass-roots level in US libraries:
Our descriptions no longer stand alone!
Connect our data with the rest of the WEB
Allow others to reuse more easily
- FOAF, Geonames
- New York Times, Thomson Reuters
- Government data-data.gov
- British Broadcasting Corporation
- Other library, archive and museum data
Distributed bibliographic control environment
- Linking Data
- Focus on identification over description
Chris Bizer, Anja Jentzsch, Richard Cyganiak State of the LOD Cloud
This document provides statistics about the structure and content of the LOD cloud. It also analyzes the extend [sic] to which LOD data sources implement nine best practices that are either recommended W3C or have emerged within the LOD community.
All statistics within this document are based on the LOD data set catalog that is maintained on CKAN. If you spot any errors in the data describing the LOD data sets, it would be great if you would correct them directly on CKAN. For information on how to describe datasets on CKAN please refer to the Guidelines for Collecting Metadata on Linked Datasets in CKAN.
John Mark Ockerbloom
Open data’s role in transforming our bibliographic framework (Updated)
I’ve seen a fair bit of buzz of the Library of Congress’s recent announcement of a new initiative for transforming the bibliographic framework of the library community. …
There’s another very important driver not explicitly mentioned in the announcement: the rise of open library data. More and more bibliographic and other library-related data is now freely available and reusable online, and it enables all kinds of improvements in resource discovery.
… there’s more big news from OCLC. At last month’s Global Council meeting, Karen Calhoun gave a presentation saying they were considering letting members release WorldCat data under an open license.
Michael Kelly, Library Journal
Library of Congress May Begin Transitioning Away from MARC
The Library of Congress has announced that it is going to undertake a major reevaluation of bibliographic control in a move that could lead to a gradual transition away from the 40-year-old MARC 21 standard in which billions of metadata records are presently encoded.
“It’s a ten,” said Sally McCallum without hesitation when asked to rank the project’s scope and importance on a scale of one to ten. McCallum is chief of the Network Development and MARC Standards Office at LOC.
The goal of the Bibliographic Framework Transition Initiative is to determine “what is needed to transform our digital framework” in the light of technological changes and budgetary constraints, said Deanna B. Marcum, the library’s Associate Librarian for Library Services, who will lead the initiative. “It’s very important that we find a way to link library resources to the whole world of information resources not focusing exclusively on bibliographic information,” she said
The only information that I heard was that there will be a complete
plan, with timelines, by the end of September. I don’t think we’ll get
anything “official” before that.