Search This Blog

21 October 2008

Frantic Semantics: DOAP, FOAF, DOAC, SIOC and DC

This is the third and final post in a series on RDF and the Semantic Web. The first post is here and the second one here.

So far, we've looked at what RDF is for and how it works. In this post are details of some popular projects that use RDF.

FOAF

FOAF (standing for 'Friend of a Friend') is an RDF vocabulary that allows creators of RDF to describe themselves and the people they know. There are tags for providing your contact details - should you wish people to know them - expressing your interests, showing off your online accounts, establishing your social ties, recording your projects and even detailing where you work or go to school. FOAF is far and away the best supported and most developed-for RDF vocabulary for describing everything about you, so if nothing else it's the most sensible move.

DOAP

Description of a Project, or DOAP for short, is a method of describing projects, programmes and plans in RDF. It allows you to give some informational text on the project, specify its homepage and - where applicable - source code, and give credit where credit is due to those who develop and maintain it. It can also integrate seamlessly into FOAF, so you can show off all the wonderful things you work on within the comfort of your own profile document.

SIOC

The Semantically Interlinking Online Communities project, otherwise known as SIOC (pronounced 'shock'), is a project to establish a common RDF vocabulary for describing community sites such as blogs and forums that allows them to be linked together in new and exciting ways. Every Site can host several Forums, each of which can in turn contain a number of Posts written by various Users. Again, this vocabulary can be mixed with others such as FOAF to give additional information.

DOAC

DOAC stands for Description Of A Career, and is an RDF method of putting your CV online. It encompasses experience, education and language skills, and when combined with FOAF is compatible with Europass, an initiative by the European Commission to help EU citizens display their qualifications clearly and simply.

DC

The DC, or Dublin Core, is a vast collection of terms for providing general metadata about a resource. Alongside FOAF, it is one of the most popular RDF vocabularies around and covers just about everything relating to web resources you can think of. Extremely useful for large organisations with a lot of paperwork to deal with!

Further Reading:

Further Semantic Antics: XML and all that

This is the second part of a short series on RDF and the Semantic Web. The first part can be found here.

RDF is essentially a form of XML, but it isn't quite the same as other XML formats such as SVG or RSS 2.0. Let's take a look at a snippet of RDF code (I will assume you have a working knowledge of XML):

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:example="http://example.com/vocab/">
   <example:company rdf:about="http://acmeinc.com/ourcompany#us">
      <example:companyName>Acme Inc.</example:companyName>
      <example:hasEmployee>
         <example:person>
            <example:personName>John Smith</example:personName>
            <example:employeeID>385740</example:employeeID>
         </example:person>
      </example:hasEmployee>
   </example:company>
</rdf:RDF>

Confused yet? Let's go through it step by step:

  1. The first thing to notice about this document is the abundance of colons. This is thanks to XML Namespaces, which allow tags from different vocabularies to be used in the same document. (A vocabulary is a collection of tags used for a specific purpose.) The namespace is a URI with the last portion missing - otherwise known as the base URI - and when the name of the tag is appended a full URI is created that references that tag specifically. Instead of having to write out the base URI over and over again whenever you want to use a tag in a particular namespace, you can assign it a name. Now when you use a tag, you can write its URI as nameOfBaseURI:tagName. For example, if the base URI http://exampletwo.org/vocab/ has been assigned the name exampletwo, then the tag name exampletwo:thing represents the URI http://exampletwo.org/vocab/thing. If you were to visit that URI, it would probably give you a description of what the thing tag meant. Base URIs are assigned names in the outermost tag of the document, in this case the rdf:RDF tag, using the attribute xmlns:assignedName, where assignedName is substituted with whatever name you want to give to the base URI in question. In RDF, all tags must have a namespace.
  2. The second thing to point out is the way in which triples are represented. In the example above, the tag example:companyName is nested inside the example:company tag, meaning that company is the subject of that particular triple whilst companyName is the predicate. Inside the example:companyName tag is a 'string literal', in other words a plain bit of text rather than a URI, and this is the object of the triple. Therefore, this triple means: "There is a company with the name 'Acme Inc.'." The second triple is much the same, but this time the object is not a string literal but an example:person tag, so the triple means: "There is a company that employs an example:person." The example:person tag is itself the subject of a third and fourth triple, which give us the name and ID number of the employee respectively.

If you wanted to express this RDF document in English, you would get something along the lines of:

There is a company with the name 'Acme Inc.' and with an employee with the name 'John Smith' and the employee ID 385740.

And so now you know the way RDF works. In the next post I'll talk about some popular RDF vocabularies.

Antics in Semantics: What exactly is the Semantic Web?

A long time ago, you may remember me talking about Microformats. Just to fill you in on the details, Microformats are little nuggets of code that can slip unobtrusively into any piece of XHTML and inform software of what everything actually means: that one piece of the document is an address whilst another is a resume, or that one piece is an event whilst another is a review. Once the software can see the data on the page in a format it recognises, it can do a number of clever things with it, such as finding an address on Google Maps or adding an event to your calendar program.

That is Microformats in a nutshell, but this post isn't about them. Instead I'm going to be delving into the secrets of the bigger, more versatile, and altogether much stranger web language that is RDF.

RDF stands for Resource Description Framework, which just about sums up its raison d'être: describing resources. Resources aren't just web documents, though; anything can be a resource, regardless of whether or not it exists on the web or even in the real world. To identify a resource, the most common practice is to give it a URI (Uniform Resource Identifier). A URI is similar to a URL, except the former need not lead to an actual web document when typed into your browser. Since an address that doesn't lead anywhere isn't very helpful, it is recommended that any resource URIs you use in your RDF do represent an actual location on the web that can be visited for information on the resource.

So that's what RDF does, but how does it do it? The answer is triples, a concept whereby all information is separated into three-part statements containing a subject, predicate and object. The subject is always a resource, the object can be a resource or a piece of data such as a string, and the predicate is a property - represented by a resource - that links the subject and object together. An example of a triple in the English language would be: "Acme Inc. has John Smith as an employee." Here, 'Acme Inc.' is the subject, 'John Smith' is the object and the ownership of an employee is the predicate. But how do we express this in RDF? Find out in the next exciting instalment, coming out as soon as I've finished writing it!