Introduction to working on LOD with Perl

Kjetil Kjernsmo <kjetil@kjernsmo.net>.

Presentation at the Linked Open Data Hackfest, Oslo, 2010-01-04.

http://www.perlrdf.org/slides/perlrdf-intro.xhtml

To install:

wget --no-check-certificate -O - http://cpanmin.us/ | perl - RDF::LinkedData

Creative Commons Attribution-Sharealike 3.0 Norway License.

Important modules

And many more of differing relevance and maturity.

Resource Description Framework (RDF)

  1. From two documents to a triple.
    • Subject - Predicate - Object
  2. Add semantics to the link.
  3. Generalize the URI to identify anything.
  4. The object can be a string (with language or datatype).
  5. A collection of triples can be named with a URI.

RDF::Trine

Node types:

Important nodes: RDF::Trine::Node::Literal, RDF::Trine::Node::Resource, e.g:

my $subject = RDF::Trine::Node::Resource->new('http://example.org/aircraft/B787');
my $predicate = RDF::Trine::Node::Resource->new('http://purl.org/dc/terms/title');
my $object = RDF::Trine::Node::Literal->new('Boeing 787', 'en');

Other node types: RDF::Trine::Node::Literal::XML, RDF::Trine::Node::Blank, RDF::Trine::Node::Variable, RDF::Trine::Node::Nil

RDF::Trine

Statement:

my $statement = RDF::Trine::Statement->new($subject, $predicate, $object);

There's also RDF::Trine::Statement::Quad.

RDF::Trine

Stores:

my $store = RDF::Trine::Store->new_with_config({
                                  storetype => 'DBI',
                                  name      => 'mymodel',
                                  dsn       => 'DBI:mysql:database=rdf',
                                  username  => 'dahut',
                                  password  => 'Str0ngPa55w0RD'
                                 });

RDF::Trine

Models:

Various classes, for filtering, for doing unions properly, etc.

Typically, you would do:

my $model = RDF::Trine::Model->new($store);

or for experiments or temporary stuff:

my $model = RDF::Trine::Model->temporary_model;

Then, you can add your statement:

$model->add_statement($statement);

RDF::Trine

Parsers:

RDF/XML, Turtle, NQuads, NTriples, RDF/JSON, RDFa, TriG

Also, interface to Redland C parsers.

my $parser = RDF::Trine::Parser->new( 'turtle' );
$parser->parse_file_into_model( $base_uri, 'data.ttl', $model );

RDF::Trine

Iterators:

Used to manipulate data or SPARQL results iteratively, typically:

my $iterator = RDF::Trine::Iterator::Graph->new( \@data );
while (my $statement = $iterator->next) {
       # do something with $statement
}

RDF::Trine

Serializers:

RDF/XML, Turtle, NQuads, NTriples, RDF/JSON

From RDF::LinkedData:

my ($type, $s) = RDF::Trine::Serializer->negotiate(
                                         'request_headers' => $self->headers_in,
                                         base => $self->base,
                                         namespaces => $self->namespaces);
my $iterator = $model->bounded_description($node);
$output = $s->serialize_iterator_to_string ( $iterator );

RDF::Query

Leading complete SPARQL 1.1 implementation!

Has no store on its own, that's all in RDF::Trine::Store.

Has a long list of classes:

You're not very likely to use most of this!

RDF::Query

use strict;
use warnings;
use RDF::Trine;
use RDF::Query;

my $model = RDF::Trine::Model->temporary_model;
my $parser = RDF::Trine::Parser->new( 'turtle' );

my $turtle =<<'EOD';
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rel: <http://purl.org/vocab/relationship/> .
@prefix bio: <http://purl.org/vocab/bio/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://www.kjetil.kjernsmo.net/foaf#me> a foaf:Person ;
    foaf:name "Kjetil Kjernsmo" ;
    rel:parentOf <http://synne.kjernsmo.net/foaf#me> .

<http://synne.kjernsmo.net/foaf#me> a foaf:Person ;
    foaf:givenName "Synne" ;
    bio:event [ bio:date "2008-11-25"^^xsd:date ;
                        a bio:Birth ] .
EOD

$parser->parse_into_model( 'http://example.org/', $turtle, $model );

my $sparql=<<EOQ;
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rel: <http://purl.org/vocab/relationship/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?childname WHERE {
        ?parent a foaf:Person ;
                rel:parentOf ?child .
          ?child  foaf:givenName ?childname .
        }
EOQ

my $query = RDF::Query->new( $sparql );
my $iterator = $query->execute( $model );
while (my $row = $iterator->next) {
       print $row->{ 'childname' }->as_string;
    }

Test::RDF

Always start writing tests!

use Test::More tests => 3;
use Test::RDF;
is_valid_rdf('</foo> <http://www.w3.org/2000/01/rdf-schema#label> "This is a Another test"@en .'
, 'turtle', 'Valid turtle');
is_rdf('</foo> <http://www.w3.org/2000/01/rdf-schema#label> "This is a Another test"@en .', 'turtle', 
       '<?xml version="1.0" encoding="utf-8"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><rdf:Description rdf:about="/foo"><ns0:label xmlns:ns0="http://www.w3.org/2000/01/rdf-schema#" xml:lang="en">This is a Another test</ns0:label></rdf:Description></rdf:RDF>', 
       'rdfxml', 'Compare RDF/XML and Turtle');

isomorph_graphs($model1, $model2, 'Compare two models');

Quick install of RDF::LinkedData

On Ubuntu Maverick / Debian Squeeze, do first:

apt-get install libtest-nowarnings-perl libwww-perl libtest-www-mechanize-perl \
libtry-tiny-perl libplack-perl libtext-table-perl libtext-csv-perl \ 
libunicode-string-perl liblist-moreutils-perl libdbd-sqlite3-perl \
libtest-json-perl libmath-combinatorics-perl libxml-namespacesupport-perl \
libxml-libxml-perl liberror-perl libdigest-sha1-perl libset-scalar-perl \
libjson-perl libmoosex-log-log4perl-perl libnamespace-autoclean-perl \
libconfig-jfdi-perl

Then, to install the Linked Data module in your home directory, go:

wget --no-check-certificate -O - http://cpanmin.us/ | perl - RDF::LinkedData

Configuring RDF::LinkedData

Set environment:

export PERL5LIB=$HOME/perl5/lib/perl5
export PATH=$PATH:$HOME/perl5/bin

Then go

cd ~/.cpanm/latest-build/RDF-LinkedData-0.14/
wget http://svn.kjernsmo.net/talks/matrikkel1.ttl

Edit rdf_linkeddata.json in there to read:

{
  "base"  : "http://localhost:3000/matrikkel/",
  "store" : {
       "storetype"  : "Hexastore",
       "sources" : [ { 
                     "file" : "matrikkel1.ttl",
                     "syntax" : "turtle" 
                   } ], 
             },                          
}

Running RDF::LinkedData basic server

plackup script/linked_data.psgi --host localhost --port 3000     

Go to http://localhost:3000/matrikkel/6 in your browser!

You should get a file named "data" with some RDF. That's a fully working Linked Data server!