Formal Language: SPARQL Literal Matching

2008-07-06

SPARQL Literal Matching

I use the Jena implementation of SPARQL for a personal photoalbum. I have encountered a use case where SPARQL just can't help me. It is impossible to split a literal into parts. Take this RDF example.

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
_:a  foaf:name   "Johnny Lee Outlaw" .

If I want to use SPARQL to convert this information into a different schema where first, middle and last name are separate properties, it can't be done.

@prefix ns2: <http://some.other/namespace/> .
_:a  ns2:firstname  "Johnny" .
_:a  ns2:middlename "Lee" .
_:a  ns2:lastname   "Outlaw" .

This could be solved by allowing variables to bind to regexp groups. Something like this.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX ns2:  <http://some.other/namespace/>
CONSTRUCT {
 ?s ns2:firstname  ?first .
 ?s ns2:middlename ?middle .
 ?s ns2:lastname   ?last .
}
WHERE {
 ?s foaf:name ?fullname .
 FILTER match(?fullname, "([^ ]*) ([^ ]*) ([^ ]*)", "?first ?middle ?last") .
}

I could of course have missed something that allows me to do what I want. You are welcome to correct me in the comments.

In this case, not being Turing-complete is a major drawback for SPARQL. If it was Turing-complete, the problem would be solvable in some way or another. The Principle of Least Power is very useful for data definition languages, but I doubt a SPARQL query is useful for an interpreter not being a query engine.

The easiest way to create a Turing-complete language is to embed it in a Turing-complete host language. Then it is always possible to go beyond the embedded language and use features from the host language when necessary.

Formal Language

2008-07-06

SPARQL Literal Matching

No comments:

Debugging with Popper

Related blogs