Version 4 (modified by sauermann, 15 years ago) (diff)


There is a syntax extending normal wiki syntax, that allows us to easily enter sentences into the wiki that will be translated to RDF tripels. This syntax is a simple way to get running today, for research we will use linguistic and statistical methods to extract tripels from natural language (but this is later on).

Semantic Wiki Syntax

As described in the diploma thesis The Gnowsis by Leo, Page 75 (Page 89 inside the PDF).

Basic Rules:

  • (Hans Wurst) -> is the RDF resource with URI http://mywikiurl/wiki/Hand%20Wurst
  • (worksIn) -> is the RDF property with URI http://mywikiurl/wiki/arbeitet
  • (Hans Wurst) (worksIn) in Projekt (Blight). one sentence - one tripel
  • Things in hyphens are Literals "a literal".
  • Current wikis: "(Hans Wurst)" is just a wiki name.
  • Things that cannot be put into triples (because the sentence structure is weird or the property/predicate is missing) are put into a default relation to the page containing them. Example: 'And then blub sais (Nosferatu).' is transfered to <page> <relatedto> <Nosferatu>.

More complex rules:

  • Remove all the junk words between wiki/RDF words (all the stopwords are ignored). Remaining are the sentence marks like dot, comma, semicolon and paragraph-break: . , ; \n
  • take the remaining RDF + sentences and interpret it as N3
  • newlines are like dots: \n -> .

Example for complex rules:

In Sachen (SemDeskPraktikum2005) haben sich Leute angemeldet die (nehmenTeil) sind (!FlorianMittag), (!RalfBiedert);
weiteres ist das (datum) der "2005-11-14".

is stripped down to the most important terms:

SemDeskPraktikum2005 nehmenTeil FlorianMittag, RalfBiedert;
  datum "2005-11-14". 

and then upgraded to be N3 (note that this is real N3 that parses and validates):

@prefix leowiki:     <> .
leowiki:SemDeskPraktikum2005 leowiki:nehmenTeil leowiki:FlorianMittag, leowiki:RalfBiedert;
  leowiki:datum "2005-11-14". 

which is equivalent to these triples (N-TRIPLES):

<> <> <>. 
<> <> <>.
<> <> "2005-11-14". 

This results to these triples, that have to be stored in the RDF database, as created by the wiki page on which they were entered.

A disadvantage is that users cannot write naturally. The clear advantage is that this approach can be implemented in two days during the PraktikumSdt2005. For research we would need something higher, but this comes second.