Posted:August 27, 2020

CWPK #24: Introduction to RDFLib

It’s Time to Add a New Semantic Tool to the Toolbox

In CWPK #17 of this Cooking with Python and KBpedia series, we discussed what we would need in an API to OWL. Our work so far with owlready2 continues to be positive, leading us to believe it will prove out in the end to be the right API solution for our objectives. But in that same CWPK #17 review we also indicated intrigue with the RDFLib option. We know there are some soft spots with owlready2 in areas such as format support for which RDFLib is strong. It is also the case that owlready2 lacks a SPARQL query option, another area in which RDFLib is strong. In fact, the data exchange methods we use in KBpedia rely directly on simple variants of RDF, especially in the N3 notation.

In recognition of these synergies, just has it had in embracing SQLite as a lightweight native quad store, owlready2 has put in place many direct relations to RDFLib, including in the data store. What I had feared would be a difficult challenge of integrating Python, Anaconda, Jupyter Notebook, owlready2, and RDFLib, turned out in fact to be a very smooth process. We introduce the newest RDFLib piece in today’s installment.

RDFLib is a Python library for interacting with the Resource Description Framework (RDF) language. It has been actively maintained over 15 years and is presently in version 5.x. RDFLib is particularly strong in the areas of RDF format support, SPARQL querying of endpoints (including local stores), and CSV file functionality. Our hope in incorporating RDFLib is to provide the most robust RDF/OWL platform available in Python.

Installing RDFLib

Enter this at the command prompt:

$ conda install rdflib

You will see the standard feedback to the terminal that the package is being downloaded and then integrated with the other packages in the system. The simple install command is possible because we had already installed conda-forge as a channel within the Anaconda distribution system for Python as described in CWPK #9.

We are now ready to use RDFLib.

Basic Setup

OK, so we steer ourselves to the 24th installment in the CWPK directory and we fire up the system by invoking the command window from this directory. We enter $ jupyter notebook at the prompt and then proceed through the Jupyter file manager to this cwpk-24-intro-rdflib.ipynb file. We pick it, and then enter our standard set of opening commands to KBpedia:

Which environment? The specific load routine you should choose below depends on whether you are using the online MyBinder service (the ‘raw’ version) or local files. The example below is based on using local files (though replace with your own local directory specification). If loading from MyBinder, use this address for kbpedia_reference_concepts.owl
main = 'C:/1-PythonProjects/kbpedia/sandbox/kbpedia_reference_concepts.owl'
# main = 'https://raw.githubusercontent.com/Cognonto/CWPK/master/sandbox/builds/ontologies/kbpedia_reference_concepts.owl'
skos_file = 'http://www.w3.org/2004/02/skos/core' 
kko_file = 'C:/1-PythonProjects/kbpedia/sandbox/kko.owl'
# kko_file = 'https://raw.githubusercontent.com/Cognonto/CWPK/master/sandbox/builds/ontologies/kko.owl'

from owlready2 import *
world = World()
kb = world.get_ontology(main).load()
rc = kb.get_namespace('http://kbpedia.org/kko/rc/')

skos = world.get_ontology(skos_file).load()
kb.imported_ontologies.append(skos)

kko = world.get_ontology(kko_file).load()
kb.imported_ontologies.append(kko)

We could have done this first, but we need to import the RDFLib package into our active environment:

import rdflib

Depending on our use of RDFLib going forward, we could restrict this import to only certain modules in the package, but we load it all in this case.

Now, here is where the neat trick used by owlready2 in working with RDFLib comes into play. RDFLib also uses (in the standard case) SQLite as its backend. So, we point to the namespace graph (could be any name) that RDFLib expects, but we assign it to the namespace (in this case, world) already recognized by owlready2:

graph = world.as_rdflib_graph()

We now may manipulate the knowledge graph as we would in a standard way using (in this case) the namespace world for owlready2 and access all of the additional functionality available via RDFLib using the (in this case) the graph namespace. This is a great example of the Python ecosystem at work.

Further, because of even greater integration, there are some native commands in owlready2 that have been mapped to from RDFLib making the syntax and conventions in working with both libraries easier.

Initial SPARQL Examples

Of course, the reason we brought RDFLib into the picture at this point was to continue our exploration of querying the knowledge graph that began in our last installment, CWPK #23. We devote the next installment to a discussion of SPARQL queries in some depth, but let’s first test to see if our configuration is working properly.

In our first of two examples we present a fairly simple query in SPARQL format to our internal KBpedia reference concept store under the namespace graph.

r = list(graph.query_owlready("""
  PREFIX rc: <http://kbpedia.org/kko/rc/>
  PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
  SELECT DISTINCT ?x ?label
  WHERE
  {
    ?x rdfs:subClassOf rc:Mammal.
    ?x skos:prefLabel  ?label. 
  }
"""))

print(r)
[[rc.AbominableSnowman, 'abominable snowman'], [rc.Afroinsectiphilia, 'Afroinsectiphilia'], [rc.Eutheria, 'placental mammal'], [rc.Marsupial, 'pouched mammal'], [rc.Australosphenida, 'Australosphenida'], [rc.Bigfoot, 'Sasquatch'], [rc.Monotreme, 'monotreme'], [rc.Vampire, 'vampire'], [rc.Werewolf, 'werewolf']]

The above format looks more akin to a standard SPARQL query format. While it is a bit different, the example below is a more Python-like expression. Note as well that the three-quote convention tells Python to expect a multi-line code block:

r = """
  PREFIX rc: <http://kbpedia.org/kko/rc/>
  PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
  SELECT DISTINCT ?x ?label
  WHERE
  {
    ?x rdfs:subClassOf rc:Mammal.
    ?x skos:prefLabel  ?label. 
  }
"""

results = list(graph.query_owlready(r))
print(results)
[[rc.AbominableSnowman, 'abominable snowman'], [rc.Afroinsectiphilia, 'Afroinsectiphilia'], [rc.Eutheria, 'placental mammal'], [rc.Marsupial, 'pouched mammal'], [rc.Australosphenida, 'Australosphenida'], [rc.Bigfoot, 'Sasquatch'], [rc.Monotreme, 'monotreme'], [rc.Vampire, 'vampire'], [rc.Werewolf, 'werewolf']]

Additional Documentation

In the next installment we will provide SPARQL documentation. Here, however, are a couple of useful links to learn mora about RDFLib and its capabilibies:

NOTE: This article is part of the Cooking with Python and KBpedia series. See the CWPK listing for other articles in the series. KBpedia has its own Web site.
NOTE: This CWPK installment is available both as an online interactive file or as a direct download to use locally. Make sure and pick the correct installment number. For the online interactive option, pick the *.ipynb file. It may take a bit of time for the interactive option to load.
I am at best an amateur with Python. There are likely more efficient methods for coding these steps than what I provide. I encourage you to experiment — which is part of the fun of Python — and to notify me should you make improvements.

Schema.org Markup

headline:
CWPK #24: Introduction to RDFLib

alternativeHeadline:
It's Time to Add a New Semantic Tool to the Toolbox

author:

image:
https://www.mkbergman.com/wp-content/uploads/2020/07/cooking-with-kbpedia-785.png

description:
We add RDFlib to our owlready2 API in this installment, which brings SPARQL, N3, and stronger RDF support to our baseline system.

articleBody:
see above

datePublished:

Leave a Reply

Your email address will not be published.