Posted:October 11, 2010

structWFSFirst of Two Semantic Component Additions to the Open Semantic Framework

Since its initial release, Structured Dynamics‘ open source Open Semantic Framework (OSF) has continued to expand its capabilities and add refinements [1]. The OSF and its various contributing open source software modules are now also fully documented and explained on the OSF TechWiki [2], from which this current article is drawn. With the kind sponsorship of one of our clients [3], we were commissioned to create “dashboards.” Dashboards are currently all the rage. A dashboard presents a composite view of data and information, involving generally multiple widgets or individual displays. This, for example, is a dashboard in the context of our client:

But the client’s request did not end there. What they wanted was a general capability to make dashboards — a dashboard-making machine, if you will — because of their desire to provide an information portal that is constantly changing and responsive to current topics and needs. The net outcome of this request was our creation of the Workbench, beautifully designed by Fred Giasson, to be our newest (and most comprehensive) semantic component. In terms of terminology:

  • The Workbench is the environment (presently expressed as a conStruct Drupal module) for creating Dashboard views
  • A Dashboard View is a combination of one of more records, attributes for those records, and the widgets that display them. A Dashboard view may be saved, which makes it persistent and callable and usable from other application locations. A Dashboard view may also be embedded into other Web pages. The figure above is an example Dashboard view with four display widgets
  • Sub-panel is an individual widget display incorporated into a given Dashboard view; practically there is a limit of about six (6) sub-panels for any given Dashboard view (though there may be as few as one sub-panel).
Note: the example screen above and those that follow are illustrative. They may be:

  • Completely different widgets and data than what is shown
  • Completely different in appearance for your own installation; they can be styled in any way you wish
  • Optionally reserved for system use only, with only the actual Dashboards viewable by standard users.

In most instances, use of the Workbench is reserved for administrators and curators, who use it to create persistent Dashboard views that are what is ultimately shared with end users. However, that is also a matter of policy and design. There is no technical reason why the Workbench could not be exposed to standard users. What follows, then, is part of the user manuals for working with the Workbench and Dashboards. It assumes you already know much of how Drupal and its conStruct OSF modules work.

Accessing the Workbench

From within a Drupal instance, you access the Workbench via either the Admin or Tools links. Then, you will see the Workbench provided as a distinct option:

The Main Workbench Screen

The Workbench is the environment (presently expressed as a conStruct Drupal module) for creating Dashboard views. As such, if used, it is one of the more complicated components in an Open Semantic Framework instance. The Workbench consists of three panels and a main menu.

Three Panels

The Workbench is comprised of three main panels: the Filter Panel (Item #1), the Record Selector Panel (Item #2) and the Dashboard Panel (Item #3): Selections in any one of the panels gets reflected and highlighted in all other panels. These three main panels can be moved or re-sized anywhere around the screen.

Filter Panel

The Filter Panel (Item #1) is for making broad “slice-and-dice” selections across the structure. It has three sub-groupings within it:

  • Datasets, which are a listing of all of the datasets to which you have access
  • Kinds, which are the facets or types (sets) by which your data is organized and characterized, and
  • Attributes, which are the specific data characteristics for your records. Attributes correspond to the columns in the Record Selector Panel and are like column headers in SQL tables.

Record Selector Panel

The Record Selector Panel (Item #2 in the main screen above), based on the filter restrictions, is for selecting the individual attributes and records to display; it works and operates like a spreadsheet (data grid).

The Dashboard Panel

Depending on the selections in the previous two panels, the Dashboard Panel (Item #3 in the main screen above) shows the specific data visualization component depending on the display profile of the attribute type (map, story, graph, explorer, etc.). It may also be used to display a similar comparisons for identified “sticky” records (say national or state- or province-level data).

Main Menu and Functionality

The Workbench main menu (Item #4 on the screen shot above) has these options:

  • Window – view in full screen or normal mode
  • View – pick a specific panel to display
  • Record Selection Mode – determines how you add records to the Dashboard Panel; see below
  • Dashboard – basic dashboard controls and options; see below.

Selecting and Filtering Data

The main purpose of the Workbench, of course, is to select and filter data for display with various widgets. Each of the three main panels participates in this function.

Filtering Datasets, Kinds and Attributes

Filtering occurs via the Filter Panel, with its possible selections of datasets, kinds or attributes: By default, if no items are selected in one of these sub-groups, then all items are deemed to be selected. However, restricting by datasets may filter out otherwise available kinds or attributes, and restricting by kind may filter out otherwise available attributes.

Selecting Records

Records AND display attributes are selected via the Record Selector Panel. First, let’s look at some records selections: If there are restrictions applied via the Filter Panel, then the number of available attributes shown in the Record Selector Panel may be reduced. Because the actual data display widgets are limited in size, there is a maximum of 50 records that can shown in the Record Selector Panel at any given time. Attribute selections are made by checking the column item’s checkbox; this causes a new display (sub-panel) to be spawned in the Dashboard Panel (see next). Record selections are made by clicking anywhere on a record row. Multiple selections can be made through the standard continuous range select (via the Shift key) or discontinuous range select of multiple, individual records (via the Ctrl key). Selections as made add records to all of the sub-panel displays in the Dashboard Panel.

Selecting Attributes (Dashboard sub-panels)

Selection of an attribute column in the Record Selector Panel causes a new display, or widget, to appear as a sub-panel within the Dashboard Panel. If a particular attribute or record type can be displayed with more than one display type, that is selected via the dropdown list at the lower left of each display sub-panel. Sub-panels are created in the order of the attributes (data) selected in the Records Selector Panel, from left-to-right, top-to-bottom. In the figure above, there are three sub-panels in a 1 x 3 configuration. But, by adding another attribute, we now add a fourth sub-panel and the overall displays shifts to a 2 x 2 configuration: Each sub-panel is auto-sized as it is added to the canvas. There is a practical limit of about six (6) sub-panels to any given Dashboard view. Each sub-panel may be drag-and-dropped to an alternate location within the panel. Once embedded in a Web page, the actual sub-panel and panel sizes for a given Dashboard view may be re-set for sizes and dimensions.

Record Selection Mode

One of the main menu options is Record Selection Mode. By default, the standard selection mode is list select. Under this mode, all records selected in the Record Selector Panel are added to all Dashboard sub-panels. This is the best initial mode, since it is fast to create similar selections across all display widgets. This option is selected when the Workbench is first accessed, as shown by this menu item: However, you may also invoke drag-and-drop mode, also selected by this same menu: Under drag-and-drop, an individual record may be selected in the Record Selector Panel and then dragged to a specific sub-panel (display widget) in the Dashboard panel. This technique is useful when, say, you want to tailor a specific sub-panel view or provide a comparative baseline to various sub-panels. Whichever selection mode is currently active is reported back in the title header of the Record Selector Panel. You may also switch back-and-forth between selection modes at any time.

Creating and Saving Dashboard Views

The Dashboard main menu option is where you use and re-use Dashboard views. This menu option allows you to:

  • Save Dashboard views
  • Load Dashboard views
  • Create tabs with different indicators or attributes
  • Rename tabs
  • Delete tabs, or
  • Generate HTML code for embedding a Dashboard view in a Web page.

Save or Load

A Dashboard view with its multiple sub-panels and tabs (see below) may have taken some thought and time to design. For this reason, you may want to re-use it and you may want to protect your work. When saving a Dashboard view, you are prompted for a name, shown existing views that you might overwrite, and are asked for a password (that is later required to do any modifications) as this popup screen shows:

Re-Using Dashboard Views

The same dialog above shows how easy it is to also re-use Dashboard views. All existing saved views are shown in the dialog box. The first obvious use is to allow existing views to be modified or updated. Another interesting possibility is to use this design for basic view “templates” that get set up, then re-used for specific records or types. In this manner a template baseline can be established that is then called up multiple times for specific tailoring. Still another advantage of re-use is to create a standard name for a Dashboard view, say, “Main Page” that then gets embedded on the main page of your application (using the “embed” procedures noted below). Because the hosting Web page is configured to accept this named view, you can actually change the specifics of the view under the Workbench — conceivably including quite different records or widget displays — and then save it for automatic re-loading on the main page.

Dashboard Tabs

Another series of menu options from the Dashboard menu relate to “tabs”. Tabs are additional sub-panels nested under a Dashboard view. As noted before, an individual panel in a Dashboard view is practically limited to six to eight sub-panels; with tabs, this can be expanded substantially. To begin the process of adding a tab you invoke the new tab option under the Dashboard menu: Once named, the tab then appears as a tab button on the Dashboard view and a blank canvas is presented for adding more sub-panels (as described above): Once saved, these tabs also get included with the persistent Dashboard view and can also be embedded in other Web pages.

Embedding Views in Web Pages

Once a Dashboard view is created, there are two ways to use or embed them: generate HTML code or treat as a Drupal node. You invoke the generate code option from the Dashboard menu using the Get Code choice: A “Get HTML Code to Embed” window will appear in the workbench. You have to provide two pieces of information before you can generate the HTML code:

  1. Base URL of the Portable Control Application (leave blank if new file is placed in PCA folder)
  2. Schema for the data used (see below)

The Base URL is the URL where the Portable Control Application is located on your Web server. However, you can leave this field empty if the HTML page you want to generate is in the same folder as the PortableControlApplication.swf file. The Schema is (one or multiple) URLs where the irXML schema that are used by the Portable Control Application are located on the Web. See further the irON specification on how to create these schema. Once these fields are completed, can click the “Generate HTML Code” button to generate the HTML code to embed in your HTML page.

Using the Generated HTML Code

The HTML code generation tool will generate code in two places within this popup up window:

  1. The “Copy then paste this <header> GENERATED CODE </header> into Header section”, and
  2. The “Copy then paste <body> GENERATED CODE </body> into Body section”

The HTML code that appears in the first section has to be copied and pasted into the <header></header> section of your HTML file. The HTML code that appears in the second section has to be copied and pasted into the <body></body> section of your HTML file. Once you have copied and pasted these codes into the two sections of your HTML page, save it, and then load the resulting Web page into your browser. If you have properly filled in all fields above, you will then see the persistent Dashboard view embedded in the page.

Some HTML Page Tweaks

The Dashboard view is displayed within an HTML <div> </div> container. This container defines the size of the actual Dashboard display within in the Web page (as well as other HTML code or styling you care to insert). We suggest that what is generated in the second text area above be added within such a <div> </div> tag. Then, you may place the <div> </div> anywhere you want in your Web page layout. It is this <div> </div> container that determines the size of the Dashboard that will be displayed to the user (plus any other instructions you care to include). Here is an example of such a <div> </div> container:

  <div style="width: 800px; height: 800px">
     <script language="JavaScript" type="text/javascript">
     </script>
     <noscript>
        <object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
          id="PortableControlApplication" width="100%" height="100%"
          codebase="http://fpdownload.macromedia.com/get/flashplayer/current/swflash.cab">
          <param name="movie" value="PortableControlApplication.swf" />
          <param name="quality" value="high" />
          <param name="bgcolor" value="#869ca7" />
          <param name="allowScriptAccess" value="sameDomain" />
          <param name="allowFulllScreen" value="true" />
          <embed src="PortableControlApplication.swf" quality="high" bgcolor="#869ca7"
            width="100%" height="100%" name="PortableControlApplication" align="middle"
            play="true"
            loop="false"
            quality="high"
            allowScriptAccess="sameDomain"
            allowFullScreen="true"
            type="application/x-shockwave-flash"
            pluginspage="http://www.adobe.com/go/getflashplayer">
          </embed>
      </object>
     </noscript>
  </div>

Optional Dashboard Page

One of the advantages of piggybacking on Drupal is the ability to leverage on native and extended capabilities. A core extension to Drupal is content types via CCK, which can be managed and invoked and themed separately. We have set up a standard Drupal content (node) type called Dashboard Views. Thus, if you follow the separate set of procedures to embed your Dashboard view in this manner, you can:

  • Assign Dashboards to standard blocks in Drupal
  • Create master listing pages of all available Dashboard views
  • Enable comments and user and community interaction and feedback
  • Provide differential access or editing by user or user group.

We have only just begun to explore the possibilities of the combined Dashboard-content type design.

A Sample Dashboard View

And, so, the result of the steps above is to create the same static Dashboard view that began this article:

Soon to Be Released

This new capability will be released as open source after the client first presents it publicly, now scheduled for the first week of November. Besides general upgrades across the entire Open Semantic Framework stack, that same release will also include a massive update to the Concept Explorer, which we will cover in a later article.


[1] These are all parts of the open semantic framework (OSF): 1) conStruct – connecting modules to enable structWSF and sComponents to be hosted/embedded in Drupal; 2) structWSF – platform-independent suite of more than 20 RESTful Web services, organized for managing structured data datasets; 3) sComponents – (mostly) Flex semantic components (widgets) for visualizing and manipulating structured data; and 4) irON – instance record Object Notation for conveying XML, JSON or spreadsheets (CSV) in RDF-ready form.
[2] The generic technical wiki (TechWiki) provides documentation for the software and related systems associated with the various OpenStructs open source software projects. It and its content is itself open source. As of the date of this writing, the TechWiki contains 233 articles under 58 categories and another 388 images.
[3] Peg is a community indicator system (CIS) that has been developed for Winnipeg by a community-wide consortium of partners spearheaded by the the United Way of Winnipeg and the International Institute for Sustainable Development (IISD). Other partners include the Province of Manitoba, the City of Winnipeg, Health in Common, and a cross section of community interests and members. Peg’s mission is to build the knowledge and capacity of Winnipeggers to work together to achieve and sustain the well-being of current and future generations.

Posted by AI3's author, Mike Bergman Posted on October 11, 2010 at 12:31 am in Open Semantic Framework, Structured Dynamics | Comments (1)
The URI link reference to this post is: https://www.mkbergman.com/919/the-osf-workbench-a-shaking-dashboard-making-machine/
The URI to trackback this post is: https://www.mkbergman.com/919/the-osf-workbench-a-shaking-dashboard-making-machine/trackback/
Posted:September 27, 2010

Resources Useful to the Understanding of Ontologies and the Semantic Web

Over the past few weeks we have been publishing a series of general background documents and tutorials useful to the understanding of ontologies. These entries have been prepared specifically with the non-expert and end user in mind.

The Ontology Tutorial Series is now complete as initially scoped. These various articles, in both originally posted form and as kept current on the OpenStructs‘ TechWiki [1], are:

 


[1] The tutorials were first published on this blog over the period of Aug. 9 to Sept. 20, 2010. They are now permanently maintained and updated on the TechWiki.

Posted by AI3's author, Mike Bergman Posted on September 27, 2010 at 1:19 am in Ontologies, Ontology Best Practices | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/916/ontology-tutorial-series/
The URI to trackback this post is: https://www.mkbergman.com/916/ontology-tutorial-series/trackback/
Posted:September 20, 2010

OWL 2 Has New Options; Useful to SKOS, Too

It is not unusual to want to treat things either as a class or an instance in an ontology, depending on context. Among other aspects, this is known as metamodeling and it can be accomplished in a number of ways. However, the newest version of the Web Ontology Language, OWL 2, provides a neat trick for doing this called “punning“. Why one would want to metamodel, how to specify it in an ontology, and why the OWL 2 approach is helpful are described in this post [1].

Why Metamodel?

Lightweight, domain ontologies have been the focus of this ontology series. Domain ontologies are the “world views” by which organizations, communities or enterprises describe the concepts in their domain, the relationships between those concepts, and the instances or individuals that are the actual things that populate that structure. Thus, domain ontologies are the basic bread-and-butter descriptive structures for real-world applications of ontologies.

These lightweight, domain ontologies often have a hierarchical structure for which SKOS (Simple Knowledge Organization System) is a recommended starting ontology [2] (see best practices recommendations). A subject concept reference ontology such as UMBEL (Upper Mapping and Binding Exchange Layer) [3], which we also recommend, also has a similar structure and a heavy reliance on SKOS in its vocabulary. Because of these structural similarities, ontologies that use SKOS or UMBEL are therefore good candidates for using metamodeling techniques.

To better understand why we should metamodel, let’s look at a couple of examples, both of which combine organizing categories of things and then describing or characterizing those things. This dual need is common to most domains [4]. For the first example, let’s take a categorization of apes as a kind of mammal, which is then a kind of animal. In these cases, ape is a class, which relates to other classes, and apes may also have members, be they particular kinds of apes or individual apes. Yet, at the same time, we want to assert some characteristics of apes, such as being hairy, two legs and two arms, no tails, capable of walking bipedally, with grasping hands, and with some being endangered species. These characteristics apply to the notion of apes as an instance.

As another example we may have the category of trucks, which may further be split into truck types, brands of trucks, type of engine, and so forth. Yet, again, we may want to characterize that a truck is designed primarily for the transport of cargo (as opposed to automobiles for people transport), or that trucks may have different drivers license requirements or different license fees than autos. These descriptive properties refer to trucks as an instance.

These mixed cases combine both the organization of concepts in relation to one another and with respect to their set members, with the description and characterization of these concepts as things unto themselves. This is a natural and common way to express most any domain of interest. The practice has been to express these mixed uses in RDFS or OWL Full, which makes them easy to write and create since most “anything goes” (a loose way of saying that the structures are not decidable) [5]. Use of sub-class relationships also enables tree-like hierarchies to be constructed and some minor inferencing (such as one concept is broader than another concept, one of the contributions of SKOS). But such mixed uses do not allow more capable OWL reasoners to be applied, nor for the full power of query or search abstraction to be applied, nor for the ontology to be checked for consistency. These limits may be fine in many circumstances, but their lack does allow structures to evolve that may become incoherent or illogical. If data interoperability is a goal, as it is in our enterprise use cases, incoherent ontologies can not contribute or participate as structures to linking datasets. At most — and this is the case for much linked data practice — all that can be done is to make explicit pairwise connections between different dataset objects. This is not efficient and defeats the whole purpose of leveraging schema.

OWL 2 has been designed to fix that (in addition to other benefits [12]). The approach taken by OWL 2 to overcome some of these metamodeling limitations is through “punning” [6]. Recall that objects are named in RDF with URIs (IRIs in OWL 2). The trick with “punning” is to evaluate the object based on how it is used contextually [7]; the IRI is shared but its referent may be viewed as either a class or instance based on context. Thus, objects used both as concepts (classes) and individuals (instances) are allowed and standard OWL 2 reasoners may be used against them. It should be noted, however, that this “punning” technique does not support the full range of possible metamodeling aspects [8]. Like any language, there is a trade-off in OWL 2 between expressivity and reasoning efficiency [9].

But, for lightweight, domain ontologies where the objective is interoperability across heterogeneous sources — that is, namely the main objectives of the semantic Web or semantic enterprise — this trade-off in OWL 2 now appears to be well balanced. Moreover, its automatic detection by tools such as Protégé 4 that use the OWL API also means it is comparatively easy to use and implement.

Relationship to Recommended Best Practices

An earlier chapter in this series presented some best practices for ontology building and maintenance. A fundamental aspect of those recommendations was the desirability of keeping instance data (ABox) separate from the conceptual structure (TBox) that provides the schema of relationships for those concepts [10]. Fortunately, this approach also integrates well with the metamodeling capabilities in OWL 2. How metamodeling and the ABox-TBox split is accommodated is shown by this diagram, using trucks as an example:

  Figure 1. Metamodeling in Domain Ontologies (click to expand)

The right-hand side of the diagram shows the two views possible via OWL 2 metamodeling in the TBox. In some cases, we may speak of trucks as a class of vehicle, to which individual members may belong; this is the class view. In other contexts, we may want to characterize or make assertions about trucks in our ontology, such as asserting cargo transport or engine type, in which case truck is now represented as an instance (individual) under the individual view. These two views in the TBox represent our structural and conceptual description (the “world view”) regarding this domain of which vehicles and trucks are a part. Then, when we begin to populate our knowledge base with specific data, we do so via the ABox. In this example, as we add data about the specific brand of Ford trucks and their attributes, we link the Ford instance to the TBox via the Truck class. (Best practice also requires that we model this new attribute structure into the TBox as well, but that is a different topic. 😉 .)

How Punning is Triggered in OWL 2

Punning is not triggered by annotation properties. Annotation properties applied to a class merely act as additional description or metadata about that class; the annotation property by definition does not participate in any inferencing or reasoning. You should also know that in OWL 2, certain predicates (properties) such as label, comment or description (among others) are reserved as annotation properties [11]. You can invoke the OWL 2 punning process directly or via context when your ontologies are processed with the OWL API. The basic rule to follow is:

Any entity declared as a class and with an asserted object or data property [15] is punned (metamodeled).

This test is done directly by the OWL API [7]. You can go ahead and test this out with an OWL 2-compliant editor, such as Protégé 4. Here is an example test (in N3 notation): First, begin with some initial declarations:

foo:Car a owl:Class .

foo:Animal a owl:Class ;
owl:disjointWith foo:Car .

Then, let’s describe an object property:

foo:isEndangered a owl:ObjectProperty ;
rdf:domain foo:Animal ;
rdf:range bar:SomeSpecies .

And define and make an assertion about Apes:

foo:Ape a owl:Class ;
foo:isEndangered bar:SomeSpecies .

Now, the system begins by testing for punning and other checks, such as:

  1. isEndangered an annotation property? no
  2. what is its domain? foo:Animal
  3. this will detect and infer:
foo:Ape a owl:Class ;
foo:Ape a foo:Animal ;
foo:isEndangered bar:SomeSpecies .
  1. punning is triggered because non-annotation property has been applied to a class
  2. non-annotation properties are now assigned to named individual (which captures individual view part of the TBox above)
  3. then, can check for inconsistencies depending on the restriction(s) applied to the foo:Animal class.

In this case, no inconsistencies were found. But, let’s now add another object (non-annotation) property:

foo:hasBrand a owl:ObjectProperty ;
rdf:domain foo:Car ;
rdf:range bar:SomeBrand .

And use it to expand our assertions about Apes:

foo:Ape a owl:Class ;
foo:isEndangered bar:SomeSpecies ;
foo:hasBrand bar:Ford .

And repeat #3:

foo:Ape a owl:Class .
foo:Ape a foo:Animal .
foo:Ape a foo:Car ;
foo:isEndangered bar:SomeSpecies ;
foo:hasBrand bar:Ford .

Now, inconsistencies are raised in the second #3: So, the consistency check fails, because Ape can not be both an Animal and a Car. While this is clearly a silly example, such checks are quite important as the number of objects and assertions grows in an ontology.

What Does Punning Look Like?

The punning technique works because the IRI for the object ends up being treated as both a concept (class) and an instance (individual). Thus, while the object shares the same IRI, depending on its context, it is evaluated by an OWL reasoner as a different thing (class or individual). The OWL API achieves this by actually writing out the object in both its class view and individual view. Here is an example (in RDF/XML serialization): Input OWL:

<owl:Class rdf:about=“http://purl.org/ontology/Ape> <isEndangered>Ape</isEndangered> </owl:Class>

Output from Protégé with punning:

<!-- http://purl.org/ontology/Ape-->

<owl:Class rdf:about="http://purl.org/ontology/Ape"/>

<!-- http://purl.org/ontology/Ape-->

<owl:NamedIndividual rdf:about="http://purl.org/ontology/Ape">
   <isEndangered>Ape</isEndangered>
</owl:NamedIndividual>

Notice the duplicate definition (in RDF/XML) to the NamedIndividual. When writing out the ontology, all punned objects are duplicated in a similar manner.

The Beginning of the Transition

OWL 2 and its other general changes [12] have arrived in the nick of time. Not only were we seeing some of the weaknesses in OWL 1 that warranted updating, but we are also now being challenged with regard to how to make linked data and the many datasets in RDF effectively interoperate. Perhaps undecidability and throwing triples to the wind worked OK in the early days of our semantic Web Wild West. But now it is time for the new sheriff to bring order to the emerging chaos. Of course only time will tell, but we believe the design decisions made by the OWL 2 working group were judicious and balanced ones to find that sweet spot between expressiveness and reasoning efficiency [9]. We also believe that, while useful in its less expressive form [2], that many new domain vocabularies based on SKOS would especially benefit from embracing the OWL 2 metamodeling techniques. But two criticisms still remain. First, tooling support for OWL 2 and the OWL API is weak, as discussed in an earlier chapter. And, as the last chapter discussed, there are not enough practitioners that have yet taken up OWL 2, which means that best practice guidance and exemplars are still limited. Lightweight domain ontologies can greatly benefit from these OWL 2 metamodeling techniques and the OWL RL alternative that also emerged as one of the OWL 2 profile enhancements [13]. Structured Dynamics thinks the growing scale and learning taking place around linked data and RDF datasets is now pointing the way to a necessary transition. And OWL 2 metamodeling should be one of the key components to making our semantic technologies more responsive and effective [14].


[1] This posting is part of a current series on ontology development and tools, jointly developed with Structured Dynamics with co-authorship by Frédérick Giasson. The series began with An Executive Intro to Ontologies, then continued with an update of the prior Ontology Tools listing, which now contains 185 tools. It progressed to a survey of ontology development methodologies. That led to a presentation of a new, Lightweight, Domain Ontologies Development Methodology. That piece was then expanded to address A New Landscape in Ontology Development Tools, which was followed up by a listing of best practices in domain ontology building and maintenance. This portion completes the series.
[2] Alistair Miles and Sean Bechhofer, eds., 2009. SKOS Simple Knowledge Organization System Reference, W3C Recommendation, 18 August 2009. See http://www.w3.org/TR/skos-reference/. Some common SKOS domain predicates include skos:definition, skos:prefLabel, skos:altLabel, skos:broaderTransitive, skos:narrowerTransitive.
According to the cited W3C recommendation:
. . . the “concepts” of a thesaurus or classification scheme are modeled [in the base SKOS form] as individuals in the SKOS data model, and the informal descriptions about and links between those “concepts” as given by the thesaurus or classification scheme are modeled as facts about those individuals, never as class or property axioms. Note that these are facts about the thesaurus or classification scheme itself, such as “concept X has preferred label ‘Y’ and is part of thesaurus Z”; these are not facts about the way the world is arranged within a particular subject domain, as might be expressed in a formal ontology.
Metamodeling and the use of OWL allows the base SKOS form to be expressed as a formal ontology, over which reasoning and inference may occur. Not all SKOS structures may be amenable to this (thesauri and lexical resources such as Wordnet perhaps fall into this category), but some other structures are logical and can be formalized. UMBEL, for example, fits into this category, as do many carefully crafted controlled vocabularies. When used as such, many of the SKOS predicates become OWL annotation properties.
[3] UMBEL (Upper Mapping and Binding Exchange Layer) is an ontology of about 20,000 subject concepts that acts as a reference structure for inter-relating disparate datasets. It is also a general vocabulary of classes and predicates designed for the creation of domain-specific ontologies.
[4] In the domain ontologies that are the focus here, we often want to treat our concepts as both classes and instances of a class. This is known as “metamodeling” or “metaclassing” and is enabled by “punning” in OWL 2. For example, here a case cited on the OWL 2 wiki entry on “punning“:
People sometimes want to have metaclasses. Imagine you want to model information about the animal kingdom. Hence, you introduce a class a:Eagle, and then you introduce instances of a:Eagle such as a:Harry.
(1) a:Eagle rdf:type owl:Class (2) a:Harry rdf:type a:Eagle
Assume now that you want to say that “eagles are an endangered species”. You could do this by treating a:Eagle as an instance of a metaconcept a:Species, and then stating additionally that a:Eagle is an instance of a:EndangeredSpecies. Hence, you would like to say this:
(3) a:Eagle rdf:type a:Species (4) a:Eagle rdf:type a:EndangeredSpecies.
This example comes from Boris Motik, 2005. “On the Properties of Metamodeling in OWL,” paper presented at ISWC 2005, Galway, Ireland, 2005. For some other examples, see Bernd Neumayr and Michael Schrefl, 2009. “Multi-Level Conceptual Modeling and OWL (Draft, 2 May – Including Full Example)”; see http://www.dke.jku.at/m-owl/most09_22_full.pdf.
[5] A good explanation of this can be found in Rinke J. Hoekstra, 2009. Ontology Representation: Design Patterns and Ontologies that Make Sense, thesis for Faculty of Law, University of Amsterdam, SIKS Dissertation Series No. 2009-15, 9/18/2009. 241 pp. See http://dare.uva.nl/document/144859. In that, Hoekstra states (pp. 49-50):
RDFS has a non-fixed meta modelling architecture; it can have an infinite number of class layers because rdfs:Resource is both an instance and a super class of rdfs:Class, which makes rdfs:Resource a member of its own subset (Nejdl et al., 2000). All classes (including rdfs:Class itself) are instances of rdfs:Class, and every class is the set of its instances. There is no restriction on defining sub classes of rdfs:Class itself, nor on defining sub classes of instances of instances of rdfs:Class and so on. This is problematic as it leaves the door open to class definitions that lead to Russell’s paradox (Pan and Horrocks, 2002). The Russell paradox follows from a comprehension principle built in early versions of set theory (Horrocks et al., 2003). This principle stated that a set can be constructed of the things that satisfy a formula with one free variable. In fact, it introduces the possibility of a set of all things that do not belong to itself . . . .
In RDFS, the reserved properties rdfs:subClassOf, rdf:type, rdfs:domain and rdfs:range are used to define both the other RDFS modelling primitives themselves and the models expressed using these primitives. In other words, there is no distinction between the meta-level and the domain.
[6] “Punning” was introduced in OWL 2 and enables the same IRI to be used as a name for both a class and an individual. However, the direct model-theoretic semantics of OWL 2 DL accommodates this by understanding the class Father and the individual Father as two different views on the same IRI, i.e., they are interpreted semantically as if they were distinct. The technique listed in the main body triggers this treatment in an OWL 2-compliant editor. See further Pascal Hitzler et al., eds., 2009. OWL 2 Web Ontology Language Primer, a W3C Recommendation, 27 October 2009; see http://www.w3.org/TR/owl2-primer/.
[7] The OWL API is a Java interface and implementation for the W3C Web Ontology Language (OWL), used to represent Semantic Web ontologies. The API provides links to inferencers, managers, annotators, and validators for the OWL2 profiles of RL, QL, EL. Two recent papers describing the updated API are: Matthew Horridge and Sean Bechhofer, 2009. “The OWL API: A Java API for Working with OWL 2 Ontologies,” presented at OWLED 2009, 6th OWL Experienced and Directions Workshop, Chantilly, Virginia, October 2009. See http://www.webont.org/owled/2009/papers/owled2009_submission_29.pdf; and, Matthew Horridge and Sean Bechhofer, 2010. “The OWL API: A Java API for OWL Ontologies,” paper submitted to the Semantic Web Journal; see http://www.semantic-web-journal.net/sites/default/files/swj107.pdf. Also see its code documentation at http://owlapi.sourceforge.net/2.x.x/documentation.html.
The main text describes how via “punning” the OWL API supports two parallel views sharing the same IRI, which can enable a concept to operate as either a class or instance depending on context.
[8] Some other metamodeling aspects not supported by “punning” include full multi-level modeling (such as in UML or OMG‘s model-driven architecture) or linkage with closed-world reasoning.
[9] OWL has historically been described as trying to find the proper tradeoff between expressive power and efficient reasoning support. See, for example, Grigoris Antoniou and Frank van Harmelen, 2003. “Web Ontology Language: OWL,” in S. Staab and R. Studer, eds., Handbook on Ontologies in Information Systems, Springer-Verlag, pp. 76-92. See http://www.few.vu.nl/~frankh/postscript/OntoHandbook03OWL.pdf.
[10] The TBox portion, or classes (concepts), is the basis of the ontologies. The ontologies establish the structure used for governing the conceptual relationships for that domain and in reference to external (Web) ontologies. The ABox portion, or instances (named entities), represents the specific, individual things that are the members of those classes. Named entities are the notable objects, persons, places, events, organizations and things of the world. Each named entity is related to one or more classes (concepts) to which it is a member. Named entities do not set the structure of the domain, but populate that structure. The ABox and TBox play different roles in the use and organization of the information and structure. These distinctions have their grounding in description logics.
[11] For a listing, see http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/#Annotation_Properties. Even if your local ontology defines a sub-property of one of these items, such as foo:myLabel as a sub-property of rdfs:label, you are advised to still specifically declare it as an annotation property.
[12] See Bernardo Cuenca Grau, Ian Horrocks, Boris Motik, Bijan Parsia, Peter Patel-Schneider and Ulrike Sattler, 2008. “OWL2: The Next Step for OWL,” see http://www.comlab.ox.ac.uk/people/ian.horrocks/Publications/download/2008/CHMP+08.pdf; and also see the OWL 2 Quick Reference Guide by the W3C, which provides a brief guide to the constructs of OWL 2, noting the changes from OWL 1.
[13] OWL RL is the “rules” profile of OWL 2 and is both decidable and offers additional axiomatic support for metamodeling. As this figure drawn from Hoekstra [Fig. 3-4 in 5] shows comparing OWL 2 to OWL 1, OWL RL provides a subset of decidable description logics:
[14] Metamodeling might be a new concept to you and some of the aspects can certainly be academic. If the references above do not sufficient satisfy your curiosity, you may want to check out some of these other useful references: Birte Glimm, Sebastian Rudolph and Johanna Völker, 2009. “Integrated Metamodeling and Diagnosis in OWL 2,” see http://www.comlab.ox.ac.uk/files/3129/paper.pdf; and Nophadol Jekjantuk, Gerd Groener and Jeff. Z. Pan, 2009. “Reasoning in Metamodeling Enabled Ontologies,” in Rinke Hoekstra and Peter F. Patel-Schneider, eds., Proceedings of OWL: Experiences and Directions (OWLED 2009); see http://www.webont.org/owled/2009.
[15] In OWL 2, an object property is a predicate that defines a binary relationship between two objects (in specific respect to a triple, between a subject and an object). A data property is a predicate that defines a binary relationship between an object an a literal (string or data value). In contrast to object and data properties, annotation properties and reserved OWL and RDF vocabularies are explicitly excluded from this rule. Only declared object or data properties trigger the punning.

Posted by AI3's author, Mike Bergman Posted on September 20, 2010 at 12:23 am in Ontologies, Ontology Best Practices | Comments (4)
The URI link reference to this post is: https://www.mkbergman.com/913/metamodeling-in-domain-ontologies/
The URI to trackback this post is: https://www.mkbergman.com/913/metamodeling-in-domain-ontologies/trackback/
Posted:September 13, 2010

A Single-stop Assembly of Ontology Tips and Pointers

As we conclude this recent series on ontology tools and building [1], one item stands clear: the relative lack of guidance on how one actually builds and maintains these beasties. While there is much of a theoretic basis in the literature and on the Web, and much of methodologies and algorithms, there is surprisingly little on how one actually goes about creating an ontology.

An earlier posting pointed to the now classic Ontology Development 101 article as a good starting point [2]. Another really excellent starting point is the Protégé 4 user manual [3]. Though it is obviously geared to the Protégé tool and its interface, it also is an instructive tutorial on general ontology (OWL) topics and constructs. I highly recommend printing it out and reading it in full.

If you do nothing else, you should download, print and study in full the Protégé 4 users manual [3].

Learning by Example

Another way to learn more about ontology construction is to inspect some existing ontologies. Though one may use a variety of specialty search engines and Google to find ontologies [4], there are actually three curated services that are more useful and which I recommend.

The best, by far, is the repository created by the University of Manchester for the now-completed TONES project [5]. TONES has access to some 200+ vetted ontologies, plus a search and filtering facility that helps much in finding specific OWL constructs. It is a bit difficult to filter by OWL 2-compliant only ontologies (except for OWL 2 EL), but other than that, the access and use of the repository is very helpful. Another useful aspect is that the system is driven by the OWL API, a central feature that we recommended in the prior tools landscape posting. From a learning standpoint this site is helpful because you can filter by vocabulary.

An older, but similar, repository is OntoSelect. It is difficult to gauge how current this site is, but it nonetheless provides useful and filtered access to OWL ontologies as well.

These sources provide access to complete ontologies. Another way to learn about ontology construction is from a bottom-up perspective. In this regard, the Ontology Design Patterns (ODP) wiki is the definitive source [6]. This is certainly a more advanced resource, since its premise begins from the standpoint of modeling issues and patterns to address them, but the site is also backed by an active community and curated by leading academics. Besides ontology building patterns, ODP also has a listing of exemplary ontologies (though without the structural search and selection features of the sources above). ODP is not likely the first place to turn to and does not give “big picture” guidance, but it also should be a bookmarked reference once you begin real ontology development.

It is useful to start with fully constructed ontologies to begin to appreciate the scope involved with them. But, of course, how one gets to a full ontology is the real purpose of this post. For that purpose, let’s now turn our attention to general and then more specific best practices.

Sources of Best Practices

As noted above, there is a relative paucity of guidance or best practices regarding ontologies, their construction and their maintenance. However, that being said, there are some sources whereby guidance can be obtained.

To my knowledge, the most empirical listing of best practices comes from Simperl and Tempich [7]. In that 2006 paper they examined 34 ontology building efforts and commented on cost, effectiveness and methodology needs. It provides an organized listing of observed best practices, though much is also oriented to methodology. I think the items are still relevant, though they are now four to five years old. The paper also contains a good reference list.

Various collective ontology efforts also provide listings of principles or such, which also can be a source for general guidance. The OBO (The Open Biological and Biomedical Ontologies) effort, for example, provides a listing of principles to which its constituent ontologies should adhere [8]. As guidance to what it considers an exemplary ontology, the ODP effort also has a useful organized listing of criteria or guidance.

One common guidance is to re-use existing ontologies and vocabularies as much as possible. This is a major emphasis of the OBO effort [9]. The NeOn methodology also suggests guidelines for building individual ontologies by re-use and re-engineering of other domain ontologies or knowledge resources [10]. Brian Sletten (among a slate of emerging projects) has also pointed to the use of the Simple Knowledge Organization System (SKOS) as a staging vocabulary to represent concept schema like thesauri, taxonomies, controlled vocabularies, and subject headers [11].

The Protégé manual [3] is also a source of good tips, especially with regard to naming conventions and the use of the editor. Lastly, the major source for the best practices below comes from Structured Dynamics‘ own internal documentation, now permanently archived. We are pleased to now consolidate this information in one place and to make it public.

The best practices herein are presented as single bullet points. Not all are required and some may be changed depending on your own preferences. In all cases, however, these best practices are offered from Structured Dynamics’ perspective regarding the use and role of adaptive ontologies [12]. To our knowledge, this perspective is a unique combination of objectives and practices, though many of the individual practices are recommended by others.

General Best Practices

General best practices refer to how the ontology is scoped, designed and constructed. Note the governing perspective in this series has been on lightweight, domain ontologies.

Scope and Content

  • Provide balanced coverage of the subject domain. The breadth and depth of the coverage in the ontology should be roughly equivalent across its scope
  • Reuse structure and vocabularies as much as possible. This best practice refers to leveraging non-ontological content such as existing relational database schema, taxonomies, controlled vocabularies, MDM directories, industry specifications, and spreadsheets and informal lists. Practitioners within domains have been looking at the questions of relationships, structure, language and meaning for decades. Effort has already been expended to codify many of these understandings. Good practice therefore leverages these existing structural and vocabulary assets (of any nature), and relies on known design patterns
  • Embed the domain coverage into a proper context. A major strength of ontologies is their potential ability to interoperate with other ontologies. Re-using existing and well-accepted vocabularies and including concepts in the subject ontology that aid such connections is good practice. The ontology should also have sufficient reference structure for guiding the assignment of what content “is about”
  • Define clear predicates (also known as properties, relationships, attributes, edges or slots), including a precise definition. Then, when relating two things to one another, use care in actually assigning these properties. Initially, assignments should start with a logical taxonomic or categorization structure and expand from there into more nuanced predicates
  • Ensure the relationships in the ontology are coherent. The essence of coherence is that it is a state of logical, consistent connections, a logical framework for integrating diverse elements in an intelligent way. So while context supplies a reference structure, coherence means that the structure makes sense. Is the hip bone connected to the thigh bone, or is the skeleton askew? Testing (see below) is a major aspect for meeting this best practice
  • Map to external ontologies to increase the likelihood of sharing and interoperability. In Structured Dynamics’ case, we also attempt to map at minimum to the UMBEL subject reference structure for this purpose [13]
  • Rely upon a set of core ontologies for external re-use purposes; Structured Dynamics tends to rely on a set of primary and secondary standard ontologies [14]. The corollary to this best practice is don’t link indiscriminantly.

Structure and Design

  • Begin with a lightweight, domain ontology [15]. Ontologies built for the pragmatic purposes of setting context and aiding disparate data to interoperate tend to be lightweight with only a few predicates, such as isAbout, narrowerThan or broaderThan. But, if done properly, these lighter weight ontologies with more limited objectives can be surprisingly powerful in discovering connections and relationships. Moreover, they are a logical and doable intermediate step on the path to more demanding semantic analysis. Because we have this perspective, we also tend to rely heavily on the SKOS vocabulary for many of our ontology constructs [16]
  • Try to structurally split domain concepts from instance records. Concepts represent the nodes within the structure of the ontology (also known as classes, subject concepts or the TBox). Instances represent the data that populates that structure (also known as named entities, individuals or the ABox) [17]. Trying to keep the ABox and TBox separate enables easier maintenance, better understandability of the ontology, and better scalability and incorporation of new data repositories
  • Treat many concepts via “punning” as both classes and instances (that is, as either sets or members, depending on context). The “punning” technique enables “metamodeling,” such as treating something via its IRI as a set of members (such as Mammal being a set of all mammals) or as an instance (such as Endangered Mammal) when it is the object of a different contextual assertion. Use of “metamodeling” is often helpful to describe the overall conceptual structure of a domain. See endnote [18] for more discussion on this topic
  • Build ontologies incrementally. Because good ontologies embrace the open world approach [19], working toward these desired end states can also be incremental. Thus, in the face of common budget or deadline constraints, it is possible initially to scope domains as smaller or to provide less coverage in depth or to use a small set of predicates, all the while still achieving productive use of the ontology. Then, over time, the scope can be expanded incrementally. Much value can be realized by starting small, being simple, and emphasizing the pragmatic. It is OK to make those connections that are doable and defensible today, while delaying until later the full scope of semantic complexities associated with complete data alignment
  • Build modular ontologies that split your domain and problem space into logical clusters. Good ontology design, especially for larger projects, warrants a degree of modularity. An architecture of multiple ontologies often works together to isolate different work tasks so as to aid better ontology management. Also, try to use a core set of primitives to build up more complex parts. This is a kind of reuse within the same ontology, as opposed to reusing external ontologies and patterns. The corollary to this is: the same concepts are not created independently multiple times in different places in the ontology. Adhering to both of these practices tends to make ontology development akin to object-oriented programming
  • Assign domains and ranges to your properties. Domains apply to the subject (the left hand side of a triple); ranges to the object (the right hand side of the triple). Domains and ranges should not be understood as actual constraints, but as axioms to be used by reasoners. In general, domain for a property is the range for its inverse and the range for a property is the domain of its inverse. Use of domains and ranges will assist testing (see below) and help ensure the coherency of your ontology
  • Assign property restrictions, but do so sparingly and judiciously [20]. Use of property restrictions will assist testing (see below) and help ensure the coherency of your ontology
  • Use disjoint classes to separate classes from one another where the logic makes sense and dictates (if not explicitly stated, they are assumed to overlap)
  • Write the ontology in a machine-processable language such as OWL or RDF Schema (among others), and
  • Aggressively use annotation properties (see next) to promote the usefulness and human readability of the ontology.

Naming and Vocabulary Best Practices

  • Name all concepts as single nouns. Use CamelCase notation for these classes (that is, class names should start with a capital letter and not contain any spaces, such as MyNewConcept)
  • Name all properties as verb senses (so that triples may be actually read); e.g., hasProperty. Try to use mixedCase notation for naming these predicates (that is, begin with lower case but still capitalize thereafter and don’t use spaces)
  • Try to use common and descriptive prefixes and suffixes for related properties or classes (while they are just labels and their names have no inherent semantic meaning, it is still a useful way for humans to cluster and understand your vocabularies). For examples, properties about languages or tools might contain suffixes such as ‘Language‘ or ‘Tool‘ for all related properties
  • Provide inverse properties where it makes sense, and adjust the verb senses in the predicates to accommodate. For example, <Father> <hasChild> <Janie> would be expressed inversely as <Janie> <isChildOf> <Father>
  • Give all concepts and properties a definition. The matching and alignment of things is done on the basis of concepts (not simply labels) which means each concept must be defined [21]. Providing clear definitions (along with the coherency of its structure) gives an ontology its semantics. Remember not to confuse the label for a concept with its meaning. (This approach also aids multi-linguality). In its own ontologies, Structured Dynamics uses the property of skos:definition, though others such as rdfs:comment or dc:description are also commonly used
  • Provide a preferred label annotation property that is used for human readable purposes and in user interfaces. For this purpose, Structured Dynamics uses the property of skos:prefLabel
  • Include a class “SemSet”, which means a series of alternate labels and terms to describe the concept. These alternatives include true synonyms, but may also be more expansive and include jargon, slang, acronyms or alternative terms that usage suggests refers to the same concept. The umbel:SemSet construct enables a listing of individual members to be generated that provides the matching set for tagging and information extraction tasks. (As such, also include the prefLabel in the SemSet for proper lookup and tagging purposes.) The SemSet construct is similar to the “synsets” in Wordnet, but with a broader use understanding. This construct is an integral part of Structured Dynamics’ approach to using ontologies for information extraction and tagging of unstructured text
  • Try to assign logical and short names to namespaces used for your vocabularies, such as foaf:XXX, umbel:XXX or skos:XXX, with a maximimum of five letters preferred
  • Enable multi-lingual capabilities in all definitions and labels. This is a rather complicated best practice in its own right. For the time being, it means being attentive to the xml:lang=”en” (for English, in this case) property for all annotation properties
  • (If you disagree with these naming conventions, use your own, but in any event, be consistent!!).

Documentation Best Practices

  • Like good software programs, a properly constructed and commented ontology is the first requirement of best practice documentation
  • The entire ontology vocabulary should be documented via a dedicated system that allows finding, selecting and editing of any and all ontology terms and their properties
  • The methodologies should be documented for ontology construction and maintenance, including naming, selection, completeness and other criteria. Documents such as this one and others in this series provide examples of important supplementary documentation regarding methodology and practice
  • Provide a complete TechWiki-like documentation system for use cases, best practices, evaluation and testing metrics, tools installation and use, and all aspects of the ontology lifecycle should be provided and supported [22]
  • Develop a complete graph of the ontology and make it available via graph visualization tools to aid understanding of the ontology in its complete aspect [23], and
  • Other ample diagrams and flowcharts should also be prepared and made available for knowledge workers’ use. UML diagrams, for example, might be included here, but general workflows and concept relationships should be explicated in any case through visual means. Such diagrams are much easier to understand and follow than the actual ontology specification.

Organizational and Collaborative Best Practices

  • Collaboration is an implementation best practice [24]
  • Re-use of already agreed-up structures and vocabularies respects prior investments and needs to be emphasized
  • Improved processes for consensus making, including tools support, must be found to enable work groups to identify and decide upon terminology, definitions, alternative labels (SemSets), and relations between concepts. These processes need not be at the formal ontology level, but at the level of the concept graph underlying the ontology [24].

Testing Best Practices

  • Test new concepts, aided by proper domain, range and property restrictions; by invoking reasoners such that inconsistencies can be determined [25]
  • Test new properties, aided by invoking reasoners, which will identify inconsistencies [25]
  • Test via external class assignments, by linking to classes in external ontologies, which acts to ‘explode the domain’ [26]
  • Use external knowledge bases and ontologies, such as Cyc or UMBEL [27], to conduct coherency testing for the basic structure and relationships in the ontology
  • Evolve the ontology specification to include necessary and sufficient conditions [25] aid more complete reasoner testing for consistency and coherence.

Best Practices for Adaptive Ontologies

In the case of ontology-driven applications using adaptive ontologies [28], there are also additional instructions contained in the system (via administrative ontologies) that tell the system which types of widgets need to be invoked for different data types and attributes. This is different from the standard conceptual schema, but is nonetheless essential to how such applications are designed.

  • Use the structWSF middleware layer [29] as the abstract access point to:
    • To create, update, delete or otherwise manage data records
    • To browse or view existing records or record sets, based on simple to possible complex selection or filtering criteria, or
    • To take one of these results sets and progress it through various workflows involving specialized analysis, applications, or visualization.
  • Supplement the domain ontology with a semantic component ontology for the purposes of guiding data widget display and visualization [30], and
  • Supplement the domain ontology with the irON (instance record Object Notation) for dataset exchange and interoperability [31].

The administrative ontologies supporting these applications are managed differently than the standard domain ontologies that are the focus of most of the best practices above. Nonetheless, some of the domain ontology best practices work in tandem with them, the combination of which are called adaptive ontologies.


[1] This posting is part of a current series on ontology development and tools, now permanently archived and updated on the OpenStructs TechWiki. The series began with An Executive Intro to Ontologies, then continued with an update of the prior Ontology Tools listing, which now contains 185 tools. It progressed to a survey of ontology development methodologies. That led to a presentation of a new, Lightweight, Domain Ontologies Development Methodology. That piece was then expanded to address A New Landscape in Ontology Development Tools. This portion completes the series.
[2] Natalya F. Noy and Deborah L. McGuinness, 2001. “Ontology Development 101: A Guide to Creating Your First Ontology,” Stanford University Knowledge Systems Laboratory Technical Report KSL-01-05, March 2001. See http://protege.stanford.edu/publications/ontology_development/ontology101-noy-mcguinness.html.
[3] Matthew Horridge et al., 2009. A Practical Guide to Building OWL Ontologies Using Protégé 4 and CO-ODE Tools, manual prepared by the University of Manchester, March 13, 2009. 108 pp. See http://owl.cs.manchester.ac.uk/tutorials/protegeowltutorial/resources/ProtegeOWLTutorialP4_v1_2.pdf.
[4] Specialty search engines for ontologies include Swoogle, FalconS, Watson, Sindice and SWSE. In addition, one can use a general search engine such as Google with a search query such as <topic> owl:equivalentClass filetype:owl. Note the filetype might also include RDF or a variant such as N3, and other language-specific constructs of interest can also be substituted for the owl:equivalentClass.
[5] The TONES Ontology Repository is primarily designed to be a central location for ontologies that might be of use to tools developers for testing purposes. It has a nice browse facility, as well as filtering by OWL vocabulary. The system contains about 220 ontologies and is powered by the OWL API.
[6] OntologyDesignPatterns.org is a semantic Web portal dedicated to ontology design patterns (ODPs). The portal was started under the NeOn project, which still partly supports its development.
[7] Elena Paslaru Bontas Simperl and Christoph Tempich, 2006. “Ontology Engineering: A Reality Check,” in Proceedings of the 5th International Conference on Ontologies, Databases, and Applications of Semantics ODBASE2006, 2006. See http://ontocom.ag-nbi.de/docs/odbase2006.pdf .
[9] Barry Smith et al., 2007. “The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration,” in Nature Biotechnology 25: 1251 – 1255, published online 7 November 2007; see http://www.nature.com/nbt/journal/v25/n11/pdf/nbt1346.pdf.
[10] See the NeOn networked ontologies project; see http://www.neon-project.org/. The four-year project began in 2006 and its first open source toolkit was released by the end of 2007. OWL features were added in 2008-09. NeON has since completed, though its toolkit and plug-ins can still be downloaded as open source.
[11] Brian Sletten, 2008. “Applying SKOS Concept Schemes,” on the DevX Web site, July 22, 2008; see http://www.devx.com/semantic/Article/38629.
[12] M. K. Bergman, 2009. “Confronting Misconceptions with Adaptive Ontologies,” AI3:::Adaptive Information blog, Aug. 17, 2009.
[13] UMBEL (Upper Mapping and Binding Exchange Layer) is an ontology of about 20,000 subject concepts that acts as a reference structure for inter-relating disparate datasets. It is also a general vocabulary of classes and predicates designed for the creation of domain-specific ontologies.
[14] Core ontologies are Dublin Core, DC Terms, Event, FOAF, GeoNames, SKOS, Timeline, and UMBEL. The various criteria that are considered in nominating an existing ontology to “core” status is that it should be general; highly used; universal; broad committee or community support; well done and documented; and easily understood. Though less universal, there are also a number of secondary ontologies, namely BIBO, DOAP, and SIOC.
[15] See Fausto Giunchiglia, Maurizio Marchese and Ilya Zaihrayeu, 2006. “Encoding Classifications into Lightweight Ontologies,” see http://www.science.unitn.it/~marchese/pdf/encoding%20classifications%20into%20lightweight%20ontologies_JoDS8.pdf. Also, M. K. Bergman, 2010. “A New Methodology for Buidling Lightweight, Domain Ontologies,” AI3:::Adaptive Information blog, Sept. 1, 2010.
[16] Alistair Miles and Sean Bechhofer, eds., 2009. SKOS Simple Knowledge Organization System Reference, W3C Recommendation, 18 August 2009. See http://www.w3.org/TR/skos-reference/. Some of the common SKOS predicates used in our ontologies include skos:definition, skos:prefLabel, skos:altLabel, skos:broaderTransitive, skos:narrowerTransitive.
[17] The TBox portion, or classes (concepts), is the basis of the ontologies. The ontologies establish the structure used for governing the conceptual relationships for that domain and in reference to external (Web) ontologies. The ABox portion, or instances (named entities), represents the specific, individual things that are the members of those classes. Named entities are the notable objects, persons, places, events, organizations and things of the world. Each named entity is related to one or more classes (concepts) to which it is a member. Named entities do not set the structure of the domain, but populate that structure. The ABox and TBox play different roles in the use and organization of the information and structure. These distinctions have their grounding in description logics.
[18] In the domain ontologies that are the focus here, we often want to treat our concepts as both classes and instances of a class.  This is known as “metamodeling” or “metaclassing” and is enabled by “punning” in OWL 2.  For example, here a case cited on the OWL 2 wiki entry on “punning“:
People sometimes want to have metaclasses. Imagine you want to model information about the animal kingdom. Hence, you introduce a class a:Eagle, and then you introduce instances of a:Eagle such as a:Harry.

(1) a:Eagle rdf:type owl:Class
(2) a:Harry rdf:type a:Eagle

Assume now that you want to say that “eagles are an endangered species”. You could do this by treating a:Eagle as an instance of a metaconcept a:Species, and then stating additionally that a:Eagle is an instance of a:EndangeredSpecies. Hence, you would like to say this:

(3) a:Eagle rdf:type a:Species
(4) a:Eagle rdf:type a:EndangeredSpecies.

This example comes from Boris Motik, 2005. “On the Properties of Metamodeling in OWL,” paper presented at ISWC 2005, Galway, Ireland, 2005.

“Punning” was introduced in OWL 2 and enables the same IRI to be used as a name for both a class and an individual. However, the direct model-theoretic semantics of OWL 2 DL accommodates this by understanding the class Father and the individual Father as two different views on the same IRI, i.e., they are interpreted semantically as if they were distinct. The technique listed in the main body triggers this treatment in an OWL 2-compliant editor. See further Pascal Hitzler et al., eds., 2009. OWL 2 Web Ontology Language Primer, a W3C Recommendation, 27 October 2009; see http://www.w3.org/TR/owl2-primer/.

[19] There is a role and place for closed world assumption (CWA) ontologies, though Structured Dynamics does not engage in them.

CWA is the traditional perspective of relational database systems within enterprises. The premise of CWA is that which is not known to be true is presumed to be false; or, any statement not known to be true is false. Another way of saying this is that everything is prohibited until it is permitted. CWA works well in bounded systems such as known product listings or known customer rosters, and is one reason why it is favored for transaction-oriented systems where completeness and performance are essential. In an ontology sense, CWA works best for bounded engineering environments such as aeronautics or petroleum engineering. Closed world ontologies also tend to be much more complicated with many nuanced predicates, and can be quite expensive to build.

The open world assumption (OWA), on the other hand, is premised that the lack of a given assertion or fact being available does not imply whether that possible assertion is true or false: it simply is not known. In other words, lack of knowledge does not imply falsity, and everything is permitted until it is prohibited. As a result, open world works better in knowledge environments with the incorporation of external information such as business intelligence, data warehousing, data integration and federation, and knowledge management.

See further, M. K. Bergman, 2009. “The Open World Assumption: Elephant in the Room,” AI3:::Adaptive Information blog, Dec. 21, 2009.

[20] See [3] for a good description of property restrictions in Section 4 and Appendix A.
[21] As another commentary on the importance of definitions, see http://ontologyblog.blogspot.com/2010/09/physician-decries-lack-of-definitions.html.
[22] The technical wiki (TechWiki) is the central repository for all documentation related to OpenStructs projects. TechWiki is the location for users and interested parties to learn about these projects and their applications, and for developers to author and write about their use and best practices. Both the TechWiki’s content and its software and organizatonal structure may be downloaded for free for setting up similar local technical documentation.
[23] See M. K. Bergman, 2008. “Large-scale RDF Graph Visualization Tools,” AI3:::Adaptive Information blog, Jan. 28, 2008; and M. K. Bergman, 2008. “Cytoscape: Hands-down Winner for Large-scale Graph Visualization,” AI3:::Adaptive Information blog, Jan. 28, 2008.
[24] The central role of ontologies is to describe a “worldview” and in specific organizations this means a shared understanding of the concepts, relations and terminology to describe the participants’ common domain. In turn, these shared understandings establish the semantics for how to effect communication and understanding within the population of domain users. All of this means that finding ways to identify and agree upon shared vocabularies and understandings is central to the task of modeling (creating an ontology) for the domain.

Sometimes this perception of shared views is too strictly interpreted as needing to have one and only one understanding of concepts and language. Far from it. One of the strengths of ontologies and language modeling within them is that multiple terms for the same concept or slight differences in understandings about nearly similar concepts can be accommodated. It is perfectly OK to have differences in terminology and concept understandings so long as those differences are also captured and explicated within the ontology. The recommendations herein that all concepts and terminology be defined, that SemSets be used to capture alternative ways to name concepts, and that concepts often be treated as both classes and instances are some of the best practices that reflect this approach.

So, while consensus building and collaboration methods are at the heart of effective ontology building, those methods need not also strive for a imposition of language and concepts by fiat. In fact, trying to do so undercuts the ability of the collaborative process to lead to greater shared understandings.

[25] See [3] for a good description of various testing and consistency checks in Sections 4.9 to 4.14.
[26] See Frédérick Giasson, 2008. “Exploding the Domain,” from his blog, April 20, 2008. ‘Exploding the domain’ means what happens when internal ontology concepts are linked to related ones on the external Web, which helps to bring in more information and context about the concept. It is also a way to test the coherence of the original concept.
[27] Already vetted knowledge bases can be a good reference testbed for testing the coherence of concepts in a new domain ontology. If the domain ontology describes concepts quite differently than standard practice (Wikipedia, Cyc and UMBEL are good for testing this), or if relationships between concepts are greatly at variance (Cyc and UMBEL are good for this), then there are likely coherency problems. In other domains other reference knowledge bases, more specific to the domain, can be used in similar ways.
[28] Structured Dynamics’ ontology-driven apps are generic applications, the operations of which are guided by the instructions and nature of the underlying data that feeds them. For example, in the case of a standard structured data display (say, a simple table like a Wikipedia infobox), such generic design includes templates tailored to various instance types (say, locational information presenting on a map versus people information warranting a image and vital statistics). Alternatively, in the generic design for a data visualization application using Adobe Flash, the information output of the results set contains certain formats and attributes, keyed by an administrative ontology linked by data type to a domain ontology’s results sets.

These ontology-driven apps, then, are informed structured results sets that are output in a form suitable to various intended applications. This output form can include a variety of serializations, formats or metadata. This flexibility of output is tailored to and responsive to particular generic applications; it is what makes our ontologies “adaptive”. Using this structure, it is possible to either “drive” queries and results sets selections via direct HTTP request or via simple dropdown selections on HTML forms. Similarly, it is possible with a single parameter change to drive either a visualization app or a structured table template from the equivalent query request. Ontology-driven apps through this ontology and architecture design thus provide two profound benefits. First, the entire system can be driven via simple selections or interactions without the need for any programming or technical expertise. And, second, simple additions of new and minor output converters can work to power entirely new applications available to the system.

[29] The structWSF Web services framework is generally RESTful middleware that provides a bridge between existing content and structure and content management systems and available indexing engines and RDF data stores. structWSF is a platform-independent means for distributed collaboration via an innovative dataset access paradigm. It has about twenty embedded Web services. See http://openstructs.org/structwsf.
[30] A semantic component is a Flex component that takes record descriptions and schema as input, and then outputs some (possibly interactive) visualizations of that record. Depending on the logic described in the input schema and the input record descriptions, the semantic component will behave differently to optimize its presentation to the users. About a dozen semantic components are available from the Semantic Component (Flex) Library. The Semantic Component Ontology is the governing structure for these schema.
[31] irON (instance record and Object Notation) is a abstract notation and associated vocabulary for specifying RDF triples and schema in non-RDF forms. Its purpose is to allow users and tools in non-RDF formats to stage interoperable datasets using RDF. The notation supports writing RDF and schema in JSON (irJSON), XML (irXML) and comma-delimited (CSV) formats (commON). The notation specification includes guidance for creating instance records (including in bulk), linkages to existing ontologies and schema, and schema definitions. Profiles and examples and code parsers and converters are also provided for the irXML, irJSON and commON serializations.