Improved Ontology Navigation and Management in Read-only and Editable FormsThis continues our series on the new UMBEL portal. UMBEL, the Upper Mapping and Binding Exchange Layer, is an upper ontology of about 28,000 reference concepts and a vocabulary designed for domain ontologies and ontology mapping [1]. This part four discusses structOntology, the online ontology viewing and management tool that is an integral part of the open semantic framework (OSF), the framework that hosts the UMBEL portal.
Ontologies are the central governing structure or “brains” of a semantic installation. As provided by the OSF framework, ontologies are also the basis for instructing user interface labels and how the interface behaves. The Web is about global access, immediacy, flexibility and adaptability. Why can’t our use of ontologies be the same?
Unlike similar tools of the past, structOntology exists on the same installation as the ontology that drives it. It is a backoffice ontology editing and management tool that is part of the conStruct tool suite, accessible via the OSF admin panel. There is no need to go off to a separate application, make changes, re-import, and then test. structOntology allows all of that to occur locally with the instance in which it resides. Also, there are some important functionality differences — especially finding and selecting stuff and search — that sets structOntology apart from existing, conventional tools.
Yet, that being said, structOntology is also not the complete Swiss Army knife for ontology management. It is designed for local and immediate use. Its spectrum of functionality is not as complete as other ontology frameworks (for example, supporting reasoners, consistency testers or plug-ins). So, for immmediate and locally relevant use, structOntology appears to be the appropriate tool. For more detailed ontology work or testing, other frameworks are perhaps more useful. And, in recognition of these roles, structOntology also has robust import and export capabilities that enable these dual local-detailed use scenarios. For these distinctions, see further the structOntology v Protégé? document.
structOntology comes in two versions. First, there is the read-only version, which can be made publicly available, that is a great aid to ontology navigation and discovery. This is the version viewable on the UMBEL portal. Second, there is an editable version, which is only available to administrators via a back office function within an OSF instance. Some screen shots of this version, plus pointers to more documentation about it, are provided below.
What enables OSF to treat ontologies as a first-class citizen — viewable and editable from within the applications in which they operate — results from the incorporation of the OWL API as one of the major engines underlying the structWSF Web services framework, the key foundational basis to an OSF installation. As noted in Part 2 of this series, the OWL API is one of the four major engines supporting structWSF:
The OWL API is the same engine used by Protégé 4, which is why both structOntology and Protégé are fully interoperable.
Besides interoperabilty, the use of the OWL API also means that other OWL API-based tools, such as reasoners or mappers, may be linked into the system. This design is in keeping with our normative view of an ontology tooling landscape, which Structured Dynamics keeps pursuing in a steady, incremental manner [2]. Further, because of its sibling engines, the OWL API and OSF are also able to leverage the other engines supporting structWSF, such as Solr for advanced search or efficient indexing in the RDF triplestore. (The advantages go both ways, too, such as for example enabling the OWL API to feed appropriate ontology specifications to the GATE text processing area for uses such as ontology-based information extraction [OBIE]). All of this makes for a most powerful and capable foundation to an OSF instance.
Since UMBEL is a reference ontology and the UMBEL portal is an access point to those references and specifications, we really don’t want casual users making modifications to the ontology [3]. For this reason, only a read-only version of structOntology is provided on the portal.
Access to the structOntology function occurs via the Ontology link on the UMBEL portal. Upon access, you are presented with the main structOntology interface:
The organization of the structOntology application presents all currently available and active ontologies listed in the left panel; UMBEL, of course, is the one selected here. Since this is a read-only version, only the View button shows up in the right-hand panel. (For the options available in the editable version, see below.)
Upon invoking the View option, the hierarchical tree for the selected ontology appears on the left; structural and definitions on the right.
You may expand the tree and explore the structure deeper by either clicking on the tree nodes in the left-hand panel or the item links in the right-hand panel. If there are further levels in the tree, you will get the JavaScript ‘working’ icon and then see the tree expanded with the new node information shown to the right.
Also note that your interaction with the structOntology application is recounted via the “breadcrumbs” listing at the upper left of the application. The green arrow icon allows you to expand or collapse various sections in the display.
The tree labels are themselves based on the preferred labels assigned to things. However, if you want to see the actual ontology URI reference, you can do so via the tooltip when mousing over the item:

The tooltip shows the full URI path (unique identifier) of the selected item.
This example has been based on the Classes tab, which are the reference concepts in the UMBEL context. In read-only mode, the basic information presented is the tree structure, the item description and prefLabel, and super and sub class information in the right-hand panel. (More options are available in the editable version; see below.)
Properties — that is the relations or predicates between items or nodes — are presnted in a similar manner to that for Classes. The Properties tab has the same basic layout and operations as the Classes tab, including similar right-hand panels:
The editable version of structOntology shares all of the functionality of the read-only version. Besides adding editing capabilities, the editable version also has other functionality related to general ontology creation and management. There is separate documentation for the editable version; the examples below are from a different instance than UMBEL.
The editable version is accessed via the backoffice admin function within an OSF instance. When invoked, it also has more management options presented in the right-hand panel:
We’ll highlight some of the differences from the read-only version below.
The first notable addition is the ability to create ontologies (as well as to delete, or Remove, them):
The URL (such as http://purl.org/ontology/myont#) becomes the base URI for the new ontology. The new ontology is created with a basic structure, from which you only need fill in your new concepts or classes and relationships:
Basic stubbing is provided for the new ontology to help bootstrap its development (not shown). Once created, this new ontology also now appears on the available local ontologies when first invoking the structOntology application.
Most screens are quite similar to the read-only version with the obvious change of replacing labels with edit boxes. It is via these edit fields that the ontology becomes editable. This change is quite evident for the View screen:
Searching can take place on the currently active ontology or all loaded (available) ontologies. Note that selection was made above via the radiobutton under the search box.
Also, depending on settings, searching can also take place on only the preferred label, or on alternative labels or descriptions (in fact, all annotations). (This is part of the settings.)
When entering search terms, the system automatically attempts to complete the matching search phrase. A minimum of three entered characters guides this auto-completion functionality:
When search is initiated, the potential results list also auto-completes for what you have already typed into the search box. Upon selection of one of these items (or completion of the full search phrase), the structOntology system issues a search query to the remote server, which then acts to auto-populate the ontology tree on the left-hand panel. In this case, we have selected ‘communitiy facilities’:
The desired search results then automatically expand the ontology tree. This is really helpful for longer ontologies (the example one shown has about 3000 concepts and about 6000 axioms) and means quicker initial tree loading. Once completed, the (multiple) occurrences of the search item are shown in highlight throughout the tree.
Note this search is not necessarily restricted to the actual node label. Alternative labels and descriptions may also be used to find the search results. This greatly expands the findability of the search function. Here is a great example of matching the OWL API engine to Solr underneath a structWSF instance.
The editable version of structOntology offers more detail in the right-hand panel when Viewing an item. These sections include:
Each section is editable. All have auto-complete. Each section may also be expanded or collapsed.
Each panel has an expand and collapse arrow shown at the upper right of its panel. These causes the panel’s individual entries to either be exposed or hidden. At the right of each entry, new entries can be invoked with the green plus symbol; existing entries can be deleted with the red minus symbol. (See Structural Relationships below.)
In working with each panel, note that each entry also has the search and auto-complete features earlier noted. Drag-and-drop is also contextual into these panels or not, depending on the nature of the item selected in the left-hand panel (tree).
Annotations provide the descriptions about the thing at hand and its associated metadata. (These are separately defined under the Properties tab, or as part of the imported ontology specification.) The available annotations are displayed in this panel when expanded:
Entries are simply provided by entering values into the text fields and then Saving.
The structural relationships are the means to set parent and child relations between concepts, as well as to instruct disjoint or equivalent class relations. The Structural Relationships panel is the key one for setting the interconnections within the graph structure at the heart of the governing ontology.
Most of the key structural relationships in OWL are provided by this panel. (However, note there are some additional and rarely used structural specifications in OWL. These must be set via a third-party external application. Such potential interactions are made possible via the flexible import and export options with structOntology).
Another right-hand panel provides the facility to assign individuals to the classes (or concepts) established under the prior two panels. In this case, we are looking at some specific ‘community facilities’ to assign to that concept:
As with the prior panels, a new instance may be added or discarded ones deleted. Individual instances and their characteristics may also be updated or changes.
Another aspect to OSF ontologies is the ability to relate concepts to various metadata characteristics or attributes that might describe that concept’s instances. This relationship is done via the dedicated hasCharacteristic property, which is assigned via this right-hand panel:
This option has the specific behavior of allowing one or more properties (characteristics) to be asserted for a given a class (concept).
Display and widget and other options are set under the Advanced Options panel. One item to note are the widgets that may be assigned for displaying a given information item:
The relationship of widgets (or semantic components) to information items is a deserving topic in its own right. For more information about this topic, see the semantic components category.
In edit mode, it is possible to drag items from the left-hand tree panel into the specifications at the right. This is contextual. In this first example, we see an attempt to drop a “class” result (or concept) into the annotation panel, which violates the structure of the system and is therefore not allowed (as shown by the visual red X cues):
However, if we drag and drop from the tree in an allowable structural definition, we get the visual green check as a cue the move is legal:
This functionality and feedback means that only allowable assignments can be dropped into a new structural definition.
Another piece of functionality in the editable version is the export option. When invoked, Export brings up the save dialog with the ability to assign an ontology file name:
Upon saving, it stores the currently active ontology in RDF/XML format:
Export is not active in UMBEL do to the large size of the ontology. If you want to obtain it directly, you may do so from the UMBEL ontology CVS.
An Import option is available in the editable version. structOntology import supports all OWL API serializations, specifically RDF/XML, N3, Manchester Syntax and Turtle. When import is invoked, a file open dialog is presented that enables you to find the ontology on your local hard drive:
The Import feature has no file extension limitations; make care to pick and assign the proper types for importation.
Via the Import and Export buttons, it is possible to work locally with structOntology while exporting to more capable third-party tools. Then, once use of those tools is complete, Import provides the ability to re-import the updated ontology back into the local collection.
Finally, as a server-based system accessed via Web services, there are some slightly different concepts necessary to keep in mind when using the editable version of structOntology. These distinctions need to be kept in mind because you might be working with the local version or the one on the main server. These file options are:
These are all available via buttons under the main right-hand panel in structOntology and are more fully described in the edit version documentation.
Additional information on structOntology may be found in an online video:

This is the fourth of a multi-part series on the newly updated UMBEL services. Other articles in this series are:
Documenting the Emerging Ecosystem Around OWL 2We have been touting the importance of OWL 2 as the language of choice for federating and reasoning over RDF and ontologies. An absolutely essential enabler of the OWL 2 language is version 3 of the OWL API (actually, version 3.2.4 at the time of this writing), a Java-based framework for accessing and managing the language. Protégé 4, the most popular open source ontology editor and integrated development environment (IDE), for example, is built around the OWL API.
As we laid out a bit more than a year ago, now codified on our TechWiki as the Normative Landscape of Ontology Tools (especially the second figure), we see the OWL API as the essential pivot point for all forms of ontology tools moving forward.
We have attempted to assemble a definitive and comprehensive list of all known tools presently based around version 3 of the OWL API. (We have surely missed some and welcome comments to this post that identify missing ones; we promise to add them and keep tracking them.) Herein is a listing of the 30 or so known OWL API-based tools:
Ignazio Palmisano also graciously suggested these additional sources:
which also further leads to this additional listing:
It is not clear if all of these offer OWL 2 support, let along work with the current OWL API.
Visualization + Analysis Pushes Aside CytoscapeThough I never intended it, some posts of mine from a few years back dealing with 26 tools for large-scale graph visualization have been some of the most popular on this site. Indeed, my recommendation for Cytoscape for viewing large-scale graphs ranks within the top 5 posts all time on this site.
When that analysis was done in January 2008 my company was in the midst of needing to process the large UMBEL vocabulary, which now consists of 28,000 concepts. Like anything else, need drives research and demand, and after reviewing many graphing programs, we chose Cytoscape, then provided some ongoing guidelines in its use for semantic Web purposes. We have continued to use it productively in the intervening years.
Like for any tool, one reviews and picks the best at the time of need. Most recently, however, with growing customer usage of large ontologies and the development of our own structOntology editing and managing framework, we have begun to butt up against the limitations of large-scale graph and network analysis. With this post, we announce our new favorite tool for semantic Web network and graph analysis — Gephi — and explain its use and showcase a current example.
Three and one-half years ago when I first wrote about Cytoscape, it was at version 2.5. Today, it is at version 2.8, and many aspects have seen improvement (including its Web site). However, in other respects, development has slowed. For example, version 3.x was first discussed more than three years ago; it is still not available today.
Though the system is open source, Cytoscape has also largely been developed with external grant funds. Like other similarly funded projects, once and when grant funds slow, development slows as well. While there has clearly been an active community behind Cytoscape, it is beginning to feel tired and a bit long in the tooth. From a semantic Web standpoint, some of the limitations of the current Cytoscape include:
Undoubtedly, were we doing semantic technologies in the biomedical space, we might well develop our own plug-ins and contribute to the Cytoscape project to help overcome some of these limitations. But, because I am a tools geek (see my Sweet Tools listing with nearly 1000 semantic Web and -related tools), I decided to check out the current state of large-scale visualization tools and see if any had made progress on some of our outstanding objectives.
There are three classes of graph tools in the semantic technology space:
One could argue that the first two categories have received the most current development attention. But, I would also argue that the third class is one of the most critical: to understand where one is in a large knowledge space, much better larger-scale visualization and navigation tools are needed. Unfortunately, this third category is also the one that appears to be receiving the least development attention. (To be sure, large-scale graphs pose computational and performance challenges.)
In the nearly four years since my last major survey of 26 tools in this category, the new entrants appear quite limited. I’ve surely overlooked some, but the most notable are Gruff, NAViGaTOR, NetworkX and Gephi [1]. Gruff actually appears to belong most in Category #2; I could find no examples of graphs on the scale of thousands of nodes. NAViGaTOR is biomedical only. NetworkX has no direct semantic graph importing and — while apparently some RDF libraries can be used for manipulating imports — alternative workflows were too complex for me to tackle for initial evaluation. This leaves Gephi as the only potential new candidate.
From a clean Web site to well-designed intro tutorials, first impressions of Gephi are strongly positive. The real proof, of course, was getting it to perform against my real use case tests. For that, I used a “big” ontology for a current client that captures about 3000 different concepts and their relationships and more than 100 properties. What I recount here — from first installing the program and plug-ins and then setting up, analyzing, defining display parameters, and then publishing the results — took me less than a day from a totally cold start. The Gephi program and environment is surprisingly easy to learn, aided by some great tutorials and online info (see concluding section).
The critical enabler for being able to use Gephi for this source and for my purposes is the SemanticWebImport plug-in, recently developed by Fabien Gandon and his team at Inria as part of the Edelweiss project [2]. Once the plug-in is installed, you need only open up the SemanticWebImport tab, give it the URL of your source ontology, and pick the Start button (middle panel):
Note the SemanticWebImport tool also has the ability (middle panel) to issue queries to a SPARQL endpoint, the results of which return a results graph (partial) from the source ontology. (This feature is not further discussed herein.) This ontology load and display capability worked without error for the five or six OWL 2 ontologies I initially tested against the system.
Once loaded, an ontology (graph) can be manipulated with a conventional IDE-like interface of tabs and views. In the right-hand panels above we are selecting various network analysis routines to run, in this case Average Degrees. Once one or more of these analysis options is run, we can use the results to then cluster or visualize the graph; the upper left panel shows highlighting the Modularity Class, which is how I did the community (clustering) analysis of our big test ontology. (When run you can also assign different colors to the cluster families.) I also did some filtering of extraneous nodes and properties at this stage and also instructed the system via the ranking analysis to show nodes with more link connections as larger than those nodes with fewer links.
At this juncture, you can also set the scale for varying such display options as linear or some power function. You can also select different graph layout options (lower left panel). There are many layout plug-in options for Gephi. The layout plugin called OpenOrd, for instance, is reported to be able to scale to millions of nodes.
At this point I played extensively with the combination of filters, analysis, clusters, partitions and rankings (as may be separately applied to nodes and edges) to: 1) begin to understand the gross structure and characteristics of the big graph; and 2) refine the ultimate look I wanted my published graph to have.
In our example, I ultimately chose the standard Yifan Hu layout in order to get the communities (clusters) to aggregate close to one another on the graph. I then applied the Parallel Force Atlas layout to organize the nodes and make the spacings more uniform. The parallel aspect of this force-based layout allows these intense calculations to run faster. The result of these two layouts in sequence is then what was used for the results displays.
Upon completion of this analysis, I was ready to publish the graph. One of the best aspects of Gephi is its flexibility and control over outputs. Via the main Preview tab, I was able to do my final configurations for the published graph:
The graph results from the earlier-worked out filters and clusters and colors are shown in the right-hand Preview pane. On the left-hand side, many aspects of the final display are set, such as labels on or off, font sizes, colors, etc. It is worth looking at the figure above in full size to see some of the options available.
Standard output options include either SVG (vector image) or PDFs, as shown at the lower left, with output size scaling via slider bar. Also, it is possible to do standard saves under a variety of file formats or to do targeted exports.
One really excellent publication option is to create a dynamically zoomable display using the Seadragon technology via a separate Seadragon Web Export plug-in. (However, because of cross-site scripting limitations due to security concerns, I only use that option for specific sites. See next section for the Zoom It option — based on Seadragon — to workaround that limitation.)
I am very pleased with the advances in display and analysis provided by Gephi. Using the Zoom It alternative [3] to embedded Seadragon, we can see our big ontology example with:
Note: at standard resolution, if this graph were to be rendered in actual size, it would be larger than 7 feet by 7 feet square at full zoom !!!
To compare output options, you may also;
It is notable that Gephi still only versions itself as an “alpha”. There is already a robust user community with promise for much more technology to come.
As an alpha, Gephi is remarkably stable and well-developed. Though clearly useful as is, I measure the state of Gephi against my complete list of desired functionality, with these items still missing:
Ultimately, of course, as I explained in an earlier presentation on a Normative Landscape for Ontology Tools, we would like to see a full-blown graphical program tie in directly with the OWL API. Some initial attempts toward that have been made with the non-Gephi GLOW visualization approach, but it is still in very early phases with ongoing commitments unknown. Optimally, it would be great to see a Gephi plug-in that ties directly to the OWL API.
In any event, while perhaps Cytoscape development has stalled a bit for semantic technology purposes, Gephi and its SemanticWebImport plug-in have come roaring into the lead. This is a fine toolset that promises usefulness for many years to come.
To learn more about Gephi, also see the:
Also, for future developments across the graph visualization spectrum, check out the Wikipedia general visualization tools listing on a periodic basis.
Structured Dynamics is pleased to unveil structOntology — its ontology manager application within the conStruct open source semantic technology suite. We are doing so via a video, which provides a bit more action about this exciting new app.
structOntology has been on our radar for more than two years. But, it was only in embracing the OWLAPI some eight months back that we finally saw our way clear to how to implement the system.
The app, superbly developed by Fred Giasson, has many notable advantages — some of which are covered by the video — but two deserve specific attention: 1) the superior search function (if you have been using Protégé or similar, you will love the fact this search indexes everything, courtesy of Solr); and 2) the availability of its functionality directly within the applications that are driven by the ontologies. Of course, there’s other cool stuff too!:
(If you have trouble seeing this, here is the direct YouTube link or an alternate local Flash version if you can not access YouTube.)
More information on structOntology will be forthcoming over the coming weeks. We will be posting it as open source as part of the Open Semantic Framework by early summer.
Though I have alluded to it numerous times in my past writings [1], I think one of the most pervasive and important benefits from semantic technologies in the enterprise will come from the democratization of information. These benefits will arise mostly from a fundamental change in how we manage and consume information. A new “system” of semantic technologies is now largely available that can put the collection, assembly, organization, analysis and presentation of information directly in the hands of those who need it most — the consumers of information.
The idea of “democratizing information” has been around for a couple of decades, and has accelerated in incidence since the dominance of the Internet. Most commonly, the idea is associated with developments and notions in such areas as citizen journalism, crowdsourcing, the wisdom of the crowd, social bookmarking (or collaborative tagging), and the democratic (small “d”) access to publishing via new channels such as blogs, microblogs (e.g., Twitter) and wikis. To be sure, these kinds of democratic information will (and are) benefiting from the use and application of semantics.
But the trend I’m focusing on here is much different and quite new. It is the idea that enterprise knowledge workers can now take ownership and control of their knowledge management functions. In the process, prior bottlenecks due to IT can be relieved and massive new benefits can open up to the enterprise.
It is no secret that IT has not served the enterprise knowledge management function well for decades. Transaction systems and database systems geared to fast indexing and access to datum have not proved well suited to information or knowledge management. KM includes such applications as business intelligence, data warehousing, data integration and federation, enterprise information integration and management, competitive intelligence, knowledge representation, and so forth. Information management is a bit broader category, and adds such functions as document management, data management, enterprise content management, enterprise or controlled vocabularies, systems analysis, information standards and information assets management to the basic functions of KM. Since the purpose of this piece is not to get into the epistemological differences between information and knowledge, I use these terms more-or-less interchangeably herein.
Knowledge and information management is very big business. Given the breadth and differences in defining the KM and IM markets, let’s take as a proxy the business intelligence (BI) market, one of KM’s most important elements. Various estimates from IDC, Gartner and others place the current value of BI software sales somewhere in the range of $9 billion to $11 billion annually [3]. Further, BI ranked number five on the list of the top 10 technology priorities for chief information officers (CIOs) in 2011. And this pertains to the structured component of information alone.
Yet, at the same time, BI-related projects continue to have high failure rates, often cited as in the 65% to higher range [4]. These failure rates are consistent with KM projects in general [5]. These failures are merely one expression of a constant litany of issues and concerns regarding the enterprise KM function:
| Conventional KM Problem Area | Comments |
| Inflexible Reports |
|
| Inflexible Analysis |
|
| Schema Bottlenecks |
|
| ETL Bottlenecks |
|
| Reliance on Intermediaries |
|
| Specialized Expertise Required |
|
| Slow Response Time |
|
| Dependence on External Apps |
|
| Unmet Needs |
|
| High Opportunity Costs |
|
| High Failure Rates |
|
The seeming contradiction between continued growth and expenditures for information management coupled with continued high failure rates and disappointments is really an expression of the centrality of information to the modern enterprise. The funding and growth of the IT function is itself an expression of this centrality and perceived importance. These have been abiding trends in our transition to information or knowledge economies.
Bray [2] places the fault for wasted initiatives within the culture of IT. I believe there is some truth to this — variably, of course, depending on the specific enterprise. But the real culprit, I believe, has been the past need to “intermediate” a layer of software and IT expertise between knowledge workers and their source information. A progression of tasks has been necessary — conducted over decades with advances and learning — to get paper information into electronic form, get those forms to be understood and operate in some common ways, and then to develop tools, architectures and frameworks to make sense of it. Yet, as more tasks with required specialized skills have been added to this layer, the actual gulf between worker and information has increased. For example, enterprises still require the overhead and layers of IT to write SQL to get information out and then to prepare and fix reports.
On average, IT now consumes about 4% of all enterprise expenditures and employs about 6% of enterprise workers [6]. IT has become a very thick intermediary layer, indeed! Yet, because of the advances and learning that has occurred in growing and nurturing this layer, we also now have the basis to begin to “disintermediate” the IT layer. Many, if not all, of the challenges noted in the table above can be improved by doing so.
One current buzzword in business intelligence is “self service”. By this term is meant giving knowledge workers the tools and systems for creating reports or doing analysis on their own without needing to work through (or be frustrated by) the IT layer. Self-service software was first postulated in the 1990s as a way for information consumers and authors (typically subject-matter experts) to automate some of their knowledge management tasks. Today, it is most commonly applied to self-service reporting or self-service analytics within the BI realm.
As a general proposition, self-service BI has been more myth than reality [7]. Forrester surveys, for example, indicate that IT still develops most BI applications. Of survey respondents in 2009, 70% responded that IT develops the enterprise’s reports and dashboards [8]. However, that figure is not 100%, as it was just a decade earlier, and there is also notable success to some open source providers such as BIRT that address a wide range of reporting needs within a typical application, ranging from operational or enterprise reporting to multi-dimensional online analytical processing (OLAP).
James Kobelius [8] is particularly bullish on the application of Web 2.0 “mashup” applications to knowledge worker purposes. Under this approach, Web-based applications are used and accessed directly by knowledge workers for charting and mapping purposes using Ajax or Flash widgets, such as Google Maps. The conventional BI and KM vendors have begun to more more aggressively into this area. Some notable new entrants — such as Tableau, Factual or Good Data — are also showing the way to more direct access, more flexible reporting and analysis widgets, and cleaner service or platform designs.
These initiatives reside at the display or reporting level. There is another group, including James Kobelius, Neil Raden or Seth Earley, that have addressed how to get disparate information to talk together using ontologies. They refer to “semanticizing” such traditional practices such as master data management (MDM), “ontologizing” taxonomies, or adding Web 2.0 mashups to business intelligence. While these thoughts are moving in the right direction, and will bring incremental benefits, they still are far short of the potentials at hand.
So far, in the KM realm, the application of semantics has tended to be limited to information extraction (tagging) of text documents and first attempts at using ontologies. The tagging component is essential to enable the 80% of information presently in textual documents to become first-class citizens within business intelligence or knowledge management. The ontology efforts to date appear to be more like thin veneers over traditional taxonomies. Rather than hierarchical structures, we now see graph-oriented ones, but still intended to fulfill the same tasks of enterprise metadata and vocabulary lookups.
The ontology efforts especially are just nibbling around the edges of what can be done with semantic technologies. Rather than looking upon ontologies as just another dictionary (though that role is true), if we re-orient our thinking to make ontologies central to the KM function, a wealth of new opportunities and benefits arises.
A bit more than a year ago, we formulated the Seven Pillars of the Open Semantic Enterprise, which included ontologies and related as some of the central components. In that article [9], we noted the particular applicability of semantic technologies to the information and knowledge management functions within enterprises. We asserted the benefits for embracing the open semantic enterprise as providing the organization greater insights with lower risk, lower cost, faster deployment, and more agile responsiveness. Since that time we have been deploying such systems and documenting those benefits.
Integral to the seven pillars are those aspects that lead to the democratization of information for the knowledge worker, what combined might be called “self-service information management”. As the figure to the right shows, three of the seven pillars are essential building blocks to this capability, two pillars are further foundations to it, with the remaining two pillars only tangentially important.
What the combination of these pieces means is a fundamental change in how knowledge work is done. Through this approach, we can largely disintermediate IT from the knowledge function, can bring knowledge management directly into the hands of those who need it in real time, and fundamentally alter how knowledge management apps are designed and deployed. The best thing is these benefits are an incremental evolution, and retain the use and value of existing information assets.
Rather than peripheral lookup structures or thin veneers, ontologies play the central role in the design of self-service information management. We use the plural on purpose here: what is deployed is actually a library of complementary and modular ontologies that play a variety of roles. Combined, we call these libraries with their representative functions adaptive ontologies.
This library contains the expected and conventional domain ontologies. These represent the actual knowledge space for the domain at hand, and may be comprised of multiple different ontologies representing different domain or knowledge spaces. These standard semantic Web ontologies may range from the small and simple to the large and complex, and may perform the roles of defining relationships among concepts, integrating instance data, orienting to other knowledge and domains, or mapping to other schema.
From a best practices standpoint [10], we take special care in constructing these domain ontologies such that we provide labels and cues for user interfaces. Some of the user interface considerations that can be driven by adaptive ontologies include: attribute labels and tooltips; navigation and browsing structures and trees; menu structures; auto-completion of entered data; contextual dropdown list choices; spell checkers; online help systems; etc. We also include a variety of synonyms and aliases (the combination of which we call semsets) for referring to concepts and instances in multiple ways and for aiding information extraction and tagging functions. (In addition to organizing and helping to interoperate contributing information, these domain ontologies are also used for what is called ontology-based information extraction (OBIE) via our scones [11] system.)
In addition the library of adaptive ontologies includes some administrative ontologies that guide how instance data can be imported and inter-related (via the Instance Record Object Notation, or irON); what information types drive what widgets (via the Semantic Component Ontology, or SCO); data mapping vocabularies (UMBEL Vocabulary); how to characterize datasets; and other potential specialty functionality.
A forthcoming article will describe the composition and modularity typically found in a library of these adaptive ontologies.
In combination, these adaptive ontologies are, in effect, the “brains” of the self-service system. The best aspect of these ontologies is that they can be understood, created and maintained by knowledge workers. They constitute the only specification (other than theming, if desired) necessary to create self-service knowledge management environments.
The piece of the puzzle that implements the instruction sets within these adaptive ontologies are the ontology-driven apps, or ODapps. A recent article describes these structures in some detail [12].
ODapps are modular, generic software applications designed to operate in accordance with the specifications contained in the adaptive ontologies. ODapps fulfill specific generic tasks, consistent with their dedicated design to respond to adaptive ontologies. For example, current ontology-driven apps include imports and exports in various formats, dataset creation and management, data record creation and management, reporting, browsing, searching, data visualization and manipulation (through libraries of what we call semantic components), user access rights and permissions, and similar. These applications provide their specific functionality in response to the specifications in the ontologies fed to them.
ODapps are designed more similarly to widgets or API-based frameworks than to the dedicated software of the past, though the dedicated functionality (e.g., graphing, reporting, etc.) is obviously quite similar. The major change in these ontology-driven apps is to accommodate a relatively common abstraction layer that responds to the structure and conventions of the guiding ontologies. The major advantage is that single generic applications can supply shared functionality based on any properly constructed adaptive ontology.
Generic functionality included in these ODapps are things like filtering, setting value ranges, choosing the specific display view, and invoking or not various display templates (akin to the infoboxes on Wikipedia). By nature of the data and the ontologies submitted to them, the ODapp signals to the user or consumer what displays, views, filters or slices-and-dices might be available to them. Fed different data and different ontologies, the ODapp would signal the user differently.
Because of their generic design, driven by the ontologies, only a relatively small number of ODapps needs to be created. Once created with appropriate generic functionality, application development is essentially over. It is through the additions and changes to the adaptive ontologies — done by knowledge workers themselves — that new capability and structure gets exposed through these ontology-driven apps. This innovation shifts the locus from software and programming to data and knowledge structures.
This democratization of IT means that everything in the knowledge management realm can become self service. Users and consumers can create their own analyses; develop their own reports; and package and disseminate what they and their colleagues need, when they need it. Through ontology-driven apps and adaptive ontologies, we turn prior software engineering practice on its head.
Integral to this design is the embrace of the open world assumption [13]. Though not a specific artifact, as are adaptive ontologies or ODapps, the open-world approach is the logical underpinning that allows consumers or knowledge workers to add new information to the system as it is discovered or scoped. This nuance may sound esoteric, but traditional KM systems have a very different underpinning that leads to some nasty implications.
Because the predominant share of KM systems are based on relational database systems, they embody a closed-world design. This works well for transaction systems or environments where the information domain is known and bounded, but does not apply to knowledge and changing information. Moreover, the schema that govern closed-world designs are brittle and hard to change and manage. It is this fact that has put KM squarely in the bailiwick of IT and has often led to delays and frustrations. Re-architecting or adding new schema views to an existing closed-world system can be fiendishly difficult.
This difficulty is a major reason why IT resists casual or constant changes to underlying data schema. Unfortunately, this makes these brittle schema difficult to extend and therefore generally unresponsive to changing and growing knowledge. As an environment for knowledge management, the relational data system and the closed-world approach are lousy foundations.
As the self-service information management diagram above shows, RDF and Web services are two further important foundations. RDF (Resource Description Framework) is the canonical data model upon which all input information is represented. This means that the ODapp tools and the adaptive ontologies can work off a single model of knowledge representation. The Web service and architecture component is also helpful in that it allows Web 2.0 technologies to be brought to bear and allows distributed sources and users for the KM system. This provides scalability and distributed applicability, including on smartphones.
The other two pillars of the open semantic enterprise — the layered approach and linked data — are also helpful, but not necessarily integral to the KM and self-service perspectives presented herein.
The benefits and flexibilities from self-service information management extend from top to bottom; from creating data and content to publishing and deploying it. Here is a listing of available potentials for self-service, drawing comparison to the current conventional approach dependent on IT:
| Information Activity | Conventional Approach (IT) | Self-service Information Management |
| Creating |
|
|
| Annotating |
|
|
| Analyzing |
|
|
| Reporting |
|
|
| Visualizing |
|
|
| Collaborating |
|
|
| Validating |
|
|
| Publishing |
|
|
| Re-purposing |
|
|
| New Functionality |
|
|
| Developing Apps |
|
|
| Dashboarding |
|
|
The fact that any source — internal or external — or format — unstructured, semi-structured and structured — can be brought together with semantic technologies is a qualitative boost over existing KM approaches. Further, all information is exposed in simple text formats, which means it can be readily manipulated and managed with easy to understand tools and applications. Reliance on open standards and languages by semantic technologies also leads to greater use and availability of open source systems.
In short, self-service information management approaches should be cheaper, faster, more responsive and more capable than current approaches.
Given these perspectives, hearing someone tout data-driven applications or advocate ontologies merely for metadata matching sounds positively Neanderthal. The prospects we have with semantic technologies, ontology-driven apps, and self-service information management systems mean so much more. The prospect at hand is to remake the entire knowledge management function, in the process bringing all aspects from creating and distributing knowledge products into the direct hands of the user. This is truly the democratization of information!
The absolutely fantastic news is none of this is theoretical or in the future. All pieces are presently proven, working and in hand. This is a practical vision, ready today.
Granted, like any new innovation, especially one that is infrastructural and systems-oriented, there are some weak or less-developed parts. These current gaps and needs include:
Yet, in the grand scheme of things, these gaps are relatively insignificant. The path and general architecture and design for moving forward are now clear.
Self-service information management via appropriately designed semantic technologies is now a reality. It promises to fulfill a vision of information access and control that has been frustrated for decades. We think these are exciting developments for the enterprise — and for the individual knowledge hound. We welcome your inquiries and invite you to join our open OSF group to contribute your ideas.