Evolution
AI³
Adaptive Information
Adaptive Innovation
Adaptive Infrastructure
a·dap·tive adj. Showing or having a capacity to make fit for new or special situations; flexible; a successful adjustment.

Blogasbörd (cloud version):
Send Email   Get SIOC Profile   Get FOAF Profile   Syndicate full contents for this site using RSS 20
Main Links
Categories
Calendar
February 2012
S M T W T F S
« Jan    
 1234
567891011
12131415161718
19202122232425
26272829  
Archives
More . . .  
Credits
Blog software courtesy of WordPress

View Mike's profile on LinkedIn
Search
Date:   December 12, 2011

State of SemWeb Tools - 2011Number of Semantic Web Tools Passes 1000 for First Time; Many Other Changes

We have been maintaining Sweet Tools, AI3‘s listing of semantic Web and -related tools, for a bit over five years now. Though we had switched to a structWSF-based framework that allows us to update it on a more regular, incremental schedule [1], like all databases, the listing needs to be reviewed and cleaned up on a periodic basis. We have just completed the most recent cleaning and update. We are also now committing to do so on an annual basis.

Thus, this is the inaugural ‘State of Tooling for Semantic Technologies‘ report, and, boy, is it a humdinger. There have been more changes — and more important changes — in this past year than in all four previous years combined. I think it fair to say that semantic technology tooling is now reaching a mature state, the trends of which likely point to future changes as well.

In this past year more tools have been added, more tools have been dropped (or abandoned), and more tools have taken on a professional, sophisticated nature. Further, for the first time, the number of semantic technology and -related tools has passed 1000. This is remarkable, given that more tools have been abandoned or retired than ever before.

Click here to browse the Sweet Tools listing. There is also a simple listing of URL links and categories only.

We first present our key findings and then overall statistics. We conclude with a discussion of observed trends and implications for the near term.

Key Findings

Some of the key findings from the 2011 State of Tooling for Semantic Technologies are:

  • As of the date of this article, there are 1010 tools in the Sweet Tools listing, the first it has passed 1000 total tools
  • A total of 158 new tools have been added to the listing in the last six months, an increase of 17%
  • 75 tools have been abandoned or retired, the most removed at any period over the past five years
  • A further 6%, or 55 tools, have been updated since the last listing
  • Though open source has always been an important component of the listing, it now constitutes more than 80% of all listings; with dual licenses, open source availability is about 83%. Online systems contribute another 9%
  • Key application areas for growth have been in SPARQL, ontology-related areas and linked data
  • Java continues to dominate as the most important language.

Many of these points are elaborated below.

The Statistical Picture

The updated Sweet Tools listing now includes nearly 50 different tools categories. The most prevalent categories, each with over 6% of the total, are information extraction, general RDF tools, ontology tools, browser tools (RDF, OWL), and parsers or converters. The relative share by category is shown in this diagram (click to expand):

Sweet Tools Applications

Since the last listing, the fastest growing categories have been SPARQL, linked data, knowledge bases and all things related to ontologies. The relative changes by tools category are shown in this figure:

Last Year Sweet Tools Applications

Though it is true that some of this growth is the result of discovery, based on our own tool needs and investigations, we have also been monitoring this space for some time and serendipity is not a compelling explanation alone. Rather, I think that we are seeing both an increase in practical tools (such as for querying), plus the trends of linked data growth matched with greater sophistication in areas such as ontologies and the OWL language.

The languages these tools are written in have also been pretty constant over the past couple of years, with Java remaining dominant. Java has represented half of all tools in this space, which continues with the most recent tools as well (see below). More than a dozen programming or scripting languages have at least some share of the semantic tooling space (click to expand):

Sweet Tools Languages

With only 160 new tools it is hard to draw firm trends, but it does appear that some languages (Haskell, XSLT) have fallen out of favor, while popularity has grown for Flash/Flex (from a small base), Python and Prolog (with the growth of logic tools):

Last Year Change in Sweet Tools Languages

PHP will likely continue to see some emphasis because of relations to many content management systems (WordPress, Drupal, etc.), though both Python and Ruby seem to be taking some market share in that area.

New Tools

The newest tools added to the listing show somewhat similar trends. Again, Java is the dominant language, but with much increased use of JavaScript and Python and Prolog:

Sweet Tools Languages

The higher incidence of Prolog is likely due to the parallel increase in reasoners and inference engines associated with ontology (OWL) tools.

The increase in comprehensive tool suites and use of Eclipse as a development environment would appear to secure Java’s dominance for some time to come.

Trends and Observations

These dry statistics tend to mask the feel one gets when looking at most of the individual tools across the board. Older academic and government-funded project tools are finally getting cleaned out and abandoned. Those tools that remain have tended to get some version upgrades and improved Web sites to accompany them.

The general feel one gets with regard to semantic technology tooling at the close of 2011 has these noticeable trends:

  • A three-tiered environment – the tools seem to segregate into: 1) a bottom tier of tools (largely) developed by individuals or small groups, now most often found on Google Code or Github; 2) a middle-tier of (largely) government-funded projects, sometimes with multiple developers, often older, but with no apparent driving force for ongoing improvements or commercialization; and 3) a top-tier of more professional and (often) commercially-oriented tools. The latter category is the most noticeable with respect to growth and impact
  • Professionalism – the tools in the apparent top tiers feel to have more professionalism and better (and more attractive) packaging. This professionalism is especially true for the frameworks and composite applications. But, it also applies to many of the EU-funded projects from Europe, which has always been a huge source of new tool developments
  • More complete toolsets – similarly, the upper levels of tools are oriented to pragmatic problems and problem-solving, which often means they embody multiple functions and more complete tooling environments. This category actually appears to be the most visible one exhibiting growth
  • Changing nature of academic releases – yet, even the academic releases seem to be increasing in professionalism and completeness. Though in the lowest tier it is still possible to see cursory or experimental tool releases, newer academic releases (often) seem to be more strategically oriented and parts of broader programmatic emphases. Programs like AKSW from the University of Leipzig or the Freie Universität Berlin or Finland’s Semantic Computing Research Group (SeCo), among many others, tend to be exemplars of this trend
  • Rise of commercial interests and enterprise adoption – the growing maturity of semantic technologies is also drawing commercial interest, and the incubation of new start-ups by academic and research institutions acts to reinforce the above trends. Promising projects and tools are now much more likely to be spun off as potential ventures, with accompanying better packaging, documentation and business models
  • Multiple languages and applications – with this growing complexity and sophistication has also come more complicated apps, combining multiple languages and functions. In fact, for some time the Sweet Tools listing has been justifiably criticized by some as overly “simplifying” the space by classifying tools under (largely) single applications or single languages. By the 2012 survey, it will likely be necessary to better classify the tools using multiple assignments
  • Google code over SourceForge for open source (and an increase in Github, as well) – virtually all projects on SourceForge now feel abandoned or less active. The largest source of open source projects in the semantic technology space is now clearly Google Code. Though of a smaller footprint today, we are also seeing many of the newer open source projects also gravitate to Github. Open source hosting environments are clearly in flux.

I have said this before, and been wrong about it before, but it is hard to see the tooling growth curve continue at its current slope into the future. I think we will see many individual tools spring up on the open source hosting sites like Google and Github, perhaps at relatively the same steady release rate. But, old projects I think will increasingly be abandoned and older projects will not tend to remain available for as long a time. While a relatively few established open source standards, like Solr and Jena, will be the exception, I think we will see shorter shelf lives for most open source tools moving forward. This will lead to a younger tools base than was the case five or more years ago.

I also think we will continue to see the dominance of open source. Proprietary software has increasingly been challenged in the enterprise space. And, especially in semantic technologies, we tend to see many open source tools that are as capable as proprietary ones, and generally more dynamic as well. The emphasis on open data in this environment also tends to favor open source.

Yet, despite the professionalism, sophistication and complexity trends, I do not yet see massive consolidation in the semantic technology space. While we are seeing a rapid maturation of tooling, I don’t think we have yet seen a similar maturation in revenue and business models. While notable semantic technology start-ups like Powerset and Siri have been acquired and are clear successes, these wins still remain much in the minority.


[1] Please use the comments section of this post for suggesting new or overlooked tools. We will incrementally add them to the Sweet Tools listing. Also, please see the About tab of the Sweet Tools results listing for prior releases and statistics.

Posted by AI3's author, Mike Bergman

Posted on December 12, 2011 at 8:29 am in Open Source, Semantic Web Tools, Structured Web | Comments (5)
The URI link reference to this post is: http://www.mkbergman.com/991/the-state-of-tooling-for-semantic-technologies/
The URI to trackback this post is: http://www.mkbergman.com/991/the-state-of-tooling-for-semantic-technologies/trackback/
Date:   September 26, 2011

OWL - Web Ontology LanguageDocumenting the Emerging Ecosystem Around OWL 2

We have been touting the importance of OWL 2 as the language of choice for federating and reasoning over RDF and ontologies. An absolutely essential enabler of the OWL 2 language is version 3 of the OWL API (actually, version 3.2.4 at the time of this writing), a Java-based framework for accessing and managing the language. Protégé 4, the most popular open source ontology editor and integrated development environment (IDE), for example, is built around the OWL API.

As we laid out a bit more than a year ago, now codified on our TechWiki as the Normative Landscape of Ontology Tools (especially the second figure), we see the OWL API as the essential pivot point for all forms of ontology tools moving forward.

We have attempted to assemble a definitive and comprehensive list of all known tools presently based around version 3 of the OWL API. (We have surely missed some and welcome comments to this post that identify missing ones; we promise to add them and keep tracking them.) Herein is a listing of the 30 or so known OWL API-based tools:

  • Protégé 4 is a free, open source ontology editor and knowledge-base framework based on OWL 2 and centered on the OWL API
  • CEL, FaCT++, HermiT, Pellet, and Racer Pro reasoners provide OWL API wrappers and are also available as reasoner plugins to Protégé 4
  • There is also a FaCT++ port to Java that is also implementing the OWLReasoner and is available as a plugin for Protégé 4.1; it is at version 0.9 with user feedback welcomed
  • structOntology is an open source ontology editor and manager supporting Structured DynamicsconStruct implementation of the Open Semantic Framework (OSF) in Drupal; more information is provided here
  • TrOWL is a Tractable reasoning infrastructure for OWL 2. TrOWL supports both standard TBox and ABox reasoning, as well as conjunctive query answering
  • SKOSEd is a SKOS editor for Protege; just recently made compatible with Protégé 4.1
Please let us know of any missing OWL API tools that should be added to this list by submitting a comment to this post. We will keep this listing current.
  • Populus is a semantic spreadsheet framework using RightField and OPPL for creating OWL ontologies
  • Bubastis is a tool for detecting asserted logical differences between two ontologies, such as between versions. A stand alone version of the tool is also available for download from the EFO tools page. Bubastis is powered by the OWL API
  • Tab2OWL and its download is a Java tool for importing classes into an already existing OWL file. The script uses the OWL API to read in a tab delimited file of class details and create OWL classes from these rows, adding them to an existing ontology
  • S-Match is a semantic matching framework, which provides several semantic matching algorithms and facilities for developing new ones. Currently S-Match contains implementations of the original S-Match semantic matching algorithm, as well as minimal semantic matching algorithm and structure preserving semantic matching algorithm
  • The Alignment API is an API and implementation for expressing and sharing ontology alignments. It uses an RDF format for expressing alignments in a uniform way. Its four main interfaces (Alignment, Cell, Relation and Evaluator) provides these services: storing, finding, and sharing alignments; piping alignment algorithms (improving an existing alignment); manipulating (thresholding and hardening); generating processing output; and comparing alignments
  • The OWLlink API is a Java interface and implementation of the OWLlink protocol on top of the Java-based OWL API. The OWLlink API enables OWL API-based applications to access remote reasoners (so-called OWLlink servers), and it turns any OWL API aware reasoner into an OWLlink server
  • OPPL2 (ontology pre-processing language) is an abstract formalism that allows for manipulating ontologies written in OWL. It is 100% based on the Manchester OWL Syntax; a query language based on OWL (logical) axioms and variables; a scripting language that allows the addition/removal of OWL (logical) axioms. It is available as an Protégé 4.1 plug-in
  • OPPL Patterns It is available as an Protégé 4.1 plug-in
  • Posh (Prolog OWL Shell) is a command line utility that wraps the Thea OWL library to allow for advanced querying and processing of ontologies, combining the power of Prolog and OWL reasoning
  • POPL (Prolog Ontology Processing Language) allows you to write expressive ontology rewrite rules in a high-level declarative fashion using a syntax similar to Manchester syntax
  • OWLTools (aka OWL2LS – OWL2 Life Sciences) is a convenience Java API on top of the OWL API. Code is available here
  • LexOWL is a plug-in for Protégé 4. In order to add more powerful functionality (e.g., inferencing, editing) to the existing infrastructure and align LexGrid more closely with various Semantic Web technologies, the LexOWL plugin for Protégé 4 provides a way for representing the ontologies modeled within the LexGrid environment in OWL. A source for downloading this tool has not been found
  • Apero, a Protégé plug-in that is an ontology debugging tool based on the use of anti-patterns; see http://www.emcl-study.eu/fileadmin/master_theses/thesis_tahwil.pdf
  • DReW is a prototype DL reasoner over LDL+ ontologies and a prototype reasoner for dl-programs over LDL+ ontologies under well-founded semantics. It is not well developed or documented; it can be downloaded here
  • The LingInfo, LexOnto, LexInfo and LMF ontologies are available from the project website, as well as a corresponding Java API with an implementation for the commonly used OWL API
  • Thea2 is a Prolog library that provides complete support for querying and processing OWL 2 ontologies directly from within Prolog programs. Thea2 also offers additional capabilities including a bridge to the Java OWL API and translation of ontologies to Description Logic programs
  • GLOW is a visualization for OWL ontologies, based on Hierarchical Edge Bundles. Hierarchical Edge Bundles is a new visually attractive technique for displaying adjacency relations in hierarchical data, such as concept structures formed by `subclass-of’ and `type-of’ relations. The displayed adjacency relations can be selected from an ontology using a set of common configurations, allowing for intuitive discovery of information. It is a visualization library based on OWL API, as well as a plug-in for Protégé
  • ROWLKit is a simple GUI to reason and query over ontologies written in the OWL 2 QL profile of OWL
  • OBDA Plugin (Ontology-based data access) is an add-on for the Protégé ontology editor aimed at transforming Protégé into a fully fledged OBDA model editor. It provides data source and mapping editors, as well as querying facilities that, in conjunction with an OBDA-enabled reasoner, allows you to design and test every aspect of an OBDA system
  • OntoCAT provides high level abstraction for interacting with ontology resources including local ontology files in standard OWL and OBO formats (via OWL API)
  • SemaRule Navigator is an Eclipse-based toolkit of multiple semWeb tools, built around the OWL API, organized into a pipeline-like system (appears quite complicated)
  • OWLDB (alias Mnemosyne) is a storage system based on object-relational mappings utilising the OWL-API for the W3C Web Ontology Language OWL
  • Finally, for a periodically updated list of “official” extensions, see https://owlapi.svn.sourceforge.net/svnroot/owlapi/v3/branches/owlextensions/.

Addendum

Ignazio Palmisano also graciously suggested these additional sources:

which also further leads to this additional listing:

It is not clear if all of these offer OWL 2 support, let along work with the current OWL API.

Posted by AI3's author, Mike Bergman

Posted on September 26, 2011 at 2:52 am in Ontologies, Semantic Web Tools | Comments (3)
The URI link reference to this post is: http://www.mkbergman.com/977/thirty-owl-api-tools/
The URI to trackback this post is: http://www.mkbergman.com/977/thirty-owl-api-tools/trackback/
Date:   August 8, 2011

Geshi NetworkVisualization + Analysis Pushes Aside Cytoscape

Though I never intended it, some posts of mine from a few years back dealing with 26 tools for large-scale graph visualization have been some of the most popular on this site. Indeed, my recommendation for Cytoscape for viewing large-scale graphs ranks within the top 5 posts all time on this site.

When that analysis was done in January 2008 my company was in the midst of needing to process the large UMBEL vocabulary, which now consists of 28,000 concepts. Like anything else, need drives research and demand, and after reviewing many graphing programs, we chose Cytoscape, then provided some ongoing guidelines in its use for semantic Web purposes. We have continued to use it productively in the intervening years.

Like for any tool, one reviews and picks the best at the time of need. Most recently, however, with growing customer usage of large ontologies and the development of our own structOntology editing and managing framework, we have begun to butt up against the limitations of large-scale graph and network analysis. With this post, we announce our new favorite tool for semantic Web network and graph analysis — Gephi — and explain its use and showcase a current example.

The Cytoscape Baseline and Limitations

Three and one-half years ago when I first wrote about Cytoscape, it was at version 2.5. Today, it is at version 2.8, and many aspects have seen improvement (including its Web site). However, in other respects, development has slowed. For example, version 3.x was first discussed more than three years ago; it is still not available today.

Though the system is open source, Cytoscape has also largely been developed with external grant funds. Like other similarly funded projects, once and when grant funds slow, development slows as well. While there has clearly been an active community behind Cytoscape, it is beginning to feel tired and a bit long in the tooth. From a semantic Web standpoint, some of the limitations of the current Cytoscape include:

  • Difficult conversion of existing ontologies — Cytoscape requires creating a CSV input; there was an earlier RDFscape plug-in that held great promise to bridge the software into the RDF and semantic Web sphere, but it has not remained active
  • Network analysis — one of the early and valuable generalized network analysis plug-ins was NetworkAnalyzer; however, that component has not seen active development in three years, and dynamic new generalized modules suitable for social network analysis (SNA) and small-world networks have not been apparent
  • Slow performance and too-frequent crashes — Cytoscape has always had a quirky interface and frequent crashes; later versions are a bit more stable, but usability remains a challenge
  • Largely supported by the biomedical community — from the beginning, Cytoscape was a project of the biomedical community. Most plug-ins still pertain to that space. Because of support for OBO (Open Biomedical and Biological Ontologies) formats and a lack of uptake by the broader semantic Web community, RDF- and OWL-based development has been keenly lacking
  • Aside from PDFs, poor ability to output large graphs in a viewable manner
  • Limited layout support — and poor performance for many of those included with the standard package.

Undoubtedly, were we doing semantic technologies in the biomedical space, we might well develop our own plug-ins and contribute to the Cytoscape project to help overcome some of these limitations. But, because I am a tools geek (see my Sweet Tools listing with nearly 1000 semantic Web and -related tools), I decided to check out the current state of large-scale visualization tools and see if any had made progress on some of our outstanding objectives.

Choosing Geshi and Using It

There are three classes of graph tools in the semantic technology space:

  1. Ontology navigation and discovery, to which the Relation Browser and RelFinder are notable examples
  2. Ontology structure visualization (and sometimes editing), such as the GraphViz (OWLViz) or OntoGraf tools used in Protégé (or the nice FlexViz, again used by the OBO community), and
  3. Large-scale graph visualization in order to gain a complete picture and macro relationships in the ontology.

One could argue that the first two categories have received the most current development attention. But, I would also argue that the third class is one of the most critical:  to understand where one is in a large knowledge space, much better larger-scale visualization and navigation tools are needed. Unfortunately, this third category is also the one that appears to be receiving the least development attention. (To be sure, large-scale graphs pose computational and performance challenges.)

In the nearly four years since my last major survey of 26 tools in this category, the new entrants appear quite limited. I’ve surely overlooked some, but the most notable are Gruff, NAViGaTOR, NetworkX and Gephi [1]. Gruff actually appears to belong most in Category #2; I could find no examples of graphs on the scale of thousands of nodes. NAViGaTOR is biomedical only. NetworkX has no direct semantic graph importing and — while apparently some RDF libraries can be used for manipulating imports — alternative workflows were too complex for me to tackle for initial evaluation. This leaves Gephi as the only potential new candidate.

From a clean Web site to well-designed intro tutorials, first impressions of Gephi are strongly positive. The real proof, of course, was getting it to perform against my real use case tests. For that, I used a “big” ontology for a current client that captures about 3000 different concepts and their relationships and more than 100 properties. What I recount here — from first installing the program and plug-ins and then setting up, analyzing, defining display parameters, and then publishing the results — took me less than a day from a totally cold start. The Gephi program and environment is surprisingly easy to learn, aided by some great tutorials and online info (see concluding section).

The critical enabler for being able to use Gephi for this source and for my purposes is the SemanticWebImport plug-in, recently developed by Fabien Gandon and his team at Inria as part of the Edelweiss project [2]. Once the plug-in is installed, you need only open up the SemanticWebImport tab, give it the URL of your source ontology, and pick the Start button (middle panel):

SemWeb Plug-in for GephiNote the SemanticWebImport tool also has the ability (middle panel) to issue queries to a SPARQL endpoint, the results of which return a results graph (partial) from the source ontology. (This feature is not further discussed herein.) This ontology load and display capability worked without error for the five or six OWL 2 ontologies I initially tested against the system. 

Once loaded, an ontology (graph) can be manipulated with a conventional IDE-like interface of tabs and views. In the right-hand panels above we are selecting various network analysis routines to run, in this case Average Degrees. Once one or more of these analysis options is run, we can use the results to then cluster or visualize the graph; the upper left panel shows highlighting the Modularity Class, which is how I did the community (clustering) analysis of our big test ontology. (When run you can also assign different colors to the cluster families.) I also did some filtering of extraneous nodes and properties at this stage and also instructed the system via the ranking analysis to show nodes with more link connections as larger than those nodes with fewer links.

At this juncture, you can also set the scale for varying such display options as linear or some power function. You can also select different graph layout options (lower left panel). There are many layout plug-in options for Gephi. The layout plugin called OpenOrd, for instance, is reported to be able to scale to millions of nodes.

At this point I played extensively with the combination of filters, analysis, clusters, partitions and rankings (as may be separately applied to nodes and edges) to: 1) begin to understand the gross structure and characteristics of the big graph; and 2) refine the ultimate look I wanted my published graph to have.

In our example, I ultimately chose the standard Yifan Hu layout in order to get the communities (clusters) to aggregate close to one another on the graph. I then applied the Parallel Force Atlas layout to organize the nodes and make the spacings more uniform. The parallel aspect of this force-based layout allows these intense calculations to run faster. The result of these two layouts in sequence is then what was used for the results displays.

Upon completion of this analysis, I was ready to publish the graph. One of the best aspects of Gephi is its flexibility and control over outputs. Via the main Preview tab, I was able to do my final configurations for the published graph:

Publication Options for GephiThe graph results from the earlier-worked out filters and clusters and colors are shown in the right-hand Preview pane. On the left-hand side, many aspects of the final display are set, such as labels on or off, font sizes, colors, etc. It is worth looking at the figure above in full size to see some of the options available. 

Standard output options include either SVG (vector image) or PDFs, as shown at the lower left, with output size scaling via slider bar. Also, it is possible to do standard saves under a variety of file formats or to do targeted exports.

One really excellent publication option is to create a dynamically zoomable display using the Seadragon technology via a separate Seadragon Web Export plug-in. (However, because of cross-site scripting limitations due to security concerns, I only use that option for specific sites. See next section for the Zoom It option — based on Seadragon — to workaround that limitation.)

Outputs Speak for Themselves

I am very pleased with the advances in display and analysis provided by Gephi. Using the Zoom It alternative [3] to embedded Seadragon, we can see our big ontology example with:

  • All 3000 nodes labeled, with connections shown (though you must must zoom to see) and
  • When zooming (use scroll wheel or + icon) or panning (via mouse down moves), wait a couple of seconds to get the clearest image refresh:

Note: at standard resolution, if this graph were to be rendered in actual size, it would be larger than 7 feet by 7 feet square at full zoom !!!

To compare output options, you may also;

Still, Some Improvements Would be Welcomed

It is notable that Gephi still only versions itself as an “alpha”. There is already a robust user community with promise for much more technology to come.

As an alpha, Gephi is remarkably stable and well-developed. Though clearly useful as is, I measure the state of Gephi against my complete list of desired functionality, with these items still missing:

  • Real-time and interactive navigation — the ability to move through the graph interactively and to issue queries and discover relationships
  • Huge node numbers — perhaps the OpenOrd plug-in somewhat addresses this need. We will be testing Gephi against UMBEL, which is an order of magnitude larger than our test big ontology
  • More node and edge control — Cytoscape still retains the advantage in the degree to which nodes and edges can be graphically styled
  • Full round-tripping — being able to use Gephi in an edit mode would be fantastic; the edit functionality is fairly straightforward, but the ability to round-trip in appropriate formats (OWL, RDF or otherwise) may be the greater sticking point.

Ultimately, of course, as I explained in an earlier presentation on a Normative Landscape for Ontology Tools, we would like to see a full-blown graphical program tie in directly with the OWL API. Some initial attempts toward that have been made with the non-Gephi GLOW visualization approach, but it is still in very early phases with ongoing commitments unknown. Optimally, it would be great to see a Gephi plug-in that ties directly to the OWL API.

In any event, while perhaps Cytoscape development has stalled a bit for semantic technology purposes, Gephi and its SemanticWebImport plug-in have come roaring into the lead. This is a fine toolset that promises usefulness for many years to come.

Some Further Gephi Links

To learn more about Gephi, also see the:

Also, for future developments across the graph visualization spectrum, check out the Wikipedia general visualization tools listing on a periodic basis.


[1] The R open source math and statistics package is very rich with apparently some graph visualization capabilities, such as the dedicated network analysis and visualization project statnet. rrdf may also provide an interesting path for RDF imports. R and its family of tools may indeed be quite promising, but the commitment necessary to R appears quite daunting. Longer-term, R may represent a more powerful upgrade path for our general toolsets. Neo4j is also a rising star in graph databases, with its own visualization components. However, since we did not want to convert our underlying data stores, we also did not test this option.
[2] Erwan Demairy is the lead developer and committer for SemanticWebImport. The first version was released in mid-April 2011.
[3] For presentations like this blog post, the Seadragon JavaScript enforces some security restrictions against cross-site scripting. To overcome that, the option I followed was to:
  • Use Gephi’s SVG export option
  • Open the SVG in Inkscape
  • Expand the size of the diagram as needed (with locked dimensions to prevent distortion)
  • Save As a PNG
  • Go to Zoom It and submit the image file
  • Choose the embed function, and
  • Embed the link provided, which is what is shown above.
(Though Zoom.it also accepts SVG files directly, I found performance to be spotty, with many graphical elements dropped in the final rendering.)

Posted by AI3's author, Mike Bergman

Posted on August 8, 2011 at 3:27 am in Ontologies, Open Source, Semantic Web Tools, UMBEL | Comments (1)
The URI link reference to this post is: http://www.mkbergman.com/968/a-new-best-friend-gephi-for-large-scale-networks/
The URI to trackback this post is: http://www.mkbergman.com/968/a-new-best-friend-gephi-for-large-scale-networks/trackback/
Date:   February 7, 2011

Sweet Tools ListingNow Presented as a Semantic Component; Grows to 900+ Tools

Sweet Tools, AI3‘s listing of semantic Web and -related tools, has just been released with its 17th update. The listing now contains more than 900 tools, about a 10% increase over the last version. Significantly the listing is also now presented via its own semantic tool, the structSearch sComponent, which is one of the growing parts to Structured Dynamics‘ open semantic framework (OSF).

So, we invite you to go ahead and try out this new Flex/Flash version with its improved search and filtering! We’re pretty sure you’ll like it.

Summary of Major Changes Sweet Tools structSearch View

Sweet Tools now lists 907 919 tools, an increase of 72 84 (or 8.6 10.1%) over the prior version of 835 tools. The most notable trend is the continued increase in capabilities and professionalism of (some of) the new tools.

This new release of Sweet Tools — available for direct play and shown in the screenshot to the right — is the first to be presented via Structured Dynamics’ Flex-based semantic component technology. The system has greatly improved search and filtering capabilities; it also shares the superior dataset management and import/export capabilities of its structWSF brethren.

As a result, moving forward, Sweet Tools updates will now be added on a more regular basis, reducing the big burps that past releases have tended to follow. We will also see much expanded functionality over time as other pieces of the structWSF and sComponents stack get integrated and showcased using this dataset.

This release is the first in WordPress, and shows the broad capabilities of the OSF stack to be embedded in a variety of CMS or standalone systems. We have provided some updates on Structured Dynamics’ OSF TechWiki for how to modify, embed and customize these components with various Flex development frameworks (see one, two or three), such as Flash Builder or FlashDevelop.

We should mention that the OSF code group is also seeing external parties exposing these capabilities via JavaScript deployments as well. This recent release expands on the conStruct version with its capabilities described in a post about a year ago.

Retiring the Exhibit Version

However, this release does mark the retirement of the very fine Exhibit version of Sweet Tools (an archive version will be kept available until it gets too long in the tooth). I was one of the first to install a commercial Exhibit system, and the first to do so on WordPress, as I described in an article more than four years ago.

Exhibit has worked great and without a hitch, and through a couple of upgrades. It still has (I think) a superior faceting system and sorting capabiities to what we presently offer with our own sComponent alternative. However, the Exhibit version is really a display technology alone, and offers no search, access control or underlying data management capabilities (such as CRUD), all of which are integral to our current system. It is also not grounded in RDF or semantic technologies, though it does have good structural genes. And, Sweet Tools has about reached the limits of the size of datasets Exhibit can handle efficiently.

Exhibit has set a high bar for usability and lightweight design. As we move in a different direction, I’d like again to publicly thank David Huynh, Exhibit’s developer, and the MIT Simile program for when he was there, for putting forward one of the seminal structured data tools of the past five years.

Updated Statistics

The updated Sweet Tools listing now includes nearly 50 different tools categories. The most prevalent categories are browser tools (RDF, OWL), information extraction, ontology tools, parsers or converters, and general RDF tools. The relative share by category is shown in this diagram (click to expand):

Sweet Tools Applications

Since the last listing, the fastest growing categories have been utilities (general and RDF) and visualization. Linked data listings have also grown by 200%, but are still a relatively small percentage of the total.

These values should be taken with a couple of grains of salt. First, not all of these additions are organic or new releases. Some are the result of our own tools efforts and investigations, which can often surface prior overlooked tools. Also, even with this large number of application categories, many tools defy characterization, and can reside in multiple categories at once or are even pointing to new ones. So, the splits are illustrative, but not defining.

General language percentages have been keeping pretty constant over the past couple of years. Java remains the leading language with nearly half of all applications, a percentage it has kept steady for four years. PHP continues to grow in popularity, and actually increased the largest percentage amount of any language over this past census. The current language splits are shown in the next diagram (click to expand):

Sweet Tools Languages

C/C++ and C# have really not grown at all over the past year. Again, however, for the reasons noted, these trends should be interpreted with care.

Tasty Dogfood?Dogfood Never Tasted So Good

Tools development is hard and the open source nature of today’s development tends to require a certain critical mass of developer interest and commitment. There are some notable tools that have much use and focus and are clearly professional and industrial grade. Yet, unfortunately, too many of the tools on the Sweet Tools listing are either proofs-of-concept, academic demos, or largely abandoned because of lack of interest by the original developer, the community or the market as a whole.

There is a common statement within the community about how important it is for developers to “eat their own dogfood.” On the face of it, this makes some sense since it conveys a commitment to use and test applications as they are developed.

But looked at more closely, this sentiment carries with it a troublesome reflection of the state of (many) tools within the semantic Web: too much kibble that is neither attractive nor tasty. It is probably time to keep the dogfood in the closet and focus on well-cooked and attractive fare.

We at Structured Dynamics are not trying to hold ourselves up as exemplars or the best chefs of tasty food. We do, however, have a commitment to produce fare that is well prepared and professional. Let’s stop with the dogfood and get on with serving nutritious and balanced fare to the marketplace.

Posted by AI3's author, Mike Bergman

Posted on February 7, 2011 at 1:47 am in Open Source, Semantic Web Tools, Structured Web | Comments (1)
The URI link reference to this post is: http://www.mkbergman.com/942/tasty-new-sweet-tools-release/
The URI to trackback this post is: http://www.mkbergman.com/942/tasty-new-sweet-tools-release/trackback/
Date:   September 7, 2010

Shifting the Center of Gravity to the OWL API, Web Services

Previous installments in this series have listed existing ontology tools, overviewed development methodologies, and proposed a new approach to building lightweight, domain ontologies [1]. For the latter to be successful, a new generation in ontology development tools is needed. This post provides an explication of the landscape under which this new generation of tools is occurring.

Ontologies supply the structure for relating information to other information in the semantic Web or the linked data realm. Because of this structural role, ontologies are pivotal to the coherence and interoperability of interconnected data.

We are now concluding the first decade of ontology development tools, especially those geared to the semantic Web and its associated languages of RDFS and OWL. Last year we also saw the release of the major update to the OWL 2 language, with its shift to more expressiveness and a variety of profiles. The upcoming next generation of ontology tools now must also shift.

The current imperative is to shift away from ontology engineering by a priesthood to pragmatic daily use and maintenance by domain practitioners. Market growth demands simpler, task-focused tools with intuitive interfaces. For this change to occur, the general tools architecture needs to shift its center of gravity from IDEs and comprehensive toolkits to APIs and Web services. Not surprisingly, this same shift is what has been occurring across all areas of software.

Methodology Reprise: The Nature of the Landscape

In the previous installment of this series, we presented a new methodological approach to ontology development, geared to lightweight, domain ontologies. One aspect of that design was to separate the operational workflow into two pathways:

  • Instances, and their descriptive characteristics, and
  • Conceptual relationships, or ontologies.

The ontology build methodology concentrated on the upper half of this diagram (blue, with yellow lead-ins and outcomes) with the various steps overviewed in that installment [2]:

Ontology and Instance Build Methodology
Figure 1. Flowchart of Ontology Development Methodology (click to expand)

The methodology captured in this diagram embraces many different emphases from current practice:  re-use of existing structure and information assets; conscious split between instance data (ABox) and the conceptual structure (TBox) [3]; incremental design; coherency and other integrity testing; and explicit feedback for scope extension and growth. The methodology also embraces some complementary utility ontologies that also reflect the design of ontology-driven apps [4].

These are notable changes in emphasis. But they are not the most important one. The most important change is the tools landscape to implement this methodology. This landscape needs to shift to pragmatic daily use and maintenance by domain practitioners. That requires simpler and more task-oriented tools. And that change in tooling needs a still more fundamental shift in tools architecture and design.

A Legacy of Excellent First Generation Tools

In many places throughout this series I use the term “inadequate” to describe the current state of ontology development tools. This characterization is not a criticism of first-generation tools per se. Rather, it is a reflection of their inadequacy to fulfill the realities of the new tooling landscape argued in this series. The fact remains, as initial generation tools, that many of the existing tools are quite remarkable and will play central roles (mostly for the professional ontologist or developer) moving forward.

At the risk of overlooking some important players, let’s trace the (partial) legacy of some of the more pivotal tools in today’s environment.

As early as a decade ago the ontology standards languages were still in flux and the tools basis was similarly immature. Frame logic, description logics, common logic and many others were competing at that time for primacy and visibility. Most ontology tools at that time such as Protégé [5], OntoEdit [6], or OilEd [7] were based on F-logic or the predecessor to OWL, DAML+Oil. But the OWL language was under development by the W3C and in anticipation of its formal release the tools environment was also evolving to meet it. Swoop [8], for example, was one of the first dedicated OWL browsers. A Protégé plug-in for OWL was also developed by Holger Knublauch [9]. In parallel, the OWL group at the University of Manchester also introduced the OWL API [10].

With the formal release of OWL 1.0 in 2004, ontology tools continued to migrate to the language. Protégé, up through the version 3x series, became a popular open source system with many visualization and OWL-related plug-ins. Knublauch joined TopQuadrant and brought his OWL experience to TopBraid Composer, which shifted to the Eclipse IDE platform and leveraged the Jena API [9,11]. In Europe, the NeON (Networked Ontologies) project started in 2006 and by 2008 had an Eclipse-based OWL platform using the OWL API with key language processing capabilities through GATE [12].

Most recently, Protégé and NeON in open source, and TopBraid Composer on the commercial side, have likely had the largest market share of the comprehensive ontology toolkits. So far, with the release of OWL 2 in late 2009, only Protégé in version 4 and the TwoUse Toolkit have yet fully embraced all aspects of the new specification, doing so by intimately linking with the new OWL API (version 3x has full OWL 2 support) [13]. However, most leading reasoners now support OWL 2 and products such as TopBraid Composer and Ontotext’s OWLIM support OWL 2 RL as well [14].

The evolution of Protégé to version 4 (OWL 2) was led by the University of Manchester via its CO-ODE project [15], now ended, which has also been a source for most existing Protégé 4 plug-ins. (Because of the switch to OWL 2 and the OWL API most earlier plug-ins are incompatible with Protégé 4.) Manchester has also been a leading force in the development of OWL 2 and the alternative Manchester syntax.

Though only recently stable because of the formalization of OWL 2, Protégé 4 and its linkage to the new OWL API provides for a very powerful combination. With Protégé, the system has a familiar ontology editing framework and a mechanism for plug-in migration and growth. With the OWL API, there is now a common API for leading reasoners (Pellet, HermiT, FaCT++, RacerPro, etc.), a solid ontology management and annotation framework, and validators for various OWL 2 profiles (RL, EL and QL). The system is widely embraced by the biology community, probably the most active scientific field in ontologies. However, plug-in support lags the diversity of prior versions of Protégé and there does not appear to be the energy and community standing behind it as in prior years.

A Normative Tools Landscape

These leading frameworks and toolkits have opted to be “ontology engineering” environments. Via plug-ins and complicated interfaces (tabs or Eclipse-style panes) the intent has apparently been to provide “all capabilities in one box.” The tools have been IDE-centric.

Unfortunately, one must be a combination of ontologist, developer, programmer and IDE expert in order use the tools effectively. And, as incremental capabilities get added to the systems, these also inherit the same complexity and style of the host environment. It is simply not possible to make complex environments and conventions simple.

Curiously, the existence or use of APIs have also not been adequately leveraged. The usefulness of an API means that subsets of information can be extracted and worked on in very clear and simple ways. This information can then be roundtripped without loss. An API allows a tailored subset abstraction of the underlying data model. In contrast, IDEs, such as Protégé or Eclipse, when they play a similar role, force all interfaces to share their built-in complexity.

With these thoughts in mind, then, we set out to architect a tools suite and work flow that could truly take advantage of a central API. We further wanted to isolate the pieces into distributable Web services in keeping with our standard structWSF Web services framework design.

This approach also allows us to split out simpler, focused tools that domain users and practitioners can use. And, we can do all of this while also enabling the existing professional toolsets and IDEs to also interoperate in the environment.

The resulting tools landscape is shown in the diagram below. This diagram takes the same methodology flow from Figure 1 (blue and yellow boxes) and stretches them out in a more linear fashion. Then, we embed the various tools (brown) and APIs (orange) in relation to that methodology:

Ontology Tools Schematic
Figure 2. The Normative Ontology Tools Landscape (click to expand)

This diagram is worth expanding to full size and studying in some detail. Aspects of this diagram that deserve more discussion are presented in the sections below.

OWL API as Center of Gravity

As noted in the preceding methodology installment, the working ontology is the central object being managed and extended for a given deployment. Because that ontology will evolve and grow over time, it is important the complete ontology specification itself be managed by some form of version control system (green) [16]. This is the one independent tool in the landscape.

Access to and from the working ontology is mediated by the OWL API [13]. The API allows all or portions of the ontology specification to be manipulated separately, with a variety of serializations. Changes made to the ontology can also be tested for validity. Most leading reasoners can interact directly with the API. Protégé 4 also interacts directly with the API, as can various rules engines [17]. Additionally, other existing APIs, notably the Alignment API with its own mapping tools and links to other tools such as S-Match can interact with the OWL API. It is reasonable to expect more APIs to emerge over time that also interoperate [18].

The OWL API is the best current choice because of its native capabilities and because Jena does not yet support OWL 2 [11]. However, because of the basic design with structWSF (see next), it is also possible to swap out with different APIs at a later time should developments warrant.

In short, having the API play the central management role in the system means that any and all tools can be designed to interact effectively with the working ontology(ies) without any loss in information due to roundtripping.

Web Services (structWSF) as Canonical Access Layer

The same rationale that governed our development of structWSF [19] applies here: to abstract basic services and functionality through a platform-independent Web services layer. This Web services layer has canonical (standard) ways to interact with other services and is generally RESTful in design to support distributed deployments. The design conforms to proper separation of view from logic and structure. Moreover, because of the design, changes can be made on either side of the layer in terms of user interface or functionality.

Use of the structWSF layer also means that tools and functionality can be distributed anywhere on the Web. Specialized server-side functions can be supported as well as dedicated specialty hardware. Text indexing or disambiguation services can fit within this design.

The ultimate value of piggybacking on the structWSF framework is that all other extant services also become available. Thus, a wealth of converters, data managers, and semantic components (or display widgets) can be invoked depending on the needs of the specific tool.

Simpler, Task-specific Tools

The objective, of course, of this design is to promote more and simpler tools useful to domain users. Some of these are shown under the Use & Maintain box in the diagram above; others are listed by category in the table below.

The RESTful interface and parameter calls of the structWSF layer further simplify the ontology management and annotation abstractions arising from the OWL API. The number of simple tools available to users under this design is virtually limitless. These tools are also fast to develop and test.

Combining These New Thrusts and Moving Forward

This landscape is not yet a full reality. It is a vision of adaptive and simpler tools, working with a common API, and accessible via platform-independent Web services. It also preserves many of the existing tools and IDEs familiar to present ontology engineers.

However, pieces of this landscape do presently exist and more are on the way. The next section briefly overviews some of the major application areas where these tools might contribute.

Individual Tools within the Landscape

If one inspects the earlier listing of 185 ontology tools it is clear that there is a diversity of tools both in terms of scope and function across the entire ontology development stack. It is also clear that nearly all of those 185 tools listed do not communicate with one another. That is a tremendous waste.

Via shared APIs and some degree of consistent design it should be possible to migrate these capabilities into a more-or-less interoperating whole. We have thus tried to categorize some important tool types and exemplar tools from that listing to show the potential that exists. (Please note that the Example Tools are links to the tools and categories from the earlier 185 tools listing.)

This correlation of types and example tools is not meant to be exhaustive nor a recommendation of specific tools. But, this tabulation is illustrative of the potential that exists to both simplify and extend tool support across the entire ontology development workflow:

Tool Type Comments Example Tools
OWL API OWL API is a Java interface and implementation for the W3C Web Ontology Language (OWL), used to represent Semantic Web ontologies. The API provides links to inferencers, managers, annotators, and validators for the OWL2 profiles of RL, QL, EL OWL API
Web Services Layer This layer provides a common access layer and set of protocols for almost all tools. It depends critically on linkage and communication with the OWL API structWSF
Ontology Editor (IDE) There are a variety of options in this area. Generally, more complete environments (that is, IDEs) based on OWL and with links to the OWL API are preferred. Less complete editor options are listed under other categories. Note that only Protégé 4 incorporates the OWL API NeOn toolkit,
Protégé,
TopBraid Composer
Scripts In all pragmatic cases the migration of existing structure and vocabulary assets to an ontology framework requires some form of scripting. These may be off the shelf resources, but more often are specific to the use case at hand. Typical scripting languages include the standard ones (Perl, Python, PHP, Ruby, XSLT, etc.) and often involve some form of parsing or regex variety; specific to use case
Converters Converters are more-or-less pre-packaged scripts for migrating one serialization or data format to another one. As the scripts above continue to be developed, this roster of off-she-shelf starting points can increase. Today, there are perhaps close to 200 converters useful to ontology purposes irON, ReDeFer, SKOS2GenTax; also see RDFizers
Vocabulary Prompter Domain ontologies are ultimately about meaning, and for that purpose there is much need for definitions, synonyms, hyponyms, and related language assets. Vocabulary prompters take input documents or structures and help identify additional vocabulary useful for characterizing semantic meaning see the TechWiki’s vocab prompting tools; ROC
Spreadsheet Spreadsheets can be important initial development environments for users without explicit ontology engineering backgrounds. The biggest issue with spreadsheets is that what is specified in them is more general or simplistic compared to what is contained in an actual ontology. Attempts to have spreadsheets capture all of this sophistication are often less than satisfactory. One way to effective “round trip” with spreadsheets (and many related simple tools) is to adhere to an OWL API Anzo, RDF123, irON (commON), Excel, Open Office
Editor (general) Ontology editing spans from simple structures useful to non-ontologists to those (like the IDEs or toolkits) that capture all aspects of the ontology. Further, some of these editors are strictly textual or (literally) editors; others span or attempt to enable visual editing. Visual editing (see below) can ultimately extend to the ontology graph itself see the TechWiki’s ontology editing tools
Alignment API The Alignment API is an API and implementation for expressing and sharing ontology alignments. The correspondences between entities (e.g., classes, objects, properties) in ontologies is called an alignment. The API provides a format for expressing alignments in a uniform way. The goal of this format is to be able to share on the web the available alignments. The format is expressed in RDF Alignment API
Mapper A variety of tools, algorithms and techniques are available for matching or mapping concepts between two different ontologies. In general, no single method has shown itself individually superior. The better approaches use voting methods based on multiple comparisons see the TechWiki’s ontology mapping tools
Ontology Browser Ontology browsers enable the navigation or exploration of the ontology — generally in visual form — but without allowing explicit editing of the structure Relation Browser, Ontology Browser, OwlSight, FlexViz
Vocabulary Manager Vocabulary managers provide a central facility for viewing, selecting, accessing and managing all aspects of the vocabulary in an ontology (that is, to the level of all classes and properties). This tool category is poorly represented at present. Ultimately, vocabulary managers should also be one (if not the main) access point to vocabulary editing PoolParty, TermWiki, UMBEL Web service
Vocabulary Editor Vocabulary editors provide (generally simple) interfaces for the editing and updating of vocabulary terms, classes and properties in an ontology Neologism, TemaTres, ThManager, Vocab Editor
Structure Editor A structure editor is a specific form of an ontology editor, geared to the subsumption (taxonomic) organization of a largely hierarchical structure. Editors of this form tend to use tree controls or spreadsheets with indented organization to show parent and child relationships PoolParty, irON (commON)
Graph Analysis Ontologies form graph structures, which are amenable to many specific network and graph analysis algorithms, included relatedness, shortest path, grouped structures, communities and the like SNAP, igraph, Network Workbench, NetworkX, Ontology Metrics
Graph API Graph visualization with associated tools is best enabled by working from a common API. This allows for expansion and re-use of other capabilities. Preferably, this graph API would also have direct interaction with the OWL API, but none exist at the moment under investigation
Graph Visualizer Graph visualizers enable the ontology to be rendered in graph form and presentation, often with multiple layout options. The systems also enable export to PDF or graphics formats for display or printing. The better tools in this category can handle large graphs, can have their displays easily configured, and are performant see the TechWiki’s ontology visualization tools
Visual Editor An ontology visual editor enables the direct manipulation of the graph in a visual mode. This capability includes adding and moving nodes, changing linkages between nodes, and other ontology specification. Very few tools exist in this category at present COE, TwoUse Toolkit
Coherence Tester Testing for coherence involves whether the ontology structure is properly constructed and has logical interconnections. The testing either involves inference and logic testing (including entailments) based on the structure as provided; comparisons with already vetted logical structures and knowledge bases (e.g., Cyc, Wikipedia); or both Cyc, OWLim, FactForge
Gap Tester Related to coherence testing, gap testing is the identification of key missing pieces or intermediary nodes in the ontology graph. This tends to happen when external specification of the ontology is made without reference to connecting information requires use of a reference external ontology; see above
Documenter Ontology documentation is not limited to the technical specifications of the structure, but also includes best practices, how-to and use guides, and the like. Automated generation of structure documentation is also highly desirable TechWiki, SpecGen, OWLDoc
Tagger Once constructed, ontologies (and their accompanying named entity dictionaries) can be very powerful resources for aiding tagging and information extraction utilities. Like vocabulary prompting, there is a broad spectrum of potential tools and uses in the tagging category GATE (OBIE); many other options
Exporter Exports need to range from full-blown OWL representations to the simpler export of data and constructs. Multiple serialization options and the ability to support the input requirements of third-party tools is also important OWL Syntax Converter, OWL Verbalizer; many various options

The beauty of this approach is that most of the tools listed are open source and potentially amenable to the minor modifications necessary to conform with this proposed landscape.

Key Gaps in the Landscape

Contrasting the normative tools landscape above with the existing listing of ontology tools points out some key gaps or areas deserving more development attention. Some of these are:

  • Vocabulary managers — easy inspection and editing environments for concepts and predicates are lacking. Though standard editors allow direct ontology language edits (OWL or RDFS), these are not presently navigable or editable by non-ontologists. Intuitive browsing structures with more “infobox”-like editing environments could be helpful here
  • Graph API — it would be wonderful to have a graph API (including analysis options) that could communicate with the OWL API. Failing that, it would be helpful to have a graph API that communicated well with RDF and ontology structures; extant options are few
  • Large-graph visualizer — while we have earlier reviewed large-scale graph visualization software, the alternatives are neither easy to set up nor use. Being able to readily select layout options with quick zooms and scaling options are important
  • Graphical editor — some browsers or editors (e.g, FlexViz) provide nice graph-based displays of ontologies and their properties and annotations. However, there appear to be few environments where the ontology graph can be directly edited or visually used for design or expansion.

Finally, it does appear that the effort and focus behind Protégé seems to be slowing somewhat. The future has clearly shifted to OWL 2 with Protégé 4. Yet, besides the admirable CO-ODE project (now ended), tools and plug-in support seems to have slowed. Many of the admirable plug-ins for Protégé 3x do not appear to be under active development as upgrades to Protégé 4. While Protégé’s future (and similar IDEs) seems assured, its prominence possibly will (and should) be replaced by a simpler kit of tools useful to users and practitioners.

Funding and Pending Project Priorities

For the past few months we at Structured Dynamics have seen ontology design and management as the pending technical priorities within the semantic technology space. Now that the market no longer looks at “ontology” as a four-letter word, it is imperative to simplify the development and use of ontologies. The first generation of tools leading up to this point have been helpful to understand the semantic space; changes are now necessary to expand it.

In our first generation we have begun to understand the types and nature of needed tools. But our focus on IDEs and comprehensive toolsets belies a developer’s or technologist’s perspective. We need to now shift focus and look at tool needs from the standpoint of users and actual use of ontologies. Many players and many toolmakers and innovators will need to contribute to build this market for semantic technologies and approaches.

Fortunately, replacing an IDE focus with one based around APIs and Web services should be a fairly smooth and natural transition. If we truly desire to be market makers, we need to stand back and place ourselves into the shoes of the domain practitioners, the subject matter experts. We need to shield actual users from all of the silly technical details and complexity. And, then, let’s focus — task-by-task — on discrete items of management and use of ontologies. Growth of the semantic technology space depends on expanding our practitioner base.

For its part, Structured Dynamics is presently seeking new projects and sponsors with a commitment to these aims. Like our prior development of structWSF and semantic components, we will be looking to make simpler ontology tools a priority in the coming months. Please let me know if you want to partner with us toward this commitment.


[1] This posting is part of a current series on ontology development and tools, now permanently archived and updated on the OpenStructs TechWiki. The series began with an update of the prior Ontology Tools listing, which now contains 185 tools. It continued with a survey of ontology development methodologies. The last part presented a Lightweight, Domain Ontologies Development Methodology. This part is archived on the TechWiki as the Normative Landscape of Ontology Tools. The last installment in the series is planned to cover ontology best practices.
[2] The original version, now slightly modified, was first published in M. K. Bergman, 2009. “Ontology-driven Applications Using Adaptive Ontologies,” AI3:::Adaptive Information blog, Nov. 23, 2009.
[3] The TBox portion, or classes (concepts), is the basis of the ontologies. The ontologies establish the structure used for governing the conceptual relationships for that domain and in reference to external (Web) ontologies. The ABox portion, or instances (named entities), represents the specific, individual things that are the members of those classes. Named entities are the notable objects, persons, places, events, organizations and things of the world. Each named entity is related to one or more classes (concepts) to which it is a member. Named entities do not set the structure of the domain, but populate that structure. The ABox and TBox play different roles in the use and organization of the information and structure. These distinctions have their grounding in description logics.
[4] See M.K. Bergman, 2009. “Ontology-driven Applications Using Adaptive Ontologies,” AI3:::Adaptive Information blog, November 23, 2009.
[5] Natalya F. Noy, Michael Sintek, Stefan Decker, Monica Crubézy, Ray W. Fergerson and Mark A. Musen, 2001. “Creating Semantic Web Contents with Protégé-2000,” IEEE Intelligent Systems, vol. 16, no. 2, pp. 60-71, Mar/Apr. 2001. See http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.7177&rep=rep1&type=pdf.
[6] York Sure, Michael Erdmann, Juergen Angele, Steffen Staab, Rudi Studer and Dirk Wenke, 2002. “OntoEdit: Collaborative Ontology Development for the Semantic Web,” in Proceedings of the International Semantic Web Conference (ISWC) (2002). See http://www.aifb.uni-karlsruhe.de/WBS/Publ/2002/2002_iswc_ontoedit.pdf.
[7] Sean Bechhofer, Ian Horrocks, Carole Goble and Robert Stevens, 2001. “OilEd: a Reasonable Ontology Editor for the Semantic Web,” in Proceedings of KI2001, Joint German/Austrian conference on Artificial Intelligence.
[8] Aditya Kalyanpur and James Hendler, 2004. “Swoop: Design and Architecture of a Web Ontology Browser/Editor,” Scholarly Paper for Master’s Degree in Computer Science, University of Maryland, Fall 2004. See http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.87.1779&rep=rep1&type=pdf.
[9] Holger Knublauch was formerly the designer and developer of Protégé-OWL, the leading open-source ontology editor. TopBraid Composer leverages the experiences gained with Protégé and other tools into a professional ontology editor and knowledge-base framework. Composer is based on the Eclipse platform and uses Jena as its underlying API. See further http://www.topquadrant.com/composer/tbc-protege.html.
[10] Sean Bechhofer, Phillip Lord and Raphael Volz, 2003. “Cooking the Semantic Web with the OWL API,” in Proceedings of the 2nd International Semantic Web Conference, ISWC, Sanibel Island, Florida, October 2003. See http://homepages.cs.ncl.ac.uk/phillip.lord/download/publications/cooking03.pdf.
[11] Jena is fundamentally an RDF API. Jena’s ontology support is limited to ontology formalisms built on top of RDF. Specifically this means RDFS, the varieties of OWL, and the now-obsolete DAML+OIL. At the time of writing, no decision has yet been made about when Jena will support the new OWL 2 features. See http://jena.sourceforge.net/ontology/.
[12] The NeON Toolkitis built on OntoStudio. It is based on Eclipse with support for the OWL API. A series of its key plug-ins utilize various aspects of GATE (General Architecture for Text Engineering). The four-year project began in 2006 and its first open source toolkit was released by the end of 2007. OWL features were added in 2008-09. NeON has since completed, though its toolkit and plug-ins can still be downloaded as open source.
[13] OWL API is a Java interface and implementation for the W3C Web Ontology Language (OWL), used to represent Semantic Web ontologies. The API provides links to inferencers, managers, annotators, and validators for the OWL2 profiles of RL, QL, EL. Two recent papers describing the updated API are: Matthew Horridge and Sean Bechhofer, 2009. “The OWL API: A Java API for Working with OWL 2 Ontologies,” presented at OWLED 2009, 6th OWL Experienced and Directions Workshop, Chantilly, Virginia, October 2009. See http://www.webont.org/owled/2009/papers/owled2009_submission_29.pdf; and, Matthew Horridge and Sean Bechhofer, 2010. “The OWL API: A Java API for OWL Ontologies,” paper submitted to the Semantic Web Journal; see http://www.semantic-web-journal.net/sites/default/files/swj107.pdf.
[14] These links show the status of TopBraid Composer and Ontotext’s OWLIM with regard to OWL 2 RL. A newer effort, based on Eclipse, with broader OWL API support is the MOST Project’s TwoUse Toolkit. In all likelihood, the number of other tools with OWL 2 support is larger than our informal survey has found. Importantly, Jena still has not upgraded to OWL 2, but its open source site suggests it may.
[15] The CO-ODE project aimed to build authoring tools and infrastructure to make ontology engineering easier. It specifically supported the development and use of OWL-DL ontologies, by being heavily involved in the creation of infrastructure and plugins for the Protégé platform and OWL 2 support for the OWL API.
[16] For a great discussion on the unique aspects of version control in ontologies, see T. Redmond, M. Smith, N. Drummond and T. Tudorache, 2008. “Managing Change: An Ontology Version Control System,” paper presented at OWLED 2008, Karslruhe, Germany. See http://bmir.stanford.edu/file_asset/index.php/1435/BMIR-2008-1366.pdf.
[17] Birte Glimm, Matthew Horridge, Bijan Parsia and Peter F. Patel-Schneider, 2009. “A Syntax for Rules in OWL 2,” presented at the Sixth OWLED Workshop, 23-24 October 2009. See http://www.webont.org/owled/2009/papers/owled2009_submission_16.pdf.
[18] The Alignment API is one of the more venerable ones in this environment. A couple of other examples include a SKOS API (Simon Jupp, Sean Bechhofer and Robert Stevens, 2009. ” A Flexible API and Editor for SKOS,” presented at the 6th Annual European Semantic Web Conference (ESWC2009); see http://ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-401/iswc2008pd_submission_88.pdf) and the Ontology Common API Tasks (Tomasz Adamusiak, K Joeri van der Velde, Niran Abeygunawardena, Despoina Antonakaki, Helen Parkinson and Morris A. Swertz, 2010. “OntoCAT — A Simpler Way to Access Ontology  Resources,” presented at ISMB2010, 10 July 2010. See http://www.iscb.org/uploaded/css/58/17254.pdf. OntoCAT is an open source package developed to simplify the task of querying heterogeneous ontology resources. It supports NCBO BioPortal and EBI Ontology Lookup Service (OLS), as well as local OWL and OBO files).
[19] The structWSF Web services framework is generally RESTful middleware that provides a bridge between existing content and structure and content management systems and available indexing engines and RDF data stores. structWSF is a platform-independent means for distributed collaboration via an innovative dataset access paradigm. It has about twenty embedded Web services. See http://openstructs.org/structwsf.

Posted by AI3's author, Mike Bergman

Posted on September 7, 2010 at 12:17 am in Ontologies, Ontology Best Practices, Open Source, Semantic Web Tools | Comments (5)
The URI link reference to this post is: http://www.mkbergman.com/909/a-new-landscape-in-ontology-development-tools/
The URI to trackback this post is: http://www.mkbergman.com/909/a-new-landscape-in-ontology-development-tools/trackback/
Page 1 of 141234510...Last »
Copyright © 2004–2012 Michael K. Bergman.   This work is licensed under a Creative Commons License