Posted:August 8, 2011

Geshi NetworkVisualization + Analysis Pushes Aside Cytoscape

Though I never intended it, some posts of mine from a few years back dealing with 26 tools for large-scale graph visualization have been some of the most popular on this site. Indeed, my recommendation for Cytoscape for viewing large-scale graphs ranks within the top 5 posts all time on this site.

When that analysis was done in January 2008 my company was in the midst of needing to process the large UMBEL vocabulary, which now consists of 28,000 concepts. Like anything else, need drives research and demand, and after reviewing many graphing programs, we chose Cytoscape, then provided some ongoing guidelines in its use for semantic Web purposes. We have continued to use it productively in the intervening years.

Like for any tool, one reviews and picks the best at the time of need. Most recently, however, with growing customer usage of large ontologies and the development of our own structOntology editing and managing framework, we have begun to butt up against the limitations of large-scale graph and network analysis. With this post, we announce our new favorite tool for semantic Web network and graph analysis — Gephi — and explain its use and showcase a current example.

The Cytoscape Baseline and Limitations

Three and one-half years ago when I first wrote about Cytoscape, it was at version 2.5. Today, it is at version 2.8, and many aspects have seen improvement (including its Web site). However, in other respects, development has slowed. For example, version 3.x was first discussed more than three years ago; it is still not available today.

Though the system is open source, Cytoscape has also largely been developed with external grant funds. Like other similarly funded projects, once and when grant funds slow, development slows as well. While there has clearly been an active community behind Cytoscape, it is beginning to feel tired and a bit long in the tooth. From a semantic Web standpoint, some of the limitations of the current Cytoscape include:

  • Difficult conversion of existing ontologies — Cytoscape requires creating a CSV input; there was an earlier RDFscape plug-in that held great promise to bridge the software into the RDF and semantic Web sphere, but it has not remained active
  • Network analysis — one of the early and valuable generalized network analysis plug-ins was NetworkAnalyzer; however, that component has not seen active development in three years, and dynamic new generalized modules suitable for social network analysis (SNA) and small-world networks have not been apparent
  • Slow performance and too-frequent crashes — Cytoscape has always had a quirky interface and frequent crashes; later versions are a bit more stable, but usability remains a challenge
  • Largely supported by the biomedical community — from the beginning, Cytoscape was a project of the biomedical community. Most plug-ins still pertain to that space. Because of support for OBO (Open Biomedical and Biological Ontologies) formats and a lack of uptake by the broader semantic Web community, RDF- and OWL-based development has been keenly lacking
  • Aside from PDFs, poor ability to output large graphs in a viewable manner
  • Limited layout support — and poor performance for many of those included with the standard package.

Undoubtedly, were we doing semantic technologies in the biomedical space, we might well develop our own plug-ins and contribute to the Cytoscape project to help overcome some of these limitations. But, because I am a tools geek (see my Sweet Tools listing with nearly 1000 semantic Web and -related tools), I decided to check out the current state of large-scale visualization tools and see if any had made progress on some of our outstanding objectives.

Choosing Gephi and Using It

There are three classes of graph tools in the semantic technology space:

  1. Ontology navigation and discovery, to which the Relation Browser and RelFinder are notable examples
  2. Ontology structure visualization (and sometimes editing), such as the GraphViz (OWLViz) or OntoGraf tools used in Protégé (or the nice FlexViz, again used by the OBO community), and
  3. Large-scale graph visualization in order to gain a complete picture and macro relationships in the ontology.

One could argue that the first two categories have received the most current development attention. But, I would also argue that the third class is one of the most critical:  to understand where one is in a large knowledge space, much better larger-scale visualization and navigation tools are needed. Unfortunately, this third category is also the one that appears to be receiving the least development attention. (To be sure, large-scale graphs pose computational and performance challenges.)

In the nearly four years since my last major survey of 26 tools in this category, the new entrants appear quite limited. I’ve surely overlooked some, but the most notable are Gruff, NAViGaTOR, NetworkX and Gephi [1]. Gruff actually appears to belong most in Category #2; I could find no examples of graphs on the scale of thousands of nodes. NAViGaTOR is biomedical only. NetworkX has no direct semantic graph importing and — while apparently some RDF libraries can be used for manipulating imports — alternative workflows were too complex for me to tackle for initial evaluation. This leaves Gephi as the only potential new candidate.

From a clean Web site to well-designed intro tutorials, first impressions of Gephi are strongly positive. The real proof, of course, was getting it to perform against my real use case tests. For that, I used a “big” ontology for a current client that captures about 3000 different concepts and their relationships and more than 100 properties. What I recount here — from first installing the program and plug-ins and then setting up, analyzing, defining display parameters, and then publishing the results — took me less than a day from a totally cold start. The Gephi program and environment is surprisingly easy to learn, aided by some great tutorials and online info (see concluding section).

The critical enabler for being able to use Gephi for this source and for my purposes is the SemanticWebImport plug-in, recently developed by Fabien Gandon and his team at Inria as part of the Edelweiss project [2]. Once the plug-in is installed, you need only open up the SemanticWebImport tab, give it the URL of your source ontology, and pick the Start button (middle panel):

SemWeb Plug-in for GephiNote the SemanticWebImport tool also has the ability (middle panel) to issue queries to a SPARQL endpoint, the results of which return a results graph (partial) from the source ontology. (This feature is not further discussed herein.) This ontology load and display capability worked without error for the five or six OWL 2 ontologies I initially tested against the system.

Once loaded, an ontology (graph) can be manipulated with a conventional IDE-like interface of tabs and views. In the right-hand panels above we are selecting various network analysis routines to run, in this case Average Degrees. Once one or more of these analysis options is run, we can use the results to then cluster or visualize the graph; the upper left panel shows highlighting the Modularity Class, which is how I did the community (clustering) analysis of our big test ontology. (When run you can also assign different colors to the cluster families.) I also did some filtering of extraneous nodes and properties at this stage and also instructed the system via the ranking analysis to show nodes with more link connections as larger than those nodes with fewer links.

At this juncture, you can also set the scale for varying such display options as linear or some power function. You can also select different graph layout options (lower left panel). There are many layout plug-in options for Gephi. The layout plugin called OpenOrd, for instance, is reported to be able to scale to millions of nodes.

At this point I played extensively with the combination of filters, analysis, clusters, partitions and rankings (as may be separately applied to nodes and edges) to: 1) begin to understand the gross structure and characteristics of the big graph; and 2) refine the ultimate look I wanted my published graph to have.

In our example, I ultimately chose the standard Yifan Hu layout in order to get the communities (clusters) to aggregate close to one another on the graph. I then applied the Parallel Force Atlas layout to organize the nodes and make the spacings more uniform. The parallel aspect of this force-based layout allows these intense calculations to run faster. The result of these two layouts in sequence is then what was used for the results displays.

Upon completion of this analysis, I was ready to publish the graph. One of the best aspects of Gephi is its flexibility and control over outputs. Via the main Preview tab, I was able to do my final configurations for the published graph:

Publication Options for GephiThe graph results from the earlier-worked out filters and clusters and colors are shown in the right-hand Preview pane. On the left-hand side, many aspects of the final display are set, such as labels on or off, font sizes, colors, etc. It is worth looking at the figure above in full size to see some of the options available.

Standard output options include either SVG (vector image) or PDFs, as shown at the lower left, with output size scaling via slider bar. Also, it is possible to do standard saves under a variety of file formats or to do targeted exports.

One really excellent publication option is to create a dynamically zoomable display using the Seadragon technology via a separate Seadragon Web Export plug-in. (However, because of cross-site scripting limitations due to security concerns, I only use that option for specific sites. See next section for the Zoom It option — based on Seadragon — to workaround that limitation.)

Outputs Speak for Themselves

I am very pleased with the advances in display and analysis provided by Gephi. Using the Zoom It alternative [3] to embedded Seadragon, we can see our big ontology example with:

  • All 3000 nodes labeled, with connections shown (though you must must zoom to see) and
  • When zooming (use scroll wheel or + icon) or panning (via mouse down moves), wait a couple of seconds to get the clearest image refresh:

Note: at standard resolution, if this graph were to be rendered in actual size, it would be larger than 7 feet by 7 feet square at full zoom !!!

To compare output options, you may also;

Still, Some Improvements Would be Welcomed

It is notable that Gephi still only versions itself as an “alpha”. There is already a robust user community with promise for much more technology to come.

As an alpha, Gephi is remarkably stable and well-developed. Though clearly useful as is, I measure the state of Gephi against my complete list of desired functionality, with these items still missing:

  • Real-time and interactive navigation — the ability to move through the graph interactively and to issue queries and discover relationships
  • Huge node numbers — perhaps the OpenOrd plug-in somewhat addresses this need. We will be testing Gephi against UMBEL, which is an order of magnitude larger than our test big ontology
  • More node and edge control — Cytoscape still retains the advantage in the degree to which nodes and edges can be graphically styled
  • Full round-tripping — being able to use Gephi in an edit mode would be fantastic; the edit functionality is fairly straightforward, but the ability to round-trip in appropriate formats (OWL, RDF or otherwise) may be the greater sticking point.

Ultimately, of course, as I explained in an earlier presentation on a Normative Landscape for Ontology Tools, we would like to see a full-blown graphical program tie in directly with the OWL API. Some initial attempts toward that have been made with the non-Gephi GLOW visualization approach, but it is still in very early phases with ongoing commitments unknown. Optimally, it would be great to see a Gephi plug-in that ties directly to the OWL API.

In any event, while perhaps Cytoscape development has stalled a bit for semantic technology purposes, Gephi and its SemanticWebImport plug-in have come roaring into the lead. This is a fine toolset that promises usefulness for many years to come.

Some Further Gephi Links

To learn more about Gephi, also see the:

Also, for future developments across the graph visualization spectrum, check out the Wikipedia general visualization tools listing on a periodic basis.


[1] The R open source math and statistics package is very rich with apparently some graph visualization capabilities, such as the dedicated network analysis and visualization project statnet. rrdf may also provide an interesting path for RDF imports. R and its family of tools may indeed be quite promising, but the commitment necessary to R appears quite daunting. Longer-term, R may represent a more powerful upgrade path for our general toolsets. Neo4j is also a rising star in graph databases, with its own visualization components. However, since we did not want to convert our underlying data stores, we also did not test this option.
[2] Erwan Demairy is the lead developer and committer for SemanticWebImport. The first version was released in mid-April 2011.
[3] For presentations like this blog post, the Seadragon JavaScript enforces some security restrictions against cross-site scripting. To overcome that, the option I followed was to:
  • Use Gephi’s SVG export option
  • Open the SVG in Inkscape
  • Expand the size of the diagram as needed (with locked dimensions to prevent distortion)
  • Save As a PNG
  • Go to Zoom It and submit the image file
  • Choose the embed function, and
  • Embed the link provided, which is what is shown above.
(Though Zoom.it also accepts SVG files directly, I found performance to be spotty, with many graphical elements dropped in the final rendering.)

Posted by AI3's author, Mike Bergman Posted on August 8, 2011 at 3:27 am in Ontologies, Open Source, Semantic Web Tools, UMBEL | Comments (0)
The URI link reference to this post is: http://www.mkbergman.com/968/a-new-best-friend-gephi-for-large-scale-networks/
The URI to trackback this post is: http://www.mkbergman.com/968/a-new-best-friend-gephi-for-large-scale-networks/trackback/
Posted:May 16, 2011

A Video Introduction to a New Online Ontology Editor and Manager

Structured Dynamics is pleased to unveil structOntology — its ontology manager application within the conStruct open source semantic technology suite. We are doing so via a video, which provides a bit more action about this exciting new app.

structOntology has been on our radar for more than two years. But, it was only in embracing the OWLAPI some eight months back that we finally saw our way clear to how to implement the system.

The app, superbly developed by Fred Giasson, has many notable advantages — some of which are covered by the video — but two deserve specific attention:  1) the superior search function (if you have been using Protégé or similar, you will love the fact this search indexes everything, courtesy of Solr); and 2) the availability of its functionality directly within the applications that are driven by the ontologies. Of course, there’s other cool stuff too!:

 

 

(If you have trouble seeing this, here is the direct YouTube link or an alternate local Flash version if you can not access YouTube.)

More information on structOntology will be forthcoming over the coming weeks. We will be posting it as open source as part of the Open Semantic Framework by early summer.

Posted:March 21, 2011

An Overview of Freely Available, Comprehensive Icon Sets

structWFS It is not unusual when designing up a new project that it is important to find a consistent set of icons for user interface or mapping purposes. Full libraries or icon sets can be important because mixing and matching icons from multiple sources often conveys a bit of chaos or unprofessionalism.

Structured Dynamics monitors freely available icons for these purposes and provides listings to its clients so that they may tailor and choose their own looks-and feel. The material below is the reference listing of about 20 comprehensive sets of open source icons that may be used for the open semantic framework (OSF) or sWebMap interfaces. Links to other listings are also provided. These references are kept up-to-date on the OSF TechWiki.

General Icons

Here are some consistent families of general user interface icons. While there are thousands of free icons available from many venues (check out via search engines), there are fewer that have sufficient diversity and scope to encompass most user interface needs. Since it is noticeably jarring to mix icon styles in the same interface (or, at least to do so indiscriminately), it is important to have a consistent design image.

Here are the candidate choices we have found. Some are provided in either multiple size formats or in vector (generally, SVG), formats:

  • The Silk icons from famfamfam is a set of over 700 16-by-16 pixel icons in PNG format (144 of which are also available as GIF mini-icons, see below). This is the standard open source set used as the basis for Pastel (see below), which is used in the various conStruct tools. There are also other free icons from this site
Famfamfam.png
  • Tango is an icon library that contains a basic set of icons for the most common usage. They come in 16×16 and 22×22 sizes, and some are scalable (vector). There are also a variety of extensions for specific purposes
Tango.png
  • Pastel SVG is an icon set based on the Silk icons noted above from FamFamFam.com. Pastel uses the same style, but comes in the sizes of 16, 24, 32, 48, 64, 72, 96, 128 or 256 pixels square; a sampling is shown below
Pastel_sample.png

Pastel is the standard icon set chosen for conStruct tools.

  • The Fugue icons by Yusuke Kamiyamane is the largest set available, and contains 3000 individual icons in 16×16 PNG format. Here is a sampling:
Fugue.png

Alternatively, there is a smaller set of 400 icons called Diagona also available from the same designer

  • Nuvola is a set of 600 icons in either PNG or SVG format from David Vignoni (Icon King). The PNG come in standard sizes of 16, 22, 32, 48, 64 or 128 pixels square. Here is a sampling:
Nuvola.png

Vignoni also has an alternative set of icons with a similar feel called Oxygen.

  • The Crystal set of more than 1300 icons is organized into six different sizes, and is divided into the categories of actions, apps, files systems, devices and mime types. Here is a sampling:
Crystal.png

Other Sources

According to the Open Icon Library, which has a nice gallery (but which also mixes sources), here are some other key sources of open source icons not already listed above:

See also the icon sets used within Wikipedia itself.

Lastly, and perhaps most usefully, peruse the 750+ icon sets on Icon Finder.

Map Icons

With the emergence of Web 2.0 and locational services, particularly the open API and “thumbtack” aspect of Google Maps, a new category of map markers for web mapping has emerged. This category is still new enough that an accepted terminology has not yet developed. Among other terms, here are some of the ways that these locational markers on maps have been described:

  • Places of interest (POIs)
  • Points of interest (POIs)
  • Pins
  • Pushpins
  • Placemarks
  • Thumbtacks
  • Markers
  • Location markers
  • Map pointers.

Here are some of the consolidated sources of open source markers now available:

  • This is a sampling of 120 markers or so available within the Google MyMaps API (see further this link with shadows and this full listing). All have matching shadows useful for conveying a 3D feeling:
Matt77.png

There are also about 250 standard icons provided within the Google Earth set. You can see those listed here. Also, to see the available icon libraries in Google maps (plus some others), see this link

  • Map Icons Collection is a set of more than 1000 free icons to use as placemarks for POI locations on maps (originally designed for the Google Maps API). Most of these icon markers are square in aspect with a pointer, and are organized by color-coordinated categories such as numbers, cinemas, hotels, banks, etc. Here is a sampling:
Google_map_icons.png
  • The Maki icon set consists of more than 100 black and white 15×15 map markers
  • This listing provides three different colors in the Google Map “teardrop” style for all letters and 99 numbers
  • Geosilk is an extension of the standard Silk icon set noted above. It is more applicable to UI icons relating to map functions than to map markers per se
  • Green Map contains a set of about 170 monochrome (can be colored differently) POI markers, with an orientation to nature or ecological categories. There are also local extensions
  • Map Pins provides 22 alternative map pins and flags:
Map_pins.png
  • 50 monochrome POI and map marker symbols from the US National Park Service (NPS):
Nps_markers.png
  • There is a similar (and complementary in design) set of 50 monochrome pedestrian and transportation symbols from AIGA in cooperation with the US Department of Transportation

Dynamic Markers

Some markers can be created dynamically with the Google Map API. Here are some background articles and links:

Other Listings

Various other listings, many with icons but perhaps not organized into the same uniform sets, include:

Posted by AI3's author, Mike Bergman Posted on March 21, 2011 at 2:30 am in Open Semantic Framework, Open Source, Software Development | Comments (2)
The URI link reference to this post is: http://www.mkbergman.com/952/noteworthy-icon-libraries-for-projects-and-web-mapping/
The URI to trackback this post is: http://www.mkbergman.com/952/noteworthy-icon-libraries-for-projects-and-web-mapping/trackback/
Posted:February 15, 2011

UMBEL Vocabulary and Reference Concept OntologyA Seminal Release by SD and Ontotext; Links to Wikipedia and PROTON

Structured Dynamics and Ontotext are pleased to announce — after four years of iterative refinement — the release of version 1.00 of UMBEL (Upper Mapping and Binding Exchange Layer). This version is the first production-grade release of the system. UMBEL’s current implementation is the result of much practical experience.

UMBEL is primarily a reference ontology, which contains 28,000 concepts (classes and relationships) derived from the Cyc knowledge base. The reference concepts of UMBEL are mapped to Wikipedia, DBpedia ontology classes, GeoNames and PROTON.

UMBEL is designed to facilitate the organization, linkage and presentation of heterogeneous datasets and information. It is meant to lower the time, effort and complexity of developing, maintaining and using ontologies, and aligning them to other content.

This release 1.00 builds on the prior five major changes in UMBEL v. 0.80 announced last November. It is open source, provided under the Creative Commons Attribution 3.0 license.

Profile of the Release

In broad terms, here is what is included in the new version 1.00:

  • A core structure of 27,917 reference concepts (RCs)
  • The clustering of those concepts into 33 mostly disjoint SuperTypes (STs)
  • Direct RC mapping to 444 PROTON classes
  • Direct RC mapping to 257 DBpedia ontology classes
  • An incomplete mapping to 671 GeoNames features
  • Direct mapping of 16,884 RCs to Wikipedia (categories and pages)
  • The linking of 2,130,021 unique Wikipedia pages via 3,935,148 predicate relations; all are characterized by one or more STs
    • 876,125 are assigned a specific rdf:type
  • The UMBEL RefConcepts have been re-organized, with most local, geolocational entities moved to a supplementary module. 577 prior (version 0.80) UMBEL RCs and a further 3204 new RCs have been added to this geolocational module. This module is not being released for the current version because testing is incomplete (watch for a pending version 1.0x)
  • Some vocabulary changes, including some new and some dropped predicates (see next), and
  • Added an Annex H that describes the version 1.00 changes and methods.

Vocabulary Summary

UMBEL’s basic vocabulary can also be used for constructing specific domain ontologies that can easily interoperate with other systems. This release sees a number of changes in the UMBEL vocabulary:

  • A new correspondsTo predicate has been added for nearly or approximate sameAs mappings (symmetric, transitive, reflexive)
  • A controlled vocabulary of qualifiers was developed for the hasMapping predicate
  • 31 new relatesToXXX predicates have been added to relate external entities or concepts to UMBEL SuperTypes
  • Some disjointedness assertions between SuperTypes were added or changed.

The UMBEL Vocabulary defines three classes:

The UMBEL Vocabulary defines these properties:

The UMBEL vocabulary also has a significant reliance on SKOS, among other external vocabularies.

Access and More Information

Here are links to various downloads, specifications, communities and assistance.

Specifications and Documentation

All documentation from the prior v 0.80 has been updated, and some new documentation has been added:

Major updates were made to the specifications and Annex G; Annex H is new. Minor changes were also made to Annexes A and B. All remaining Annexes only had minor header changes. All spec documents with minor or major changes were also versioned, with the earlier archives now date stamped.

Files and Downloads

All UMBEL files are listed on the Downloads and SVN page on the UMBEL Web site. The reference concept and mapping files may also be obtained from http://code.google.com/p/umbel/source/browse/#svn/trunk.

Additional Information

To learn more about UMBEL or to participate, here are some additional links:

Acknowledgements

These latest improvements to UMBEL and its mappings have been undertaken by Structured Dynamics and Ontotext. Support has also been provided by the European Union research project RENDER, which aims to develop diversity-aware methods in the ways Web information is selected, ranked, aggregated, presented and used.

Next Steps

This release continues the path to establish a gold standard between UMBEL and Wikipedia to guide other ontological, semantic Web and disambiguation needs. For example, the number of UMBEL reference concepts was expanded by some 36% from 20,512 to 27,917 in order to provide a more balanced superstructure for organizing Wikipedia content. And across all mappings, 60% of all UMBEL reference concepts (or 16,884) are now linked directly to Wikipedia via the new umbel:correspondsTo property. A later post will describe the design and importance of this gold standard in greater detail.

Next releases will expand this linkage and coverage, and bring in other important reference structures such as GeoNames and others. This version of UMBEL will also be incorporated into the next version of FactForge. We will also be re-invigorating the Web vocabulary access and Web services, and adding tagging services based on UMBEL.

We invite other players with an interest in reusable and broadly applicable vocabularies and reference concepts to join with us in these efforts.


[1] Note, for legacy reasons, you may still encounter reference to ‘subject concepts’ in earlier UMBEL documentation. Please consider that term as interchangeable with the current ‘reference concepts’.

Posted by AI3's author, Mike Bergman Posted on February 15, 2011 at 12:39 am in Ontologies, Open Source, Semantic Web, UMBEL | Comments (0)
The URI link reference to this post is: http://www.mkbergman.com/945/announcing-the-first-production-grade-umbel/
The URI to trackback this post is: http://www.mkbergman.com/945/announcing-the-first-production-grade-umbel/trackback/
Posted:February 7, 2011

Sweet Tools ListingNow Presented as a Semantic Component; Grows to 900+ Tools

Sweet Tools, AI3‘s listing of semantic Web and -related tools, has just been released with its 17th update. The listing now contains more than 900 tools, about a 10% increase over the last version. Significantly the listing is also now presented via its own semantic tool, the structSearch sComponent, which is one of the growing parts to Structured Dynamics‘ open semantic framework (OSF).

So, we invite you to go ahead and try out this new Flex/Flash version with its improved search and filtering! We’re pretty sure you’ll like it.

Summary of Major Changes Sweet Tools structSearch View

Sweet Tools now lists 907 919 tools, an increase of 72 84 (or 8.6 10.1%) over the prior version of 835 tools. The most notable trend is the continued increase in capabilities and professionalism of (some of) the new tools.

This new release of Sweet Tools — available for direct play and shown in the screenshot to the right — is the first to be presented via Structured Dynamics’ Flex-based semantic component technology. The system has greatly improved search and filtering capabilities; it also shares the superior dataset management and import/export capabilities of its structWSF brethren.

As a result, moving forward, Sweet Tools updates will now be added on a more regular basis, reducing the big burps that past releases have tended to follow. We will also see much expanded functionality over time as other pieces of the structWSF and sComponents stack get integrated and showcased using this dataset.

This release is the first in WordPress, and shows the broad capabilities of the OSF stack to be embedded in a variety of CMS or standalone systems. We have provided some updates on Structured Dynamics’ OSF TechWiki for how to modify, embed and customize these components with various Flex development frameworks (see one, two or three), such as Flash Builder or FlashDevelop.

We should mention that the OSF code group is also seeing external parties exposing these capabilities via JavaScript deployments as well. This recent release expands on the conStruct version with its capabilities described in a post about a year ago.

Retiring the Exhibit Version

However, this release does mark the retirement of the very fine Exhibit version of Sweet Tools (an archive version will be kept available until it gets too long in the tooth). I was one of the first to install a commercial Exhibit system, and the first to do so on WordPress, as I described in an article more than four years ago.

Exhibit has worked great and without a hitch, and through a couple of upgrades. It still has (I think) a superior faceting system and sorting capabiities to what we presently offer with our own sComponent alternative. However, the Exhibit version is really a display technology alone, and offers no search, access control or underlying data management capabilities (such as CRUD), all of which are integral to our current system. It is also not grounded in RDF or semantic technologies, though it does have good structural genes. And, Sweet Tools has about reached the limits of the size of datasets Exhibit can handle efficiently.

Exhibit has set a high bar for usability and lightweight design. As we move in a different direction, I’d like again to publicly thank David Huynh, Exhibit’s developer, and the MIT Simile program for when he was there, for putting forward one of the seminal structured data tools of the past five years.

Updated Statistics

The updated Sweet Tools listing now includes nearly 50 different tools categories. The most prevalent categories are browser tools (RDF, OWL), information extraction, ontology tools, parsers or converters, and general RDF tools. The relative share by category is shown in this diagram (click to expand):

Since the last listing, the fastest growing categories have been utilities (general and RDF) and visualization. Linked data listings have also grown by 200%, but are still a relatively small percentage of the total.

These values should be taken with a couple of grains of salt. First, not all of these additions are organic or new releases. Some are the result of our own tools efforts and investigations, which can often surface prior overlooked tools. Also, even with this large number of application categories, many tools defy characterization, and can reside in multiple categories at once or are even pointing to new ones. So, the splits are illustrative, but not defining.

General language percentages have been keeping pretty constant over the past couple of years. Java remains the leading language with nearly half of all applications, a percentage it has kept steady for four years. PHP continues to grow in popularity, and actually increased the largest percentage amount of any language over this past census. The current language splits are shown in the next diagram (click to expand):

C/C++ and C# have really not grown at all over the past year. Again, however, for the reasons noted, these trends should be interpreted with care.

Tasty Dogfood?Dogfood Never Tasted So Good

Tools development is hard and the open source nature of today’s development tends to require a certain critical mass of developer interest and commitment. There are some notable tools that have much use and focus and are clearly professional and industrial grade. Yet, unfortunately, too many of the tools on the Sweet Tools listing are either proofs-of-concept, academic demos, or largely abandoned because of lack of interest by the original developer, the community or the market as a whole.

There is a common statement within the community about how important it is for developers to “eat their own dogfood.” On the face of it, this makes some sense since it conveys a commitment to use and test applications as they are developed.

But looked at more closely, this sentiment carries with it a troublesome reflection of the state of (many) tools within the semantic Web: too much kibble that is neither attractive nor tasty. It is probably time to keep the dogfood in the closet and focus on well-cooked and attractive fare.

We at Structured Dynamics are not trying to hold ourselves up as exemplars or the best chefs of tasty food. We do, however, have a commitment to produce fare that is well prepared and professional. Let’s stop with the dogfood and get on with serving nutritious and balanced fare to the marketplace.

Posted by AI3's author, Mike Bergman Posted on February 7, 2011 at 1:47 am in Open Source, Semantic Web Tools, Structured Web | Comments (1)
The URI link reference to this post is: http://www.mkbergman.com/942/tasty-new-sweet-tools-release/
The URI to trackback this post is: http://www.mkbergman.com/942/tasty-new-sweet-tools-release/trackback/