Locational information — points of interest/POIs, paths/routes/polylines, or polygons/regions — is common to many physical things in our real world. Because of its pervasiveness, it is important to have flexible and powerful display widgets that can respond to geo-locational data. We have been working for some time to extend our family of semantic components  within the open semantic framework (OSF)  to encompass just such capabilities. Structured Dynamics is thus pleased to announce that we have now added the sWebMap component, which marries the entire suite of Google Map API capabilities to the structured data management arising from the structWSF Web services framework  at the core of OSF.
The sWebMap component is fully in keeping with our design premise of ontology-driven applications, or ODapps . The sWebMap component can itself be embedded in flexible layouts — using Drupal in our examples below — and can be very flexibly themed and configured. sWebMap we believe will rapidly move to the head of the class as the newest member of Structured Dynamics’ open source semantic components.
The absolutely cool thing about sWebMap is it just works. All one needs to do is relate it to a geo-enabled Search structWSF endpoint, and then all of the structured data with geo-locational attributes and its facets and structure becomes automagically available to the mapping widget. From there you can flexible map, display, configure, filter, select and keep those selections persistent and share with others. As new structured data is added to your system, that data too becomes automatically available.
Though screen shots in the operation of this component are provided below, here are some further links to learn more:
There is considerable functionality in the sWebMap widget, not all immediately obvious when you first view it.
Here is an example for sWebMap when it first comes up, using an example for the “Beaumont neighborhood”:
It is possible to set pre-selected items for any map display. That was done in this case, which shows the pre-selected items and region highlighted on the map and in the records listing (lower left below map).
The basic layout of the map has its main search options at the top, followed by the map itself and then two panels underneath:
The left-hand panel underneath the map presents the results listing. The right-hand panel presents the various filter options by which these results are generated. The filter options consist of:
As selections are made in sources or kinds, the subsequent choices narrow.
The layout below shows the key controls available on the sWebMap:
You can go directly to an affiliated page by clicking the upper right icon. This area often shows a help button or other guide. The search box below that enables you to search for any available data in the system. If there is information that can be mapped AND which occurs within the viewport of the current map size, those results will appear as one of three geographic feature types on the map:
At the map’s right is the standard map control that allows you to scroll the map area or zoom. Like regular Google maps, you can zoom (+ or – keys, or middle wheel on mouse) or navigate (arrow direction keys, or left mouse down and move) the map.
Current records are shown below the map. Specific records may be selected with its checkbox; this keeps them persistent on the map and in the record listing no matter what the active filter conditions may be. (You may also see a little drawing icon , which presents an attribute report — similar to a Wikipedia ‘infobox‘ — for the current record). You can see in this case that the selected record also corresponds to a region (polygon) shape on the map.
In the map area itself, it is possible to also get different map views by selecting one of the upper right choices. In this case, we can see a satellite view (or “layer”):
Or, we can choose to see a terrain layer:
Or there may optionally be other layers or views available in this same section.
Another option that appears on the map is the ability to get a street view of the map. That is done by grabbing the person icon at the map left and dragging it to where you are interested within the map viewport. That also causes the street portion to be highlighted, with street view photos displayed (if they exist for that location):
By clicking the person icon again, you then shift into walking view:
Via the mouse, you can now navigate up and down these streets and change perspective to get a visual feel for the area.
Another option you may invoke is the multi-map view of the sWebMap. In this case, the map viewing area expands to include three sub-maps under the main map area. Each sub-map is color-coded and shown as a rectangle on the main map. (This particular example is displaying assessment parcels for the sample instance.) These rectangles can be moved on the main map, in which case their sub-map displays also move:
You must re-size using the sub-map (which then causes the rectangle size to change on the main map). You may also pan the sub-maps (which then causes the rectangle to move on the main map). The results list at the lower left is determined by which of the three sub-maps is selected (as indicated by the heavier bottom border).
There are two ways to get filter selection details for your current map: Show All Records or Search.
In the first case, we pick the Show All Records option at the bottom of the map view, which then brings up the detailed filter selections in the lower-right panel:
Here are some tips for using the left-hand records listing:
The records that actually appear on this listing are based on the records scope or Search (see below) conditions, as altered by the filter settings on the right-hand listing under the sWebMap. For example, if we now remove the neighborhood record as being persistent and Show included records we now get items across the entire map viewport:
Search works in a similar fashion, in that it invokes the filter display with the same left- and right-hand listings appear under the sWebMap, only now only for those records that met the search conditions. (The allowable search syntax is that for Lucene.) Here is the result of a search, in this case for “school”:
As shown above, the right-hand panel is split into three sections: Sources (or datasets), Kinds (that is, similar types of things, such as bus stops v schools v golf courses), and Attributes (that is, characteristics for these various types of things). All selection possibilities are supported by auto-select.
Sources and Kinds are selected via checkbox. (The default state when none are checked is to show all.) As more of these items are selected, the records listing in the left-hand panel gets smaller. Also, the counts of available items [as shown by the (XX) number at the end of each item] are also changed as filters are added or subtracted by adding or removing checkboxes.
Applying filters to Attributes works a little differently. Attributes filters are selected by selecting the magnifier plus  icon, which then brings up a filter selection at the top of the listing underneath the Attributes header.
The specific values and their counts (for the current selection population) is then shown; you may pick one or more items. Once done, you may pick another attribute to add to the filter list, and continue the filtering process.
sWebMaps have a useful way to save and share their active filter selections. At any point as you work with a sWebMap, you can save all of its current settings and configurations — viewport area, filter selections, and persistent records — via some simple steps.
You initiate this functionality by choosing the save button at the upper right of the map panel:
When that option is invoked, it brings up a dialog where you are able to name the current session, and provide whatever explanatory notes you think might be helpful.
Once you have a saved session, you will then see a new control at the upper right of your map panel. This control is how you load any of your previously saved sessions:
Further, once you load a session, still further options are presented to you that enables you to either delete or share that session:
If you choose to share a session, a shortened URI is generated automatically for you:
If you then provide that URI link to another user, that user can then click on that link and see the map in the exact same state — viewport area, filter selections, and persistent records — as you initially saved. If the recipient then saves this session, it will now also be available persistently for his or her local use and changes.
We have been maintaining Sweet Tools, AI3‘s listing of semantic Web and -related tools, for a bit over five years now. Though we had switched to a structWSF-based framework that allows us to update it on a more regular, incremental schedule , like all databases, the listing needs to be reviewed and cleaned up on a periodic basis. We have just completed the most recent cleaning and update. We are also now committing to do so on an annual basis.
Thus, this is the inaugural ‘State of Tooling for Semantic Technologies‘ report, and, boy, is it a humdinger. There have been more changes — and more important changes — in this past year than in all four previous years combined. I think it fair to say that semantic technology tooling is now reaching a mature state, the trends of which likely point to future changes as well.
In this past year more tools have been added, more tools have been dropped (or abandoned), and more tools have taken on a professional, sophisticated nature. Further, for the first time, the number of semantic technology and -related tools has passed 1000. This is remarkable, given that more tools have been abandoned or retired than ever before.
We first present our key findings and then overall statistics. We conclude with a discussion of observed trends and implications for the near term.
Some of the key findings from the 2011 State of Tooling for Semantic Technologies are:
Many of these points are elaborated below.
The updated Sweet Tools listing now includes nearly 50 different tools categories. The most prevalent categories, each with over 6% of the total, are information extraction, general RDF tools, ontology tools, browser tools (RDF, OWL), and parsers or converters. The relative share by category is shown in this diagram (click to expand):
Since the last listing, the fastest growing categories have been SPARQL, linked data, knowledge bases and all things related to ontologies. The relative changes by tools category are shown in this figure:
Though it is true that some of this growth is the result of discovery, based on our own tool needs and investigations, we have also been monitoring this space for some time and serendipity is not a compelling explanation alone. Rather, I think that we are seeing both an increase in practical tools (such as for querying), plus the trends of linked data growth matched with greater sophistication in areas such as ontologies and the OWL language.
The languages these tools are written in have also been pretty constant over the past couple of years, with Java remaining dominant. Java has represented half of all tools in this space, which continues with the most recent tools as well (see below). More than a dozen programming or scripting languages have at least some share of the semantic tooling space (click to expand):
With only 160 new tools it is hard to draw firm trends, but it does appear that some languages (Haskell, XSLT) have fallen out of favor, while popularity has grown for Flash/Flex (from a small base), Python and Prolog (with the growth of logic tools):
PHP will likely continue to see some emphasis because of relations to many content management systems (WordPress, Drupal, etc.), though both Python and Ruby seem to be taking some market share in that area.
The higher incidence of Prolog is likely due to the parallel increase in reasoners and inference engines associated with ontology (OWL) tools.
The increase in comprehensive tool suites and use of Eclipse as a development environment would appear to secure Java’s dominance for some time to come.
These dry statistics tend to mask the feel one gets when looking at most of the individual tools across the board. Older academic and government-funded project tools are finally getting cleaned out and abandoned. Those tools that remain have tended to get some version upgrades and improved Web sites to accompany them.
The general feel one gets with regard to semantic technology tooling at the close of 2011 has these noticeable trends:
I have said this before, and been wrong about it before, but it is hard to see the tooling growth curve continue at its current slope into the future. I think we will see many individual tools spring up on the open source hosting sites like Google and Github, perhaps at relatively the same steady release rate. But, old projects I think will increasingly be abandoned and older projects will not tend to remain available for as long a time. While a relatively few established open source standards, like Solr and Jena, will be the exception, I think we will see shorter shelf lives for most open source tools moving forward. This will lead to a younger tools base than was the case five or more years ago.
I also think we will continue to see the dominance of open source. Proprietary software has increasingly been challenged in the enterprise space. And, especially in semantic technologies, we tend to see many open source tools that are as capable as proprietary ones, and generally more dynamic as well. The emphasis on open data in this environment also tends to favor open source.
Yet, despite the professionalism, sophistication and complexity trends, I do not yet see massive consolidation in the semantic technology space. While we are seeing a rapid maturation of tooling, I don’t think we have yet seen a similar maturation in revenue and business models. While notable semantic technology start-ups like Powerset and Siri have been acquired and are clear successes, these wins still remain much in the minority.
We have been touting the importance of OWL 2 as the language of choice for federating and reasoning over RDF and ontologies. An absolutely essential enabler of the OWL 2 language is version 3 of the OWL API (actually, version 3.2.4 at the time of this writing), a Java-based framework for accessing and managing the language. Protégé 4, the most popular open source ontology editor and integrated development environment (IDE), for example, is built around the OWL API.
As we laid out a bit more than a year ago, now codified on our TechWiki as the Normative Landscape of Ontology Tools (especially the second figure), we see the OWL API as the essential pivot point for all forms of ontology tools moving forward.
We have attempted to assemble a definitive and comprehensive list of all known tools presently based around version 3 of the OWL API. (We have surely missed some and welcome comments to this post that identify missing ones; we promise to add them and keep tracking them.) Herein is a listing of the 30 or so known OWL API-based tools:
Ignazio Palmisano also graciously suggested these additional sources:
which also further leads to this additional listing:
It is not clear if all of these offer OWL 2 support, let along work with the current OWL API.
Though I never intended it, some posts of mine from a few years back dealing with 26 tools for large-scale graph visualization have been some of the most popular on this site. Indeed, my recommendation for Cytoscape for viewing large-scale graphs ranks within the top 5 posts all time on this site.
When that analysis was done in January 2008 my company was in the midst of needing to process the large UMBEL vocabulary, which now consists of 28,000 concepts. Like anything else, need drives research and demand, and after reviewing many graphing programs, we chose Cytoscape, then provided some ongoing guidelines in its use for semantic Web purposes. We have continued to use it productively in the intervening years.
Like for any tool, one reviews and picks the best at the time of need. Most recently, however, with growing customer usage of large ontologies and the development of our own structOntology editing and managing framework, we have begun to butt up against the limitations of large-scale graph and network analysis. With this post, we announce our new favorite tool for semantic Web network and graph analysis — Gephi — and explain its use and showcase a current example.
Three and one-half years ago when I first wrote about Cytoscape, it was at version 2.5. Today, it is at version 2.8, and many aspects have seen improvement (including its Web site). However, in other respects, development has slowed. For example, version 3.x was first discussed more than three years ago; it is still not available today.
Though the system is open source, Cytoscape has also largely been developed with external grant funds. Like other similarly funded projects, once and when grant funds slow, development slows as well. While there has clearly been an active community behind Cytoscape, it is beginning to feel tired and a bit long in the tooth. From a semantic Web standpoint, some of the limitations of the current Cytoscape include:
Undoubtedly, were we doing semantic technologies in the biomedical space, we might well develop our own plug-ins and contribute to the Cytoscape project to help overcome some of these limitations. But, because I am a tools geek (see my Sweet Tools listing with nearly 1000 semantic Web and -related tools), I decided to check out the current state of large-scale visualization tools and see if any had made progress on some of our outstanding objectives.
There are three classes of graph tools in the semantic technology space:
One could argue that the first two categories have received the most current development attention. But, I would also argue that the third class is one of the most critical: to understand where one is in a large knowledge space, much better larger-scale visualization and navigation tools are needed. Unfortunately, this third category is also the one that appears to be receiving the least development attention. (To be sure, large-scale graphs pose computational and performance challenges.)
In the nearly four years since my last major survey of 26 tools in this category, the new entrants appear quite limited. I’ve surely overlooked some, but the most notable are Gruff, NAViGaTOR, NetworkX and Gephi . Gruff actually appears to belong most in Category #2; I could find no examples of graphs on the scale of thousands of nodes. NAViGaTOR is biomedical only. NetworkX has no direct semantic graph importing and — while apparently some RDF libraries can be used for manipulating imports — alternative workflows were too complex for me to tackle for initial evaluation. This leaves Gephi as the only potential new candidate.
From a clean Web site to well-designed intro tutorials, first impressions of Gephi are strongly positive. The real proof, of course, was getting it to perform against my real use case tests. For that, I used a “big” ontology for a current client that captures about 3000 different concepts and their relationships and more than 100 properties. What I recount here — from first installing the program and plug-ins and then setting up, analyzing, defining display parameters, and then publishing the results — took me less than a day from a totally cold start. The Gephi program and environment is surprisingly easy to learn, aided by some great tutorials and online info (see concluding section).
The critical enabler for being able to use Gephi for this source and for my purposes is the SemanticWebImport plug-in, recently developed by Fabien Gandon and his team at Inria as part of the Edelweiss project . Once the plug-in is installed, you need only open up the SemanticWebImport tab, give it the URL of your source ontology, and pick the Start button (middle panel):
Once loaded, an ontology (graph) can be manipulated with a conventional IDE-like interface of tabs and views. In the right-hand panels above we are selecting various network analysis routines to run, in this case Average Degrees. Once one or more of these analysis options is run, we can use the results to then cluster or visualize the graph; the upper left panel shows highlighting the Modularity Class, which is how I did the community (clustering) analysis of our big test ontology. (When run you can also assign different colors to the cluster families.) I also did some filtering of extraneous nodes and properties at this stage and also instructed the system via the ranking analysis to show nodes with more link connections as larger than those nodes with fewer links.
At this juncture, you can also set the scale for varying such display options as linear or some power function. You can also select different graph layout options (lower left panel). There are many layout plug-in options for Gephi. The layout plugin called OpenOrd, for instance, is reported to be able to scale to millions of nodes.
At this point I played extensively with the combination of filters, analysis, clusters, partitions and rankings (as may be separately applied to nodes and edges) to: 1) begin to understand the gross structure and characteristics of the big graph; and 2) refine the ultimate look I wanted my published graph to have.
In our example, I ultimately chose the standard Yifan Hu layout in order to get the communities (clusters) to aggregate close to one another on the graph. I then applied the Parallel Force Atlas layout to organize the nodes and make the spacings more uniform. The parallel aspect of this force-based layout allows these intense calculations to run faster. The result of these two layouts in sequence is then what was used for the results displays.
Upon completion of this analysis, I was ready to publish the graph. One of the best aspects of Gephi is its flexibility and control over outputs. Via the main Preview tab, I was able to do my final configurations for the published graph:
Standard output options include either SVG (vector image) or PDFs, as shown at the lower left, with output size scaling via slider bar. Also, it is possible to do standard saves under a variety of file formats or to do targeted exports.
One really excellent publication option is to create a dynamically zoomable display using the Seadragon technology via a separate Seadragon Web Export plug-in. (However, because of cross-site scripting limitations due to security concerns, I only use that option for specific sites. See next section for the Zoom It option — based on Seadragon — to workaround that limitation.)
Note: at standard resolution, if this graph were to be rendered in actual size, it would be larger than 7 feet by 7 feet square at full zoom !!!
To compare output options, you may also;
It is notable that Gephi still only versions itself as an “alpha”. There is already a robust user community with promise for much more technology to come.
As an alpha, Gephi is remarkably stable and well-developed. Though clearly useful as is, I measure the state of Gephi against my complete list of desired functionality, with these items still missing:
Ultimately, of course, as I explained in an earlier presentation on a Normative Landscape for Ontology Tools, we would like to see a full-blown graphical program tie in directly with the OWL API. Some initial attempts toward that have been made with the non-Gephi GLOW visualization approach, but it is still in very early phases with ongoing commitments unknown. Optimally, it would be great to see a Gephi plug-in that ties directly to the OWL API.
In any event, while perhaps Cytoscape development has stalled a bit for semantic technology purposes, Gephi and its SemanticWebImport plug-in have come roaring into the lead. This is a fine toolset that promises usefulness for many years to come.
To learn more about Gephi, also see the:
Also, for future developments across the graph visualization spectrum, check out the Wikipedia general visualization tools listing on a periodic basis.
Sweet Tools, AI3‘s listing of semantic Web and -related tools, has just been released with its 17th update. The listing now contains more than 900 tools, about a 10% increase over the last version. Significantly the listing is also now presented via its own semantic tool, the structSearch sComponent, which is one of the growing parts to Structured Dynamics‘ open semantic framework (OSF).
So, we invite you to go ahead and try out this new Flex/Flash version with its improved search and filtering! We’re pretty sure you’ll like it.
Sweet Tools now lists 907 919 tools, an increase of 72 84 (or 8.6 10.1%) over the prior version of 835 tools. The most notable trend is the continued increase in capabilities and professionalism of (some of) the new tools.
This new release of Sweet Tools — available for direct play and shown in the screenshot to the right — is the first to be presented via Structured Dynamics’ Flex-based semantic component technology. The system has greatly improved search and filtering capabilities; it also shares the superior dataset management and import/export capabilities of its structWSF brethren.
As a result, moving forward, Sweet Tools updates will now be added on a more regular basis, reducing the big burps that past releases have tended to follow. We will also see much expanded functionality over time as other pieces of the structWSF and sComponents stack get integrated and showcased using this dataset.
This release is the first in WordPress, and shows the broad capabilities of the OSF stack to be embedded in a variety of CMS or standalone systems. We have provided some updates on Structured Dynamics’ OSF TechWiki for how to modify, embed and customize these components with various Flex development frameworks (see one, two or three), such as Flash Builder or FlashDevelop.
However, this release does mark the retirement of the very fine Exhibit version of Sweet Tools (an archive version will be kept available until it gets too long in the tooth). I was one of the first to install a commercial Exhibit system, and the first to do so on WordPress, as I described in an article more than four years ago.
Exhibit has worked great and without a hitch, and through a couple of upgrades. It still has (I think) a superior faceting system and sorting capabiities to what we presently offer with our own sComponent alternative. However, the Exhibit version is really a display technology alone, and offers no search, access control or underlying data management capabilities (such as CRUD), all of which are integral to our current system. It is also not grounded in RDF or semantic technologies, though it does have good structural genes. And, Sweet Tools has about reached the limits of the size of datasets Exhibit can handle efficiently.
Exhibit has set a high bar for usability and lightweight design. As we move in a different direction, I’d like again to publicly thank David Huynh, Exhibit’s developer, and the MIT Simile program for when he was there, for putting forward one of the seminal structured data tools of the past five years.
The updated Sweet Tools listing now includes nearly 50 different tools categories. The most prevalent categories are browser tools (RDF, OWL), information extraction, ontology tools, parsers or converters, and general RDF tools. The relative share by category is shown in this diagram (click to expand):
Since the last listing, the fastest growing categories have been utilities (general and RDF) and visualization. Linked data listings have also grown by 200%, but are still a relatively small percentage of the total.
These values should be taken with a couple of grains of salt. First, not all of these additions are organic or new releases. Some are the result of our own tools efforts and investigations, which can often surface prior overlooked tools. Also, even with this large number of application categories, many tools defy characterization, and can reside in multiple categories at once or are even pointing to new ones. So, the splits are illustrative, but not defining.
General language percentages have been keeping pretty constant over the past couple of years. Java remains the leading language with nearly half of all applications, a percentage it has kept steady for four years. PHP continues to grow in popularity, and actually increased the largest percentage amount of any language over this past census. The current language splits are shown in the next diagram (click to expand):
C/C++ and C# have really not grown at all over the past year. Again, however, for the reasons noted, these trends should be interpreted with care.
Tools development is hard and the open source nature of today’s development tends to require a certain critical mass of developer interest and commitment. There are some notable tools that have much use and focus and are clearly professional and industrial grade. Yet, unfortunately, too many of the tools on the Sweet Tools listing are either proofs-of-concept, academic demos, or largely abandoned because of lack of interest by the original developer, the community or the market as a whole.
There is a common statement within the community about how important it is for developers to “eat their own dogfood.” On the face of it, this makes some sense since it conveys a commitment to use and test applications as they are developed.
But looked at more closely, this sentiment carries with it a troublesome reflection of the state of (many) tools within the semantic Web: too much kibble that is neither attractive nor tasty. It is probably time to keep the dogfood in the closet and focus on well-cooked and attractive fare.
We at Structured Dynamics are not trying to hold ourselves up as exemplars or the best chefs of tasty food. We do, however, have a commitment to produce fare that is well prepared and professional. Let’s stop with the dogfood and get on with serving nutritious and balanced fare to the marketplace.