We are pleased to announce the release of version 1.05 of UMBEL, which now has linkages to schema.org  and GeoNames . UMBEL has also been split into ‘core’ and ‘geo’ modules. The resulting smaller size of UMBEL ‘core’ — now some 26,000 reference concepts — has also enabled us to create a full visualization of UMBEL’s content graph.
The first notable change in UMBEL v. 1.05 is its mapping to schema.org. schema.org is a collection of schema (usable as HTML tags) that webmasters can use to markup their pages in ways recognized by major search providers. schema.org was first developed and organized by the major search engines of Bing, Google and Yahoo!; later Yandex joined as a sponsor. Now many groups are supporting schema.org and contributing vocabularies and schema.
I was one of the first to hail schema.org hours after its announcement . It seemed only fair that we put our money where our mouth is and map UMBEL to it as well.
The UMBEL-schema.org mapping was manually done by, firstly, searching and inspecting the current UMBEL concept base for appropriate matches. If that mapping failed to find a rather direct correspondence between existing UMBEL concepts and the types in schema.org, the source concept reference of OpenCyc was then inspected in a similar manner. Failing a match from either of these two sources, the decision was to add a new concept to the ‘core’ UMBEL. This new concept was then appropriately placed into the UMBEL reference concept subject structure.
The net result of this process was to add 298 mapped schema.org types to UMBEL. This mapping required a further three concepts from OpenCyc, and a further 78 new reference concepts, to be added to UMBEL. Along with the new updates to UMBEL and its mappings, the section of Key Files below provides further explanatory links. We are reserving the addition of schema.org properties for a later time, when we plan to re-organize the Attributes SuperType within UMBEL.
Modularization of the UMBEL Vocabulary
Even in the early development of UMBEL there was a tension about the scope and level of what geographic information to include in its concept base. The initial decision was to support country and leading-country province and state concepts, and some leading cities. This decision was in the spirit of a general reference structure, but still felt arbitrary.
GeoNames is devoted to geographical information and concepts — both natural and human artifacts — and has become the go-to resource for geo-locational information. The decision was thus made to split out the initial geo-locational information in UMBEL and replace it with mappings to GeoNames. This decision also had the advantage of beginning a process of modularization of UMBEL.
Two sets of reference concepts were identified as useful for splitting out from the ‘core’ UMBEL in a geo-locational aspect:
- Geopolitical places and places of human activities and facilities
- Natural geographical places and features.
These removed concepts were then placed into a separate ‘geo’ module of UMBEL, including all existing annotations and relations, resulting in a module of 1,854 concepts. That left 26,046 concepts in UMBEL ‘core’. Because of some shared parent concepts, there is some minor overlap between the two modules. These are now the modular splits in UMBEL version 1.05.
Mapping to GeoNames
GeoNames has a different structure to UMBEL. It has few classes and distinguishes its geographic information on the basis of some 671 feature codes. These codes span from geopolitical divisions — such as countries, states or provinces, cities, or other administrative districts — to splits and aggregations by natural and human features. Types of physical terrain — above ground and underwater — are denoted, as well as regions and landscape features governed by human activities (such as vineyards or lighthouses) . We wanted to retain this richness in our mappings.
We needed a bridge between feature codes and classes, a sort of umbrella property generally equivalent to
owl:sameAs in nature, but with some possible inexactitude or degree of approximation. The appropriate choice here is
umbel:correspondsTo, which was designed specifically for this purpose . This predicate is thus the basis for the mappings.
The 671 GeoNames feature codes were manually mapped to corresponding classes in the UMBEL concepts, in a manner identical to what was described for schema.org above. The result was to add another further three OpenCyc concepts and to add 88 new UMBEL reference concepts to accommodate the full GeoNames feature codes. We thus now have a complete representation of the full structure and scope of GeoNames in UMBEL.
There are three modes in which one can now work with UMBEL:
- With UMBEL ‘core’ alone, recommended when your concept space is not concerned with geographical information
- UMBEL ‘core’ plus the UMBEL ‘geo’ module — equivalent to prior versions of UMBEL, or
- UMBEL ‘core’ plus GeoNames, recommended where geographical information is important to your concept space.
In the latter case, you may use SPARQL queries with the
umbel:correspondsTo predicate to achieve the desired retrievals. If more logic is required, you will likely need to look to a rules-based addition such as SWRL  or RIF  to capture the
New Big Graph Visualization
Because of the UMBEL modularization, it has now become tractable to graph the main ontology in its entirety. The core UMBEL ontology contains about 26,000 reference concepts organized according to 33 super types. There are more than 60,000 relationships amongst these concepts, resulting in a graph structure of very large size.
It is difficult to grasp this graph in the abstract. Thus, using methods earlier described in our use of the Gephi visualization software , we present below a dynamic, navigable rendering of this graph of UMBEL core:
Note: at standard resolution, if this graph were to be rendered in actual size, it would be larger than 34 feet by 34 feet square at full zoom !!! Hint: that is about 1200 square feet, or 1/2 the size of a typical American house !
This UMBEL graph displays:
- All 26,000 concepts (“nodes”) with labels, and with connections shown (though you must must zoom to see)
- The color-coded relation of these nodes to the 33 or so major SuperTypes in UMBEL, as well as the relative position of these clusters with respect to one another, and
- When zooming (use scroll wheel or + icon) or panning (via mouse down moves), wait a couple of seconds to get the clearest image refresh:
You may also want to inspect a static version of this big graph by downloading a PDF.
- The main UMBEL Web site: http://umbel.org
- The basic UMBEL v. 1.05 release: https://github.com/structureddynamics/UMBEL
- Full UMBEL specifications and annexes: http://techwiki.umbel.org/index.php/Category:Specification
- The UMBEL ‘core’ module: https://github.com/structureddynamics/UMBEL/blob/master/Reference%20Structure/umbel_reference_concepts.n3/
- The UMBEL-schema.org mapping file: https://github.com/structureddynamics/UMBEL/blob/master/External%20Ontologies/schema.org.n3
- schema.org mapping methodology: Annex I: schema.org Mapping
- The UMBEL ‘geo” module: https://github.com/structureddynamics/UMBEL/blob/master/Modules/geo/umbel_geo.n3
- The UMBEL-GeoNames mapping file: https://github.com/structureddynamics/UMBEL/blob/master/External%20Ontologies/geonames.n3
- Geo methodology: Annex J: Geo Module and GeoNames Mapping
- The interactive UMBEL ‘big graph': http://umbel.org/content/umbel-graph.
 Approximate relationships are discussed in M.K. Bergman, 2010. “The Nature of Connectedness on the Web,” AI3:::Adaptive Information blog, November 22, 2010; see http://www.mkbergman.com/935/the-nature-of-connectedness-on-the-web/. One option, for example, is the
x:coref predicate from the UMBC Ebiquity group; see further Jennifer Sleeman and Tim Finin, 2010. “Learning Co-reference Relations for FOAF Instances,” Proceedings of the Poster and Demonstration Session at the 9th International Semantic Web Conference, November 2010; see http://ebiquity.umbc.edu/_file_directory_/papers/522.pdf. In the words of Tim Finin of the Ebiquity group:
owl:sameAsmay lead to contradictions. However, virtually merging the descriptions in a co-reference engine is fine — both provide information that is useful in disambiguating future references as well as for many other purposes. Our property (
:coref) is a transitive, symmetric property that is a super-property of
owl:sameAsand is paired with another,
:notCorefthat is symmetric and generalizes
When we look at the analog properties noted above, we see that the property objects tend to share reflexivity, symmetry and transitivity. We specifically designed the
umbel:correspondsTo predicate to capture these close, nearly equivalent, but uncertain degree of relationships.