Posted:November 20, 2018

A Knowledge Representation PractionaryPractical Guidance on How to Leverage Knowledge Graphs, Semantic Technologies, and KBpedia

As readers of this blog well know, I am passionate on topics related to semantic technologies, knowledge graphs (ontologies), data structs, and artificial intelligence. Readers also probably know that I have found Charles S. Peirce, the 19th century American logician, scientist, and philosopher, to have remarkable insights on all aspects of knowledge representation. I’m proud to now announce my new book, A Knowledge Representation Practionary: Guidelines Based on Charles Sanders Peirce (Springer), that combines these viewpoints into a comprehensive whole. The 464 pp book is available for pre-order from Springer or from Amazon (and others, I’m sure). Formal release is due the second week of December.

Peirce’s practical guidelines and universal categories provide a structured approach to knowledge representation that captures differences in events, entities, relations, attributes, types, and concepts. Besides the ability to capture meaning and context, this Peircean approach is also well-suited to machine learning and knowledge-based artificial intelligence (KBAI). Peirce is a founder of pragmatism, the uniquely American philosophy. We have already used this viewpoint to produce the KBpedia knowledge base and artifact, which we just released as open source. My book combines that viewpoint with the experience that Fred Giasson and I gained over the past decade with commercial clients in semantic and AI technologies. While KBpedia and the book stand on their own and do not depend on each other, they do reference one another, and those with serious interest may find it useful to keep KBpedia open as they progress through the book’s chapters.

I use the term practionary for the book — a decidedly new term — because the Peircean scholar Kelly Parker first coined that term to capture Charles Perice’s uniquely pragmatic way to fully explicate a particular domain of inquiry. In our case, of course, that domain is knowledge representation, which is shorthand for how to represent human symbolic information and knowledge to computers to solve complex questions. KR applications range from semantic technologies and knowledge management and machine learning to information integration, data interoperability, and natural language understanding. Knowledge representation is an essential foundation for knowledge-based AI. The practionary approach is a soup-to-nuts way to fully apprehend a given topic. To my knowledge, the book is the first attempt to put this Peircean method and framework into action.

I structure the book into five parts, following Peirce’s own approach. The first and last parts are bookends. The first bookend sets the context and background. The concluding bookend presents practical applications from following the guidelines. In between, the three main parts mirror Peirce’s three universal categories, the meat of his approach. The first of these three addresses the terminologies and grammar of knowledge representation. The next discusses the actual components or building blocks for KR systems. And the third provides what generalities we may derive about how to design, build, test, and follow best practices in putting a system together. Throughout, the book refers to and leverages the open source KBpedia knowledge graph and its public knowledge bases, including Wikipedia and Wikidata. Actual practitioners may find KBpedia, built from the ground up on these Peircean principles, a ready baseline to build their own domain knowledge graph and applications.

Here are the parts and chapters of the book:

Preface vii
 1. Introduction 1
Structure of the Book 2
Overview of Contents 3
Key Themes 10
 2. Information, Knowledge, Representation 15
What is Information? 16
What is Knowledge? 27
What is Representation? 33
Part I: Knowledge Representation in Context
 3. The Situation 45
Information and Economic Wealth 46
Untapped Information Assets 54
Impediments to Information Sharing 61
 4. The Opportunity 65
KM and A Spectrum of Applications 66
Data Interoperability 69
Knowledge-based Artificial Intelligence 74
 5. The Precepts 85
Equal Class Data Citizens 86
Addressing Semantic Heterogeneity 91
Carving Nature at the Joints 97
Part II: A Grammar for Knowledge Representation
 6. The Universal Categories 107
A Foundational Mindset 108
Firstness, Secondness, Thirdness 112
The Lens of the Universal Categories 116
 7. A KR Terminology 129
Things of the World 131
Hierarchies in Knowledge Representation 135
A Three-Relations Model 143
 8. KR Vocabulary and Languages 151
Logical Considerations 153
Pragmatic Model and Language Choices 163
The KBpedia Vocabulary 167
Part III: Components of Knowledge Representation
 9. Keeping the Design Open 183
The Context of Openness 184
Information Management Concepts 193
Taming a Bestiary of Data Structs 200
10. Modular, Expandable Typologies 207
Types as Organizing Constructs 208
A Flexible Typology Design 215
KBpedia’s Typologies 219
11. Knowledge Graphs and Bases 227
Graphs and Connectivity 228
Upper, Domain and Administrative Ontologies 237
KBpedia’s Knowledge Bases 242
Part IV: Building KR Systems
12. Platforms and Knowledge Management 251
Uses and Work Splits 252
Platform Considerations 262
A Web-oriented Architecture 268
13. Building Out The System 273
Tailoring for Domain Uses 274
Mapping Schema and Knowledge Bases 280
‘Pay as You Benefit’ 291
14. Testing and Best Practices 295
A Primer on Knowledge Statistics 296
Builds and Testing 304
Some Best Practices 309
Part V: Practical Potentials and Outcomes
15. Potential Uses in Breadth 319
Near-term Potentials 320
Logic and Representation 327
Potential Methods and Applications 332
16. Potential Uses in Depth 343
Workflows and BPM 343
Semantic Parsing 349
Cognitive Robotics and Agents 361
17. Conclusion 371
The Sign and Information Theoretics 372
Peirce: The Philosopher of KR 373
Reasons to Question Premises 377
Appendix A: Perspectives on Peirce 381
Appendix B: The KBpedia Resource 409
Appendix C: KBpedia Feature Possibilities 421
Glossary 435
Index 451

My intent is to produce a book of enduring, practical guidelines for how to think about KR and to design knowledge management (KM) systems. I emphasize how-to guidance and ways to think about KR problems. The audience in my mind are enterprise information and knowledge managers who are contemplating a new knowledge initiative. However, early reviewers have told me the basics are useful to students and researchers at all levels.

I am not even-handed in this book. My explicit purpose is to offer a fresh viewpoint on KR as informed by Peirce and our experience in building systems. For more balanced treatments, I recommend the excellent reference texts by van Harmelan et al. or Brachman and Levesque. Still, for those looking at the practical side of things, I hope this book may become an essential addition to theory and practice for KR and semantic technology. Peirce has a profound understanding of meaning and context that I believe is of benefit to knowledge management practitioners and AI researchers alike.

Individuals with a Springer subscription may get a softcover copy of the e-book for $24.99 under Springer’s MyCopy program. The standard e-book is available for $129 and hardcover copies are available for $169; see the standard Springer order site. Students or individuals without Springer subscriptions who can not afford these prices should contact me directly for possible alternatives.

Posted:November 13, 2018

KBpediaBetter Mappings, More Properties

When we released KBpedia v 1.60 as open source a couple of weeks back, I noted that I would follow-up the announcement with more details on the changes made in preparation for the release. This post provides that update.

KBpedia is a computable knowledge structure that combines seven major public knowledge bases — Wikipedia, Wikidata, schema.org, DBpedia, GeoNames, OpenCyc, and UMBEL. KBpedia supplements these core KBs with mappings to more than a score of additional leading vocabularies. The entire KBpedia structure is computable, meaning it can be reasoned over and logically sliced-and-diced to produce training sets and reference standards for machine learning and data interoperability. KBpedia provides a coherent overlay for retrieving and organizing Wikipedia or Wikidata content. KBpedia greatly reduces the time and effort traditionally required for knowledge-based artificial intelligence (KBAI) tasks.

KBpedia is a comprehensive knowledge structure for promoting data interoperability and KBAI. KBpedia’s upper structure, the KBpedia Knowledge Ontology (KKO), is based on the universal categories and knowledge representation theories of the great 19th century American logician, philosopher, polymath and scientist, Charles Sanders Peirce. This design provides a logical and coherent underpinning to the entire KBpedia structure. The design is also modular and fairly straightforward to adapt to enterprise or domain purposes. KBpedia was first released in October 2016. My initial announcement provides further details on KBpedia and how to download it.

Besides prepping the KBpedia knowledge artifiact for open-source release, we also made these improvement to the base structure in comparison to the prior v 1.51, the last proprietary version:

  • The major effort was to increase the mapping to Wikidata, with most mappings represented as owl:equivalentClass. Coverage of KBpedia to Wikidata is now 50%, with 27,423 of KBpedia’s reference concepts now mapped to Wikidata. Version 1.60 has 4.5x more coverage than the previous v. 1.51
  • We also continued to increase coverage to Wikipedia, with coverage now at 77%
  • We now have essentially complete coverage to DBpedia ontology, schema.org and GeoNames
  • We doubled the number of mapped properties to nearly 5 K and added schema.org property mappings
  • We organized the properties into attributes, indexes/indices, and external relations.

Please note we measure coverage as the larger of percent of external concepts mapped or percent of KBpedia mapped to the external source. The % Change figures represent the changes from v 1.51 to the new open source v 1.60.

Besides the property organization, we made few changes in this latest v 1.60 release to the overall structure or scope of KBpedia. The emphasis was on mapping to existing sources and clean up for public release. Here are the major statistics for v 1.60:

Structure Value % Change Coverage
No. of RCs 54,867 2.7%
KKO 173 -0.6%
Standard RCs 54,694 2.7%
No. of mapped vocabularies 23 -14.8%
Core KBs 7 16.7%
Extended vocabs 16 -23.8%
No. of typologies 68 7.9%
Core entity types 33 0.0%
Other core types 5 0.0%
Extended types 30 20.0%
No. of properties 4,847 92.4%
RC Mappings 139,311 21.1%
Wikipedia 42,108 4.3% 77%
Wikidata 27,423 446.2% 50%
schema.org 845 15.1% 99%
DBpedia ontology 764 0.0% 99%
GeoNames 918 0.0% 99%
OpenCyc 33,526 0.0% 61%
UMBEL 33,478 0.0% 99%
Extended vocabs 249 -4.2%
Property Mappings 4,847 92.4%
Wikidata 3,970 57.6%
schema.org 877 N/A

Through its mapped sources, KBpedia links to more than 30 million entities, the largest percentage coming from Wikidata. The mappings to these external sources are provided in the linkages to the external resources file in the KBpedia downloads. (A larger inferred version is also available.) The external sources keep their own record files. KBpedia distributions provide the links. However, you can access these entities through the KBpedia explorer on the project’s Web site (see these entity examples for cameras, cakes, and canyons; clicking on any of the individual entity links will bring up the full instance record.)

Please know that KBpedia remains under active development, with new updates anticipated in the near future. We are incorporating feedback gained from the initial open source release, and are also committed to increasing the mapping coverage for the artifact and other baseline improvements. Our plan is to complete this baseline before new external sources are added to the system.

KBpedia is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. KBpedia’s development to date has been sponsored by Cognonto Corporation.

Posted:October 23, 2018

KBpediaA Major Milestone in Semantic Technologies and AI After a Decade of Effort

Fred Giasson and I are very (no, make that supremely!) pleased to announce the availability of KBpedia as open source. Woohoo! The complete open source KBpedia includes its upper ontology (KKO), full knowledge graph, mappings to major leading knowledge bases, and 70 logical concept groupings called typologies. We are also today announcing version 1.60 of KBpedia, with greatly expanded mappings.

For those who have been following our work, it should be clear that this release represents the culmination of more than ten years of steady development. KBpedia is the second-generation knowledge graph successor to UMBEL, which we will now begin to retire. KBpedia, when first released in 2016, only provided its upper portion, the KBpedia Knowledge Ontology (KKO), as open source. While we had some proprietary needs in the first years of the structure, we’re really pleased to return to our roots in open source semantic technologies and software. Open source brings greater contributions and greater scrutiny, both important to growth and improvements.

KBpedia is a computable knowledge structure that combines seven major public knowledge bases — Wikipedia, Wikidata, schema.org, DBpedia, GeoNames, OpenCyc, and UMBEL. KBpedia supplements these core KBs with mappings to more than a score of additional leading vocabularies. The entire KBpedia structure is computable, meaning it can be reasoned over and logically sliced-and-diced to produce training sets and reference standards for machine learning and data interoperability. KBpedia provides a coherent overlay for retrieving and organizing Wikipedia or Wikidata content. KBpedia greatly reduces the time and effort traditionally required for knowledge-based artificial intelligence (KBAI) tasks.

KBpedia is a comprehensive knowledge structure for promoting data interoperability and KBAI. KBpedia’s upper structure, KKO, is based on the universal categories and knowledge representation theories of the great 19th century American logician, polymath and scientist, Charles Sanders Peirce. This design provides a logical and coherent underpinning to the entire structure. The design is also modular and fairly straightforward to adapt to enterprise or domain purposes. KBpedia was first released in October 2016.

“We began KBpedia with machine learning and AI as the driving factors,” said Fred, also the technical lead on the project. “Those remain challenging, but we are also seeing huge demands to bring a workable structure that can leverage Wikidata and Wikipedia,” he said. “We are seeing the convergence of massive public data with open semantic technologies and the ideas of knowledge graphs to show the way,” he stated. Here are some of the leading purposes and use cases for KBpedia:

    • A coherent and computable overlay to both Wikipedia and Wikidata
    • Integrating domain data
    • Fine-grained entity identification, extraction and tagging
    • Faceted, semantic search and retrieval
    • Mapping and integration of external datasets
    • Natural language processing and computational linguistics
    • Knowledge graph creation, extension and maintenance
    • Tailored filtering, slicing-and-dicing, and extraction of domain knowledge structures
    • Data harvesting, transformation and ingest
    • Data interoperability, re-use of existing content and data assets, and knowledge discovery
    • Supervised, semi-supervised and distant supervised machine learning for:
      • Typing, classification, extraction, and tagging of entities, attributes and relations
    • Unsupervised and deep learning.

    The KBpedia Web site provides a working KBpedia explorer and demo of how the system may be applied to local content for tagging or analysis. KBpedia splits between entities and concepts, on the one hand, and splits in predicates based on attributes, external relations, and pointers or indexes, all informed by Charles Peirce’s prescient theories of knowledge representation. I will have much further to say about the project and its relation to Peirce in the coming weeks.

    The new v 1.60 release of KBpedia has 55,000 reference concepts in its guiding knowledge graph, which ties into an estimated 30 million entities, mostly from Wikidata. The system is inherently multi-lingual, though the current release is in English only. We hope to see multiple language versions emerge, which should be straightforward given the dominance of links from Wikipedia and Wikidata. As it stands, the core structure of KBpedia provides direct links to millions of external reference sources. A subsequent post will document the changes in version 1.60 in detail.

    With this open source release, we will next shift our attention to expand the coverage of links to external sources. By moving to open source, we hope to see problems with the structure emerge as well as contributions now come from others. When you pull back the curtain with open source a premium gets placed on having clean assignments and structure that can stand up to inspection. Fortunately, Fred has designed a build system that starts with clean ‘triples’ input files.We make changes, re-run the structure against logic and consistency tests, fix the issues, and run again. We conducted tens of builds of the complete KBpedia structure in the transition from the prior versions to the current release. While we have a top-down design based on Peirce, we build the entire structure from the bottom up from these simple input specifications. The next phase in the our KBpedia release plan is also to release these build routines as open source.

    Though tremendous strides have been made in the past decade in leveraging knowledge bases for artificial intelligence, we are butting up against two limitations. Our first problem is that we are relying on knowledge sources like Wikipedia that were never designed for AI or data integration purposes. The second problem is that we do not have repeatable building blocks that can be extended to any domain or any enterprise. AI is sexy and attractive, but way too expensive. We hope the current open source release of KBpedia moves us closer to overcoming these problems.

    Downloads

    Here are the various KBpedia resources that you may download or use with attribution:

    • The complete KBpedia knowledge graph (7 MB, zipped). This download is likely your most useful starting point
    • KBpedia’s upper ontology, KKO (304 KB), which is easily inspected and navigated in an editor
    • The annotated KKO (291 KB). This is NOT an active ontology, but is has the upper concepts annotated to more clearly show the Peircean categories of Firstness (1ns), Secondness (2ns), and Thirdness (3ns)
    • The 68 individual KBpedia typologies in N3 format
    • The KBpedia mappings to the seven core knowledge bases and the additional extended knowledge bases in N3 format
    • A version of the full KBpedia knowledge graph extended with linkages to the external resources (8.7 MB, zipped), and
    • A version of the full KBpedia knowledge graph extended with inferences and linkages (11.6 MB, zipped).

    The last two resources require time and sufficient memory to load. We invite and welcome contributions or commentary on any of these resources.

    All resources are available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. KBpedia’s development to date has been sponsored by Cognonto Corporation.

Posted:February 21, 2018

The Compleat Knowledge GraphNine Features Wanted for Ontologies

I think the market has spoken in preferring the term of ‘knowledge graph’ over that for ‘ontology.’ I suppose we could argue nuances in differences for what the terms mean. We will continue to use both terms, more-or-less interchangeably. But, personally, I do find the concept of ‘knowledge graph’ easier to convey to clients.

As we see knowledge graphs proliferate in many settings — from virtual agents (Siri, Alexa, Cortana and Google Assistant, among others) to search and AI platforms (Watson) — I’d like to take stock of the state-of-the-art and make some recommendations for what I would like to see in the next generation of knowledge graphs. We are just at the beginning of tapping the potential of knowledge graphs, as my recommendations show.

Going back for twenty years to Nicola Guarino in 1998 [1] and Michael Uschold in 2008 [2], there is a sense that ontologies could be relied upon for even more central aspects of overall applications. Both Guarino and Uschold termed this potential ’ontology-driven information systems.’ It is informative of the role that ontologies may play by listing some of these incipient potentials, some of which have only been contemplated or met in one or two actual installations. Let me list nine main areas of (largely) untapped potential:

  1. Context and meaning — by this, I mean the ability to model contexts and situations, which requires specific concepts for such and an ability to express gradations of adjacency (spatial and otherwise). Determining or setting contexts is essential to disambiguate meaning. Context and situations have been particularly difficult ideas for ontologies to model, especially those that have a binary or dichotomous design;

  2. A relations component — true, OWL offers the distinction of annotation, object and datatype properties, and we can express property characteristics such as transitivity, domain, range, cardinality, inversion, reflexivity, disjunction and the like, but it is a rare ontology that uses any or many of these constructs. The subProperty expression is used, but only in limited instances and rarely in a systematic schema. For example, it is readily obvious that some broader predicates such as animalAction could be split into  involuntaryAction and voluntaryAction, and then into specific actions such as breathing or walking, and so on, but schema with these kinds of logical property subsumptions are not evident. Structurally, we can use OWL to reason over actions and relations in a similar means as we reason over entities and types, but our common ontologies have yet to do so. Creating such schema are within grasp since we have language structures such as VerbNet and other resources we could put to the task;

  3. An attributes component — the lack of a schema and organized presentation of attributes means it is a challenge to do ABox-level integration and interoperability. As with a relations component, this gap is largely due to the primary focus on concepts and entities in the early stages of semantic technologies. Optimally, what we would like to see is a well-organized attributes schema that enables instance data characteristics from different sources to be mapped to a canonical attributes schema. Once in place, not only would mapping be aided, but we should also be able to reason over attributes and use them as intensional cues for classifying instances. At one time Google touted its Biperpedia initiative [3] to organize attributes, but that effort went totally silent a couple of years ago;

  4. A quantity units ontology —  is the next step beyond attributes, as we attempt to bring data values for quantities (and well as the units and labeling used) into alignment. Fortunately, of late, the QUDT ontologies (quantities, units and data types) has become an active project again with many external supporters. Something like this needs to accompany the other recommendations listed;

  5. A statistics and probabilities ontology —  the world is not black-and-white, but vibrantly colored with all kinds of shades. We need to be able to handle gradations as well as binary choices. Being able to add probabilistic reasoners is appropriate given the idea of continua (Thirdness) from Charles Sanders Peirce and capturing the idea of fallibility. Probabilistic reasoning is still a young field in ontology. Some early possibilities include Costa [4] and the PR-OWL ontology using Multi-Entity Bayesian Networks (MEBN) [5] which are a probabilistic first-order logic that goes beyond Peirce’s classic deterministic logic; as well as fuzzy logic applied to ontologies [6];

  6. Abductive reasoning and hypothesis generation —  Peirce explicated a third kind of logical reasoning, abduction, that combines hypothesis generation with an evaluation of likelihood of success and effort required. This logic method has yet to be implemented in any standard Web ontologies to my knowledge. The method could be very useful to pose desired outcome cases and then to work through what may be required to get there. Adding this to existing knowledge graphs would likely require developing a bespoke abductive reasoner;

  7. Rich feature set for KBAI —  we want a rich features set useful to provide labeled instances for supervised machine learners. I addressed this need earlier with a rather comprehensive listing of possible features for knowledge graphs useful to learners [7]. We now need to start evaluating this features pool to provide pragmatic guidance for which features and learners match best for various knowledge-based artificial intelligence (KBAI) tasks;

  8. Consistent, clean, correct and coherent — we want knowledge graphs that are as free from error as possible to make sure we are not feeding garbage to our machine learners and as a coherent basis for evaluating new additions and mappings; and

  9. ODapps — ‘ontology-driven applications’ go beyond the mere templating or completions of user interface components to devise generic software packages driven by ontology specifications for specific applications. We have developed and deployed ODapps to import or export datasets; create, update, delete (CRUD) or otherwise manage data records; search records with full-text and faceted search; manage access control at the interacting levels of users, datasets, tools, and CRUD rights; browse or view existing records or record sets, based on simple to possible complex selection or filtering criteria; or process results sets through workflows of various natures, involving specialized analysis, information extraction or other functions. ODapps are designed more similarly to widgets or API-based frameworks than to the dedicated software of the past, though the dedicated functionality is quite similar. The major change in ODapps is to use a relatively common abstraction layer that responds to the structure and conventions of the guiding ontologies. We may embed these ODapps in a layout canvas for a Web page, where, as the user interacts with the system, the service generates new queries (most often SPARQL) to the various Web services endpoints, which produce new structured results sets, which can drive new displays and visualizations. As new user interactions occur, the iteration cycle is generated anew, again starting a new cycle of queries and results sets.

Fortunately, we are actively addressing multiple of these recommendations (#1 – #3, #6 – #9) with our KBpedia initiative. We are also planning to add mapping to QUDT (#4) in a near-future release. We are presently evaluating probabilistic reasoners and hypothesis generators (#5 and #6).

Realizing these potentials will enable our knowledge management (KM) efforts to shift to the description, nature, and relationships of the information environment. In other words, ontologies themselves need to become the focus of development. KM no longer needs to be abstracted to the IT department or third-party software. The actual concepts, terminology and relations that comprise coherent ontologies now become the explicit focus of KM activities, and subject to the direct control and refinement by their users, the knowledge workers, and subject matter experts.

We are still some months from satisfying our desiderata for knowledge graphs. Fortunately, we have already made good progress, and we are close at hand to check off all of the boxes. Stay tuned!


[1] N. Guarino, “Formal Ontology and Information Systems,” in Proceedings of FOIS’98, Trento, Italy, 1998, pp. 3–15.
[2] M. Uschold, “Ontology-Driven Information Systems: Past, Present and Future,” in Proceedings of the Fifth International Conference on Formal Ontology in Information Systems (FOIS 2008), Carola Eschenbach and Michael Grüninger, eds., IOS Press, Amsterdam, Netherlands, 2008, pp. 3–20.
[3] R. Gupta, A. Halevy, X. Wang, S.E. Whang, and F. Wu. “Biperpedia: An Ontology for Search Applications,” Proceedings of the VLDB Endowment 7, no. 7, 2014, pp. 505-516.
[4] P. C. Costa, “Bayesian Semantics for the Semantic Web,” Ph.D., George Mason University, 2005.
[5] K. B. Laskey, “MEBN: A Language for First-Order Bayesian Knowledge Bases,” Artificial Intelligence, vol. 172, no. 2–3, pp. 140–178, Feb. 2008.
[6] F. Bobillo and U. Straccia, “Fuzzy Ontology Representation Using OWL 2,” International Journal of Approximate Reasoning, vol. 52, no. 7, pp. 1073–1094, Oct. 2011.
[7] M.K. Bergman, “A (Partial) Taxonomy of Machine Learning Features,” AI3:::Adaptive Information blog, November 23, 2015.

 

Posted:January 22, 2018

OWL2 Web Ontology Language More Active Tools than Last Census

For the last couple of years one of the more popular articles on this blog has been my 2014 listing of 50 ontology alignment tools. When published, only 20 of those fifty were active; the rest had been abandoned. Ontology alignment, also sometimes called ontology mapping or ontology matching, is making formal correspondences between concepts in two or more knowledge graphs, or ontologies. Entity matching may also be included in the mix.

I had occasion to update this listing for some recent work. Three active tools from that last listing have now been retired, but I was also able to identify nine new ones and to update quite a few others. Here is the updated listing:

  • AgreementMakerLight is an automated and efficient ontology matching system derived from AgreementMaker

  • ALCOMO is a shortcut for Applying Locical Constraints on Matching Ontologies. ALCOMO is a debugging system that allows incoherent alignments to be transformed to coherent ones by removing some correspondences from the alignment, called a diagnosis. It is complete in the sense that it detects any kind of incoherence in SHIN(D) ontologies

  • Alignment is a collaborative, system aided, user driven ontology/vocabulary matching application

  • The Alignment API is an API and implementation for expressing and sharing ontology alignments. The correspondences between entities (e.g., classes, objects, properties) in ontologies is called an alignment. The API provides a format for expressing alignments in a uniform way. The goal of this format is to be able to share on the web the available alignments. The format is expressed in RDF, so it is freely extensible. The Alignment API itself is a Java description of tools for accessing the common format. It defines four main interfaces (Alignment, Cell, Relation and Evaluator)

  • ALIN is an ontology alignment system specializing in the interactive alignment of ontologies. Its main characteristic is the selection
    of correspondences to be shown to the expert, depending on the previous feedbacks given by the expert. This selection is based on semantic and structural characteristics
  • Blooms is a tool for ontology matching. It utilizes information from Wikipedia category hierarchy and from the web to identify subclass relationship between entities. See also its Wiki page

  • CODI (Combinatorial Optimization for Data Integration) leverages terminological structure for ontology matching. The current implementation produces mappings between concepts, properties, and individuals. CODI is based on the syntax and semantics of Markov logic and transforms the alignment problem to a maximum-a-posteriori optimization problem

  • COMA++ is a schema and ontology matching tool with a comprehensive infrastructure. Its graphical interface supports a variety of interaction

  • Falcon-AO (Finding, aligning and learning ontologies) is an automatic ontology matching tool that includes the three elementary matchers of String, V-Doc and GMO. In addition, it integrates a partitioner PBM to cope with large-scale ontologies

  • hMAFRA (Harmonize Mapping Framework) is a set of tools supporting semantic mapping definition and data reconciliation between ontologies. The targeted formats are XSD, RDFS and KAON

  • GOMMA is a generic infrastructure for managing and analyzing life science ontologies and their evolution. The component-based infrastructure utilizes a generic repository to uniformly and efficiently manage many versions of ontologies and different kinds of mappings. Different functional components focus on matching life science ontologies, detecting and analyzing evolutionary changes and patterns in these ontologies

  • HerTUDA is a simple, fast ontology matching tool, based on syntactic string comparison and filtering of irrelevant mappings. Despite its simplicity, it outperforms many state-of-the-art ontology matching tools

  • Karma is an information integration tool to integrate data from databases, spreadsheets, delimited text files, XML, JSON, KML and Web APIs. Users integrate information according to an ontology of their choice using a graphical user interface that automates much of the process. Karma learns to recognize the mapping of data to ontology classes and then uses the ontology to propose a model that ties together these classes

  • KitAMO is a tool for evaluating ontology alignment strategies and their combinations. It supports the study, evaluation and comparison of alignment strategies and their combinations based on their performance and the quality of their alignments on test cases. Based on the SAMBO project

  • The linked open data enhancer (LODE) framework is a set of integrated tools that allow digital humanists, librarians, and information scientists to connect their data collections to the linked open data cloud. It can be applied to any domain with RDF datasets

  • LogMap is highly scalable ontology matching system with ‘built-in’ reasoning and diagnosis capabilities. LogMap can deal with semantically rich ontologies containing tens (and even hundreds) of thousands of classes

  • Map-On is a collaborative ontology mapping environment which supports different users –domain experts, data owners, and ontology engineers– to integrate data in a collaborative way using standard semantic technologies

  • MapOnto is a research project aiming at discovering semantic mappings between different data models, e.g, database schemas, conceptual schemas, and ontologies. So far, it has developed tools for discovering semantic mappings between database schemas and ontologies as well as between different database schemas. The Protege plug-in is still available, but appears to be for older versions

  • OntoM is one component of the OntoBuilder, which is a comprehensive ontology building and managing framework. OntoM provides a choice of mapping and scoring methods for matching schema

  • OntoSim is a Java API allowing to compute similarities between ontologies. It relies on the Alignment API for ontology loading so it is quite independent of the ontology API used (JENA or OWL API)

  • OpenII Harmony is a schema matching tool that combines multiple language-based matching algorithms and a graphical user interface

  • OxO is a service for finding mappings (or cross-references) between terms from ontologies, vocabularies and coding standards. OxO imports mappings from a variety of sources including the Ontology Lookup Service and a subset of mappings provided by the UMLS

  • PARIS is a system for the automatic alignment of RDF ontologies. PARIS aligns not only instances, but also relations and classes. Alignments at the instance level cross-fertilize with alignments at the schema level

  • S-Match takes any two tree like structures (such as database schemas, classifications, lightweight ontologies) and returns a set of correspondences between those tree nodes which semantically correspond to one another

  • ServOMap is an ontology matching tool based on Information Retrieval technique relying on the ServO system. To run it, please follow the directions described at http://oaei.ontologymatching.org/2012/seals-eval.html

  • The Silk framework is a tool for discovering relationships between data items within different Linked Data sources. Data publishers can use Silk to set RDF links from their data sources to other data sources on the Web. While designed for mapping instance data, it can also be used for schema

  • treemerge.io is a web based tool to: 1) import category systems (tree based taxonomies/ontologies) in the form of JSON files; 2) map them using a visual user interface; 3) export a single unified ontology

  • WikiV3 is an ontology matching system that uses Wikipedia as an external knowledge base useful for concepts, entities, and properties and multi=lingual alignments
  • Yam++ (not) Yet Another Matcher is a flexible and self-configuring ontology matching system for discovering semantic correspondences between entities (i.e., classes, object properties and data properties) of ontologies. This new version YAM++ 2013 has a significant improvement from the previous versions. See also the 2013 results. Code not apparently available.

Posted by AI3's author, Mike Bergman Posted on January 22, 2018 at 4:46 am in Ontologies, Semantic Web Tools | Comments (1)
The URI link reference to this post is: http://www.mkbergman.com/2129/30-active-ontology-alignment-tools/
The URI to trackback this post is: http://www.mkbergman.com/2129/30-active-ontology-alignment-tools/trackback/