Posted:October 8, 2014

AI3 Pulse

I have been dazzled by what Stephen Wolfram has been doing with Wolfram Alpha since Day 1. The mathematical capabilities are exceedingly impressive, the backing knowledge base is exceedingly impressive, and the visualization has also come to be exceedingly impressive. The problem: no open source or anything.

I’ve never met Wolfram and acknowledge he has seen success many times greater than most. But, putting on my VC cap, I see an old school adherence to closed and proprietary.

I would not presume to suggest where the Wolfram folks should draw these lines on open and closed, but I definitely do counsel they work hard to open up as much as they can; it is in their own self interest. The business model of today is premised on leveraging the network effect in various ways. Total proprietary is a turn off and drives the compounding basis for the network effect into the dirt.

Maybe their recent pricing initiatives with Mathematica online are a wink toward this direction. But, better still find a core of functionality and knowledge base and release it as open source. The world will beat a path to what created all of this impressive stuff in the first place.

Posted by AI3's author, Mike Bergman Posted on October 8, 2014 at 11:33 pm in Pulse | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/1805/pulse-let-my-wolfram-alpha-go/
The URI to trackback this post is: https://www.mkbergman.com/1805/pulse-let-my-wolfram-alpha-go/trackback/
Posted:September 30, 2014

AI3 Pulse

A few weeks back I reflected on more than a decade of involvement in the semantic Web. I appreciate the many nice comments and compliments received.

In part in reaction to that post and also because it is the retrospective season, Amit Sheth has just posted his own retrospective on the semWeb going back 15 years. Amit has been one of the leaders in the space and (according to his history) the first to obtain a semantic Web patent.

Amit’s wonderful history is more global and informed than my own, and I recommend you check it out. BTW, Amit is asking for comments from those involved in the early years for any corrections or additions.

Posted by AI3's author, Mike Bergman Posted on September 30, 2014 at 9:41 am in Pulse, Semantic Web | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/1807/pulse-more-semweb-retrospective/
The URI to trackback this post is: https://www.mkbergman.com/1807/pulse-more-semweb-retrospective/trackback/
Posted:September 29, 2014

AI3 Pulse

For years now I have been a huge fan of Pandora (sorry, not apparently available to all across the world due to digital rights issues). Even though Pandora’s Music Genome was not set up as a W3C-compliant ontology, in its use and application it is effectively one. What Pandora shows is that feature selection and characterization trumps language and data structure format.

Given that, I have also been a Web scientist as to how I select and promote music to meet my musical interests.

Thus, based on my own totally unscientific study, here are a few things I’ve found worth passing on. It would be bold to call them secrets; they are really more just observations:

  • When a new “seed” artist is chosen, the attributes of that artist (up to 450 different attributes from beat to genre to dominant instruments) set the pure characteristics of that channel
  • As similar songs play that meet this profile, when you vote them “up” or “down” you are effectively adding or deleting options for these 450 attributes in your profile criteria
  • Thus, continuous expressions of preference as a channel plays acts to “dilute” the purity of the initial seed; these preference expressions lead to a “mixed” seed
  • The more that choices are preferred, the more the signal of your original selection gets diluted.

The net result is that I now no longer vote any of my songs on any channel as up or down. Rather, I look to the purest “seeds” that capture my mode or genre preference. If my initial selection does not provide this purity, I delete it, and try to find a better true seed.

This approach has led to some awesome channels for me, that I can then combine together, depending on mood, into mixed shuffle play with randomized channel selection. I no longer vote any song up or down; I rather look for more telling seeds.

As a couple of examples, here is a channel of less-well known 60’s goldies and a jazz guitar channel that are pure, single-seed channels, that, at least for me, provide hours of consistent music in that genre. Roll your own!

Posted by AI3's author, Mike Bergman Posted on September 29, 2014 at 10:44 am in Pulse | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/1804/pulse-pure-v-mixed-seeds-on-pandora/
The URI to trackback this post is: https://www.mkbergman.com/1804/pulse-pure-v-mixed-seeds-on-pandora/trackback/
Posted:September 25, 2014

AI3 Pulse

We all know the slow decline and zero take-off about the semantic Web writ large, but what we probably did not know is that Big data would gobsmack the concept. Trends do not lie.

Some ten years ago the idea of the semantic Web had about a 4x advantage in Google searches compared to that for “big data”. Today, Big data searches exceed those for the semWeb by 25-fold. The cross-over point occurred in about April 2011:
SemWeb v Big Data Google Trends
Within a year, projections are that the Big data interest level will exceed the semantic Web by 35x.

These may not be entirely apples-to-apples cases, but trends don’t often lie. Play yourself with these comparisons on Google Trends.

Posted by AI3's author, Mike Bergman Posted on September 25, 2014 at 1:22 am in Pulse, Semantic Web | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/1803/pulse-big-data-smokes-semweb/
The URI to trackback this post is: https://www.mkbergman.com/1803/pulse-big-data-smokes-semweb/trackback/
Posted:September 15, 2014

Big Structure has a Foundation in Reference Structures, But Any Structure Aids Interoperability

Big Structure is built on a foundation of reference structures, with domain structures capturing the domain at hand. These represent the target foundations for mapping schema and transforming data in the wild into an operable, canonical form. Any structure, even the most lightweight of lists and metadata, can contribute to and be mapped into this model, as this wall of structure shows:

Foundations to Big Structure

Described below are some of these structures, in rough descending order of completeness and usefulness, for making data interoperable. Please note that any of these structures might be available as linked data.

Reference Structures

In both semantics and artificial intelligence — and certainly in the realm of data interoperability — there is always the problem of symbol grounding. In the conceptual realm, symbol grounding means that when we use a term or phrase we are referring to the same thing. In the data value realm, symbol grounding means that when we refer to an object or a number, we are referring to the same measure.

UMBEL is the standard reference ontology used by Structured Dynamics. It contains 28,000 concepts (classes and relationships) derived from the Cyc knowledge base. The reference concepts of UMBEL are mapped to Wikipedia, schema.org (used in Google’s knowledge graph), DBpedia ontology classes, GeoNames and PROTON. Similar reference structures are used to ground the actual data values and attributes.

Other reference structures may be used, so long as they are rather complete in scope and coherent in their relationships. Logical consistency is a key requirement for grounding.

Knowledge Bases

Knowledge bases combine schema with data in a logical manner; well-constructed ones support computations, inference and reasoning. To date, the two primary knowledge bases that we use are Wikipedia and Cyc. However, many specific domain knowledge bases also exist.

Knowledge bases are important sources for symbol grounding. It addition, because of their computability, they may be used with artificial intelligence methods to both extend the knowledge base and to refine the feature estimates used in the AI algorithms.

Domain Ontologies

Domain ontologies, constructed as graphs, are the principal working structures in data interoperability. Though best practices recommend they be grounded in the reference structures, the domain structures are the ones that specifically capture the concepts and data attributes of the target information domain. More effort should be focused at this level in the wall of structure than any other.

Domain structures provide unique benefits in discovery, flexible access, and information integration due to their inherent connectedness. Further, these domain structures can be layered on top of existing information assets, which means they are an enhancement and not a displacement for prior investments. And, these domain structures may be matured incrementally, which means their development is cost-effective.

Mappings

Data and schema in the wild need to be mapped and transformed into these canonical structures. What is known as data wrangling is an aspect of these mappings and transformations. Mappings thus become the glue that ties native data to interoperable forms.

Mapping is the critical bridging function in data interoperability. It requires tools and background intelligence to suggest possible correspondences; how well this is done is a key to making the semi-automatic mapping process as efficient as possible. Mapping structures are the result of the final correspondences. Mapping effort is a function of the scope of Big Structure, not the volume of Big Data.

Existing Structures

A broad variety of structures occur in the wild — from database schema and taxonomies to dictionaries and lists — that need to be represented in a common form and then mapped in order to support interoperability. The common representation used by Structured Dynamics is the RDF data model.

Structure Scripts

Scripting and tooling are essential to help create Big Structure efficiently.

Editor’s Note: We are pleased to share with you in advance some of the text from Structured Dynamics’ new Web site.