Method for an Integrated Knowledge Environment (MIKE2.0) is an open source framework for best practices and methods in enterprise information management. Structured Dynamics has been an active contributor to parts of MIKE2.0, particularly in semantic technologies.
MIKE2.0 has started a useful podcast series, hosted by Jim Harris of the OCDQ blog. The most recent release in that series deals with the open semantic enterprise, based on contributions from SD and my blog:
There is some good introductory material here, with the summaries toward the end of the 13-minute podcast nicely done. More detailed information on the open semantic enterprise is provided under MIKE2.0′s composite offering in this area.
This decade has clearly marked a sea change in the move of enterprise software from proprietary to open source, as I have recently discussed . It is instructive that only a mere six years ago I was in heated fights with my then Board about open source; today, that seems so quaint and dated.
Also during this period many have noted how open source has changed the capital required to begin a new software startup . Open source both provides the tooling and the components for cobbling together specialty apps and extensions. Six and seven and even eight figure startup costs common just a decade ago have now dropped to four or five figures. When we see the explosion of hundreds of thousands of smartphone apps we are seeing the glowing residue of these additional sea changes. Dropping startup costs by one to three orders of magnitude is truly democratizing innovation.
But something else has been going on that is changing the face of enterprise software (besides consolidation, another factor I also recently commented on). And that factor is “marketing”. Much less commentary is made about this change, but it, too, is greatly lowering costs and fundamentally changing market penetration strategies. That topic — and my personal experience with it — is the focus of this article.
Besides the few remaining big providers of enterprise software — like IBM, Oracle, HP, SAP — most vendors have totally remade their sales practices of just a few years ago. Large sales forces with big commissions and a year to two year sales cycles can no longer be justified when software license fees and the percentage maintenance annuities that flow from them are dropping rapidly. Today’s mantras are doing more with less and doing it faster, hardly consistent with the traditional enterprise software model. Sure, big enterprises, especially big government and big business, have large sunk costs in legacy systems that will continue to be milked by existing vendors. But the flow is constricting with longer-term trends clear to see. The old enterprise software model is obsolete.
Even if it were not dying, it is hard to square huge investments in sales and marketing when product development has become inexpensive and agile. The proliferation of three-letter marketing acronyms for branding “new” product areas and standard formulas for product hype of just a few years ago also feels old and dated. Cozy relationships with conventional trade press pundits and market analysts seem to be diminishing in importance, possibly because the authoritativeness of their influence is also diminishing. It is harder to justify market firm subscription costs when priority budget items are being cut and new information outlets have emerged.
In response to this, many developers have forsaken the enterprise market for the consumer one. Indeed enterprises themselves are looking more and more to the consumer sector and commodity apps for innovation and answers. But, still, problems unique to enterprises remain and how to effectively reach them in this brave new world is today’s marketing problem for enterprise software vendors.
Most entities today, when opining about these challenges, tend to emphasize the need for “laser focus” and “rifle-shot” targeting of prospects. The advice takes the form of: 1) emphasize well-defined verticals; 2) know your market well; and 3) target and go after your likely prospects. Prospect data mining and targeted ad analysis are the proferred elixirs.
But, there is little evidence such refined methods for prospect identification and targeting are really working. Like politicians doing focus groups and opinion polling to capture the desired “message” of their potential electorates, these are all still “push” models of marketing. Yet we are swamped with pushed messages and marketing everywhere we turn. The model is failing.
Besides message overload, there are two issues with laser targeting. First, despite all that we try to know about ready buyers (for enterprise software), we really don’t know if any particular individual is truly needful, in a position to buy, has the authority to buy, or is the right advocate to make the internal sell. Second, though the idea of “laser” carries with it the image of focus and not flailing, it is in fact expensive to identify the targets and send a focused message their way. Because of these issues, decay rates for laser prospects throughout conventional sales pipelines continue to rise.
There has always been the phenomenon of the “fish jumping into the boat“; that is, the unanticipated inbound inquiry from a previously unknown prospect leading to a surprisingly swift sale. But we have seen this phenomenon increase markedly in recent years. Structured Dynamics‘ current customer base — including recurring customers — comes almost exclusively from this source. As we have noted this trend in comparison with more targeted outreach, we have spent much time trying to understand why it is occurring and how we can leverage what Peter Drucker called the “unexpected success” .
What we are seeing, I believe, is a shift from sales to marketing, and within marketing from direct or outbound marketing to a new paradigm of marketing. Others have likened this to inbound marketing  or content marketing  or permission marketing . What we are seeing at Structured Dynamics bears many resemblances to parts of what is claimed for these other approaches, but not all. And, it is also true that what we are seeing may pertain mostly to innovative IT for emerging enterprise markets, and not a generalized paradigm suitable to other products or markets.
For lack of a better term, what we are seeing we can term “substantive marketing”. By this we mean offering valuable content and solutions-oriented systems for free and without restriction. This shares aspects with content marketing. Then, in keeping with the trend for buyers doing their own research and analysis to fulfill their own needs, similar to the premises of inbound or permission marketing, potential consumers can make their own judgments as to relevance and value of our offerings.
Sometimes, of course, some prospects find our approaches and solutions lacking. Sometimes, they may grab what we have offered for free and use them on their own without compensation to us. But where the match is right — and we need to be honest with both ourselves and the customer when it is not — we can better spend the customer’s limited time and resources to tailor our generic solutions to their specific needs. In doing so, we offer higher value (tailored services) while learning better about another spectrum of consumer need that can virtuously enhance our substantive offerings for the next prospect.
So, let’s decompose these components further to see what they can tell us about this new practice of substantive marketing and how to use it as an engine for moving forward.
The premise of substantive marketing is to offer square-deal value to the marketplace in the form of solutions-based content. Like content marketing that offers “the creation or sharing of content for the purpose of engaging current and potential consumer bases” , substantive marketing goes even further. The whole basis and premise of the approach is to provide substantive content, in one of more of these areas, preferably all:
Further, this substantive content is offered without strings, restrictions or customer fill-in forms. The content is not a come on or a teaser. We are not trying to gather leads or prospect names, because we have no intent to dun them with emails or follow-ups.
This substantive content is as complete as can be to enable new users to adopt the information and tools in their current state without further assistance. (In some cases, the information also educates the marketplace in order to prepare future customers for adoption.) Most importantly, this substantive content is offered for free, either open source (for code) or creative commons for documentation and other content. In return, it is fair to request — and we do — attribution when this material is used.
We have previously termed this complete panoply of substantive content a total open solution . Some might find the provision of such robust information crazy: How can we give away the store of our proprietary knowledge and systems?
But we find this kind of thinking old school. In an open source world where so much information is now available online, with a bit of effort customers can find this information anyway. Rather, our mindset is that customers do not want to pay again for what has already been done, but are willing to pay for what can be done with that knowledge for their own specific problems. Offering the complete storehouse of our knowledge in fact signals our interest in only charging the customer for new answers, new value or new formulations. The customers we like to work with feel they are getting an honest, square deal.
Consider your substantive content to be your flag, a unique banner for conveying and packaging your specific brand. It is thus important to find appropriate flagpoles — in the virtual territories that your customers visit — for raising this content high for them to see. Since the role of these flagpoles is to create awareness in potential prospects — who you do not likely know individually or even by group in advance — it makes sense to raise your offerings up on many flagpoles and on the highest flagpoles. Visibility is the object of the approach.
This approach is distinctly not leafletting or cramming links or emails into as many spaces as possible. The idea of substantive marketing is to fly valuable content high enough that desirous potential customers can discover and then inspect the information on their own, and only if they so choose. In this regard, substantive marketing resembles permission marketing .
Being visible helps ensure that the needful, questing prospect that you would never have been able to target on your own is able to see and be aware of your offerings. And, since they are seeking information and answers, your collateral needs to be of a similar nature. Solutions and substance are what they are seeking; what you have run up the flagpole should respond to that.
The mindset here is to respect your prospective customers and to allow them to chose to receive and inspect your offerings, but only if they so choose. If flown in the right venues with the right visibility, customers will see your flags and inspect them if they meet their requirements.
Some of the venues at which you can raise your flags include:
The observant reader will have already concluded that each of these venues develops slowly, and therefore raising visibility is generally a slow-and-steady game that requires patience. Start-up vendors backed by venture firms or those looking for quick visibility and cashout will not find this approach suitable. On the other hand, customer prospects looking for answers and self-sustaining solutions are not much interested in flash in the pan vendors, either.
The real drivers for this changing paradigm come from customer prospects. Sophisticated buyers of enterprise IT and instrumental change agents within organizations share most if not all of these characteristics:
More often than not we find our customers to have already installed and used our existing substantive materials for some time before they approach us about further work. They appreciate the tutorial information and have taught themselves much in advance. By the time we engage, both parties are able to cost-effectively focus on what is truly missing and needed and to deliver those answers in a quick way. Re-engagements tend to occur when a next set of gaps or challenges arise.
Though it may sound trite or even unbelievable to those who have not yet experienced such a relationship, the square deal value offered by substantive marketing can really lead to true partnerships and trust between vendor and customer. We experience it daily with our customers, and vice versa. We also think this is the adaptive approach that our new environment demands.
Once prospects learn of our substantive offerings, many may decide independently that what we have is not suitable. Others may simply download and use the information on their own, for which we often never know let alone receive revenue. We are completely fine with this, as shown for three different cases.
First, some of these prospects need no more than what we already have. This increases our user base, increases our visibility and often results in contributions to our forums and documentation.
Then, some of these prospects come to learn they need or want more than what our current offerings provide, leading to two possible forks. In one fork, the second case, they may have sufficient skills internally or with other suppliers to extend the system on their own. Some of this flows back to an improved code base or improved installation or documentation bases.
In the other fork, the third case, they may decide to engage us in tailoring a solution for them. That case is the only one of the three that leads to a direct revenue path.
In all three cases we win, and the customer wins. Maybe enterprise software vendors of decades past rue this reality of lower margins and shared benefits; we agree that the absolute profit potential of substantive marketing is much less. But we gladly accept the more enjoyable work and steady revenue relationships resulting from these changes. We are not engaged in some pollyann-ish altruism here, but in a steely-eyed honest brokering that best serves our own self-interest (and fairly that of the customer, as well).
Great IT product does not come from idle musings or dreamed up functionality. It comes solely and directly from solving customer problems. Only via customers can software be refined and made more broadly usable.
A slipstream of those who have previously become aware and tested our offerings will choose to engage our services. This generally takes the form of an inbound call, where the prospect not only qualifies itself, but also establishes the terms and conditions for the sale. They have chosen to select us; they are fish that have jumped into the boat.
To again quote Peter Drucker, “. . . the aim of marketing is to make selling superfluous. The aim of marketing is to know and understand the customer so well that the product or service fits him and sells itself. Ideally, marketing should result in a customer who is ready to buy. All that should be needed then is to make the product or service available . . .” . This is precisely what I meant earlier about the shift in emphasis from sales to marketing.
Even at this point there may be mismatches in needs and our skills and availabilities. If such is the case, we do not hesitate to say so, and attempt to point the prospect in another direction (from which we also gain invaluable market knowledge). If there is indeed a match, we then proceed to try to find common ground on schedule and budget.
Paradoxically, this square deal and honesty about the readiness and weaknesses of our offerings often leads to forgiveness from our customers. For example, for some time we have lacked automated installation scripts that would make it easier for prospects to install our open semantic framework. But, because of compensating value in other areas, such gaps can be overlooked and tackled later on (indeed, as a current customer is now funding). By not pretending to be everything to everyone, we can offer what we do have without embarrassment and get on with the job of solving problems.
For larger potential engagements, we typically suggest a fixed price initial effort to develop an implementation plan. The interviews and research to support this typical 4- to 6-weeks effort (generally in the $5 K to $10 K range, depending) then result in a detailed fulfillment proposal, with firm tasks, budget and schedule, specific to that customer’s requirements. Just as we respect our prospects’ time and budget, we expect the same and do not conduct these detailed plans without compensation. With respect to fulfillment contracts, we cap contract amount and limit milestone payments to pre-set percentages or time expended, whichever is lower.
This approach ensures we understand the customer’s needs and have budgeted and tasked accordingly. Capped contracts also put the onus on us the contractor to understand our own effort and tasking structures and realities, which leads to better future estimating. For the customer, this approach caps risk and potential exposure, and ensures milestones are being met no matter the time expenditures by us, the contractor. This approach extends our square-deal basis to also embrace risks and payments.
Thus, when customers engage us, they spend almost solely on new functionality specifically tailored to their needs. In doing so, we suggest they agree to release the new developments they fund as open source. We argue — and customers predominantly agree — that they are already benefitting from lower overall costs because other customers have funded sharable, open source before them. We point out that the new customers that follow them will also be independently creating new functionality, to which they will also later benefit.
(This argument does not apply to specific customer data or ontologies, which are naturally proprietary to the customer. Also, if the customer wants to retain intellectual ownership of extensions, we charge higher development fees.)
Once these new developments are completed, they are fed back into a new baseline of valuable content and code. From this new baseline the cycle of substantive marketing can be augmented anew and perpetuated.
All of these points can really be boiled down to three guidelines for how to make substantive marketing effective:
What we are finding — as we continue to refine our understanding of this new paradigm — is that through substantive marketing the fish are finding us and they sometimes jump into the boat. We like our enterprise customers to pre-qualify themselves and already be “sold” once they knock on the door. One never knows when that phone might ring or the email might come in. But when it does, it often results in a collaborative customer as a partner who is a joy to work with to solve exciting new problems.
Every couple of months I return to the idea of the open world assumption (OWA)  and its fundamental importance to knowledge applications. What it is that makes us human — in health and in sickness — is but a further line of evidence for the importance of an open world viewpoint. I’ll use three personal anecdotes to make this case.
Believe it or not, Alfred Wegener‘s theory of continental drift was only becoming accepted by mainstream scientists in my high school years. I experienced déjà vu regarding a science revolution while a botany major at Pomona College in the early 1970s. A young American biologist at that time, Lynn Margulis, was postulating the theory of endosymbiosis; that is, that certain cell organelles originated from initially free-living bacteria.
This idea of longstanding symbionts in the cell — indeed, even forming what was our overall conception of cells and their parts — was truly revolutionary. It was revolutionary because of the implications for the nature and potential degree of symbiosis. And it was revolutionary in adding a different arrow in the quiver of biotic change over time than classical Darwinian evolution.
Today, Margulis’ theory is now widely accepted and is understood to embrace cell organelles from mitochondria to chloroplasts and ribosomes. The seemingly fundamental unit of all organisms — the cell — is itself an amalgam of archaic symbionts and bacteria-like lifeforms. Truly remarkable.
In the early 1990s, my oldest child, Erin, then in elementary school, had been going through a debilitating bout of periodic and severe stomach upsets. I sort of thought this might be inherited, since my paternal grandmother had suffered from ulcers for many decades (as did many at that time).
We were good friends with our pediatrician in our small town and knew him to be a thoughtful and well-informed MD. His counsel was that Erin was likely suffering from an ulcer and we began taking great care about her diet. But Erin’s symptoms did not seem to improve.
My wife, Wendy, is a biomedical researcher and began to investigate this problem on her own. She discovered some early findings implicating a gastrointestinal (gut) bacteria with similar symptoms and brought this research to our doctor’s attention. He, too, was intrigued, and prescribed a rather straightforward antibiotic regimen for Erin. Her symptoms immediately ceased, and she has been clear of further symptoms in the twenty years since.
The nearly universal role of the Helicobacter bacteria in ulcers is now widely understood. The understanding of peptic ulcers that had stood for centuries no longer applies in most cases. Though ulcers may arise from many other conditions, because of these new understandings the prevalence and discussion of ulcers has nearly fallen off the radar screen.
A few years back I began to show symptoms of rosacea, a facial skin condition characterized by redness. My local dermatologist recommended a daily dose of antibiotics as the preferred course of action. I was initially reluctant to follow this advice. I knew about the growing problem of bacterial resistance, and did not think that my constant use of tetracycline would help that issue. I also knew some about the controversial use of antibiotics in animal feeds, and had hesitations for that reason as well.
Nonetheless, I took the doctor’s advice. I rarely take any kind of medicine and immediately began to notice GI problems. My digestive regularity was immediately thrown out of kilter with other adverse effects as well. I immediately stopped using the antibiotics, and soon returned to (largely) my pre-regime conditions. (I also switched doctors.)
Over the past five years, due to a revolution in DNA sequencing , we are now beginning to understand the why of my observed reactions to antibiotics. Because we can now analyze skin and fecal samples for foreign DNA, we are coming to realize that humans (as is likely true for all higher organisms) are walking, teeming ecosystems of thousands of different species, mostly bacteria .
While there are some 23,000 genes in the native humane genome, there are more than 3 million estimated as arising from these fellow travelers. While we are still learning much, and rapidly, we know that our ecosystem of bacteria is involved in nutrition and digestion, contributing perhaps as much as 15% of the energy value we get from food. We also know that imbalances of various sorts in our walking ecosystem can also lead to diseases and other chronic conditions.
Though the degree and nature is still quite uncertain, our “microbiome” of symbiotic bacteria has been implicated in heart disease, Type II diabetes, obesity, malnutrition, multiple sclerosis, other auto-immune diseases, asthma, eczema, liver disease, bowel cancer and autism, among others. The breadth and extent of implications on well-being is staggering, especially since all of these implications have been learned over the past five years.
There are considerable differences between different human populations and cultures, too, in terms of differing compositions of the microbiome. And these effects are not limited to the gut. Skin and orifices to the outside world have their own denizens as well, likely also involved with both health and disease. Humans are not just complicated beasts, but a world of other species unique unto ourselves.
Each of these three anecdotes — and there are many others — point to phenomenal changes in our understanding of the human organism. This new knowledge has also arisen over a remarkably short period. Who knows when the pace of these insights might slow, if ever?
These anecdotes are exemplary about the fundamental nature of knowledge: it is constantly expanding with new connections and heretofore unforeseen relationships constantly emerging. These anecdotes also point to the fact that most knowledge problems are systems problems, intimately involved with the connections and inter-relationships among a diversity of players and factors.
It makes sense that how we choose to organize and analyze the information that constitutes our knowledge should have a structure and underlying logic premise consistent with expansion and new relationships. This premise is the central feature of the open world assumption and semantic Web technologies.
Fixed, closed, brittle schema of transaction systems and relational databases are a clear mismatch with knowledge problems and knowledge applications. We need systems where schema and structure can evolve with new information and knowledge. The foundational importance of open world approaches to understanding and modeling knowledge problems continues to be the elephant in the room.
It is perhaps not surprising that one of the fields most aggressive in embracing ontologies and semantic technologies is the life sciences. Practitioners in this field experience daily the explosion in new knowledge and understandings. Knowledge workers in other fields would be well-advised to follow the lead of the life sciences in re-thinking their own foundations for knowledge representation and management. It is good to remember that if your world is not open, then your understanding of it is closed.
Virtually everywhere one looks we are in the midst of a transition for how we organize and manage information, indeed even relationships. Social networks and online communities are changing how we live and interact. NoSQL and graph databases — married to their near cousin Big Data — are changing how we organize and store information and data. Semantic technologies, backed by their ontologies and RDF data model, are showing the way for how we can connect and interoperate disparate information in ways only dreamed about a decade ago. And all of this, of course, is being built upon the infrastructure of the Internet and the Web, a global, distributed network of devices and information that is undoubtedly one of the most important technological developments in human history.
There is a shared structure across all of these developments — the graph. Graphs are proving to be the new universal paradigm for how we organize and manage information. Graphs have an inherently expandable nature, and one which can also capture any existing structure. So, as we see all of the networks, connections, relationships and links — both physical and informational — grow around us, it is useful to step back a bit and contemplate the universal graph structure at the core of these developments.
Understanding that we now live in the Age of the Graph means we can begin studying and using the concept of the graph itself to better analyze and manage our interconnected world. Whether we are trying to understand the physical networks of supply chains and infrastructure or the information relationships within ontologies or knowledge graphs, the various concepts underlying graphs and graph theory, themselves expressed through a rich vocabulary of terms, provide the keys for unlocking still further treasures hidden in the structure of graphs.
The use of “graph” as a mathematical concept is not much more than 100 years old. The beginning explication of the various classes of problems that can be addressed by graph theory probably is no older than 300 years. The use of graphs for expressing logic structures probably is not much older than 100 years, with the intellectual roots beginning with Charles Sanders Peirce . Though likely trade routes and their affiliated roads and primitive transportation or nomadic infrastructures were perhaps the first expressions of physical networks, the emergence and prevalence of networks is a fairly recent phenomenon. The Internet and the Web are surely the catalyzing development that has brought graphs and networks to the forefront.
In mathematics, a graph is an abstract representation of a set of objects where pairs of the objects are connected. The objects are most often known as nodes or vertices; the connections between the objects are called edges. Typically, a graph is depicted in diagrammatic form as a set of dots or bubbles for the nodes, joined by lines or curves for the edges. If there is a logical relationship between connected nodes the edge is directed, and the graph is known as a directed graph. Various structures or topologies can be expressed through this conceptual graph framework. Graphs are one of the principle focuses of study in discrete mathematics . The word “graph” was first used in the sense as a mathematical structure by J.J. Sylvester in 1878 .
As representative of various data models, particularly in our company’s own interests in the Resource Description Framework (RDF) model, the nodes can represent “nouns” or subjects or objects (depending on the direction of the links) or attributes. The edges or connections represent “verbs” or relationships, properties or predicates. Thus, the simple “triple” of the basic statement in RDF (consisting of subject – predicate – object) is one of the constituent barbells that make up what becomes the eventual graph structure.
The manipulation and analysis of graph structures comes under the rubric of graph theory. The first recognized paper in that field is the Seven Bridges of Königsberg, written by Leonhard Euler in 1736. The objective of the paper was to find a walking path through the city that would cross each bridge once and only once. Euler proved that the problem has no solution:
Euler’s approach represented the path problem as a graph, by treating the land masses as nodes and the bridges as edges. Euler’s proof postulated that if every bridge has been traversed exactly once, it follows that, for each land mass (except for the ones chosen for the start and finish), the number of bridges touching that land mass must be even (the number of connections to a node we now call “degree”). Since that is not true for this instance, there is no solution. Other researchers, including Leibniz, Cauchy and L’Huillier applied this approach to similar problems, leading to the origin of the field of topology.
Later, Cayley broadened the approach to study tree structures, which have many implications in theoretical chemistry. By the 20th century, the fusion of ideas coming from mathematics with those coming from chemistry formed the origin of much of the standard terminology of graph theory.
Graph theory forms the core of network science, the applied study of graph structures and networks. Besides graph theory, the field draws on methods including statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. Classical problems embraced by this realm include the four color problem of maps, the traveling salesman problem, and the six degrees of Kevin Bacon.
Graph theory and network science are the suitable disciplines for a variety of information structures and many additional classes of problems. This table lists many of these applicable areas, most with links to still further information from Wikipedia:
|Graph Structures||Graph Problems|
Subgraphs, induced subgraphs, and minors
Search and navigation
Subsumption and unification
Route (path) problems
Visibility graph problems
Graphs are among the most ubiquitous models of both natural and human-made structures. They can be used to model many types of relations and process dynamics in physical, biological and social systems. Many problems of practical interest can be represented by graphs. This breadth of applicability makes network science and graph theory two of the most critical analytical areas for study and breakthroughs for the foreseeable future. I touch on this more in the concluding section.
Surely the first examples of graph structures were early trade and nomadic routes. Here, for example, are the trade routes of the Radhanites dating from about 870 AD :
It is not surprising that routes such as these, or other physical networks as exemplified by the bridges of Königsberg, were the stimulus for early mathematics and analysis related to efficient use of networks. Minimizing the time to complete a trade circuit or visiting multiple markets efficiently has clear benefits. These economic rationales apply to a wide variety of modern, physical networks, including:
Of course, included in the latter category is the Internet itself. It is the largest graph in existence, with an estimated 2.2 billion users and their devices all connected in one way or another in all parts of the globe .
Graphs and graph theory also have broad applicability to natural systems. For example, graph theory is used extensively to study molecular structures in chemistry and physics. A graph makes a natural model for a molecule, where vertices represent atoms and edges bonds. Similarly, in biology or ecology, graphs can readily express such systems as species networks, ecological relationships, migration paths, or the spread of diseases. Graphs are also proper structures for modeling biological and chemical pathways.
Some of the exemplar natural systems that lend themselves to graph structures include:
As with physical networks, a graph representation for natural systems provides real benefits in computer processing and analysis. Once expressed as a graph, all graph algorithms and perspectives from graph theory and network science can be brought to bear. Statistical methods are particularly applicable to representing connections between interacting parts of a system, as well to representing the physical dynamics of natural systems.
Parallel with the growth of the Internet and Web has been the growth of social networks. Social network analysis (SNA) has arguably been the single most important driver for advances in graph theory and analysis algorithms in recent years. New and interesting problems and challenges — from influence to communities to conflicts — are now being elucidated through techniques pioneered for SNA.
Second only in size to the Internet has been the graph of interactions arising from Facebook. Facebook had about 900 million users as of May 2012, half of which accessed the service via mobile devices . Facebook famously embraced the graph with its own Open Graph protocol, which makes it easy for users to access and tie into Facebook’s social network. A representation of the Facebook social graph as of December 2010 is shown in this well-known figure:
The suitability of the graph structure to capture relationships has been a real boon to better understanding of social and community dynamics. Many new concepts have been introduced as the result of SNA, including such things as influence, diversity, centrality, cliques and so forth. (The opening diagram to this article, for example, models centrality, with blue the maximum and red the minimum.)
Particular areas of social interaction that lend themselves to SNA include:
Entirely new insights have arisen from SNA including finding terrorist leaders, analyzing prestige, or identifying keystone vendors or suppliers in business ecosystems.
Given the ubiquity of graphs as representations of real systems and networks, it is certainly not surprising to see their use in computer science as as means for information representation. We already saw in the table above the many data structures that can be represented as graphs, but the paradigm has even broader applicability.
The critical breakthroughs have come through using the graph as a basis for data models and logic models. These, in turn, provide the basis for crafting entire graph-based vocabularies and languages. Once such structures are embraced, it is a natural extension to also extend the mindset to graph databases as well.
Some of the notable information representations that have a graph as their basis include:
A key point of graphs noted earlier was their inherent extensibility. Once graphs are understood as a great basis for representing both logic and data structures, it is a logical next step to see their applicability extend to knowledge representations and knowledge bases as well.
Graph-theoretic methods have proven particularly useful in linguistics, since natural language often lends itself well to discrete structure. So, not only can graphs represent syntactic and compositional structure, but they can also capture the interrelationships of terms and concepts within those languages. The usefulness of graph theory to linguistics is shown by the various knowledge bases such as WordNet (in various languages) and VerbNet.
Domain ontologies are similar structures, capturing the relationships amongst concepts within a given knowledge domain. These are also known as knowledge graphs, and Google has famously just released its graph of entities to the world . Semantic networks and neural networks are similar knowledge representations.
The following interactive diagram, of the UMBEL knowledge graph of about 25,000 reference concepts for helping to orient disparate datasets , shows that some of these graph structures can get quite large:
What all of these examples show is the nearly universal applicability of graphs, from the abstract to the physical, from the small to the large, and every gradation between. We also see how basic graph structures and concepts can be built upon with more structure. This breadth points to the many synergies and innovations that may be transferred from diverse fields to advance the usefulness of graph theories.
Despite the many advances that have occurred in graph theory and the increased attention from social network analysis, many, many graph problems remain some of the hardest in computation. Optimizations, partitioning, mapping, inferencing, traversing and graph structure comparisons remain challenging. And, some of these challenges are only growing due to the growth in the size of networks and graphs.
Applying the lessons of the Internet in such areas as non-relational databases, distributed processing, and big data and map reduce-oriented approaches will help some in this regard. We’re learning how to divide and conquer big problems, and we are discovering data and processing architectures more amenable to graph-based problems.
The fact we have now entered the Age of the Graph also bodes that further scrutiny and attention will lead to more analytic breakthroughs and innovation. We may be in an era of Big Data, but the structure underlying all of that is the graph. And that reality, I predict, will result in accelerated advances in graph theory.
There are many semantic technology terms relevant to the context of a semantic technology installation . Some of these are general terms related to language standards, as well as to ontologies or the dataset concept.
<attribute name, value>
where each element is a key-value pair. The key is the defined attribute and the value may be a reference to another object or a literal string or value. In RDF triple terms, the subject is implied in a key-value pair by nature of the instance record at hand.