Evolution
AI³
Adaptive Information
Adaptive Innovation
Adaptive Infrastructure
a·dap·tive adj. Showing or having a capacity to make fit for new or special situations; flexible; a successful adjustment.

Blogasbörd (cloud version):
Send Email   Get SIOC Profile   Get FOAF Profile   Syndicate full contents for this site using RSS 20
Main Links
Categories
Calendar
February 2013
S M T W T F S
« Jan    
 12
3456789
10111213141516
17181920212223
2425262728  
Archives
More . . .  
Credits
Blog software courtesy of WordPress Site Meter View Mike's profile on LinkedIn
6395
Search
Date:   May 10, 2011

Deciphering Information Assets Exposing $4.7 Trillion Annually in Undervalued Information

Something strange began to happen with company valuations beginning twenty to thirty years ago. Book values increasingly began to diverge — go lower — from stock prices or acquisition prices. Between 1982 and 1992 the ratio of book value to market value decreased from 62% to 38% for public US companies [1]. The why of this mystery has largely been solved, but what to do about it has not. Significantly, semantic technologies and approaches offer both a rationale and an imperative for how to get the enterprises’ books back in order. In the process, semantics may also provide a basis for more productive management and increased valuations for enterprises as well.

The mystery of diverging value resides in the importance of information in an information economy. Unlike the historical and traditional ways of measuring a company’s assets — based on the tangible factors of labor, capital, land and equipment — information is an intangible asset. As such, it is harder to see, understand and evaluate than other assets. Conventionally, and still the more common accounting practice, intangible assets are divided into goodwill, legal (intellectual property and trade secrets) and competitive (know-how) intangibles. But — given that intangibles now equal or exceed the value of tangible assets in advanced economies — we will focus instead on the information component of these assets.

As used herein, information is taken to be any data that is presented in a form useful to recipients (as contrasted to the more technical definition of Shannon and Weaver [2]). While it is true that the there is always a question of whether the collection or development of information is a cost or represents an investment, that “information” is of growing importance and value to the enterprise is certain.

The importance of this information focus can be demonstrated by two telling facts, which I elaborate below. First, only five to seven percent of existing information is adequately used by most enterprises. And, second, the global value of this information amounts to perhaps a range of $2.0 trillion to $7.4 trillion annually (yes, trillions with a T)! It is frankly unbelievable that assets of such enormous magnitude are so poorly understood, exploited or managed.

Amongst all corporate resources and assets, information is surely the least understood and certainly the least managed. We value what we measure, and measure what we value. To say that we little measure information — its generation, its use (or lack thereof) or its value — means we are attempting to manage our enterprises with one eye closed and one arm tied behind our backs. Semantic approaches offer us one way, perhaps the best way, to bring understanding to this asset and then to leverage its value.

The Seven “Laws” of Information

More than a decade ago Moody and Walsh put forward a seminal paper on the seven “laws” of information [3]. Unlike other assets, information has some unique characteristics that make understanding its importance and valuing it much more difficult than other assets. Since I think it a shame that this excellent paper has received little attention and few citations, let me devote some space to covering these “laws”.

Like traditional factors of production — land, labor, capital — it is critical to understand the nature of this asset of “information”. As the laws below make clear, the nature of “information” is totally unique with respect to other factors of production. Note I have taken some liberty and done some updating on the wording and emphasis of the Moody and Walsh “laws” to accommodate recent learnings and understandings.

Law #1: Information Is (Infinitely) Shareable

Information is not friable and can not be depleted. Using or consuming it has no direct affect on others using or consuming it and only using portions of it does not undermine the whole of it. Using it does not cause a degradation or loss of function from its original state. Indeed, information is actually not “consumed” at all (in the conventional sense of the term); rather, it is “shared”. And, absent other barriers, information can be shared infinitely. The access and
use to information on the Web demonstrates this truth daily.

Thus, perhaps the most salient characteristic of information as an asset is that it can be shared between any number of people, business areas and organizations without loss of value to any party (absent the importance of confidentiality or secrecy, which is another information factor altogether). The sharability or maintenance of value irrespective of use makes information quite different to how other assets behave. There is no dilution from use. As Moody and Walsh put it, “from the firm’s perspective, value is therefore cumulative rather than apportioned across different users.”

In practice, however, this very uniqueness is also a cause of other organizational challenges. Both personal and institutional barriers get erected to limit sharing since “knowledge is power.” One perverse effect of information hoarding or lack of institutional support for sharing is to force the development anew of similar information. When not shared, existing information becomes a cost, and one that can get duplicated many times over.

Law #2: The Value of Information Increases With Use

Most resources degrade with use, such as equipment wearing out. In contrast, the per unit value of information increases with use. The major cost of information is in its capture, storage and maintenance. The actual variable costs of using the information (particularly digital information) is, in essence, zero. Thus, with greater use, the per unit cost of information drops.

There is a corollary to this that also goes to the heart of the question of information as an asset. From an accounting point of view, something can only be an asset if it provides future economic value. If information is not used, it cannot possibly result in such benefits and is therefore not an asset. Unused information is really a liability, because no value is extracted from it. In such cases the costs of capture, storage and maintenance are incurred, but with no realized
value. Without use, information is solely a cost to the enterprise.

The additional corollary is that awareness of the information’s existence is an essential requirement in order to obtain this value. As Moody and Walsh state, “information is at its highest ‘potential’ when everyone in the organization knows where it is, has access to it and knows how to use it. Information is at its lowest ‘potential’ when people don’t even know it is there.”

A still further corollary is the importance of information literacy. Awareness of information without an understanding of where it fits or how to take advantage of it also means its value is hidden to potential users. Thus, in addition to awareness, training and documentation are important factors to help ensure adequate use. Both of these factors
may seem like additional costs to the enterprise beyond capture, storage and maintenance, but — without them — no or little value will be leveraged and the information will remain a sunk cost.

Law #3: Information is Perishable

Like most other assets, the value of information tends to depreciate over time [4]. Some information has a short shelf life (such as Web visitations); other has a long shelf life (patents, contracts and many trade secrets). Proper valuation of information should take into account such differences in operational life, analysis or decision life, and statutory life. Operational shelf life tends to be the shortest.

In these regards, information is not too dissimilar from other asset types. The most important point is to be cognizant of use and shelf differences amongst different kinds of information. This consideration is also traded off against the declining costs of digital information storage.

Law #4: The Value of Information Increases With Accuracy

A standard dictum is that the value of information increases with accuracy. The caveat, however, is that some information, because it is not operationally dependent or critical to the strategic interests of the firm, actually can become a cost when capture costs exceed value. Understanding such Pareto principles is an important criterion in evaluating information approaches. Generally, information closest to the transactional or business purpose of the organization will demand higher accuracy.

Such statements may sound like platitudes — and are — in the absence of an understanding of information dependencies within the firm. But, when certain kinds of information are critical to the enterprise, it is just as important to know the accuracy of the information feeding that “engine” as it is for oil changes or maintenance schedules for physical engines. Thus an understanding of accuracy requirements in information should be a deliberate management focus for critical business functions. It is the rare firm that attends to such imperatives today.

Law #5: The Value of Information Increases in Combination

A unique contribution from semantic approaches — and perhaps the one resulting in the highest valuation benefit — arises from the increased value of connecting the information. We have come to understand this intimately as the “network effect” from interconnected nodes on a network. It also arises when existing information is connected as well.

Today’s enterprise information environment is often described by many as unconnected “silos”. Scattered databases and spreadsheets and other information repositories litter the information landscape. Not only are these sources unconnected and isolated, but they also may describe similar information in different and inconsistent ways.

As I have described previously in The Law of Linked Data [5], existing information can act as nodes that — once connected to one another — tend to produce a similar network effect to what physical networks exhibit with increasing numbers of users. Of course, the nature of the information that is being connected and its centrality to the mission of the enterprise will greatly affect the value of new connections. But, based on evidence to date, the value of information appears to go up somewhere between a quadratic and exponential function for the number of new connections. This value is especially evident in know-how and competitive areas.

Law #6: More Is Not Necessarily Better

Information overload is a well-known problem. On the other hand, lack of appropriate information is also a compelling problem. The question of information is thus one of relevancy. Too much irrelevant information is a bad thing, as is too little relevant information.

These observations lead to two use considerations. First, means to understand and focus information capture on relevant information is critical. And, second, information management systems should be purposefully designed with user interfaces for easy filtering of irrelevant information.

The latter point sounds straightforward, but, in actuality, requires a semantic underpinning to the enterprise’s information assets. This requirement is because relevancy is in the eye of the beholder, and different users have different terms, perspectives, and world views by which information evaluation occurs. In order for useful filtering, information must be presented in similar terms and perspectives relevant to those users. Since multiple studies affirm that information decision-makers seek more information beyond their overload points [3], it is thus more useful to provide relevant access and filtering methods that can be tailored by user rather than “top down” information restrictions.

Law #7: Information is Self-propagating

With access and connections, information tends to beget more information. This propagation results from summations, analysis, unique combinations and other ways that basic datum get recombined into new datum. Thus, while the first law noted that information can not be consumed (or depleted) by virtue of its use, we can also say that information tends to reproduce and expand itself via use and inspection.

Indeed, knowledge itself is the result of how information in its native state can be combined and re-organized to derive new insights. From a valuation standpoint, it is this very understanding that leads to such things as competitive intelligence or new know-how. In combination with insights from connections, this propagating factor of information is the other leading source of intangible asset valuations.

This law also points to the fact that information per se is not a scarce resource. (Though its availability may be scarce.) Once available, techniques like data mining, analysis, visualization and so forth can be rich sources for generating new information from existing holdings of data.

Information as an Asset and How to Value

These “laws” — or perspectives — help to frame the imperatives for how to judge information as an asset and its resulting value. The methodological and conceptual issues of how to explicitly account for information on a company’s books are, of course, matters best left to economists and professional accountants. With the growing share of information in relation to intangible assets, this would appear to be a matter of great importance to national policy. Accounting for R&D efforts as an asset versus a cost, for example, has been estimated to add on the order of 11 percent to US national GDP estimates [9].

The mere generation of information is not necessarily an asset, as the “laws” above indicate. Some of the information has no value and some indeed represents a net sunk cost. What we can say, however, is that valuable information that is created by the enterprise but remains unused or is duplicated means that what was potentially an asset has now been turned into a cost — sometimes a cost repeated many-fold.

Information that is used is an asset, intangible or not. Here, depending on the nature of the information and its use, it can be valued on the basis of cost (historical cost or what it cost to develop it), market value (what others will pay for it), or utility (what is its present value as benefits accrue into the future). Traditionally the historical cost method has been applied to information. Yet, since information can both be sold and still retained by the organization, it may have both market value and utility value, with its total value being the sum.

In looking at these factors, Moody and Walsh propose a number of new guidelines in keeping with the “laws” noted above [3]:

  • Operational information should be measured as the cost of collection using data entry costs
  • Management information should be valued based on what it cost to extract the data from operational systems
  • Redundant data should be considered to have zero value (Law #1)
  • Unused data should be considered to have zero value (Law #2)
  • The number of users and number of accesses to the data should be used to multiply the value of the information (Law #2). When information is used for the first time, it should be valued at its cost of collection; subsequent uses should add to this value (perhaps on a depreciated basis; see below)
  • The value of information should be depreciated based on its “shelf life” (Law #3)
  • The value of information should be discounted by its accuracy relative to what is considered to be acceptable (Law #4)
  • And, as an added factor, information that is effectively linked or combined should have its value multiplied (Law #5), though the actual multiplier may be uncertain [5].

The net result of thinking about information in this more purposeful way is to encourage more accurate valuation methods, and to provide incentives for more use and re-use, particularly in combined ways. Such methods can also help distinguish what information is of more value to the organization, and therefore worthy of more attention and investment.

The Growing Importance of Intangible Information

The emerging discrepancy between market capitalizations and book values began to get concerted academic attention in the 1990s. To be sure, perceptions by the market and of future earnings potential can always  color these differences. The simple occurrence of a discrepancy is not itself proof of erroneous or inaccurate valuations. (And, the corollary is that the degree of the discrepancy is not sufficient alone to estimate the intangible asset “gap”, a logical error made by many proponents.) But, the fact that these discrepancies had been growing and appeared to be based (in part) on structural changes linked to intangibles was creating attention.

Leonard Nakamura, an economist with the Federal Reserve Board in Philadelphia, published a working paper in 2001 entitled, “What is the U.S. Gross investment in Intangibles?  (At Least) One Trillion Dollars a Year!” [6]. This was one of the first attempts to measure intangible investments, which he defined as private expenditures on assets that are intangible and necessary to the creation and sale of new or improved products and processes, including designs, software, blueprints, ideas, artistic expressions, recipes, and the like. Nakamura acknowledged his work as being preliminary. But he did find direct and indirect empirical evidence to show that US private firms were investing at least $1 trillion annually (as of 2000, the basis year for the data) in intangible assets.  Private expenditures, labor and corporate operating margins were the three measurement methods.  The study also suggested that the capital stock of intangibles in the US has an equilibrium market value of at least $5 trillion.

Another key group — Carol Corrado, Charles Hulten, and Daniel Sichel, known as “CHS” across their many studies — also began to systematically evaluate the extent and basis for intangible assets and its discrepancy [7].  They estimated that spending on long-lasting knowledge capital — not just intangibles broadly — grew relative to other major components of aggregate demand during the 1990s. CHS was the first to show that by the turn of the millenium that fixed US investment in intangibles was at least as large as business investment in traditional, tangible capital.

By later in the decade, Nakamura was able to gather and analyze time series data that showed the steady increase in the contributions of intangibles [8]:


Trends in Intangibles Values

One can see the cross-over point late in the decade. Investment in intangibles he now estimates to be on the order of 8% to 10% of GDP annually in the US.

Roughly at the same time the National Academies in the US was commissioned to investigate the policy questions of intangible assets. The resulting major study [9] contains much relevant information. But it, too, contained an update by CHS on their slightly different approach to analyzing the growing role of intangible assets:


Trends in Intangibles Values - Version 2

This CHS analysis shows similar trends to what Nakamura found, though the degree of intangible contributions is estimated as higher (~14% of annual GDP today), with investments in intangibles exceeding tangible assets somewhat earlier.

Surveys of more than 5,000 companies in 25 companies confirmed these trends from a different perspective, and also showed that most of these assets did not get reflected in financial statements. A large portion of this value was due to “brands” and other market intangibles [10]. The total “undisclosed” portion appeared to equal or exceed total
reported assets. Figures for the US indicated there might be a cumulative basis of intangible assets of $9.2 trillion [11].

In parallel, these groups and others began to decompose the intangible asset growth by country, sector, or asset type. The specific component of “information” received a great deal of attention. Uday Apte, Uday Karmarkar and Hiranya Nath, in particular, conducted a couple of important studies during this decade [12,13]. For example, they found nearly two-thirds of recent US GDP was due to information or knowledge industry contributions, a percentage that had been growing over time. They also found that a secondary sector of information internal to firms itself constituted well over 40% of the information economy, or some 28% of the entire economy. So the information activities internal to organizations and institutions represent a very large part of the economy.

The specific components that can constitute the informational portion of intangible assets has also been looked at by many investigators, importantly including key accounting groups. FASB, for example, has specific guidance on treatment of intangible assets in SFAS 141 [14]. Two-thirds of the 90 specific intangible items listed by the American Institute of Certified Public Accountants are directly related to information (as opposed to contracts, brands or goodwill), as shown in [15]. There has also been some good analysis by CHS on breakdowns by intangible assets categories [16]. There are also considerable differences by country on various aspects of these measures (for example, [10]). For example, according to OECD figures from 2002, expenditures for knowledge (R&D, education and software) ranged from nearly 7 percent (Sweden) to below 2 percent (Greece) in OECD countries, with the average of about 4 percent and the US at over 6 percent [17].

. . . Plus Too Much Information Goes Unused

The common view is that a typical organization only uses 5 to 7 percent of the information it already has on hand [18], and 20% to 25% of a knowledge worker’s time is spent simply trying to find information [19]. To probe these issues more deeply, I began a series of analyses in 2004 looking at how much money was being spent on preparing documents within US companies, and how much of that investment was being wasted or not re-used [20]. One key finding from that study was that the information within documents in the US represent about a third of total gross domestic product, or an amount equal at the time of the study to about $3.3 trillion annually (in 2010 figures, that would be closer to $4.7 trillion). This level of investment is consistent with the results of Apte et al. and others as noted above.

However, for various reasons — mostly due to lack of awareness and re-use — some 25% of those trillions of dollar spent annually on document creation costs are wasted. If we could just find the information and re-use it, massive benefits could accrue, as these breakdowns in key areas show:

U.S. FIRMS $ Million %
Cost to Create Documents $3,261,091
Benefits to Finding Missed or Overlooked Documents $489,164 63%
Benefits to Improved Document Access $81,360 10%
Benefits of Re-finding Web Documents $32,967 4%
Benefits of Proposal Preparation and Wins $6,798 1%
Benefits of Paperwork Requirements and Compliance $119,868 15%
Benefits of Reducing Unauthorized Disclosures $51,187 7%
Total Annual Benefits $781,314 100%
PER LARGE FIRM $ Million
Cost to Create Documents $955.6
Benefits to Finding Missed or Overlooked Documents $143.3
Benefits to Improving Document Access $23.8
Benefits of Re-finding Web Documents $9.7
Benefits of Proposal Preparation and Wins $2.0
Benefits of Paperwork Requirements and Compliance $35.1
Benefits of Reducing Unauthorized Disclosures $15.0
Total Annual Benefits $229.0

Table. Mid-range Estimates for the Annual Value of Documents, U.S. Firms, 2002 [20]

The total benefit from improved document access and use to the U.S economy is on the order of 8% of GDP. For the 1,000 largest U.S. firms, benefits from these improvements can approach nearly $250 million annually per firm (2002 basis). About three-quarters of these benefits arise from not re-creating the intellectual capital already invested in prior document creation. About one-quarter of the benefits are due to reduced regulatory non-compliance or paperwork, or better competitiveness in obtaining solicited grants and contracts.

This overall value of document use and creation is quite in line with the analyses of intangible assets noted above, and which arose from totally different analytical bases and data. This triangulation brings confidence that true trends in the growing importance of information have been identified.

How Big is the Information Asset Gap?

These various estimates can now be combined to provide an assessment of just how large the “gap” is for the overlooked accounting and use of information assets:

GDP ($T) Intangible % Info Contrib % Info Assets ($T) Unused Info ($T) Total ($T)
Lo Hi Lo Hi Lo Hi Lo Hi Lo Hi
US $14.72 9% 14% 33% 67% $0.44 $1.38 $0.30 $1.21 $0.74 $2.60
European Union $15.25 8% 12% 33% 50% $0.40 $0.92 $0.31 $1.26 $0.72 $2.17
Remaining Advanced $10.17 8% 12% 33% 50% $0.27 $0.61 $0.21 $0.84 $0.48 $1.45
Rest of World $34.32 2% 6% 10% 25% $0.07 $0.51 $0.00 $0.71 $0.07 $1.22
Total $74.46 $1.18 $3.42 $0.83 $4.02 $2.00 $7.44
Notes (see endnotes) [21] [22] [23]

Depending, these estimates can either be viewed as being too optimistic about the importance of information assets [25] or too conservative [26]. The breadth of the ranges of these values is itself an expression of the uncertainty in the numbers and the analysis.

The analysis shows that, globally, the value of unused and unaccounted information assets may be on the order of  $2.0 trillion to $7.4 trillion annually, with a mid-range value of $4.7 trillion. Even considering uncertainties, these are huge, huge numbers by any account. For the US alone, this range is $750 billion to $2.6 trillion annually. The analysis from the prior studies [20] would strongly suggest the higher end of this range is more likely than the lower. Similarly large gaps likely occur within the European Union and within other advanced nations. For individual firms, depending on size, the benefits of understanding and closing these gaps can readily be measured in the millions to billions [27].

At the high end, these estimates suggest that perhaps as much as 10% of global expenditures is wasted and unaccounted for due to information-related activities. This is roughly equivalent to adding a half of the US economy to the global picture.

In the concluding section, we touch on why such huge holes may appear in the world’s financial books. Clearly, though, even with uncertain and heroic assumptions, the magnitude of this gap is huge, with compelling needs to understand and close it as soon as possible.

The Relationship to Semantic Technologies

The seven Moody and Walsh information “laws” provide the clues to the reasons why we are not properly accounting for information and why we inadequately use it:

  • We don’t know what information we have and can not find it
  • What we have we don’t connect
  • We misallocate resources for generating, capturing and storing information, because we don’t understand its value and potential
  • We don’t manage the use of information or its re-use
  • We duplicate efforts
  • We inadequately leverage what information we have and miss valuable (that is, can be “valuated”) insights that could be gained.

Fundamentally, because information is not understood in our bones as central to the well-being of our enterprises, we continue to view the generation, capture and maintenance of information as a “cost” and not an “asset”.

I have maintained for some time an interactive information timeline [28] that attempts to encompass the entire human history of information innovations. For tens of thousands of years steady — yet slow — progress in the ways to express and manage information can be seen in this timeline. But, then, beginning with electricity and then digitization, the pace of innovation explodes.

The same timeframe that sees the importance of intangible assets appear on national and firm accounts is when we see the full digitization of information and its ability to be communicated and linked over digital networks. A very insightful figure by Rama Hoetzlein for his thesis in 2007, which I have modified and enhanced, captures this evolution with some estimated dates as is shown below (click to expand) [29]:


Knowledge System Trends throughout History

The first insight this figure provides is that all forms of information are now available in digital form. This includes unstructured (images and documents), semi-structured (mark-up and “tagged” information) and structured (database and spreadsheet) information. This information can now be stored and communicated over digital networks with broadly accepted protocols.

But the most salient insight is that we now have the means through semantic technologies and approaches to interrelate all of this information together. Tagging and extraction methods enable us to generate metadata for unstructured documents and content. Data models based on predicate logic and semantic logics give us the flexible means to express the relationships and connections between information. And all of this can be stored and manipulated through graph-based datastores and languages such that we can draw inferences and gain insights. Plus, since all of this is now accessible via the Web and browsers, virtually any user can access, use and leverage this information.

This figure and its dates not only shows where we have come as a species in our use and sophistication with information, but how we need to bring it all together using semantics to complete our transition to a knowledge economy.

The very same metadata and semantic tagging capabilities that enable us to interrelate the information itself also provides the techniques by which we can monitor and track usage and provenance. It is through these additional semantic methods that we can finally begin to gain insight as to what information is of what value and to whom. Tapping this information will complete the circle for how we can also begin to properly valuate and then manage and optimize our information assets.

Conclusion

With our transition to an information economy, we now see that intangible assets exceed the value of tangible ones. We see that the information component of these intangibles represent one-third to two-thirds of these intangibles. In other words, information makes up from 17% to more than one-third of an individual firm’s value in modern economies. Further, we see that at least 25% of firm expenditures on information is wasted, keeping it as a cost and negating its value as an asset.

The “factories” of the modern information economy no longer produce pins with the fixed inputs of labor and capital as in the time of Adam Smith. They rather produce information and knowledge and know-how. Yet our management and accounting systems seem fixed in the techniques of yesteryear. The quaint idea of total factor productivity as a “residual” merely belies our ignorance about the causes of economic growth and firm value. These are issues that should rightly occupy the attention of practitioners in the disciplines of accounting and management.

Why industrial-era accounting methods have been maintained in the present information age is for students of corporate power politics to debate. It should suffice to remind us that when industrialization induced a shift from the extraction of funds from feudal land possessions to earning profits on invested capital, most of the assumptions about how to measure performance had to change. When the expenses for acquiring information capabilities cease to be an arbitrary budget allocation and become the means for gaining Knowledge Capital, much of what is presently accepted as management of information will have to shift from a largely technological view of efficiency to an asset management perspective [30].

Accounting methods grounded in the early 1800s that are premised on only capital assets as the means to increase the productivity of labor no longer work. Our engines of innovation are not physical devices, but ideas, innovation and knowledge; in short, information. Capable executives recognize these trends, but have yet to change management practices to address them [31].

As managers and executives of firms we need not await wholesale modernization of accounting practices to begin to make a difference. The first step is to understand the role, use and importance of information to our organizations. Looking clearly at the seven information “laws” and what that means about tracking and monitoring is an immediate way to take this step. The second step is to understand and evaluate seriously the prospects for semantic approaches to make a difference today.

We have now sufficiently climbed the data federation pyramid [32] to where all of our information assets are digital; we have network protocols to link it; we have natural language and extraction techniques for making documents first-class citizens along side structured data; and we have logical data models and sound semantic technologies for tying it all together.

We need to reorganize our “factory” floors around these principles, just as prime movers and unit electric drives altered our factories of the past. We need to reorganize and re-think our work processes and what we measure and value to compete in the 21st century. It is time to treat information as seriously as it has become an integral part of our enterprises. Semantic technologies and approaches provide just the path to do so.


[1] Baruch Lev and Jürgen H. Daum, 2003. “Intangible Assets and the Need for a Holistic and More Future-oriented Approach to Enterprise Management and Corporate Reporting,” prepared for the 2003 PMA Intellectual Capital Symposium, 2nd October 2003, Cranfield Management Development Centre, Cranfield University, UK; see http://www.juergendaum.de/articles/pma_ic_symp_jdaum_final.pdf.
[2] Claude E. Shannon and Warren Weaver, 1949. The Mathematical Theory of Communication. The University of Illinois Press, Urbana, Illinois, 1949. ISBN 0-252-72548-4.
[3] Daniel Moody and Peter Walsh, 1999. “Measuring The Value Of Information: An Asset Valuation Approach,” paper presented at the Seventh European Conference on Information Systems (ECIS’99), Copenhagen Business School, Frederiksberg, Denmark, 23-25 June, 1999. See http://wwwinfo.deis.unical.it/zumpano/2004-2005/PSI/lezione2/ValueOfInformation.pdf. A precursor paper that is also quite helpful and cited much in Moody and Walsh is R. Glazer, 199. “Measuring the Value of Information: The Information Intensive Organisation”, IBM Systems Journal, Vol 32, No 1, 1993.
[4] Some trade secrets could buck this trend if the value of the underlying enterprise that relies on them increases.
[5] M.K. Bergman, 2009. “The Law of Linked Data,” post in AI3:::Adaptive Information blog, October 11, 2009. See http://www.mkbergman.com/837/the-law-of-linked-data/.
[6]  Leonard Nakamura, 2001. What is the U.S. Gross Investment in Intangibles?  (At Least) One Trillion Dollars a Year!,
Working Paper No. 01-15, Federal Reserve Bank of Philadelphia, October 2001; see http://www.phil.frb.org/files/wps/2001/wp01-15.pdf.
[7] Carol A. Corrado, Charles R. Hulten, and Daniel E. Sichel, 2004. Measuring Capital and Technology: An Expanded Framework. Federal Reserve Board, August 2004. http://www.federalreserve.gov/pubs/feds/2004/200465/200465pap.pdf.
[8] Leonard I. Nakamura, 2009. Intangible Assets and National Income Accounting: Measuring a Scientific Revolution, Working Paper No. 09-11, Federal Reserve Bank of Philadelphia, May 8, 2009; see http://www.philadelphiafed.org/research-and-data/publications/working-papers/2009/wp09-11.pdf.
[9] Christopher Mackie, Rapporteur, 2009. Intangible Assets: Measuring and Enhancing Their Contribution to Corporate Value and Economic Growth: Summary of a Workshop, prepared by the Board on Science, Technology, and Economic Policy (STEP) Committee on National Statistics (CNSTAT), ISBN: 0-309-14415-9, 124 pages; see http://www.nap.edu/openbook.php?record_id=1274 (available for PDF download with sign-in).
[10] Brand Finance, 2006. Global Intangible Tracker 2006: An Annual Review of the World’s Intangible Value, paper published by Brand Finance and The Institute of Practitioners in Advertising, London, UK, December 2006. See  http://www.brandfinance.com/images/upload/9.pdf.
[11] Kenan Patrick Jarboe and Roland Furrow, 2008. Intangible Asset Monetization: The Promise and the Reality, Working Paper #03 from the Athena Alliance, April 2008. See http://www.athenaalliance.org/pdf/IntangibleAssetMonetization.pdf.
[12] Uday M. Apte and Hiranya K. Nath, 2004, Size, Structure and Growth of the US Information Economy,” UCLA Anderson School of Management on Business and Information Technologies, December 2004; see  http://www.anderson.ucla.edu/documents/areas/ctr/bit/ApteNath.pdf.pdf.
[13] Uday M. Apte, Uday S. Karmarkar and Hiranya K Nath, 2008. “Information Services in the US Economy: Value, Jobs and Management Implications,” California Management Review, Vol. 50, No.3, 12-30, 2008.
[14] See the Financial Accounting Standards Board—SFAS 141; see http://www.gasb.org/pdf/fas141r.pdf.
[15] See further, AICPA Special Committee on Financial Reporting, 1994. Improving Business Reporting—A Customer Focus: Meeting the Information Needs of Investors and Creditors. See  http://www.aicpa.org/InterestAreas/AccountingAndAuditing/Resources/EBR/DownloadableDocuments/Jenkins%20Committee%20Report.pdf

Blueprints  

Book libraries

Broadcast licenses

Buy-sell agreements

Certificates of need

Chemical formulas

Computer software

Computerized databases

Contracts

Cooperative agreements

Copyrights

Credit information files

Customer contracts

Customer and client lists

Customer relationships

Designs and drawings  

Development rights

Employment contracts

Engineering drawings

Environmental rights

Film libraries

Food flavorings and recipes

Franchise agreements

Historical documents

Heath maintenance organization enrollment lists

Know-how

Laboratory notebooks

Literary works

Management contracts

Manual databases

Manuscripts  

Medical charts and records

Musical compositions

Newspaper morgue files

Noncompete covenants

Patent applications

Patents (both product and process)

Patterns

Prescription drug files

Prizes and awards

Procedural manuals

Product designs

Proposals outstanding

Proprietary computer software

Proprietary processes

Proprietary products  

Proprietary technology

Publications

Royalty agreements

Schematics and diagrams

Shareholder agreements

Solicitation rights

Subscription lists

Supplier contracts

Technical and specialty libraries

Technical documentation

Technology-sharing agreements

Trade secrets

Trained and assembled workforce

Training manuals

[16] See, for example, Carol Corrado, Charles Hulten and Daniel Sichel, 2009. “Intangible Capital and U.S. Economic Growth,” Review of Income and Wealth Series 55, Number 3, September 2009; see http://www.conference-board.org/pdf_free/IntangibleCapital_USEconomy.pdf.
[17] As stated in Kenan Patrick Jarboe, 2007. Measuring Intangibles: A Summary of Recent Activity, Working Paper #02 from the Athena Alliance, April 2007. See http://www.athenaalliance.org/pdf/MeasuringIntangibles.pdf.
[18] The 5% estimate comes from Graham G. Rong, Chair at MIT Sloan CIO Symposium, as reported in the SemanticWeb.com on May 5, 2011. (Rong also touted the use of semantic technologies to overcome this lack of use.) A similar 7% estimate comes from Pushpak Sarkar, 2002. “Information Quality in the Knowledge-Driven Enterprise,” InfoManagement Direct, November 2002. See http://www.information-management.com/infodirect/20021115/6045-1.html.
[19] M.K. Bergman, 2005. “Search and the ’25% Solution’,” AI3:::Adaptive Innovation blog, September 14, 2005. See http://www.mkbergman.com/121/search-and-the-25-solution/.
[20] M.K. Bergman, 2005.  “Untapped Assets: the $3 Trillion Value of U.S. Documents,” BrightPlanet Corporation White Paper, July 2005, 42 pp. Also available  online and in PDF.
[21] From the CIA, 2011. The World Factbook; accessed online at  https://www.cia.gov/library/publications/the-world-factbook/index.html on May 9, 2011. The “remaining advanced” countries are Australia, Canada, Iceland, Israel, Japan, Liechtenstein, Monaco, New Zealand, Norway, Puerto Rico, Singapore. South Korea, Switzerland, Taiwan.
[22] The range of estimates is drawn from the Nakamura [8] and CHS [9] studies, with each respectively providing the lower and upper bounds. These values have been slightly decremented for non-US advanced countries, and greatly reduced for non-advanced ones.
[23] The high range is based on the categorical share of intangible asset categories (60 of 90) from the AIPCA work [15]; the lower range is from the one-third of GDP estimates from [20].These values have been slightly decremented for non-US advanced countries, and greatly reduced for non-advanced ones.
[24] For unused information assets, the high range is based on the one-third of GDP and 25% “waste” estimates from [20]; the low range halves each of those figures. These values have been slightly decremented for non-US advanced countries, and greatly reduced for non-advanced ones (and zero for the low range).
[25] Reasons for the estimates to be too optimistic are information as important as goodwill; branding; intellectual basis of cited resources is indeed real; considerable differences by country and sector (see [10] and [16]).
[26] Reasons for the estimates to be too conservative: no network effects; greatly discounted non-advanced countries; share is growing (but older estimates used); considerable differences by country and sector (see [10] and [16]).
[27] For some discussion of individual firm impacts and use cases see [10] and [20], among others.
[28] See the Timeline of Information History, and its supporting documentation at M.K. Bergman, 2008. “Announcing the ‘Innovations in Information’ Timeline,” AI3:::Adaptive Information blog, July 6, 2008; see  http://www.mkbergman.com/421/announcing-the-innovations-in-information-timeline/.
[29] This figure is a modification of the original published by Rama C. Hoetzlein, 2007. Quanta – The Organization of Human Knowedge: Systems for Interdisciplinary Research, Master’s Thesis, University of California Santa Barbara, June 2007; see http://www.rchoetzlein.com/quanta/ (p 112). I adapted this figure to add logics, data and metadata to the basic approach, with color coding also added.
[30] From Paul A. Strassmann, 1998. “The Value of Knowledge Capital,” American Programmer, March 1998. See  http://www.strassmann.com/pubs/valuekc/.
[31] For example, according to [11], in a 2003 Accenture survey of senior managers across industries, 49 percent of respondents said that intangible assets are their primary focus for delivering long-term shareholder value, but only 5 percent stated that they had an organized system to track the performance of these assets. Also, according sources cited in Gio Wiederhold, Shirley Tessler, Amar Gupta and David Branson Smith, 2009. “The Valuation of Technology-Based Intellectual Property in Offshoring Decisions,” in Communications for the Association of Information Systems (CAIS) 24, May 2009 (see http://ilpubs.stanford.edu:8090/951/2/Article_07-270.pdf): Owners and stockholders acknowledge that IP valuation of technological assets is not routine within many organizations. A 2007 study performed by Micro Focus and INSEAD highlights the current state of affairs: Of the 250 chief information officers (CIOs) and chief finance officers (CFOs) surveyed from companies in the U.S., UK, France, Germany, and Italy, less than 50 percent had attempted to value their IT assets, and more than 60 percent did not assess the value of their software.
[32] M.K. Bergman, 2006. “Climbing the Data Federation Pyramid,” AI3:::Adaptive Information blog, May 25, 2006; see  http://www.mkbergman.com/229/climbing-the-data-federation-pyramid/.

Posted by AI3's author, Mike Bergman

Posted on May 10, 2011 at 3:13 pm in Adaptive Information, Adaptive Innovation, Semantic Enterprise, Semantic Web | Comments (2)
The URI link reference to this post is: http://www.mkbergman.com/958/leveraging-intangible-assets-using-semantic-technologies/
The URI to trackback this post is: http://www.mkbergman.com/958/leveraging-intangible-assets-using-semantic-technologies/trackback/
Date:   April 25, 2011

Advances in How to Transfer Semantic Technologies to Enterprise Users structWFS

For some time, our mantra at Structured Dynamics has been, “We’re successful when we are not needed. [1]

In support of this vision, we have been key developers of an entire stack of semantic technologies useful to the enterprise, the open semantic framework (OSF); we have formulated and contributed significant open source deployment guidance to the MIKE2.0 methodology for semantic technologies in the enterprise called Open SEAS; we have developed useful structured data standards and ontologies; and we have made massive numbers of free how-to documents and images available for download on our TechWiki. Today, we add further to these contributions with our workflows guidance. All of these pieces contribute to what we call the total open solution.

Prior documentation has described the overall architecture or layered approach of the open semantic framework (OSF). Those documents are useful, but lack a practical understanding of how the pieces fit together or how an OSF instance is developed and maintained.

This new summary overviews a series of seven different workflows for various aspects of developing and maintaining an OSF (based on Drupal) [2]. In addition, each workflow section also cross-references other key documentation on the TechWiki, as well as points to possible tools that might be used for conducting each specific workflow.

Overview

Seven different workflows are described, as shown in the diagram below. Each of the workflows is color-coded and related to the other workflows. The basic interaction with an OSF instance tends to occur from left-to-right in the diagram, though the individual parts are not absolutely sequential. As each of the seven specific workflows is described below, it is keyed by the same color-coded portion of the overall workflow.

OSF Workflow

Each of the component workflows is itself described as a series of inter-relating activities or tasks.

Installation Workflow

Installation is mostly a one-time effort and proceeds in a more-or-less sequential basis. As various components of the stack are installed, they are then configured and tested for proper installation.

The installation guide is the governing document for this process, with quite detailed scripts and configuration tests to follow. The blue bubbles in the diagram represent the major open source software components of Virtuoso (RDF triple store), Solr (full-text search) and Drupal (content management system).

Install Workflow

Another portion of this workflow is to set up the tools for the backoffice access and management, such as PuTTY and WinSCP (among others).

Click here to see the tools associated with this workflow sequence, as described in the TechWiki desktop tools document.

Configure & Presentation Workflow

One of the most significant efforts in the overall OSF process is the configuration and theming of the host portal, generally based on Drupal.

The three major clusters of effort in this workflow are the design of the portal, including a determination of its intended functionality; the setting of the content structure (stubbing of the site map) for the portal; and determining user groups and access rights. Each of these, in turn, is dependent on one or more plug-in modules to the Drupal system.

Some of these modules are part of the conStruct series of OSF modules, and others are evaluated and drawn from the more than 8000 third-party plug-in modules to Drupal.

Configure Workflow

The Design aspect involves picking and then modifying a theme for the portal. These may start as one of the open source existing Drupal themes, as well as those more specifically recommended for OSF. If so, it will likely be necessary to do some minor layout modifications on the PHP code and some CSS (styling) changes. Theming (skinning) of the various semantic component widgets (see below) also occurs as part of this workflow.

The Content Structure aspect involves defining and then stubbing out placeholders for eventual content. Think of this step as creating a site map structure for the OSF site, including major Drupal definitions for blocks, Views and menus. Some of the entity types are derived from the named entity dictionaries used by a given project.

More complicated User assignments and groups are best handled through a module such as Drupal’s Organic Groups. In any event, determination of user groups (such as anonymous, admins, curators, editors, etc.) is a necessary early determination, though these may be changed or modified over time.

For site functionality, Modules must be evaluated and chosen to add to the core system. Some of these steps and their configuration settings are provided in the guidelines for setting up Drupal document.

None of the initial decisions “lock in” eventual design and functionality. These may be modified at any time moving forward.

Click here to see the tools associated with this workflow sequence, as described in the TechWiki desktop tools document.

Structured Data Workflow

Of course, a key aspect of any OSF instance is the access and management of structured data.

There are basically two paths for getting structured data into the system. The first, involving (generally) smaller datasets is the manual conversion of the source data to one of the pre-configured OSF import formats of RDF, JSON, XML or CSV. These are based on the irON notation; a good case study for using spreadsheets is also available.

The second path (bottom branch) is the conversion of internal structured data, often from a relational data store. Various converters and templates are available for these transformations. One excellent tool is FME from Safe Software (representing the example shown utilizing a spatial data infrastructure (SDI) data store), though a very large number of options exist for extract, transform and load.

In the latter case, procedures for polling for updates, triggering notice of updates, and only extracting the deltas for the specific information changed can help reduce network traffic and upload/conversion/indexing times.

Structured Data Workflow
Click here to see the tools associated with this workflow sequence, as described in the TechWiki desktop tools document.

Content Workflow

The structured data from the prior workflow process is then matched with the remaining necessary content for the site. This content may be of any form and media (since all are supported by various Drupal modules), but, in general, the major emphasis is on text content.

Existing text content may be imported to the portal or new content can be added via various WYSIWYG graphical editors for use within Drupal. (The excellent WYWIWYG Drupal module provides an access point to a variety of off-the-shelf, free WYSIWYG editors; we generally use TinyMCE but multiples can also be installed simultaneously).

The intent of this workflow component is to complete content entry for the stubs earlier created during the configuration phase. It is also the component used for ongoing content additions to the site.

Content Workflow

Content that is tagged by the scones tagger is done so based on the concepts in the domain ontology (see below) and the named entities (as contained in “dictionaries”) used by a given project. Once tagged, this information can also now be related to the other structured data in the system.

Once all of this various content is entered into the system, it is then available for access and manipulation by the various conStruct modules (see figure above) and semantic component widgets (see below).

Click here to see the tools associated with this workflow sequence, as described in the TechWiki desktop tools document.

Ontologies Workflow

Though the next flowchart below appears rather complicated, there are really only three tasks that most OSF administrators need worry about with respect to ontologies:

  1. Adding a concept to the domain ontology (a class) and setting its relationships to other concepts
  2. Adding a dataset attribute (data characteristic) for various dataset records, or
  3. Adding or changing an annotation for either of these things, such as the labels or descriptions of the thing.

In actuality, of course, editing, modifying or deleting existing information is also important, but they are easier subsets of activities and user interfaces to the basic add (“create”) functions.

The OSF interface provides three clean user interfaces to these three basic activities [3].

These basic activities may be applied to the three major governing ontologies in any OSF installation:

  • The domain ontology, which captures the conceptual description of the instance’s domain space
  • The semantic components ontology (SCO), which sets what widgets may display what kinds of data, and
  • irON for the instance record attributes and metadata (annotations).

All of the OSF ontology tools work off of the OWLAPI as the intermediary access point. The ontologies themselves are indexed as structured data (RDF with Virtuoso) or full text (Solr) for various search, retrieval and reasoning activities.

Ontologies Workflow

Because of the central use of the OWLAPI, it is also possible to use the Protégé editor/IDE environment against the ontologies, which also provides reasoners and consistency checking.

Click here to see the tools associated with this workflow sequence, as described in the TechWiki desktop tools document.

Filter & Select Workflow

The filter and select activities are driven by user interaction, with no additional admin tools required. This workflow is actually the culmination of all of the previous sequences in that it exposes the structured data to users, enables them to slice-and-dice it, and then to view it with a choice of relevant widgets (semantic components).

For example, see this animation:

Animated Filtering and Selection Workflow

Considerable more detail and explanation is available for these semantic components.

Click here to see the tools associated with this workflow sequence, as described in the TechWiki desktop tools document.

Maintenance Workflow

The ongoing maintenance of an OSF instance is mostly a standard Drupal activity. Major activities that may occur include moderating comments; rotating or adding new content; managing users; and continued documentation of the site for internal tech transfer and training. If the portal embraces other aspects of community engagement (social media), these need to be handled as part of this workflow as well.

All aspects of the site and its constituent data may be changed, or added to at any time.

Maintenance Workflow
Click here to see the tools associated with this workflow sequence, as described in the TechWiki desktop tools document.

Moving from Here

Total Open SolutionWhen first introduced in our three-part series, we noted the interlocking pieces that constituted the total open solution of the open semantic framework (OSF) (see right). We also made the point — unfortunately still true today — that the relative maturity and completeness of all of these components still does not allow us to achieve fully, “We’re successful when we are not needed.”

As a small firm that is committed to self-funding via revenues, Structured Dynamics is only able to add to its stable of open source software and to develop methodologies and provide documentation based on our client support. Yet, despite our smallness, our superb client support has enabled us to aggressively and rapidly add to all four components of this total open solution. This newest series of ongoing workflow documents (plus some very significant expansions and refinements of the OSF code base) is merely the latest example of this dynamic.

Through judicious picking of clients (and vice versa), and our insistence that new work and documentation be open sourced because it itself has benefitted from prior open source, we and our client partners have been making steady progress to this vision of enterprises being able to adopt and install semantic solutions on their own. Inch-by-inch we are getting there.

The status of our vision today is that we are still needed in most cases to help formulate the implementation plan and then guide the initial set-up and configuration of the OSF. This support typically includes ontology development, data conversion and overall component integration. While it is true that some parties have embraced the OSF code and documentation and are implementing solutions on their own, this still requires considerable commitment and knowledge and skills in semantic technologies.

The great news about today’s status is that — after initial set-up and configuration — we are now able to transfer the technology to the client and walk away. Tools, documentation, procedures and workflows are now adequate for the client to extend and maintain their OSF instance on their own. This great news includes a certification process and program for transferring the technology to client staff and assessing their proficiency in using it.

We have been completely open about our plans and our status. In our commitment to our vision of success, much work is still needed on the initial install and configure steps and on the entire area of ontology creation, extension and mapping [4]. We are working hard to bridge these gaps. We welcome additional partners that share with us the vision of complete, turnkey frameworks — including all aspects of total open solutions. Inch-by-inch we are approaching the realization of a vision that will fundamentally change how every enterprise can leverage its existing information assets to deliver competitive advantage and greater value for all stakeholders. You are welcome aboard!


[1] This has been the thematic message on Structured Dynamics‘ Web site for at least two years. The basic idea is to look at open source semantic technologies from the perspective of the enterprise customer, and then to deliver all necessary pieces to enable that customer to install, deploy and maintain the OSF stack on its own. The sentiment has infused our overall approach to technology development, documentation, technology transfer and attention to methodologies.
[2] The first version of this article appeared as Workflow Perspectives on OSF on the OpenStructs TechWiki on April 19, 2011.
[3] The current release of OSF does not yet have these components included; they will be released to the open source SVNs by early summer.
[4] The best summary of the vision for where ontology development needs to head is provided by the Normative Landscape of Ontology Tools article on the TechWiki; see especially the second figure in that document.

Posted by AI3's author, Mike Bergman

Posted on April 25, 2011 at 1:45 am in Open Semantic Framework, Semantic Enterprise, Structured Dynamics | Comments (1)
The URI link reference to this post is: http://www.mkbergman.com/956/workflow-perspectives-on-the-open-semantic-framework/
The URI to trackback this post is: http://www.mkbergman.com/956/workflow-perspectives-on-the-open-semantic-framework/trackback/
Date:   April 4, 2011

People in Crowds

Self-service Information Management for Knowledge Workers

Though I have alluded to it numerous times in my past writings [1], I think one of the most pervasive and important benefits from semantic technologies in the enterprise will come from the democratization of information. These benefits will arise mostly from a fundamental change in how we manage and consume information. A new “system” of semantic technologies is now largely available that can put the collection, assembly, organization, analysis and presentation of information directly in the hands of those who need it most — the consumers of information.

The idea of “democratizing information” has been around for a couple of decades, and has accelerated in incidence since the dominance of the Internet. Most commonly, the idea is associated with developments and notions in such areas as citizen journalism, crowdsourcing, the wisdom of the crowd, social bookmarking (or collaborative tagging), and the democratic (small “d”) access to publishing via new channels such as blogs, microblogs (e.g., Twitter) and wikis. To be sure, these kinds of democratic information will (and are) benefiting from the use and application of semantics.

But the trend I’m focusing on here is much different and quite new. It is the idea that enterprise knowledge workers can now take ownership and control of their knowledge management functions. In the process, prior bottlenecks due to IT can be relieved and massive new benefits can open up to the enterprise.

Decades-long Mismatches Between KM and IT

“Enterprise systems are doing it wrong. And not just a little bit, either. Orders of magnitude wrong. Billions and billions of dollars worth of wrong. Hang-our-heads-in-shame wrong. It’s time to stop the madness.”
– Tim Bray [2]

It is no secret that IT has not served the enterprise knowledge management function well for decades.  Transaction systems and database systems geared to fast indexing and access to datum have not proved well suited to information or knowledge management. KM includes such applications as business intelligence, data warehousing, data integration and federation, enterprise information integration and management, competitive intelligence, knowledge representation, and so forth. Information management is a bit broader category, and adds such functions as document management, data management, enterprise content management, enterprise or controlled vocabularies, systems analysis, information standards and information assets management to the basic functions of KM. Since the purpose of this piece is not to get into the epistemological differences between information and knowledge, I use these terms more-or-less interchangeably herein.

Knowledge and information management is very big business. Given the breadth and differences in defining the KM and IM markets, let’s take as a proxy the business intelligence (BI) market, one of KM’s most important elements. Various estimates from IDC, Gartner and others place the current value of BI software sales somewhere in the range of $9 billion to $11 billion annually [3]. Further, BI ranked number five on the list of the top 10 technology priorities for chief information officers (CIOs) in 2011. And this pertains to the structured component of information alone.

Yet, at the same time, BI-related projects continue to have high failure rates, often cited as in the 65% to higher range [4]. These failure rates are consistent with KM projects in general [5]. These failures are merely one expression of a constant litany of issues and concerns regarding the enterprise KM function:

Conventional KM Problem Area Comments
Inflexible Reports
  • reports are rarely “self-service”
  • new requests need to be placed in queue
  • 90% of stored report templates are never used
  • unlimited “slicing and dicing” not available
Inflexible Analysis
  • analysis is rarely “self-service”
  • new requests need to be placed in queue
  • many requests not accepted due to schema rigidities, cascading changes needed
  • analysis options are “pre-canned”, inflexible
Schema Bottlenecks
  • brittleness of relational data model and typical star schema
  • crossing across schema or databases difficult
  • load and re-indexing cycles can limit access, impose expensive back-end requirements
  • can not (often) accommodate new data, structures
ETL Bottlenecks
  • getting data into the system needs to be placed in queue
  • new external data requires extract, transform and load (ETL) routines to be written
  • schedule and update cycles can be a mismatch to access needs
Reliance on Intermediaries
  • all problems above work through intermediaries
  • disconnect between those with need and decision-makers and those who implement the solutions
  • inherent issues in communicating requirements to implementers
  • related time delays to implementation exacerbate the communication of requirements
Specialized Expertise Required
  • expertise and skill sets needed to implement solutions different from those of the knowledge consumer
  • inherent issues in communicating requirements to implementers
  • high costs for attracting necessary expertise
  • expertise is inherently an overhead function
Slow Response Time
  • all problems above lead to delays, slow response
  • timely communications, analysis, decisions suffer
  • delays mean knowledge management is not an active “contact sport”, becomes mired and unresponsive
  • some needs are just not requested because of these problems
Dependence on External Apps
  • new apps need to be identified, procured
  • design and configuration of apps requires external expertise, programming skills
  • multiple sourcing of apps leads to frequent incompatibilities, high costs for integration, poor interoperability
Unmet Needs
  • many KM needs are simply not requested
  • by the time responses are forthcoming, needs and imperatives have moved on
  • communications, analysis and decisions become hassles
  • the “contact sport” of active discovery and learning is unmet
High Opportunity Costs
  • many KM insights are simply not discovered
  • delays and frustration adds to costs, friction, inefficiencies
  • no way to know the opportunity costs of what is not learned — but, surely is high
High Failure Rates
  • the net impact of all of the problems above is to lead to high failure rates (~60% to 70%) and unacceptable costs
  • reliance on IT for KM has utterly and totally failed

The seeming contradiction between continued growth and expenditures for information management coupled with continued high failure rates and disappointments is really an expression of the centrality of information to the modern enterprise. The funding and growth of the IT function is itself an expression of this centrality and perceived importance. These have been abiding trends in our transition to information or knowledge economies.

Bray [2] places the fault for wasted initiatives within the culture of IT. I believe there is some truth to this — variably, of course, depending on the specific enterprise. But the real culprit, I believe, has been the past need to “intermediate” a layer of software and IT expertise between knowledge workers and their source information. A progression of tasks has been necessary — conducted over decades with advances and learning — to get paper information into electronic form, get those forms to be understood and operate in some common ways, and then to develop tools, architectures and frameworks to make sense of it. Yet, as more tasks with required specialized skills have been added to this layer, the actual gulf between worker and information has increased. For example, enterprises still require the overhead and layers of IT to write SQL to get information out and then to prepare and fix reports.

On average, IT now consumes about 4% of all enterprise expenditures and employs about 6% of enterprise workers [6]. IT has become a very thick intermediary layer, indeed! Yet, because of the advances and learning that has occurred in growing and nurturing this layer, we also now have the basis to begin to “disintermediate” the IT layer. Many, if not all, of the challenges noted in the table above can be improved by doing so.

Early Attempts at Self-service and Semantics

One current buzzword in business intelligence is “self service”. By this term is meant giving knowledge workers the tools and systems for creating reports or doing analysis on their own without needing to work through (or be frustrated by) the IT layer. Self-service software was first postulated in the 1990s as a way for information consumers and authors (typically subject-matter experts) to automate some of their knowledge management tasks. Today, it is most commonly applied to self-service reporting or self-service analytics within the BI realm.

As a general proposition, self-service BI has been more myth than reality [7]. Forrester surveys, for example, indicate that IT still develops most BI applications. Of survey respondents in 2009, 70% responded that IT develops the enterprise’s reports and dashboards [8]. However, that figure is not 100%, as it was just a decade earlier, and there is also notable success to some open source providers such as BIRT that address a wide range of reporting needs within a typical application, ranging from operational or enterprise reporting to multi-dimensional online analytical processing (OLAP).

James Kobelius [8] is particularly bullish on the application of Web 2.0mashup” applications to knowledge worker purposes. Under this approach, Web-based applications are used and accessed directly by knowledge workers for charting and mapping purposes using Ajax or Flash widgets, such as Google Maps. The conventional BI and KM vendors have begun to more more aggressively into this area. Some notable new entrants — such as Tableau, Factual or Good Data — are also showing the way to more direct access, more flexible reporting and analysis widgets, and cleaner service or platform designs.

These initiatives reside at the display or reporting level. There is another group, including James Kobelius, Neil Raden or Seth Earley, that have addressed how to get disparate information to talk together using ontologies. They refer to “semanticizing” such traditional practices such as master data management (MDM), “ontologizing” taxonomies, or adding Web 2.0 mashups to business intelligence. While these thoughts are moving in the right direction, and will bring incremental benefits, they still are far short of the potentials at hand.

Self-service Information Management

Self-service Information Management (click to expand)

So far, in the KM realm, the application of semantics has tended to be limited to information extraction (tagging) of text documents and first attempts at using ontologies. The tagging component is essential to enable the 80% of information presently in textual documents to become first-class citizens within business intelligence or knowledge management. The ontology efforts to date appear to be more like thin veneers over traditional taxonomies. Rather than hierarchical structures, we now see graph-oriented ones, but still intended to fulfill the same tasks of enterprise metadata and vocabulary lookups.

The ontology efforts especially are just nibbling around the edges of what can be done with semantic technologies. Rather than looking upon ontologies as just another dictionary (though that role is true), if we re-orient our thinking to make ontologies central to the KM function, a wealth of new opportunities and benefits arises.

A bit more than a year ago, we formulated the Seven Pillars of the Open Semantic Enterprise, which included ontologies and related as some of the central components. In that article [9], we noted the particular applicability of semantic technologies to the information and knowledge management functions within enterprises. We asserted the benefits for embracing the open semantic enterprise as providing the organization greater insights with lower risk, lower cost, faster deployment, and more agile responsiveness. Since that time we have been deploying such systems and documenting those benefits.

Integral to the seven pillars are those aspects that lead to the democratization of information for the knowledge worker, what combined might be called “self-service information management”. As the figure to the right shows, three of the seven pillars are essential building blocks to this capability, two pillars are further foundations to it, with the remaining two pillars only tangentially important.

What the combination of these pieces means is a fundamental change in how knowledge work is done. Through this approach, we can largely disintermediate IT from the knowledge function, can bring knowledge management directly into the hands of those who need it in real time, and fundamentally alter how knowledge management apps are designed and deployed. The best thing is these benefits are an incremental evolution, and retain the use and value of existing information assets.

Building Block #1: Adaptive Ontologies

Rather than peripheral lookup structures or thin veneers, ontologies play the central role in the design of self-service information management. We use the plural on purpose here: what is deployed is actually a library of complementary and modular ontologies that play a variety of roles. Combined, we call these libraries with their representative functions adaptive ontologies.

This library contains the expected and conventional domain ontologies. These represent the actual knowledge space for the domain at hand, and may be comprised of multiple different ontologies representing different domain or knowledge spaces.  These standard semantic Web ontologies may range from the small and simple to the large and complex, and may perform the roles of defining relationships among concepts, integrating instance data, orienting to other knowledge and domains, or mapping to other schema.

From a best practices standpoint [10], we take special care in constructing these domain ontologies such that we provide labels and cues for user interfaces. Some of the user interface considerations that can be driven by adaptive ontologies include: attribute labels and tooltips; navigation and browsing structures and trees; menu structures; auto-completion of entered data; contextual dropdown list choices; spell checkers; online help systems; etc. We also include a variety of synonyms and aliases (the combination of which we call semsets) for referring to concepts and instances in multiple ways and for aiding information extraction and tagging functions. (In addition to organizing and helping to interoperate contributing information, these domain ontologies are also used for what is called ontology-based information extraction (OBIE) via our scones [11] system.)

In addition the library of adaptive ontologies includes some administrative ontologies that guide how instance data can be imported and inter-related (via the Instance Record Object Notation, or irON); what information types drive what widgets (via the Semantic Component Ontology, or SCO); data mapping vocabularies (UMBEL Vocabulary); how to characterize datasets; and other potential specialty functionality.

A forthcoming article will describe the composition and modularity typically found in a library of these adaptive ontologies.

In combination, these adaptive ontologies are, in effect, the “brains” of the self-service system. The best aspect of these ontologies is that they can be understood, created and maintained by knowledge workers. They constitute the only specification (other than theming, if desired) necessary to create self-service knowledge management environments.

Building Block #2: Ontology-driven Apps

The piece of the puzzle that implements the instruction sets within these adaptive ontologies are the ontology-driven apps, or ODapps. A recent article describes these structures in some detail [12].

ODapps are modular, generic software applications designed to operate in accordance with the specifications contained in the adaptive ontologies. ODapps fulfill specific generic tasks, consistent with their dedicated design to respond to adaptive ontologies. For example, current ontology-driven apps include imports and exports in various formats, dataset creation and management, data record creation and management, reporting, browsing, searching, data visualization and manipulation (through libraries of what we call semantic components), user access rights and permissions, and similar. These applications provide their specific functionality in response to the specifications in the ontologies fed to them.

ODapps are designed more similarly to widgets or API-based frameworks than to the dedicated software of the past, though the dedicated functionality (e.g., graphing, reporting, etc.) is obviously quite similar. The major change in these ontology-driven apps is to accommodate a relatively common abstraction layer that responds to the structure and conventions of the guiding ontologies. The major advantage is that single generic applications can supply shared functionality based on any properly constructed adaptive ontology.

Generic functionality included in these ODapps are things like filtering, setting value ranges, choosing the specific display view, and invoking or not various display templates (akin to the infoboxes on Wikipedia). By nature of the data and the ontologies submitted to them, the ODapp signals to the user or consumer what displays, views, filters or slices-and-dices might be available to them. Fed different data and different ontologies, the ODapp would signal the user differently.

Because of their generic design, driven by the ontologies, only a relatively small number of ODapps needs to be created. Once created with appropriate generic functionality, application development is essentially over. It is through the additions and changes to the adaptive ontologies — done by knowledge workers themselves — that new capability and structure gets exposed through these ontology-driven apps. This innovation shifts the locus from software and programming to data and knowledge structures.

This democratization of IT means that everything in the knowledge management realm can become self service. Users and consumers can create their own analyses; develop their own reports; and package and disseminate what they and their colleagues need, when they need it. Through ontology-driven apps and adaptive ontologies, we turn prior software engineering practice on its head.

Building Block #3: Open World Assumption

Integral to this design is the embrace of the open world assumption [13]. Though not a specific artifact, as are adaptive ontologies or ODapps, the open-world approach is the logical underpinning that allows consumers or knowledge workers to add new information to the system as it is discovered or scoped. This nuance may sound esoteric, but traditional KM systems have a very different underpinning that leads to some nasty implications.

Because the predominant share of KM systems are based on relational database systems, they embody a closed-world design. This works well for transaction systems or environments where the information domain is known and bounded, but does not apply to knowledge and changing information. Moreover, the schema that govern closed-world designs are brittle and hard to change and manage. It is this fact that has put KM squarely in the bailiwick of IT and has often led to delays and frustrations. Re-architecting or adding new schema views to an existing closed-world system can be fiendishly difficult.

This difficulty is a major reason why IT resists casual or constant changes to underlying data schema. Unfortunately, this makes these brittle schema difficult to extend and therefore generally unresponsive to changing and growing knowledge. As an environment for knowledge management, the relational data system and the closed-world approach are lousy foundations.

Other Building Blocks

As the self-service information management diagram above shows, RDF and Web services are two further important foundations. RDF (Resource Description Framework) is the canonical data model upon which all input information is represented. This means that the ODapp tools and the adaptive ontologies can work off a single model of knowledge representation. The Web service and architecture component is also helpful in that it allows Web 2.0 technologies to be brought to bear and allows distributed sources and users for the KM system. This provides scalability and distributed applicability, including on smartphones.

The other two pillars of the open semantic enterprise — the layered approach and linked data — are also helpful, but not necessarily integral to the KM and self-service perspectives presented herein.

Benefits from Self-service Information Management

The benefits and flexibilities from self-service information management extend from top to bottom; from creating data and content to publishing and deploying it. Here is a listing of available potentials for self-service, drawing comparison to the current conventional approach dependent on IT:

Information Activity Conventional Approach (IT) Self-service Information Management
Creating
  • structured data only
  • not generally available directly to the knowledge worker
  • can create own datasets
  • can extract and transform own datasets
  • can tag and integrate non-structured (text + document) information
  • able to handle unstructured, semi-structured and structured data alike
Annotating
  • not generally provided
  • completely open, flexible
  • can define own annotation fields, annotation schema (approaches)
Analyzing
  • pre-canned functions
  • structure pre-defined
  • slow performance
  • all structural dimensions can be filtered
  • all values and ranges thereof can be filtered
  • multiple analysis display widgets selectable depending on the type of input data
  • real-time configuration
  • fast (nearly instantaneous) performance
  • provision of (nearly) real-time analytics
  • additional capabilities in inferencing and reasoning
  • modeling and understanding of complex graph and relationships structures (e.g., social networks)
Reporting
  • pre-canned templates or report writers
  • structure pre-defined
  • user-definable templates
  • templates automatically assignable by types of thing being reported
  • embeddable in Web pages, alternate presentation media
  • styling and theming flexibility
Visualizing
  • very little done through IT
  • variety of visualization widgets available (e.g., maps, charts, graphs, networks)
  • large-scale systems views possible
  • visual interactions (a la Web 2.0) possible
Collaborating
  • very little done through IT
  • collaboration, if done, is via separate social media
  • completely open
  • variable access and permission rights by user or group
  • built-in to the entire infrastructure
Validating
  • not directly done by knowledge worker
  • user input, if done, via problem tickets with delays
  • can be integrated into the business process or workflow
  • “soft” validations and ratings/rankings can also be included
  • consistency checking
  • satisfiability checking
Publishing
  • limited to pre-canned reports
  • any report or analysis is available for publishing
  • documents and images and widget displays are available for publishing
  • multiple export formats means information, slices thereof, or analysis results thereof can be embedded and integrated into multiple presentation media
Re-purposing
  • none directly by the knowledge worker
  • any report or analysis is available for re-purposing
  • documents and images and widget displays are available for re-purposing
  • canonical internal representations (RDF and XHTML) means available information can be deployed for a variety of purposes (Web pages, reports, documents, slide shows, etc.)
New Functionality
  • none known, if not already listed
  • semantic querying
  • data visualization
  • text mining and tagging
  • categorization
  • graph mining
  • logic checking
Developing Apps
  • none via the official systems by the knowledge worker
  • if done, via guerrilla apps
  • only generic apps needed
  • many fewer and more flexible apps push issue into the background
Dashboarding
  • not available to most systems
  • if available, limited number of pre-canned options
  • any report or analysis is available for dashboarding
  • any widget is available for dashboarding
  • complete structure (typing, values, sources) available for filtering, “slicing and dicing”
  • all dashboard objects on a given canvas are linked, interoperate (selections in one widget reflected in other widgets)
  • dashboards may be made persistent for re-use, springboarding new dashboards (as templates)

The fact that any source — internal or external — or format — unstructured, semi-structured and structured — can be brought together with semantic technologies is a qualitative boost over existing KM approaches. Further, all information is exposed in simple text formats, which means it can be readily manipulated and managed with easy to understand tools and applications. Reliance on open standards and languages by semantic technologies also leads to greater use and availability of open source systems.

In short, self-service information management approaches should be cheaper, faster, more responsive and more capable than current approaches.

Great Progress, with Ontology Management the Next Challenge

Ontology Tools Schematic (click to expand)

Given these perspectives, hearing someone tout data-driven applications or advocate ontologies merely for metadata matching sounds positively Neanderthal. The prospects we have with semantic technologies, ontology-driven apps, and self-service information management systems mean so much more. The prospect at hand is to remake the entire knowledge management function, in the process bringing all aspects from creating and distributing knowledge products into the direct hands of the user. This is truly the democratization of information!

The absolutely fantastic news is none of this is theoretical or in the future. All pieces are presently proven, working and in hand. This is a practical vision, ready today.

Granted, like any new innovation, especially one that is infrastructural and systems-oriented, there are some weak or less-developed parts. These current gaps and needs include:

  • Though tools exist, the state of ontology create, edit, manage, update, delete, map and validate tools could be greatly improved [14]. As the central drivers for ODapps, a simplification of tasks geared more to the knowledge worker, and not professional ontologists, is needed (see diagram to right for some of the needed functions). Some of these developments are underway, with more desired
  • A relatively complete starting set of about 20 ODapps widgets is presently available. However, more are needed and for different deployment environments. BI analysis remains one weak area, as is an Ajax-based library
  • The number of infobox templates is small, and better (WYSIWYG or graphical) create and manage utilities would be most useful, and
  • User permission and authorization protocols exist, but are IP-based at present and could be beneficially expanded for different environments and use cases.

Yet, in the grand scheme of things, these gaps are relatively insignificant. The path and general architecture and design for moving forward are now clear.

Self-service information management via appropriately designed semantic technologies is now a reality. It promises to fulfill a vision of information access and control that has been frustrated for decades. We think these are exciting developments for the enterprise — and for the individual knowledge hound. We welcome your inquiries and invite you to join our open OSF group to contribute your ideas.


[1] Including going all the way back to my description of purpose for this blog back in 2005; see the AI3 Blogasbörd where I state, “One of my central arguments [in this blog] is that an inexorable trend through history has been the ‘democratization’ of information.”
[2] Tim Bray, 2010. “Doing it Wrong,” on his blog, January 5, 2010. The extensive comments are also worth a read.
[3] According to Marketwire quoting IDC, “Preliminary market sizing suggests that the business intelligence tools software market grew 2.6% in 2009 to reach $8.1 billion. Given the current market assumptions regarding the global economy and demand drivers in the BI tools software market, IDC forecasts this market to grow at a compound annual growth rate of 6.9% through 2014 to $11.3 billion.” CBR, citing Gartner, indicates the worldwide BI software market will grow 9.7 percent, reaching US$10.8 billion in 2011. Gartner also said BI platforms would continue to be one of the fastest growing software markets. For a very good background on BI, see Rochelle Shaw, 2011. “What is Business Intelligence,” posted in Database Trends and Applications, January 7, 2011.
[4] According to this article, by Antone Gonsalves, Poor Use Of Data Integration Tools Can Waste $500,000 Annually: Gartner (April 27, 2009), which reports on a recent Gartner Report, large global 2000 companies, using several data integration tools with overlapping features, can reduce costs by more than $500,000 annually by eliminating redundant software and leveraging a shared services model. In a further report by Roman Stanek, Business Intelligence Projects are Famous for Low Success Rates, High Costs and Time Overruns (April 25, 2009), Gartner is talking about a dirty little secret in the world of data integration, the fact that the data integration technology in place is based on generations of data integration technology being layered in the enterprise over the years. Thus, technology that was purchased to solve data integration problems, and reduce costs, is actually making the data integration problem more complex and no longer cost efficient.
[5] For example, see Roger Sessions, 2009. Cost of IT Failure, September 28, 2009. This analysis suggests failure rates of 65% with a total estimated worldwide cost of $6.2 trillion in 2009. Commenters have raised questions as to what constitutes failure and have questioned some of the analysis assumptions. Nonetheless, even with over-estimates, the scale of the numbers is alarming; see Jorge Dominguez, 2009. The CHAOS Report 2009 on IT Project Failure, June 16, 2009, which indicates combined failure and challenge rates for IT projects have ranged from 65% to 84% over the period 1994 to 2009; see http://www.education.state.pa.us/portal/server.pt/gateway/PTARGS_0_2_690719_0_0_18/CHAOS%20Summary%202009.pdf. Also see Dan Galorath, 2008. Software Project Failure Costs Billions; Better Estimation & Planning Can Help, June 7, 2008. In this report, Galorath compares and combines many of the available IT failure studies and summarizes that 3 of 5 IT projects do not do what they were supposed to for the expected costs, with 49% showing budget overruns, 47% showing higher than expected maintenance costs, and 41% failing to deliver expected business value; the anecdotal failure rate for years for IT projects has been claimed as 80%, with business intelligence and data warehousing particularly failure-prone areas; in 2001, a study by Mark N. Frolick and Keith Lindsey, Critical Factors for Data Warehouse Failures, for the Data Warehousing Institute noted conventional wisdom says the failure rate of data warehousing projects is 70 to 80 percent, with a then-recent study in the insurance industry found a 90-percent failure rate. This report is useful for combining many historical studies.
[6] As taken from the Gartner IT Metrics: IT Spending and Staffing Report, 2010; see http://www.slideshare.net/dellenterprise/it-spending-and-staffing-report-2010.
[7] Wayne W. Eckerson, 2007. “The Myth of Self-Service Business Intelligence,” in TDWI Online, October 18, 2007; see http://tdwi.org/articles/2007/10/18/the-myth-of-selfservice-bi.aspx. “Business Intelligence projects are famous for low success rates, high costs and time overruns. The economics of BI are visibly broken, and have been for years. Yet BI remains the #1 technology priority according to Gartner.”
[8] See James G. Kobielus, 2009. Mighty Mashups: Do-It-Yourself Business Intelligence For The New Economy, July 23, 2009, see http://www.corda.com/pdfs/mighty-mashups-article.pdf. In this report, Kobelius, the lead author from a Forrester study (August 2008, Global BI And Data Management Online Survey) that surveyed 82 IT decision-makers, noted that just over 70% responded that IT develops their reports and dashboards. About 57% responded that power users did such development. Only 18.3% reported that BI development is done by end users with limited BI skills. .
[9] M.K. Bergman, 2010. “Seven Pillars of the Open Semantic Enterprise,” in AI3:::Adaptive Information blog, January 12, 2010; see http://www.mkbergman.com/859/seven-pillars-of-the-open-semantic-enterprise/.
[10] There are a series of ongoing ontology best practices articles; see http://www.mkbergman.com/category/ontology-best-practices/.
[11] The scones (Subject Concept Or Named EntitieS) tagger provides information extraction of domain-specific subject concepts and entities from unstructured text. It also provides disambiguation of this information based on the context of the source information. See further http://techwiki.openstructs.org/index.php/Category:Scones.
[12] M.K. Bergman, 2011. “Ontology-Driven Apps Using Generic Applications,” in AI3:::Adaptive Information blog, March 7, 2011; see http://www.mkbergman.com/948/ontology-driven-apps-using-generic-applications/.
[13] M.K. Bergman, 2009. “The Open World Assumption: Elephant in the Room,” in AI3:::Adaptive Information blog, December 21, 2009; see http://www.mkbergman.com/852/the-open-world-assumption-elephant-in-the-room/. The open world assumption (OWA) generally asserts that the lack of a given assertion or fact being available does not imply whether that possible assertion is true or false: it simply is not known. In other words, lack of knowledge does not imply falsity. Another way to say it is that everything is permitted until it is prohibited. OWA lends itself to incremental and incomplete approaches to various modeling problems.
[14] M.K. Bergman, 2010. “A New Landscape in Ontology Development Tools,” in AI3:::Adaptive Information blog, Sept. 7, 2010; see http://www.mkbergman.com/909/a-new-landscape-in-ontology-development-tools/.

Posted by AI3's author, Mike Bergman

Posted on April 4, 2011 at 3:06 am in Ontologies, Open Semantic Framework, Semantic Enterprise, Software Development | Comments (4)
The URI link reference to this post is: http://www.mkbergman.com/953/democratizing-information-with-semantics/
The URI to trackback this post is: http://www.mkbergman.com/953/democratizing-information-with-semantics/trackback/
Date:   March 7, 2011

from Wikimedia CommonsThe Time and Technology is Here to Stand Software Engineering on its Head

As an information society we have become a software society. Software is everywhere, from our phones and our desktops, to our cars, homes and every location in between. The amount of software used worldwide is unknowable; we do not even have agreed measures to quantify its extent or value [1]. We suspect there are at least 1 billion lines of code that have accumulated over time [1,2]. On the order of $875 billion was spent worldwide on software in 2010, of which about half was for packaged software and licenses and the rest for programmer services, consulting and outsourcing [3]. In the U.S. alone, about 2 million people work as programmers or related [4].

It goes without saying that software is a very big deal.

No matter what the metrics, it is expensive to develop and maintain software. This is also true for open source, which has its own costs of ownership [5]. Designing software faster with fewer mistakes and more re-use and robustness have clearly been emphases in computer science and the discipline of programming from its inception.

This attention has caused a myriad of schools and practices to develop over time. Some of the earlier efforts included computer-aided software engineering (CASE) or Grady Booch’s (already cited in [1]) object-oriented design (OOD). Fourth-generation languages (4GLs) and rapid application development (RAD) were popular in the 1980s and 1990s. Most recently, agile software development or extreme programming have grabbed mindshare.

Altogether, there are dozens of software development philosophies, each with its passionate advocates. These express themselves through a variety of software development methodologies that might be characterized or clustered into the prototyping or waterfall or spiral camps.

In all instances, of course, the drivers and motivations are the same: faster development, more re-use, greater robustness, easier maintainability, and lower development costs and total costs of ownership.

The Ontology Perspective in this Mix

For at least the past decade, ontologies and semantic Web-related approaches have also been part of this mix. A good summary of these efforts comes from Michael Uschold in an invited address at FOIS 2008 [6]. In this review, he points to these advantages for ontology-based approaches to software engineering:

  • Re-use — abstract/general notions can be used to instantiate more concrete/specific notions, allowing more reuse
  • Reduced development times — producing software artifacts that are closer to how we think, combined with reuse and automation that enables applications to be developed more quickly
  • Increased reliability — formal constructs with automation reduces human error
  • Decreased maintenance costs — increased reliability and the use of automation to convert models to executable code reduces errors. A formal link between the models and the code makes software easier to comprehend and thus maintain.

These first four items are similar to the benefits argued for other software engineering methodologies, though with some unique twists due to the semantic basis. However, Uschold also goes on to suggest benefits for ontology-based approaches not claimed by other methodologies:

  • Reduced conceptual gap — application developers can interact with the tools in a way that is closer to their thinking
  • Facilitate automation — formal structures are amenable to automated reasoning, reducing the load on the human, and
  • Agility/flexibility — ontology-driven information systems are more flexible, because you can much more easily and reliably make changes in the model than in code.

In making these arguments, Uschold picks up on the “ontology-driven information systems” moniker first put forward by Nicola Guarino in 1998 [7]. The ideas around ODIS have had substantial impact on the semantic Web community, especially in the use of formal ontologies and modeling approaches. The FOIS series of conferences, and most recently the ODiSE series, have been spawned from these ideas. There is also, for example, a fairly rich and developed community working on the integration of UML via ontologies as the drivers or specifiers of software [8].

Yet, as Uschold is careful to point out, the idea of ODIS extends beyond software engineering to encompass all of information systems. My own categorization of how ontologies may contribute to information systems is:

  1. Domain modeling — this category includes the domain knowledge representations and reasoning and inference bases that are the traditional understanding of ontologies in the semantic space. The structural aspects are akin to a database schema definition; the unique aspects of ontologies reside in their logic foundations and graph structures, which offer more power in inferencing, reasoning and graph analysis than conventional approaches
  2. Model-driven architectures (MDA) — like UML, these are platform-independent specifications that provide the functional and dataflow definitions of “models” executed by the system. These are the natural progeny of earlier CASE approaches, for example. Such systems also potentially allow graphical or visual means for building or hooking together components as a substitute to direct coding
  3. Program specifications and excecutables — though fairly experimental at present, these approaches use the languages of RDF, OWL or direct use of logic languages to create the equivalent of executable software programs. A couple of experimental systems include Fhat and Neno, for example, point to possible future directions in this area [9]
  4. Runtime or utility components — proper construction of ontologies can be a source for labels and prompts within user interfaces and other runtime uses. Because of the ontology basis, these contributions may also be contextual [10]
  5. Automated agents — based on context, user choices and the governing ontologies, new instruction sets can be generated via what some term automated agents or “robots” to instruct subsequent steps in the software, including potentially analysis or validation. Mission Critical IT [11] is apparently the most advanced in this area; we discuss their ODASE approach more below
  6. Bespoke drivers of generic applications — through using and combining a number of the aspects above, in its totality this approach is a very different paradigm, as we describe below.

When we look at this list from the standpoint of conventional software or software engineering, we see that #1 shares overlaps with conventional database roles and #2, #3 and #4 with conventional programmer or software engineering responsibilities. The other portions, however, are quite unique to ontology-based approaches.

But Is Software Engineering Even the Right Focus?

For decades, issues related to how to develop apps better and faster have been proposed and argued about. We still have the same litany of challenges and issues from expense to re-use and brittleness. And, unfortunately, despite many methodologies du jour, we still see bottlenecks in the enterprise relating to such matters as:

Software is merely an intermediary artifact to accomplish some given tasks. Rather than “engineering” software, the focus should be on how to fulfill those tasks in an optimal manner — and that demands a systems approach.
  • data access
  • queries
  • data transformations
  • data integration or federation
  • reports
  • other data presentations
  • business analysis, and
  • targeted, specialty functionality.

Promises such as self-service reporting touted at the inception of data warehousing two decades ago are still to be realized [12]. Enterprises still require the overhead and layers of IT to write SQL for us and prepare and fix reports. If we stand back a bit, perhaps we can come to see that the real opportunity resides in turning the whole paradigm of software engineering upside down.

Our objective should not be software per se. Software is merely an intermediary artifact to accomplish some given task. Rather than engineering software, the focus should be on how to fulfill those tasks in an optimal manner. How can we keep the idea of producing software from becoming this generation’s new buggy whip example [13]?

For reasons we delve into a bit more below, it perhaps has required a confluence of some new semantic technologies and ontologies to create the opening for a shift in perspective. That shift is one from software as an objective in itself to one of software as merely a generic intermediary in an information task pipeline.

Though this shift may not apply (at least with current technologies) to transactional and process-based software, I submit it may be fundamental to the broad category of knowledge management. KM includes such applications as business intelligence, data warehousing, data integration and federation, enterprise information integration and management, competitive intelligence, knowledge representation, and so forth. These are the real areas where integration and reports and queries and analysis remain frustrating bottlenecks for knowledge workers. And, interestingly, these are also the same areas most amenable to embracing an open world (OWA) mindset [14].

If we stand back and take a systems perspective to the question of fulfilling functional KM tasks, we see that the questions are both broader and narrower than software engineering alone. They are broader because this systems perspective embraces architecture, data, structures and generic designs. The questions are narrower because software — within this broader context — can be now be generalized as artifacts providing the fulfillment of classes of functions.

ODapps: The Ontology-Driven Application Approach

Open Semantic Framework (OSF) at openstructs.orgOntology-driven applications — or ODapps for short — based on adaptive ontologies are a topic we have been nibbling around and discussing for some time. In our oft-cited seven pillars of the semantic enterprise we devote two pillars specifically (#4 and #3, respectively) to these two components [15]. However, in keeping with the systems perspective relevant to a transition from software engineering to generic apps, we should also note that canonical data models (via RDF) and a Web-oriented architecture are two additional pillars in the vision.

ODapps are modular, generic software applications designed to operate in accordance with the specifications contained in one or more ontologies. The relationships and structure of the information driving these applications are based on the standard functions and roles of ontologies (namely as domain ontologies as noted under #1 above), as supplemented by the UI and instruction sets and validations and rules (as noted under #4 and #5 above). The combination of these specifications as provided by both properly constructed domain ontologies and supplementary utility ontologies is what we collectively term adaptive ontologies [16].

ODapps fulfill specific generic tasks, consistent with their bespoke design (#6 above) to respond to adaptive ontologies. Examples of current ontology-driven apps include imports and exports in various formats, dataset creation and management, data record creation and management, reporting, browsing, searching, data visualization and manipulation (through libraries of what we call semantic components), user access rights and permissions, and similar. These applications provide their specific functionality in response to the specifications in the ontologies fed to them.

ODapps are designed more similarly to widgets or API-based frameworks than to the dedicated software of the past, though the dedicated functionality (e.g., graphing, reporting, etc.) is obviously quite similar. The major change in these ontology-driven apps is to accommodate a relatively common abstraction layer that responds to the structure and conventions of the guiding ontologies. The major advantage is that single generic applications can supply shared functionality based on any properly constructed adaptive ontology.

In fact, the widget idea from Web 2.0 is a key precursor to the ODapps design. What we see in Web 2.0 are dedicated single-purpose widgets that perform a display operation (such as Google Maps) based on the properly structured data fed to them (structured geolocational information in the case of GMaps).

In Structured Dynamics‘ early work with RDF-based applications by our predecessor company, Zitgist, we demonstrated how the basic Web 2.0 widget idea could be extended by “triggering” which kind of mashup widget got invoked by virtue of the data type(s) fed to it. The Query Builder presented contextual choices for how to build a SPARQL query via UI based on what prior dropdown list choices were made. The DataViewer displayed results with different widgets (maps, profiles, etc.) depending on which part of a query’s results set was inspected (by responding to differences in data types). These two apps, in our opinion, remain some of the best developed in the semantic Web space, even though development on both ceased nearly four years ago.

This basic extension of data-driven applications — as informed by a bit more structure — naturally evolved into a full ontology-driven design. We discovered that — with some minor best practice additions to conventional ontologies — we could turn ontologies into powerhouses that informed applications through:

  • An understanding of the kind of things under consideration, including their inference chains
  • The types of data in results sets, and how that informs the nature of the widget(s) (maps, calendars, timelines, charts, tabular reports, images, stories, media, etc.) appropriate to display and manipulate that information, and
  • UI and utility functions such as interface labels, mouseovers, auto-suggests, spelling suggestions, synonym matches, etc.

Like the earlier Zitgist discoveries, basing the applications on only one or two canonical data models and serializations (RDF and a simple data exchange XML, which Fred Giasson calls structXML) provides the input uniformity to make a library of generic applications tractable. And, embedding the entire framework in a Web-oriented architecture means it can be distributed and deployed anywhere accessible by HTTP.

Booch has maintained for years that in software design abstraction is good, but not if too abstract [1]. ODapps are a balanced abstraction within the framework of canonical architectures, data models and data structures. This design thus limits software brittleness and maximizes software re-use. Moreover, it shifts the locus of effort from software development and maintenance to the creation and modification of knowledge structures. The KM emphasis can shift from programming and software to logic and terminology [16].

In the sub-sections below, we peel back some portions of this layered design to unveil how some of these major pieces interact.

Built Upon an Ontology- and Web-based Architecture

Again, to cite Booch, the most fundamental software design decision is architecture [1]. In the case of Structured Dynamics and its support for ODapps, its open semantic framework (OSF) is embedded in a Web-oriented architecture (WOA). The OSF itself is a layered design that proceeds from a kernel of existing assets (data and structures) and proceeds through conversion to Web service access, and then ontology organization and management via ODapps [17]. The major layers in the OSF stack are:

  • Existing assets — any and all existing information and data assets, ranging from unstructured to structured. Preserving and leveraging those assets is a key premise
  • scones / irON – the conversion layer, in part consisting of information extraction of subject concepts or named entities (scones) or the instance record Object Notation for conveying XML, JSON or spreadsheets (CSV) in RDF-ready form (via irON or RDFizers)
  • structWSF – a platform-independent suite of more than 20 RESTful Web services, organized for managing structured data datasets; it provides the standard, common interface by which existing information assets get represented and presented to the outside world and to other layers in the OSF stack
  • Ontologies — are the layer containing the structured assets “driving” the system; this includes the concepts and relationships of the domain at hand, and administrative ontologies that guide how the user interfaces or widgets in the system should behave
  • conStruct – connecting modules to enable structWSF and sComponents to be hosted/embedded in Drupal, and
  • sComponents – (mostly) Flex semantic components (widgets) for visualizing and manipulating structured data.

Not all of these layers or even their specifics is necessary for an ontology-driven app design [18]. However, the general foundations of generic apps, properly constructed adaptive ontologies, and canonical data models and structures should be preserved in order to operationalize ODapps in other settings.

OSF is the Basis for Domain-specific Instantiations

The power of this design is that by swapping out adaptive ontologies and relevant data, the entire OSF stack as is can be used to deploy multiple instantiations. Potential uses can be as varied as the domain coverage of the domain ontologies that drive this framework.

The OSF semantic framework is a completely open and generic one. The same set of tools and capabilities can be applied to any domain that needs to manage and understand information in its own domain. With the existing ODApps in hand, this includes from unstructured text or documents to conventional structured databases.

What changes from domain to domain are the data structures (the ontologies, schema and entity references) and their instance data (which can also be converted from existing to canonical forms). Here is an illustration of how this generic framework can be leveraged for different deployments. Note that Citizen Dan is a local government example of the OSF framework with relatively complete online demos:

The Open Semantic Framework can Spawn Many Different Domain Instances

(click for full size)

Structured Dynamics continues to wrinkle this basic design for different clients and different industries. As we round out the starting set of ODapps (see below), the major effort in adapting this generic design to different uses is to tailor the ontologies and “RDFize” existing data assets.

Lower Layers

Conversion of existing assets to RDF and canonical forms is not discussed further here. See the irON and scones documentation or the TechWiki for more information on these topics.

The structWSF Web Services Layer

The first suite of ODapps occurs at the structWSF Web services layer. structWSF provides a set of generic functions and endpoints to:

  • Import or export datasets
  • Create, update, delete (CRUD) or otherwise manage data records
  • Search records with full-text and faceted search
  • Browse or view existing records or record sets, based on simple to possible complex selection or filtering criteria, or
  • Process results sets through workflows of various natures, involving specialized analysis, information extraction or other functions.

Here is a listing of current ODapp functions within structWSF (with links to details for each):

WSF management Web services
User-oriented Web services

At this level the information access and processing is done largely on the basis of structured results sets. Other visualization and display ODapps are listed in the next subsection.

The Semantics Components Layer

The visualization and data display and manipulation ODapps are provided via the semantic components layer. Structured Dynamics’s sComponents are Flex-based widgets that conform to a standard, generic design. Other developers using the OSF framework are developing JavaScript versions [19]. Here is the current library (with links to details for each):

New Components
Components Extending Flex

These components can be used in combination with any of the structWSF ODapps, meaning the filtering, searching, browsing, import/export, etc., may be combined as an input or output option with the above.

The next animated figure shows how the basic interaction flow works with these components:

Semantic Components Workflow

(click for full size)

Using the ODapp structure it is possible to either “drive” queries and results sets selections via direct HTTP request via endpoints (not shown) or via simple dropdown selections on HTML forms or Flex widgets (shown). This design enables the entire system to be driven via simple selections or interactions without the need for any programming or technical expertise.

As the diagram shows, these various sComponents get embedded in a layout canvas for the Web page. By interacting with the various components, new queries are generated (most often as SPARQL queries) to the various structWSF Web services endpoints. The result of these requests is to generate a structured results set, which includes various types and attributes.

An internal ontology that embodies the desired behavior and display options (SCO, the Semantic Component Ontology) is matched with these types and attributes to generate the formal instructions to the sComponents. When combined with the results set data, and attribute information in the irON ontology, plus the domain understanding in the domain ontology, a synthetic schema is constructed that instructs what the interface may do next. Here is an example schema:

Calculated Schema

(click for full size)

These instructions are then presented to the sControl component, which determines which widgets (individual components, with multiples possible depending on the inputs) need to be invoked and displayed on the layout canvas.

As new user interactions occur with the resulting displays and components, the iteration cycle is generated anew, again starting a new cycle of queries and results sets. Importantly, as these pathways and associated display components get created, they can be named and made persistent for later re-use or within dashboard invocations.

Self-service Reporting

Since self-service reporting has been such a disappointment [12], it is worth noting another aspect from this ODapp design. Every “thing” that can be presented in the interface can have a specific display template associated with it. Absent another definition, for example, any given “thing” will default to its parental type (which, ultimate, is “Thing”, the generic template display for anything without a definition; this generally defaults to a presentation of all attributes for the object).

However, if more specific templates occur in the inference path, they will be preferentially used. Here is a sample of such a path:

Thing
Product
Camera
Digital Camera
SLR Digital Camera
Olympus Evolt E520

At the ultimate level of a particular model of Olympus camera, its display template might be exactly tailored to its specifications and attributes.

This design is meant to provide placeholders for any “thing” in any domain, while also providing the latitude to tailor and customize to every “thing” in the domain.

It is critical that generic apps through an ODapp approach also provide the underpinnings for self-service reporting. The ultimate metric is whether consumers of information can create the reports they need without any support or intervention by IT.

Adaptive Analysis

The Mission Critical IT reference provided earlier [11] helps point to the potentials of this paradigm in a different way. Mission Critical also shows user interfaces contextually chosen based on prior selections. But they extend that advantage with context-specific analysis and validation through the SWRL rules-base semantic language. This is an exciting extension of the base paradigm that confirms the applicability of this approach to business intelligence and general enterprise analytics.

Standing Software Engineering on its Head

All of this points to a very exciting era for enterprise and consumer apps moving into the future. We perhaps should no longer talk about “killer apps”; we can shift our focus to the information we have at hand and how we want to structure and analyze it.

Using ontologies to write or specify code or to compete as an alternative to conventional software engineering approaches seems too much like more of the same. The systems basis in which such methodologies such as MDA reside have not fixed the enterprise software challenges of decades-long standing. Rather, a shift to generic applications driven by adaptive ontologies — ODapps — looks to shift the locus from software and programming to data and knowledge structures.

This democratization of IT means that everything in the knowledge management realm can become “self service.” We can create our own analyses; develop our own reports; and package and disseminate what we and our colleagues need, when they need it. Through ontology-driven apps and adaptive ontologies, we can turn prior decades of software engineering practices on their head.

What Structured Dynamics and a handful of other vendors are showing is by no means yet complete. Our roster of ODapp widgets and templates still needs much filling out. The toolsets available for creating, maintaining, mapping and extending the ontologies underlying these systems are still woefully inadequate [20]. These are important development needs for the near term.

And, of course, none of this means the end of software development either. Process and transactions systems still likely reside outside of this new, emerging paradigm. Creating great and solid generic ODapps still requires software. Further, ODapps and their potential are completely silent on how we create that software and with what languages or methodologies. The era of software engineering is hardly at an end.

What is exceptionally powerful about the prospects in ontology-driven apps is to speed time to understanding and place information manipulation directly in the hands of the knowledge worker. This is a vision of information access and control that has been frustrated for decades. Perhaps, with ontologies and these semantic technologies, that vision is now near at hand.


[1] This estimate is from Grady Booch, 2005. “The Complexity of Programming Models,” see http://www.cs.nott.ac.uk/~nem/complexity.pdf. He comments on the weakness of software lines of code as a meaningful measure. At the time in 2005, he estimated perhaps 800 billion lines of code has accumulated, which given growth and vagaries of such guesstimates I have updated to the 1 billion number noted.
[2] For a wildly different estimate, that has been criticized somewhat, see Blackduck Software, 2009. “Estimating the Development Cost of Open Source Software,” at http://www.blackducksoftware.com/development-cost-of-open-source. According to Blackduck’s research there are over 200,000 OSS projects on the Internet representing more than 4.9 billion lines of available code from 4,000 sites that the company monitors. Blackduck estimates that reproducing this OSS would cost $387 billion for “typical” SLOC estimating bases. While Blackduck is likely in the best place of any organization to track open source given their business model, others have criticized the estimates because only a portion (fewer than 10%, consistent with my own research) of open source projects are active, and many active projects also share significant code bases. Nonetheless, there is still a huge disparity between the 1 billion SLOC estimate in [1] and this estimate of 5 billion for open source alone. This disparity is an indicator of the measurement challenges.
[3] See IMAP, 2010. Computing & Internet Software Global Report — 2010, 40 pp, see http://imap.com/imap/media/resources/HighTechReport_WEB_89B4E29C01817.pdf. The relative splits they show for software packages and licenses, IT consulting or outsourcing are 48%, 29% and 23%, respectively, of the total shown. Note however, that Gartner estimates are as high as 2x these amounts, again showing the uncertainty of measuring software; see, for example, http://www.gartner.com/it/page.jsp?id=1209913.
[4] For this and related measures, see Business Software Alliance, 2009. Software Industry Facts and Figures, see http://www.bsa.org/country/Public%20Policy/~/media/Files/Policy/Security/General/sw_factsfigures.ashx.
[5] Simply conduct a Web search on ‘”open source” “cost of ownership”‘ to see the many studies in this area. Depending on advocacy, estimates may be as high as proprietary software to a lower, but still substantial percentage. In no cases are open source understood to be fully “free” once maintenance, upgrades, modifications, and site adaptations are considered.
[6] Michael Uschold, 2008. “Ontology-Driven Information Systems: Past, Present and Future,” in Proceedings of the Fifth International Conference on Formal Ontology in Information Systems (FOIS 2008), Carola Eschenbach and Michael Grüninger, eds., IOS Press, Amsterdam, Netherlands, pp 3-20; see http://mba.eci.ufmg.br/downloads/recol/FormalOntologyinInformationSystems2008.pdf.
[7] Nicola Guarino, 1998. “Formal Ontology and Information Systems,” in Proceedings of FOIS’98, Trento, Italy, June 6-8, 1998. Amsterdam, IOS Press, pp. 3-15; see http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.29.1776&rep=rep1&type=pdf.
[8] See Phil Tetlow et al., eds., 2006. Ontology Driven Architectures and Potential Uses of the Semantic Web in Software Engineering, a W3C Editor’s Draft on Best Practices, February 11, 2006; see http://www.w3.org/2001/sw/BestPractices/SE/ODA/. UML class diagrams have close resemblance to certain ontology structures. This effort was part of a formal collaboration between W3C and the Object Management Group (OMG), which resulted among other things in the production of the Ontology Definition Metamodel (ODM). In the OMG’s model-driven architecture (MDA) initiative, models are used not only for design and maintenance purposes, but as a basis for generating executable artifacts for downstream use. The MDA approach grew out of much of the standards work conducted in the 1990s in the Unified Modeling Language (UML).
[9] Neno is a semantic network programming language and Fhat is a virtual machine that works off of it. These two projects have been largely abandoned. A related project is Ripple, a relational, stack-based dataflow language by Joshua Shinavier, which is episodically updated.
[10] Holger Knublauch of TopQuadrant has made the point that ontologies can also have runtime uses as well: “In contrast to conventional Model-Driven Architecture known from object-oriented systems, semantic applications use their data models not only at design time, but also as runtime components. The rich declarative semantics of ontological data models can be exploited to drive user interfaces and to control an application’s behavior.” See H. Knublauch, 2007. “From Ontology Design to Deployment: Semantic Application Development with TopBraid,” presented at the 2007 Semantic Technology Conference, San Jose, CA; see http://www.semantic-conference.com/2007/sessions/l5.html.
[11] Mission Critical IT describes its ODASE platform (Ontology Driven Architecture for Software Engineering) as a set of tools to facilitate the creation of working applications from a semantic business model (an ontology), using the open standards OWL, SWRL and RDF. The ODASE code generators (a.k.a “robots”) generate an API based on the business terminology defined by the OWL+SWRL+RDF business model, which the ODASE platform then uses to execute the rules and reasoning as contextual choices are made by the user. Among other links, the company has an impressive online demo that shows a consumer telecommunications purchase example; there is also a video explaining the rules basis of the ODASE framework.
[12] See Wayne W. Eckerson, 2007. “The Myth of Self-Service Business Intelligence,” in TDWI Online, October 18, 2007; see http://tdwi.org/articles/2007/10/18/the-myth-of-selfservice-bi.aspx.
[13] The buggy whip industry as a major economic entity ceased to exist with the introduction of the automobile, and is cited in economics and marketing as an example of an industry ceasing to exist because its market niche, and the need for its product, disappears. Not recognizing what industry or business purpose is being served is an oft-cited cause for obsolescence. Thus, software engineering is a practice that serves the creation of software, which itself is only a means to a functional end.
[14] See M. K. Bergman, 2009. The Open World Assumption: Elephant in the Room,” AI3:::Adaptive Information blog, December 21, 2009. The open world assumption (OWA) generally asserts that the lack of a given assertion or fact being available does not imply whether that possible assertion is true or false: it simply is not known. In other words, lack of knowledge does not imply falsity. Another way to say it is that everything is permitted until it is prohibited. OWA lends itself to incremental and incomplete approaches to various modeling problems.
[15] See M.K. Bergman, 2010. Seven Pillars of the Open Semantic Enterprise, AI3:::Adaptive Information blog, January 12, 2010.
[16] See M.K. Bergman, 2009. Ontologies as the ‘Engine’ for Data-Driven Applications, AI3:::Adaptive Information blog, June 10, 2009, for the first presentation of these topics, but the specific term adaptive ontology was not yet used. That term was first introduced in “Confronting Misconceptions with Adaptive Ontologies” (August 17, 2009). The dedicated treatment of these topics and their interplay was provided in M.K. Bergman, 2009. “Ontology-driven Applications Using Adaptive Ontologies”, AI3:::Adaptive Information blog, November 23, 2009. The relation of these topics to enterprise software was first presented in M.K. Bergman, 2009. “Fresh Perspectives on the Semantic Enterprise”, AI3:::Adaptive Information blog, September 28, 2009.
[17] Some 250 pp of complete technical documentation for these projects is provided on the Structured Dynamics’ open source OpenStructs TechWiki.
[18] For more discussion of semantic components, see F. Giasson, 2010. “Semantic Components,” in his blog, July 5, 2010. For more discussion of the layered OSF design, see M.K. Bergman, 2010. Domain-specific Instantiations based on the Open Semantic Framework, AI3:::Adaptive Information blog, June 17, 2010.
[19] To find these groups and follow the open source OSF developments, see xxx. So long as the basic design comports with the foundations herein, sComponents may be developed in any rich Internet application (RIA) environment.
[20] Ontology development, management and mapping is the emerging imperative in the semantic technology space. For some thoughts on how Structured Dynamics is approaching this question, see a Normative Landscape of Ontology Tools on the TechWiki.

Posted by AI3's author, Mike Bergman

Posted on March 7, 2011 at 2:19 am in irON, Ontologies, Ontology Best Practices, Open Semantic Framework, Semantic Enterprise, Semantic Web, Software Development, Web-oriented Architecture | Comments (4)
The URI link reference to this post is: http://www.mkbergman.com/948/ontology-driven-apps-using-generic-applications/
The URI to trackback this post is: http://www.mkbergman.com/948/ontology-driven-apps-using-generic-applications/trackback/
Date:   January 17, 2011

The Hollowing Out of Enterprise ITReasons for and Implications from Innovation Moving to Consumers

Today, the headlines and buzz for information technologies centers on smartphones, social networks, cloud computing, tablets and everything Internet. Very little is now discussed about IT in the enterprise. This declining trend began about 15 years ago, and has been accelerating over time. Letting the air out of the enterprise IT balloon has some profound reasons and implications. It also has some lessons and guidance related to semantic approaches and technologies and their adoption by enterprises.

A Brief Look at Sixty Years of Enterprise IT

One can probably clock the start of enterprise information technology (IT) to the first use of mainframe computers in the early 1950s [1], or sixty years ago. The earliest mainframes were huge and expensive machines that required their own specially air-conditioned rooms because of the heat they generated. The first use of “information technology” as a term occurred in a Harvard Business Review article from 1958 [2].

Until the late 1960s computers were usually supplied under lease, and were not purchased [3]. Service and all software were generally bundled into the lease amount without separate charge and with source code provided. Then, in 1969, IBM led an industry change by starting to charge separately for (mainframe) software and services, and ceasing to supply source code [3]. At about the same time integrated circuits enabled computer sizes to be reduced, with the minicomputers such as from DEC causing a marked expansion in number of potential customers. Enterprise apps became a huge business, with software licensing and maintenance fees achieving a peak of 70% of IT vendor total revenues by the mid-1990s [4]. However, since that peak, enterprise software as a portion of vendor revenues has been steadily eroding.

One of the earliest enterprise applications was in transaction systems and their underlying database management software. The relational database management system (RDBMS) was initially developed at IBM. Oracle, based on early work for the CIA in the late 1970s and its innovation to write in the C programming language, was able to port the RDBMS to multiple operating systems. These efforts, along with those of other notable vendors (most of which like Informix no longer exist), led to the RDBMS becoming more or less the de facto standard for data management within the enterprise by the 1980s. Today Oracle is the largest supplier of RDBMS software globally, and other earlier database system designs such as network databases or object databases fell out of favor [5].

In 1975, the Altair 8800 was introduced to electronics hobbyists as the first microcomputer, followed then by Apple II and the IBM PC in 1981, among others. Rapidly a slew of new applications became available to the individual, including spreadsheets, small databases, graphics programs and word processors. These apps were a boon to individual productivity and the IBM PC in particular brought credibility and acceptance within the enterprise (along with the growth of Microsoft). Novell and local area networks also pointed the way to a more distributed computing future. By the late 1980s virtually every knowledge worker within enterprises had some degree of computer literacy.

The apogee for enterprise software and apps occurred in the 1990s, with whole classes of new applications (most denoted by three-letter acronyms) such as enterprise resource planning (ERP), business intelligence (BI), customer relationship management (CRM), enterprise information systems (EIS) and the like coming to the fore. These systems also began as proprietary software, which resulted in the “stovepiping” or creating of information silos. In reaction and with great market acceptance, vendors such as SAP arose to provide comprehensive, enterprise-wide solutions, though often at high cost and with significant failure rates.

More significantly, the 1990s also saw the innovation of the World Wide Web with its basis in hypertext links on the Internet. Greatly facilitated by the Mosaic Web browser, the basis of the commercial Netscape browser, and the HTML markup language and HTTP transport protocol, millions began experiencing the benefit of creating Web pages and interconnecting. By the mid-1990s, enterprises were on the Web in force, bringing with them larger content volumes, dynamic databases and enterprise portals. The ability for anyone to become a publisher led to a focus and attention on the new medium that led to still further innovations in e-commerce and online advertising. New languages and uses of Web pages and applications emerged, creating a convergence of design, media, content and interactivity. Venture capital and new startups with valuations independent of revenues led to a frenzy of hype and eventually the dot com crash of 2000.

The growth companies of the past 15 years have not had the traditional focus on enterprises, but on the use and development of the Web. From search (Google) to social interactions (Facebook) to media and video (Flickr, YouTube) and to information (Wikipedia), the engines of growth have shifted away from the enterprise.

Meanwhile, the challenges of data integration and interoperability that were such a keen focus going back to initial enterprise computerization remain. Now, however, these challenges are even greater, as we see images, documents (unstructured data) and Web pages, markup and metadata (semi-structured data) become first-class information citizens. What was a challenge in integrating structured data in the 1980s and 1990s via data warehousing, has now become positively daunting for the enterprise with respect to scale and scope.

The paradox is that as these enterprise needs increased, the attractiveness of the enterprise from an IT perspective has greatly decreased. It is these factors we discuss below, with an eye to how Web architecture, design and opportunities may offer a new path through the maze of enterprise information interoperability.

The Current Landscape

Since 1995 the Gartner Group has been producing its annual Hype Cycle [6]. The clientele for this research is the enterprise, so Gartner’s presentation of what’s hot and what’s hype and what is being adopted is a good proxy for the IT state of affairs in enterprises. These graphs are reproduced below since 2006 (click to expand). Note how many of the items shown are not very specific to the enterprise:

Gartner Hype Cycles - 2006 to 2010

References to architectures and content processing and related topics were somewhat prevalent in 2006, but have disappeared most recently. In comparison to the innovations noted under the History discussion, it appears that the items on Gartner’s radar are more related to consumer applications and uses. We no longer see whole new categories of enterprise-related apps or enterprise architectures.

The kinds of innovations that are being discussed as important to enterprises in the coming year [7,8] tend to mostly leverage existing innovations in other areas or to wrinkle existing approaches. One report from Constellation Research, for example, lists the five core disruptive technologies of social, mobile, cloud, analytics and unified communications [7]. Only analytics could be described as enterprise focused or driven.

And, even in analytics, the kinds of things being promoted are self-service reporting or analysis [8]. In essence, these opportunities represent the application of Web 2.0 techniques to bring reporting or analysis directly to the analyst. Though important and long overdue, such innovations are more derivative than fundamental.

Master data management (MDM) is another touted area. But, to read analyst’s predictions in these areas, it feels like one has stepped into a time warp of technologies and options from a decade ago. When has XML felt like an innovation?

Of course, there is a whole industry of analysts that makes their living prognosticating to enterprises about what to expect from information technologies and how to adopt and embrace them. The general observations — across the board — seem to center on items such as smartphones and mobile, moving to the cloud for software or platforms (SaaS, PaaS), and collaboration and social networks. As I note below, there is nothing inherently wrong or unexciting per se about these trends. But, what does appear true is that the locus of innovation has shifted from the enterprise to consumers or the Internet.

Seven Reasons for a Shift in Innovation

The shift in innovation away from the enterprise has been structural, not cyclical. That means that very fundamental forces are at work to cause this change in innovation focus. It does not mean that innovation has permanently shifted away from the enterprise (organizations), but that some form of countervailing structural changes would need to occur to see a return to the IT focus on the enterprise from prior decades.

I think we can point to seven structural reasons for this shift, many of which interact with one another. While all of them are bringing benefits (some yet to be foreseen) to the enterprise, and therefore are to be lauded, they are not strictly geared to address specific enterprise challenges.

#1: The Internet

As pundits say, “The Internet changes everything” [9]. For the reasons noted under the history above, the most important cause for the shift in innovation away from the enterprise has been the Internet.

One aspect that is quite interesting is the use of Internet-based technologies to provide “outsourced” enterprise applications hosted on Web servers. Such “cloud computing” leverages the technologies and protocols inherent to the Internet. It shifts hosting, maintenance and upgrade responsibilities for conventional apps to remote providers. Initially, of course, this simply shifts locus and responsibility from in-house to a virtual party. But, it is also the case that such changes will also promote more subtle shifts in collaboration and interaction possibilities. There is also the fact that quick upgrades of underlying infrastructure and application software can also occur.

The implications for existing enterprise IT staff, traditional providers, and licensing and maintenance approaches are profound. The Internet and cloud computing will perhaps have a greater effect on governance, staffing and management than application functionality per se.

#2: Consumer Innovations

The captivating IT-related innovations at present are mobile (smartphones) and their apps, tablets and e-book readers, Internet TV and video, and social networks of a variety of stripes. Somewhat like the phenomenon of when personal computers first appeared, many of these consumer innovations have applicability to the enterprise, though only as a side effect.

It is perhaps instructive to look back at the adoption of PCs in the enterprise to understand the possible effect of these new consumer innovations. Central IT was never able to control and manage the proliferation of personal computers, and only began to understand years later what benefits and new governance challenges they brought. Enterprise leaders will understand how to embrace and extend today’s new consumer technologies for the enterprise’s benefits; laggards will resist to no avail.

The ubiquity of computing will be enormously impactful on the enterprise. The understanding of what makes sense to do on a mobile basis with a small screen and what belongs on the desk or in the office is merely a glimmer in the current conversation. However, in the end, like most of the other innovations noted in this analysis, the enterprise will largely be a reactive player to these innovations. Yes, the implications will be profound, but their inherent basis are not grounded in unique enterprise challenges. Nonetheless, adapting to them and changing business practice will be critical to asserting enterprise leadership.

#3: Open Source

Open Source Growth

Ten years ago open source was largely dismissed in the enterprise. About five years ago VCs and others began funding new commercial open source ventures, even while there were still rear guard arguments from enterprises resisting open source. Meanwhile, as the figure to the right shows, open source projects were growing exponentially [10].

The shift to open source in the enterprise, still ongoing, has been rapid. Within 5 years, more than 50% of enterprise software will be open source [11] . According to an article in Fortune magazine last year [12], a Forrester Research survey found that 48% of enterprise respondents were using open source operating systems, and 57% were using open source code. A similar Accenture survey of 300 large public and private companies found that half are committed to open source software, with 38% saying they would begin using open-source software for “mission-critical” applications over the next 12 months.

There are likely many reasons for this shift, including the Internet itself and its basis in open source. Many of the most successful companies of the past 15 years including Amazon, Google, Facebook, and virtually any large Web site has shown excellent performance and scalability building their IT infrastructure around open source foundations. Most of the large, existing enterprise IT vendors, notably including IBM, Oracle, Nokia, Intel, Sun (prior to Oracle), Citrix, Novell (just acquired by Attachmate) and SAP have bought open source providers or have visible support for open source initiatives. Even two of the most vocal proprietary source proponents of the past — HP and Microsoft — have begun to make moves toward open source.

The age of proprietary software based on proprietary standards is dead. The monopoly rents formerly associated with unique, proprietary platforms and large-scale enterprise apps are over. Even where software remains proprietary, it is embracing open standards for data interchange and APIs. Traditional enterprise apps such as content management, business intelligence and ETL, among all others, are being penetrated by commercial open source offerings (as examples, Alfresco, Pentaho and Talend, respectively). The shift to services and new business models appears to be an inexorable force.

Declining profit margins, matched with the relatively high cost of marketing and sales to enterprises, means attention and focus have been shifting away from the enterprise. And with these shifts in focus has come a reduction in enterprise-focused innovation.

#4: Slow Development Cycles in Enterprise

It is not unusual to find deployed systems within enterprises as old as thirty years [13]. So long as they work reasonably well, systems once installed — along with their data — tend to remain in operation until their platforms or functionality become totally obsolete. This leads to rather lengthy turnover cycles, and slow development cycles.

Slow cycles in themselves slow innovation. But slow development cycles are also a disincentive to attract the most capable developers. When development tends to focus on maintenance and scripts and more routines of the same nature, the best developers tend to migrate elsewhere (see next).

Another aspect of slow development cycles is the imperative for new enterprise IT to relate to and accommodate legacy systems — again, including legacy data. This consideration is the source of one of the negative implications of a shift away from innovation in the enterprise: the orphaning of existing information assets.

#5: What’s Hot: Developers

Arguably the emphasis on consumer and Internet technologies means that is where the best developers gravitate. Developing apps for smartphones or working at one of the cool Internet companies or joining a passionate community of open source developers is now attracting the best developers. Open source and Web-based systems also lead to faster development cycles. The very best developers are often the founders of the next generation startups and Web and software companies [14].

While, of course, huge numbers of computer programmers and IT specialists are hired by enterprises each year, the motivations tend to be higher pay, better benefits and more job security. The nature of the work and the bureaucracy and routine of many IT functions require such compensation. And, because of the other shifts noted elsewhere, even the software startups that are able to attract the most innovative developers no longer tend to develop for enterprise purposes.

Computer science students have been declining in industrialized countries for some time and that is the category of slowest growth in IT [14]. Meanwhile, existing IT personnel often have expertise in older legacy systems or have been focused on bug fixes and more prosaic tasks like report writing. Narrow job descriptions and work activities also keep many existing IT personnel from getting exposed to or learning about new trends or innovations, such as the semantic Web.

Declining numbers of new talent, plus declining interest by that talent, combined with (often) narrow and legacy expertise of existing talent, creates a disappointing storm of energy and innovation to address enterprise IT challenges. Enterprises have it within their power to create more exciting career opportunities to overcome these limitations, but unfortunately IT management often also appears challenged to get on top of these structural forces.

#6: What’s Hot: Startups

Open source and Internet-based systems have reduced the capital necessary for a new startup by an order of magnitude or so over the past decade. It is now quite possible to get a new startup up and running for tens to hundreds of thousands of dollars, as opposed to the millions of years past. This is leading to more startups, more startups per innovator, and quicker startup and abandonment cycles. Ideas can be tried quickly and more easily thrown away [15].

These dynamics are acting to accelerate overall development cycles and to cause a shift in funding structures and funding amounts by VCs and angels. The kind of market and sales development typical for many enterprise sales does not fit well within these dynamics and is a countervailing force for more capital when all trends point the other way.

In short, all of this is saying that money goes to where the returns are, and returns are not of the same basis as decades past in the enterprise sector. Again, this means a hollowing out of innovation for enterprises.

#7: Declining Software Rents and Consolidation

As an earlier reference noted [4], software revenues as a percent of IT vendor revenues peaked in about the mid-1990s. As profitability for these entities began to decline, so did the overall attractiveness of the sector.

As the next chart shows, coincident with the peak in profitability was the onset of a consolidation trend in the enterprise IT vendor sector [16]. The chart below shows that three of the largest IT vendors today — Oracle, IBM and HP — began an acquisition spree in the mid-1990s that has continued until just recently, as many of the existing major players have already been acquired:

Major !T Acquisitions, 1994 - 2010

Notable acquisitions over this period include: Oracle — PeopleSoft, Siebel Systems, MySQL, Hyperion, BEA and Sun; HP — EDS, 3Com, VeriFone, Compaq, Palm and Mercury Interactive; IBM — Lotus, Rational, Informix, Ascential, FileNet, Cognos and SPSS. Published acquisition costs exceeded $130 billion, mostly for the larger deals. But terms for 75% of the 262 transactions were not disclosed [16]. The total value of these consolidations likely approaches $200 billion to $300 billion.

Clearly, the market is now favoring large players with large service components. This consolidation trend does belie one early criticism of open source v proprietary software: proprietary software is likely to be better supported. In theory this might be true, but vanishing suppliers does not bode well for support either. Over time, we may likely see successful open source projects showing greater longevity than many IT vendors.

Positive Implications from the Decline

This discussion is not a boo-hoo because the heyday of enterprise IT innovation is past. Much of that innovation was expensive, often failed to achieve successful adoption, and promoted walled gardens and silos. As someone who ran companies directly involved in enterprise software sales, I personally do not miss the meetings, the travel, the suits and the 18-month sales cycles.

The enterprise has gained much from outside innovation in the past, from the personal computer to LANs and browsers and the Internet. To be sure, what we are now seeing with mobile phones has more computing power than the original Space Shuttle [17], and continued mashup and social engagement innovations will have unforeseen and manifest benefits for enterprises. I think this is unalloyed goodness.

We can also see innovations based on the Internet such as the semantic Web and its languages and standards to promote interoperability. Breaking these barriers is critically needed by enterprises of the future. Data models such as RDF [18] and open world mindsets that better accommodate uncertainty and breadth of information [19] can only be seen as positive. The leverage that will come from these non-enterprise innovations may in the end prove to be as important as the enterprise-specific innovations of the past.

Negative Implications from the Decline

Yet a shift to Internet and consumer IT innovation leaves some implications. These concerns have to do with the unique demands and needs of enterprises. One negative implication is that a diminishing supplier base may not lead to actual deployments that are enterprise-ready or -responsive.

The first concern relates to quality and operational integrity. There is an immense gulf between ISO 9000 or Six Sigma and, for example, the “good enough” of standard search results on the Web. Consumer apps do not impose the same thresholds for quality as demanded by paying bosses or paying customers. This is not a value judgment; simply a reality. I see it reflected in the quality of tools and code for many new innovations today on the Web.

Proofs-of-concept and “cool” demos work well for academic theses or basic intros to new concepts. The 20% that gets you 80% goes a long way to point the way to new innovation; but the 80% to get to the last 20% is where enterprises bet their money. Unfortunately, in too many instances, that gap is not being filled. The last 20% is hard work, often boring, and certainly not as exciting as the next Big Thing. And, as the trends above try to explicate, there are also diminishing rewards for living in that territory.

A similar and second concern pervades data interoperability. Data interoperability has been the central challenge of enterprise IT for at least three decades. As soon as we were able to interconnect systems and bridge differences in operating systems and data schema, the Holy Grail has been breaking information barriers and silos. The initial attempts with proprietary data warehouses or enterprise-wide ERP systems were wrongly trying to apply closed solutions to inherently open problems. But, now, finally when we have the open approaches and standards in hand for bridging these gaps, the attractiveness of doing so for the enterprise seems to have vanished.

For example, we see demos, tools and algorithms being published all over the place that show promising advances or improvements in the semantic Web or linked data (among other areas; see [20]). Some of these automated techniques sound wonderful, but real systems require the hard slog of review and manual approval. Quality matters. If Technique A, say, shows an improvement over Technique B of 5%, that is worth touting. But even at 98% percent accuracy, we will still find 20,000 errors in a population of 1 million items. Such errors will simply not work in having trains run on time, seats be available on airplanes, or inventory get to their required destinations.

What can work from the standpoint of linkage or interoperability on the Web according to consumer standards will simply not fly for many enterprises. But, where are the rewards for tackling that hard slog?

Another concern is security and differential access. Open Web systems, bless their hearts, do not impose the same access and need to know restrictions as information systems within enterprises. If we are to adopt Web-based approaches to the next-generation enterprise — a position we strongly advocate — then we are also going to need to figure out how to marry these two world views. Again, there appears to be an effort-reward mismatch here.

What Lessons Might be Drawn?

These observations are not meant to be a polemic, but a statement of more-or-less current circumstances. Since its widescale adoption, the major challenge — and opportunity — of enterprise IT has been how to leverage the value within the enterprise’s existing digital information assets. That challenge is augmented today with the availability of literally a whole world of external digital knowledge. Yet, the energy and emphasis for innovation to address these challenges has seemingly shifted to consumers and away from the enterprise.

Economics abhors a vacuum. I think two responses may be likely to this circumstance. The first is that new vendors will emerge to address these gaps, but with different cost structures and business models. I’d like to think my own firm, Structured Dynamics, is one of these entities. How we are addressing this opportunity and differences in our business model we will discuss at a later time. In any case, any such new player will need to take account of some of the structural changes noted above.

Another response can come from enterprises themselves, using and working the same forces of change noted earlier. Via collaboration and open source, enterprises can band together to contribute resources, expertise and people to develop open source infrastructures and standards to address the challenges of interoperability. We already see exemplars of such responses in somewhat related areas via initiatives such as Eclipse, Apache, W3C, OASIS and others. By leveraging the same tools of collaboration and open data and systems and the Internet, enterprises can band together and ensure their own self-interests are being addressed.

One advantage of this open, collaborative approach is that it is consistent with the current innovation trends in IT. But the real advantage is that it works and is needed. Without it, it is unclear how the enterprise IT challenge — especially in data interoperability — will be met.


[1] Though calculating machines and others extend back to Charles Babbage and more relevant efforts during World War II, the first UNIVAC was delivered to the US Census Bureau in 1951, and the first IBM to the US Defense Department in 1953. Many installations followed thereafter. See, for example, Lectures in the History of Computing: Mainframes.
[2] As provided by “information technology” (subscription required), Oxford English Dictionary (2 ed.), Oxford University Press, 1989, http://dictionary.oed.com/, retrieved 12 January 2011.
[3] See further the Wikipedia entry on proprietary software.
[4] M.K. Bergman, 2006. “Redux: Enterprise Software Licensing on Life Support,” AI3:::Adaptive Information blog, June 2, 2006. See http://www.mkbergman.com/111/the-death-of-enterprise-software-licensing/.
[5] The combination of distributed network systems and table-oriented designs such as Google’s BigTable and related open source Hadoop, plus many scripting languages, is leading to the resurgence of new database designs including NoSQL, columnar, etc.
[6] The Gartner Hype Cycle is a graphical representation of the maturity, adoption and application of technologies. It proceed through five phases beginning with a technology trigger and then, if successful, ultimately adoption. The peak of the curve represents the biggest “hype” for the innovation.The information in these charts is courtesy of Gartner. The sources for the charts are summary Gartner reports for 2010, 2009, 2008, and 2006. 2007 was skipped to provide a bit longer time horizon for comparison purposes.
[7] As summarized by Klint Finley, 2011. “How Will Technology Disrupt the Enterprise in 2011?,” ReadWriteWeb Enterprise blog, January 4, 2011.
[8] Jaikumar Vijayan, 2011. “Self-service BI, SaaS, Analytics will Dominate in 2011,” in Computerworld Online, January 3, 2011.
[9] According to Google on January 12, 2011, there were 251,000 uses of this exact phrase on the Web.
[10] Amit Deshpande and Dirk Riehle, 2008. “The Total Growth of Open Source,” in Proceedings of the Fourth Conference on Open Source Systems (OSS 2008), Springer Verlag, pp 197-209; see http://dirkriehle.com/wp-content/uploads/2008/03/oss-2008-total-growth-final-web.pdf.
[11] See http://futureofopensource.drupalgardens.com/2010-survey-results.
[12] See http://tech.fortune.cnn.com/2010/08/16/how-corporate-america-went-open-source/.
[13] For example, according to James Mullarney in 2005, “How to Deal with the Legacy of Legacy Systems,” the average age of IT systems in the insurance industry was 23 years. In that same year, according to Logical Minds, a survey by HAL Knowledge Systems showed the average age of applications running core business processes to be 15 years old, with almost 30 per cent of companies maintaining software that is 25 years old or older.
[14] For general IT employment trends, see the Bureau of Labor Statistics; for example, http://www.bls.gov/oco/ocos303.htm.
[15] See, for example, Paul Graham, 2010. “The New Funding Landscape,” Blog post, October 2010.
[16] This chart was constructed from these sources: Oracle — http://en.wikipedia.org/wiki/List_of_acquisitions_by_Oracle; IBM — http://en.wikipedia.org/wiki/List_of_mergers_and_acquisitions_by_IBM; and HP — http://en.wikipedia.org/wiki/List_of_acquisitions_by_Hewlett-Packard. Of course, other acquisitions occurred by other players over this period as well.
[17] Current smartphones may have around 2 GHz in processing power and 1 GB of RAM; see for example, this Motorola press release. By comparison to the Shuttle, see http://en.wikipedia.org/wiki/Space_Shuttle#Flight_systems.
[18] M. K. Bergman, 2009. “Advantages and Myths of RDF,” AI3:::Adaptive Information blog, April 8, 2009.
[19] M. K. Bergman, 2009. “The Open World Assumption: Elephant in the Room,” AI3:::Adaptive Information blog, Dec. 21, 2009.
[20] See, for example, the Sweet Tools listing of 900 semantic Web and -related tools on this AI3:::Adaptive Information blog.

Posted by AI3's author, Mike Bergman

Posted on January 17, 2011 at 4:31 pm in Adaptive Innovation, Open Source, Semantic Enterprise, Software and Venture Capital | Comments (4)
The URI link reference to this post is: http://www.mkbergman.com/940/declining-it-innovation-in-the-enterprise/
The URI to trackback this post is: http://www.mkbergman.com/940/declining-it-innovation-in-the-enterprise/trackback/
Page 3 of 512345
Copyright © 2004–2013 Michael K. Bergman.   This work is licensed under a Creative Commons License