Posted:October 25, 2005

From Broadcast Newsroom, BBN Technologies just released version 2.0 of its AVOKE STX speech-to-text software.  According to BBN, this new version improves the relevance of multimedia search results by transforming audio into searchable text with unprecedented accuracy. Applications include enterprise search, business and government intelligence, consumer search, audio mining, video search, broadcast monitoring, and multimedia asset management.

BBN says AVOKE STX 2.0 separates speech from non-speech, such as music or laughter, and then processes the speech to identify additional characteristics. This information is captured, tagged with metadata, and indexed in an  XML format for use by standard search engines or technology.  Because each word in the metadata is time-stamped, users can navigate easily to any point in the transcript, listen to the original audio, or watch the corresponding video.

BBN’s legacy extends to playing a key role in pioneering the development of the ARPANET, the forerunner of the Internet. BBN supports both commercial and government clients.  Its AVOKE speech technology translates Arabic and Chinese with additional foreign languages planned.

Posted:October 23, 2005

Michael Wacey argues inThe Semantic Organization: Knowing What You Know that corporations have a tremendous amount of stored information and are likely to be the early adoption point for semantic Web capabillities, similar to the ways in which corporations have proven to be the adopters for Web services and the underlying technologies (UDDI, WSDL, SOAP) initially designed for the Web at large.

I agree with his premise that Web-wide adoption of semantic tagging is unlikely at first and individual organizations offer better and easier proving grounds.  However, my experience has been that government agencies have been the leaders in semantic and entity extraction; for reasons noted elsewhere, corporations have been slow on the uptake.

Some of the stumbling blocks appear to be lack of understanding of benefits by top management and the lack of automated and accurate means to "tag" content at scale and then manage it.  Until these fundamental sticking points are greased, we are likely to continue to see the leadership in promoting semantic Web capabilties by government entities where lives and national security are at stake.

Posted:October 21, 2005

This post introduces a new category area in my blog related to what I and BrightPlanet are terming the eXtensible Semi-structured Data Model (XSDM). Topics in this category cover all information related to extensible data models and engines applicable to documents, metadata, attributes, semi-structured data, or the processing, storing and indexing of XML, RDF, OWL, or SKOS data formats.

Why this category is important is introduced by Fengdong Du in the master’s thesis, Moving from XML Documents to XML Databases, submitted to the University of British Columbia in March 2004. As succinctly stated in the introduction to that thesis:

Depending on the characteristics of XML applications, the current XML storage techniques can be classified into two major categories. Most text-centric applications (e.g., newspapers) choose an existing file system for data storage. Data is usually divided into logical units, and each logical unit is physically stored as a separate file. As an example, a newspaper application may divide the entire year's newspapers into 12 collections by months, and store each collection as a document file. This type of application usually provides a keyword-based search tool and manipulates the data in application-specific processes. While this approach simplifies the storage problem, it has some major drawbacks. First, storing XML data as plain text makes it difficult to develop a generic data manipulation interface.

Second, mapping logical units of data to individual files makes it difficult to view the data from a different perspective. For this reason, this type of application only provides services with limited functionalities and therefore restricts the usage of data.

On the other hand, in data-centric applications such as e-commerce applications, data is typically highly-structured, e.g., extracted from a relational database management system (RDBMS). XML is primarily used as a tool to publish data to the Web or deliver information in a self-descriptive way in place of the conventional relative files. This type of application relies on the RDBMS for data storage. Data received in XML format is eventually put into an RDBMS when persistence is desired. Over the years, an RDBMS has been well developed to efficiently store and retrieve well-structured data. Structured Query Language (SQL) and many useful extended RDBMS utilities (e.g., Programming Language SQL, stored procedures) act as an application-independent data manipulation interface. Applications can communicate with databases through this generic interface and, on top of it, provide services with very rich functionalities.

While storing XML data into an RDBMS can take advantage of the well-developed relational database techniques and open interfaces, this approach requires an extra schema-mapping process applied to XML data, which involves schema transformation and usually decomposition. The schemas of XML data have to be mapped to strictly-defined relational schemas before data is actually stored. This process is strongly application-dependent or domain-dependent because there must be enough information available to determine many relational database design issues such as which table in the target RDBMS is a good place to store the information delivered, what new tables need to be created, which elements/attributes should be indexed, etc. No matter how this kind of information is obtained, whether delivered with XML data as schemas and processing instructions, or the application context makes it obvious, it is hard to develop an automatic and generic schema-mapping mechanism. Instead, application-specific work needs to take care of the schema-mapping problem. This involves non-trivial work of database server-side programming and database administration.

Another drawback of storing XML data in an RDBMS is that it is hard to efficiently support many types of queries that people want to ask on XML data. In RDBMS, each table has a pre-defined primary key field, and possibly a few other indexed fields. Queries not on the key field and not on the indexed fields will result in table scans (i.e., possibly a very large number of I/O's, which can be very time consuming) such as for the following path and predicate expression:

//department[@street="main mall"]/student[@nationality="Chinese"]

It is very likely that "department" is not indexed on "street" and that "student" is not indexed on "nationality". Therefore, resolving this path expression will cause table scans. Moreover, storing XML data in an RDBMS often results in schema decomposition and produces many small tables. Hence, evaluating a query often needs many expensive join operations.

For unstructured or semi-structured data, an RDBMS has greater difficulty, and query performance is usually unacceptable for relatively large amount of data. For these reasons, a native database management system is expected in the XML world. Like a traditional RDBMS, native XML databases would provide a comprehensive and generic data management interface, and therefore isolate lower level details from the database applications. Unlike an RDBMS, an ideal native XML database would make no distinction between unstructured data and strictly structured data. It treats all valid XML data in the same way and manages them equally efficiently. Its performance is only affected by the type of data manipulation. In other words, an ideal XML native database is not only access transparent but also performance transparent upon the structural difference of data.

Future topics in this XSDM area will expand on these challenges and describe new standards-based solutions being developed by BrightPlanet that specifically address these challenges..

Posted by AI3's author, Mike Bergman Posted on October 21, 2005 at 3:06 pm in Adaptive Information, Information Automation, Semantic Web | Comments (0)
The URI link reference to this post is: http://www.mkbergman.com/146/the-semantic-web-demands-different-database-models/
The URI to trackback this post is: http://www.mkbergman.com/146/the-semantic-web-demands-different-database-models/trackback/
Posted:October 6, 2005

Collaboration is important.  BrightPlanet‘s earlier research paper on the waste associated with enterprise document use (or lack thereof) indicated that $690 billion a year alone could be reclaimed by U.S. enterprises from better sharing of information. That represents 88% of the total $780 billion wasted annually.

The issue of poor document use within the organization is certainly not solely a technological issue, and is likely due more to cultural and people issues, not to mention process. At BrightPlanet, we have been attempting a concerted “document as you go” commitment by our developers and support people, and have worked hard to put in place Wiki and other collaboration tools to minimize friction.

But friction remains, often stubbornly so. At heart, the waste and misuse of document assets within organizations arises from a complex set of these people, process and technology issues.

Dave Pollard, the inveterate blogger on KM and other issues, provided a listing of 16 reasons of ‘Why We Don’t Share Stuff’ on September 19.[1] That thoughtful posting received a hail storm of responses, which caused Dave to update that listing to 23 reasons on September 29 under a broader post called ‘Knowledge Sharing & Collaboration 2015′ (a later post upped that amount to 24 reasons). (BTW, my own additions below have upped this number to 40, though high listing numbers are beside the point.) This is great stuff, and nearly complete grist for laying out the reasons — some major and some minor — why collaboration is often difficult.

I have taken these reasons, plus some others I’ve added of my own or from other sources, and have attempted to cluster them into the various categories below.[2] Granted, these assignments are arbitrary, but they are also telling as the concluding sections discuss.

People, Behavior and Psychology

These are possible reasons why collaboration fails due to people, behavior or psychological reasons. They represent the majority (56%) of reasons proferred by Pollard:

  • People find it easier and more satisfying to reinvent the wheel than re-use other people’s ‘stuff’ (*)
  • People only accept and internalize information that fits with their mental models and frames (Lakoff’s rule) (*)
  • Some modest people underestimate the value of what they know so they don’t share (*)
  • We all learn differently (some by reading, some by listening, some by writing down, some by hands-on), and people won’t internalize information that isn’t in a format attuned to how they learn (one size training doesn’t fit all) (*)
  • People grasp graphic information more easily than text, and understand information conveyed through stories better than information presented analytically (we learn by analogy, and images and stories are better analogies to our real-life experiences than analyses are) (*)
  • People cannot readily differentiate useful information from useless information (* split)
  • Most people want friends and even strangers to succeed, and enemies to fail; this has a bearing on their information-sharing behaviour (office politics bites back) (*)
  • People are averse to sharing information orally, and even more averse to sharing it in written form, if they perceive any risk of it being misused or misinterpreted (the better safe than sorry principle) (*)
  • People don’t take care of shared information resources (Tragedy of the Commons again) (*)
  • People seek out like minds who entrench their own thinking (leads to groupthink) (**)
  • Introverts are more comfortable wasting time looking for information rather than just asking (sometimes it’s just more fun spending 5 hours on secondary research, or doing the graphics for your powerpoint deck by trial and error, than getting your assistant to do it for you in 5 minutes) (**)
  • People won’t (or can’t) internalize information until they need it or recognize its value (most notably, information in e-newsletters is rarely absorbed because it rarely arrives just at the moment it’s needed) (**)
  • People don’t know what others who they meet know, that they could benefit from knowing (a variant on the old “don’t know what we don’t know” — “we don’t know what we don’t know that they do”) (**)
  • If important news is withheld or sugar-coated, people will ‘fill in the blanks’ with an ‘anti-story’ worse than the truth (**)
  • Experts often speak in jargon or “expert speak.” They don’t know they aren’t communicating, and non-experts are afraid to ask (***).

Management and Organization

These are possible reasons why collaboration fails due to managerial or organization limits. They represent about one-fifth (20%) of the reasons proferred by Pollard:

  • Bad news rarely travels upwards in organizations (shoot the messenger, and if you do tell the boss bad news, better have a plan to fix it already in motion) (*)
  • People share information generously peer-to-peer, but begrudgingly upwards (“more paperwork for the boss”), and sparingly downwards (“need to know”) in organizational hierarchy — it’s all about trust (*)
  • Managers are generally reluctant to admit they don’t know, or don’t understand, something (leads to oversimplifying, and rash decision-making) (*)
  • Internal competition can mitigate against information sharing (if you reward individuals for outperforming peers, they won’t share what they know with peers) (*)
  • The people with the most valuable knowledge have the least time to share it (**)
  • Management does not generally appreciate its role in overcoming psychology and personal behaviors that limit collaboration (***)
  • Management does not appreciate the trremendous expense, revenue, profitability and competiveness implications from lack of collaboration (***)
  • Management does not know training, incentive, process, technology or other techniques to overcome limits to collaboration (***)
  • Earlier organization attempts with CIOs, CKOs, etc., have not been sustained or were the wrong model for internalizing these needs within the organization (***)
  • Organizational job titles still reinforce managerial v. expertise in status and reward (***)
  • Hiring often inadequately stresses communication and collaboration skills, and does not provide in-house training if still lacking (***).

Technology, Process and Training

These are possible reasons why collaboration fails due to technology, process or training. They represent about one-eighth (12%) of the reasons proferred by Pollard, but also realize his original premise was on human or psychological reasons, so it is not surprising this category is less represented:

  • People know more than they can tell (some experience you just have to show) & tell more than they can write down (composing takes a lot of time) (Snowden’s rule) (*)
  • People feel overwhelmed with content volume and complex tools (info overload, and poverty of imagination) (* split)
  • People will find ways to work around imposed tools, processes and other resources that they don’t like or want to use (and then deny it if they’re called to account for it) (**)
  • Employees lack the appreciation for the importance of collaboration to the success of their employer and their job (***)
  • Most means for “recording” the raw data and information for collaboration have too much “friction” (***)
  • There needs to be clear divisions between “capturing” knowledge and information and “packaging” it for internal or external consumption (***)
  • Single-source publication techniques suck (***)
  • Testing, screening, vetting and making new technology or process advantages is generally lacking (***).

Cost, Rewards and Incentives

These are possible reasons why collaboration fails due to the cost and rewards structure, again about one-eighth (12%) of the reasons proferred by Pollard. Again, realize his original premise was on human or psychological reasons, so it is not surprising this category is less represented:

  • The true cost of acquiring information (time wasted looking for it) and the cost of not knowing (Katrina, 9/11, Poultry Flu etc.) are both greatly underestimated in most organizations (*)
  • Rewards for sharing knowledge don’t work for long (*)
  • People value information they paid for more highly than that they get free from their own people (thus the existence of the consulting industry) (from James Governor) (**)
  • Find reduced cost document solutions (***)
  • Link performance pay to collaboration goals (***).

Insights and Quibbles

There are some 25 reasons provided by Dave and his blog respondents, actually closer to 40 when my own are added, that represent a pretty complete compendium of “why collaboration fails.” Though I can pick out individual ones of these to praise or criticize that would miss the point.

The objective is neither to collect the largest numbers of such factors or to worry terribly about how they are organized. But there are some interesting insights.

Clearly, human behavior and psychology provides the baseline for looking at these questions. Management’s role is to provide organizational structure, incentives, training, pay and recognition to reward the collaborative behavior it desires and needs. Actually, management’s challenge is even greater than that since in most cases upper level managers don’t yet have a clue as to the importance of the underlying information nor collaboration around it.

Like in years past, leadership for these questions needs to come from the top. The disappointments of the CIO and CKO positions of years past need to be looked at closely and given attention. The idea of these positions in the past was not wrong; what was wrong was the execution and leadership commitment.

Organizations of all types and natures have figured out how to train and incentivize its employees for difficult duties ranging from war to first response to discretion. Putting in place reward and training programs to encourage collaboration — despite piss poor performance today — should not be so difficult in this light.

I think Dave brings many valuable insights into such areas as people being reluctant to reinvent the wheel but liking creative design, or without some sense of ownership a collaboration repository is at risk, or people are afraid to look stupid, or some people communciate better orally v. in written form, etc. These are, in fact, truisms of human diversity and skill differences. I believe firmly if organizations want to purposefully understand these factors they can still design reward, training and recognition regimens to shape the behavior desired by that organization.

The real problem in the question of collaboration within the enterprise begins at the top. If the organization is not aware and geared to address human nature with appropriate training and rewards, it will continue to see the poor performance around collaboration that has characterized this issue for decades.

NOTE: This posting is part of a series looking at why document assets are so poorly utilized within enterprises.  The magnitude of this problem was first documented in a BrightPlanet white paper by the author titled, Untapped Assets:  The $3 Trillion Value of U.S. Enterprise Documents.  An open question in that paper was why more than $800 billion per year in the U.S. alone is wasted and available for improvements, but enterprise expenditures to address this problem remain comparatively small and with flat growth in comparison to the rate of document production.  This series is investigating the various technology, people, and process reasons for the lack of attention to this problem.

[1] There have been some other interesting treatments of barriers to collaboration including that by Carol Kinsey Goman’s Five reasons people don’t tell what they know and Jack Vinson’s Barriers to knowledge sharing.

[2] Pollard’s initial 16 reasons are shown with a single symbol (*); the next 8 additions with a double symbol (**). All remaining reasons added by me have three symbols (***).

Posted by AI3's author, Mike Bergman Posted on October 6, 2005 at 1:41 pm in Adaptive Information, Document Assets, Information Automation | Comments (5)
The URI link reference to this post is: http://www.mkbergman.com/135/why-are-800-billion-in-document-assets-wasted-annually-ii-barriers-to-collaboration/
The URI to trackback this post is: http://www.mkbergman.com/135/why-are-800-billion-in-document-assets-wasted-annually-ii-barriers-to-collaboration/trackback/
Posted:October 3, 2005

A recent column (Sept. 22) by David Wessel in the Wall Street Journal argues that “Better Information Isn’t Always Beneficial.” His major arguments can be summarized as follows:

  1. Having more information available is generally good
  2. Having some information available is clearly bad (to terrorists, privacy violations)
  3. However, other information is also bad because it may advance the private (profit) interest but not that of society, and
  4. Computers are worsening Argument #3 by reducing the cost of processing information.

Wessel claims that computers are removing limits to information processing that will force society to wrestle with practical issues of inequities that seemed only theoretical a generation ago. Though this article is certainly thought provoking, and therefore of value, it is wrong on epistemological, logical, and real-world grounds.

Epistemology

All of us at times confuse data or content with the concept of information when we describe current circumstances with terms such as “information overload” or “infoglut.” This confusion often extends to the economics literature in how it deals with the value of “information.” Most researchers or analysts in knowledge management acknowledge this hierarchy of value in the knowledge chain:

data (or content) » information » knowledge (actionable)

This progression also represents a narrowing flow or ‘staging’ of volume. The amount of total data always exceeds information; only a portion of available information is useful for knowledge or action.

Rather than provide “definitions” of these terms, which are not universally agreed, let’s use the example of searching on Google to illustrate these concepts:

  • Data — the literally billions of documents contained within Google’s search index
  • Information — subsets of this data appropriate to the need or topic at hand. While this sounds straightforward, depending on how the user queries and its precision, the “information” returned from a search may have much lower or higher percentages of useful information value, as well as a great range of total possible results
  • Knowledge – Google obviously does not provide knowledge per se, but, depending on user review of the information from more-or-less precise search queries and information duplication or not, knowledge may come about through inspection and learning of this information.

The concept of staging and processing is highly useful here. For example, in the context of a purposeful document repository, initial searches to Google and other content aggregation sites — even with a query or topic basis — could act to populate that repository with data, which would then need to be mined further for useful information and then evaluated for supplying knowledge. Computers always act upon data, whether global in a Google case or local in a local repository case, and whether useful information is produced or not.

Wessel and indeed most economists co-mingle all three terms in their arguments and logic. By missing the key distinctions, fuzzy thinking can result.

A Philosophical or Political Polemic?

First, I will not take issue with Wessel’s first two arguments above. Rather, I’d like to look at the question of Argument #3 that some information is “bad” because it delivers private vs. societal value. His two economist references in the piece are to Arrow and Hirshleifer. As Wessel cites Hirshleifer:

“The contrast between the private profitability and the social uselessness of foreknowledge may seem surprising,” the late economist Jack Hirshleifer wrote in 1971. But there are instances, he argued, where “the community as a whole obtains no benefit … from either the acquisition or the dissemination (by resale or otherwise) of private foreknowledge.”

Yet Hirshleifer had a very specific meaning of “private foreknowledge,” likely not in keeping with the Wessel arguments. The Hirshleifer[1] reference deals entirely with speculative investments and the “awareness” or not (knowledge; perfect information) of differing economic players. According to the academic reviewer Morrison[2]:

In Hirshleifer’s terms, ‘private foreknowledge’ is information used to identify pricing errors after resource allocation is fixed. Because it results in a pure wealth transfer but is costly to produce, it reduces social surplus. . . . As opposed to private foreknowledge, ‘discovery information’ is produced prior to the time resource allocation is fixed, and because it positively affects resource allocation it generally increases social surplus. But even discovery information can be overproduced because optimal expenditures on discovery information will inevitably be subject to pricing errors that can be exploited by those who gather superior information. In cases of both fixed and variable resource allocation, then, excess search has the potential to occur, and private parties will adopt institutional arrangements to avoid the associated losses.

Hmmm. What? Is this actually in keeping with the Wessel arguments?

Wessel poses a number of examples where he maintains the disconnect between private gain and societal benefit occurs. The examples he cites are:

  • Assessing judges as to how they might rule on patent infringement cases
  • Screening software for use in jury selections
  • Demographic and voting information for gerrymandering U.S. congressional districts
  • Weather insurance for crops production.

These examples are what Wessel calls “the sort of information that Nobel laureate Kenneth Arrow labeled ‘socially useless but privately valuable.’ It doesn’t help the economy produce more goods or services. It creates nothing of beauty or pleasure. It simply helps someone get a bigger slice of the pie.”

According to Oldrich Kyn, an economics professor emeritus from Boston University, Joseph Stiglitz, another Nobel laureate, took exception to Arrow’s thesis regarding information in the areas of market socialism and neoclassical economics as shown by these Stiglitz quote excerpts:

The idea of market socialism has had a strong influence over economists: it seemed to hold open the possibility that one could attain the virtues of the market system–economic efficiency (Pareto optimality)–without the seeming vices that were seen to arise from private property.

The fundamental problem with [the Arrow--Debrue model] is that it fails to take into account . . .  the absence of perfect information–and the costs of information–as well as the absence of certain key risk markets . . .

The view of economics encapsulated in the Arrow–Debreu framework . . . is what I call ‘engineering economics’ . . .  economics consisted of solving maximization problems . . . The central point is that in that model there is not a flow of new information into the economy, so that the question of the efficiency with which the new information is processed–or the incentives that individuals have for acquiring information–is never assessed. . .  the fundamental theorems of welfare economics have absolutely nothing to say about . . .  whether the expenditures on information acquisition and dissemination– is, in any sense, efficient.

Stiglitz in his own online autobiography states: “The standard competitive market equilibrium model had failed to recognize the complexity of the information problem facing the economy – just as the socialists had. Their view of decentralization was similarly oversimplified.” Grossman and Stiglitz[3] more broadly observe “that perfectly informative financial markets are impossible and . . .  the informativeness of prices is inversely related to the cost of information.”

I am no economist, but reading the original papers suggests to me a narrower and more theoretical focus than what is claimed in Wessel’s arguments. Indeed, the role of “information” is both central to and nuanced within current economic theory, the understanding of which has progressed tremendously in the thirty years since Wessel’s original citations. By framing the question of private (profit) versus societal good, Wessel invokes an argument based on political philosophy and one seemingly “endorsed” by Arrow as a Nobel laureate. Yet as Eli Rabett commented on the Knowledge Crumb’s Web site, “[the Wessel thesis] is a communitarian argument which has sent Ayn Rand, Alan Greenspan, Newt Gingrich and Grover Norquist to spinning in their graves.”

Logical Fallacies

Even if these philosophical differences could be reconciled, there are other logical fallacies in the Wessel piece.

In the case of assessing the performance of patent judges by crunching information that can now be sold cost-effectively to all participants, Wessel asks, “But does it increase the chances that the judge will come to a just decision?” The logical fallacies here are manifest:

  • Is the only societal benefit one of having the judge come to a just decision or, also potentially, society learning about judicial prejudices singly or collectively or setting new standards in evaluating or confirming judicial candidates?
  • No new information has been created by the computer. Rich litigants could have earlier gone through expensive evaluations. Doesn’t cost-effective information democratize this information?
  • Is not broad information availability an example of desired transparency as cited by Knowledge Crumbs?

Wessel raises another case of farmers now possibly being able to buy accurate weather forecasts. But he posits a resulting case where the total amount of food available is unchanged and insurance would no longer be necessary. Yet, as Mark Bahner points out, this has the logical fallacies of:

  • The amount of food available would NOT be “unchanged” if farmers knew for certain what the weather was going to be. Social and private benefits would also accrue from, for example, applying fertilizers when needed without wasteful runoffs
  • Weather knowledge would firstly never be certain and other uncertainties (pests, global factors, etc.) would also exist. Farmers understand uncertainty and would continue to hedge through futures or other forms of insurance or risk management.

The real logical fallacies relate to the assumption of perfect information and complete reduction of uncertainty. No matter how much data, or how fast computers, these factors will never be fully resolved.

Practical Role of the Computer

Wessel concludes that by reducing the cost of information so much, computers intensify the information problem of private gain v. societal benefit. He uses Arrow again to pose the strawman that, “Thirty years ago, Mr. Arrow said the fundamental problem for companies trying to get and use information for profit was ‘the limitation on the ability of any individual to process information.’”

But as Knowledge Crumbs notes, computers may be able to process more data than an individual, but they are still limited and always will be. Moreover there will remain the Knowledge Problem and the SNAFU principle to make sure that humans are not augmented perfectly by their computers. Knowledge Crumbs concludes:

The issue with knowledge isn’t that there is too much, it is that we lack methods to process it in a timely fashion, and processing introduces defects that sometimes are harmful. When data is reduced or summarized something is lost as well as gained.

The speed of crunching data or computer processing power is not the issue. Use and misuse of information will continue to exist, as it has since mythologies were passed by verbal allegory by firelight.

Importance to Document Assets

So, why does such a flawed polemic get published in a reputable source like the Wall Street Journal? There are real concerns and anxieties underlying this Wessel piece and it is always useful to stimulate thought and dialog. But, like all “information” that the piece itself worries over, it must be subjected to scrutiny, testing and acceptance before it can become the basis for action. The failure of the Wessel piece to pass these thresholds itself negates its own central arguments.

Better that our pundits should focus on things that can be improved such as why there is so much duplication, misuse and overlooking of available information. These cost the economy plenty, totally swamping any of Wessel’s putative private benefits were they even correct.

Let’s focus on the real benefits available today through computers and information to improve society’s welfare. Setting up false specters of computer processing serving private greed only takes our eye off the ball.

NOTE: This posting is part of a series looking at why document assets are so poorly utilized within enterprises.  The magnitude of this problem was first documented in a BrightPlanet white paper by the author titled, Untapped Assets:  The $3 Trillion Value of U.S. Enterprise Documents.  An open question in that paper was why nearly $800 billion per year in the U.S. alone is wasted and available for improvements, but enterprise expenditures to address this problem remain comparatively small and with flat growth in comparison to the rate of document production.  This series is investigating the various technology, people, and process reasons for the lack of attention to this problem.

[1] J. Hirshleifer, “The Private and Social Value of Information and the Reward to Inventive Activity,” American Economic Review, Vol. 61, pp. 561-574, 1971.

[2] A. D. Morrison, “Competition and Information Production in Market Maker Models,” forthcoming in the Journal of Business Finance and Accounting, Blackwell Publishing Ltd., Malden, MA. See the 20 pp. online version, http://users.ox.ac.uk/~bras0541/12_jbfa5709.pdf#search=’Hirshleifer%20private%20foreknowledge

[3] S.J. Grossman and J.E. Stiglitz, “On the Impossibility of Informationally Efficient Markets,” American Economic Review, Vol. 70, No. 3, pp. 393-403, June 1980.

Posted by AI3's author, Mike Bergman Posted on October 3, 2005 at 9:14 am in Adaptive Information, Document Assets, Information Automation | Comments (4)
The URI link reference to this post is: http://www.mkbergman.com/130/why-are-800-billion-in-document-assets-wasted-annually-i-is-private-information-bad/
The URI to trackback this post is: http://www.mkbergman.com/130/why-are-800-billion-in-document-assets-wasted-annually-i-is-private-information-bad/trackback/