Posted:October 3, 2005

Why Are $800 Billion in Document Assets Wasted Annually? I. Is ‘Private’ Information Bad?

A recent column (Sept. 22) by David Wessel in the Wall Street Journal argues that “Better Information Isn’t Always Beneficial.” His major arguments can be summarized as follows:

  1. Having more information available is generally good
  2. Having some information available is clearly bad (to terrorists, privacy violations)
  3. However, other information is also bad because it may advance the private (profit) interest but not that of society, and
  4. Computers are worsening Argument #3 by reducing the cost of processing information.

Wessel claims that computers are removing limits to information processing that will force society to wrestle with practical issues of inequities that seemed only theoretical a generation ago. Though this article is certainly thought provoking, and therefore of value, it is wrong on epistemological, logical, and real-world grounds.


All of us at times confuse data or content with the concept of information when we describe current circumstances with terms such as “information overload” or “infoglut.” This confusion often extends to the economics literature in how it deals with the value of “information.” Most researchers or analysts in knowledge management acknowledge this hierarchy of value in the knowledge chain:

data (or content) » information » knowledge (actionable)

This progression also represents a narrowing flow or ‘staging’ of volume. The amount of total data always exceeds information; only a portion of available information is useful for knowledge or action.

Rather than provide “definitions” of these terms, which are not universally agreed, let’s use the example of searching on Google to illustrate these concepts:

  • Data — the literally billions of documents contained within Google’s search index
  • Information — subsets of this data appropriate to the need or topic at hand. While this sounds straightforward, depending on how the user queries and its precision, the “information” returned from a search may have much lower or higher percentages of useful information value, as well as a great range of total possible results
  • Knowledge — Google obviously does not provide knowledge per se, but, depending on user review of the information from more-or-less precise search queries and information duplication or not, knowledge may come about through inspection and learning of this information.

The concept of staging and processing is highly useful here. For example, in the context of a purposeful document repository, initial searches to Google and other content aggregation sites — even with a query or topic basis — could act to populate that repository with data, which would then need to be mined further for useful information and then evaluated for supplying knowledge. Computers always act upon data, whether global in a Google case or local in a local repository case, and whether useful information is produced or not.

Wessel and indeed most economists co-mingle all three terms in their arguments and logic. By missing the key distinctions, fuzzy thinking can result.

A Philosophical or Political Polemic?

First, I will not take issue with Wessel’s first two arguments above. Rather, I’d like to look at the question of Argument #3 that some information is “bad” because it delivers private vs. societal value. His two economist references in the piece are to Arrow and Hirshleifer. As Wessel cites Hirshleifer:

“The contrast between the private profitability and the social uselessness of foreknowledge may seem surprising,” the late economist Jack Hirshleifer wrote in 1971. But there are instances, he argued, where “the community as a whole obtains no benefit … from either the acquisition or the dissemination (by resale or otherwise) of private foreknowledge.”

Yet Hirshleifer had a very specific meaning of “private foreknowledge,” likely not in keeping with the Wessel arguments. The Hirshleifer[1] reference deals entirely with speculative investments and the “awareness” or not (knowledge; perfect information) of differing economic players. According to the academic reviewer Morrison[2]:

In Hirshleifer’s terms, ‘private foreknowledge’ is information used to identify pricing errors after resource allocation is fixed. Because it results in a pure wealth transfer but is costly to produce, it reduces social surplus. . . . As opposed to private foreknowledge, ‘discovery information’ is produced prior to the time resource allocation is fixed, and because it positively affects resource allocation it generally increases social surplus. But even discovery information can be overproduced because optimal expenditures on discovery information will inevitably be subject to pricing errors that can be exploited by those who gather superior information. In cases of both fixed and variable resource allocation, then, excess search has the potential to occur, and private parties will adopt institutional arrangements to avoid the associated losses.

Hmmm. What? Is this actually in keeping with the Wessel arguments?

Wessel poses a number of examples where he maintains the disconnect between private gain and societal benefit occurs. The examples he cites are:

  • Assessing judges as to how they might rule on patent infringement cases
  • Screening software for use in jury selections
  • Demographic and voting information for gerrymandering U.S. congressional districts
  • Weather insurance for crops production.

These examples are what Wessel calls “the sort of information that Nobel laureate Kenneth Arrow labeled ‘socially useless but privately valuable.’ It doesn’t help the economy produce more goods or services. It creates nothing of beauty or pleasure. It simply helps someone get a bigger slice of the pie.”

According to Oldrich Kyn, an economics professor emeritus from Boston University, Joseph Stiglitz, another Nobel laureate, took exception to Arrow’s thesis regarding information in the areas of market socialism and neoclassical economics as shown by these Stiglitz quote excerpts:

The idea of market socialism has had a strong influence over economists: it seemed to hold open the possibility that one could attain the virtues of the market system–economic efficiency (Pareto optimality)–without the seeming vices that were seen to arise from private property.

The fundamental problem with [the Arrow–Debrue model] is that it fails to take into account . . .  the absence of perfect information–and the costs of information–as well as the absence of certain key risk markets . . .

The view of economics encapsulated in the Arrow–Debreu framework . . . is what I call ‘engineering economics’ . . .  economics consisted of solving maximization problems . . . The central point is that in that model there is not a flow of new information into the economy, so that the question of the efficiency with which the new information is processed–or the incentives that individuals have for acquiring information–is never assessed. . .  the fundamental theorems of welfare economics have absolutely nothing to say about . . .  whether the expenditures on information acquisition and dissemination– is, in any sense, efficient.

Stiglitz in his own online autobiography states: “The standard competitive market equilibrium model had failed to recognize the complexity of the information problem facing the economy – just as the socialists had. Their view of decentralization was similarly oversimplified.” Grossman and Stiglitz[3] more broadly observe “that perfectly informative financial markets are impossible and . . .  the informativeness of prices is inversely related to the cost of information.”

I am no economist, but reading the original papers suggests to me a narrower and more theoretical focus than what is claimed in Wessel’s arguments. Indeed, the role of “information” is both central to and nuanced within current economic theory, the understanding of which has progressed tremendously in the thirty years since Wessel’s original citations. By framing the question of private (profit) versus societal good, Wessel invokes an argument based on political philosophy and one seemingly “endorsed” by Arrow as a Nobel laureate. Yet as Eli Rabett commented on the Knowledge Crumb’s Web site, “[the Wessel thesis] is a communitarian argument which has sent Ayn Rand, Alan Greenspan, Newt Gingrich and Grover Norquist to spinning in their graves.”

Logical Fallacies

Even if these philosophical differences could be reconciled, there are other logical fallacies in the Wessel piece.

In the case of assessing the performance of patent judges by crunching information that can now be sold cost-effectively to all participants, Wessel asks, “But does it increase the chances that the judge will come to a just decision?” The logical fallacies here are manifest:

  • Is the only societal benefit one of having the judge come to a just decision or, also potentially, society learning about judicial prejudices singly or collectively or setting new standards in evaluating or confirming judicial candidates?
  • No new information has been created by the computer. Rich litigants could have earlier gone through expensive evaluations. Doesn’t cost-effective information democratize this information?
  • Is not broad information availability an example of desired transparency as cited by Knowledge Crumbs?

Wessel raises another case of farmers now possibly being able to buy accurate weather forecasts. But he posits a resulting case where the total amount of food available is unchanged and insurance would no longer be necessary. Yet, as Mark Bahner points out, this has the logical fallacies of:

  • The amount of food available would NOT be “unchanged” if farmers knew for certain what the weather was going to be. Social and private benefits would also accrue from, for example, applying fertilizers when needed without wasteful runoffs
  • Weather knowledge would firstly never be certain and other uncertainties (pests, global factors, etc.) would also exist. Farmers understand uncertainty and would continue to hedge through futures or other forms of insurance or risk management.

The real logical fallacies relate to the assumption of perfect information and complete reduction of uncertainty. No matter how much data, or how fast computers, these factors will never be fully resolved.

Practical Role of the Computer

Wessel concludes that by reducing the cost of information so much, computers intensify the information problem of private gain v. societal benefit. He uses Arrow again to pose the strawman that, “Thirty years ago, Mr. Arrow said the fundamental problem for companies trying to get and use information for profit was ‘the limitation on the ability of any individual to process information.'”

But as Knowledge Crumbs notes, computers may be able to process more data than an individual, but they are still limited and always will be. Moreover there will remain the Knowledge Problem and the SNAFU principle to make sure that humans are not augmented perfectly by their computers. Knowledge Crumbs concludes:

The issue with knowledge isn’t that there is too much, it is that we lack methods to process it in a timely fashion, and processing introduces defects that sometimes are harmful. When data is reduced or summarized something is lost as well as gained.

The speed of crunching data or computer processing power is not the issue. Use and misuse of information will continue to exist, as it has since mythologies were passed by verbal allegory by firelight.

Importance to Document Assets

So, why does such a flawed polemic get published in a reputable source like the Wall Street Journal? There are real concerns and anxieties underlying this Wessel piece and it is always useful to stimulate thought and dialog. But, like all “information” that the piece itself worries over, it must be subjected to scrutiny, testing and acceptance before it can become the basis for action. The failure of the Wessel piece to pass these thresholds itself negates its own central arguments.

Better that our pundits should focus on things that can be improved such as why there is so much duplication, misuse and overlooking of available information. These cost the economy plenty, totally swamping any of Wessel’s putative private benefits were they even correct.

Let’s focus on the real benefits available today through computers and information to improve society’s welfare. Setting up false specters of computer processing serving private greed only takes our eye off the ball.

NOTE: This posting is part of a series looking at why document assets are so poorly utilized within enterprises.  The magnitude of this problem was first documented in a BrightPlanet white paper by the author titled, Untapped Assets:  The $3 Trillion Value of U.S. Enterprise Documents.  An open question in that paper was why nearly $800 billion per year in the U.S. alone is wasted and available for improvements, but enterprise expenditures to address this problem remain comparatively small and with flat growth in comparison to the rate of document production.  This series is investigating the various technology, people, and process reasons for the lack of attention to this problem.

[1] J. Hirshleifer, “The Private and Social Value of Information and the Reward to Inventive Activity,” American Economic Review, Vol. 61, pp. 561-574, 1971.

[2] A. D. Morrison, “Competition and Information Production in Market Maker Models,” forthcoming in the Journal of Business Finance and Accounting, Blackwell Publishing Ltd., Malden, MA. See the 20 pp. online version,’Hirshleifer%20private%20foreknowledge

[3] S.J. Grossman and J.E. Stiglitz, “On the Impossibility of Informationally Efficient Markets,” American Economic Review, Vol. 70, No. 3, pp. 393-403, June 1980. Markup

Why Are $800 Billion in Document Assets Wasted Annually? I. Is ‘Private’ Information Bad?




A recent column (Sept. 22) by David Wessel in the Wall Street Journal argues that “Better Information Isn’t Always Beneficial.” His major arguments can be summarized as follows: Having more information available is generally good Having some information available is clearly bad (to terrorists, privacy violations) However, other information is also bad because it may […]

see above


Leave a Reply

Your email address will not be published. Required fields are marked *