<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>AI3:::Adaptive Information &#187; Document Assets</title>
	<atom:link href="http://www.mkbergman.com/category/document-assets/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mkbergman.com</link>
	<description>Mike Bergman on the semantic Web and structured Web</description>
	<lastBuildDate>Tue, 24 Jan 2012 15:52:16 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Brown Bag Lunch: Untapped Assets: The $3 Trillion Value of US Documents</title>
		<link>http://www.mkbergman.com/871/brown-bag-lunch-untapped-assets-the-3-trillion-value-of-us-documents/</link>
		<comments>http://www.mkbergman.com/871/brown-bag-lunch-untapped-assets-the-3-trillion-value-of-us-documents/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 18:43:17 +0000</pubDate>
		<dc:creator>Mike Bergman</dc:creator>
				<category><![CDATA[Adaptive Information]]></category>
		<category><![CDATA[Brown Bag Lunch]]></category>
		<category><![CDATA[Document Assets]]></category>
		<category><![CDATA[Information Automation]]></category>
		<category><![CDATA[documents]]></category>
		<category><![CDATA[economy]]></category>

		<guid isPermaLink="false">http://www.mkbergman.com/?p=871</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Brown Bag Lunch: Untapped Assets: The $3 Trillion Value of US Documents&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Adaptive Information&amp;rft.subject=Brown Bag Lunch&amp;rft.subject=Document Assets&amp;rft.subject=Information Automation&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2010-03-12&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/871/brown-bag-lunch-untapped-assets-the-3-trillion-value-of-us-documents/&amp;rft.language=English"></span>
Today, in the advanced knowledge economy of the United States, the information contained within documents represents about a third of total gross domestic product, or an amount of about $3.3 trillion annually. Yet our understanding of the value of documents and the means to manage them is abysmal. These failures impact enterprises of all sizes [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Brown Bag Lunch: Untapped Assets: The $3 Trillion Value of US Documents&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Adaptive Information&amp;rft.subject=Brown Bag Lunch&amp;rft.subject=Document Assets&amp;rft.subject=Information Automation&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2010-03-12&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/871/brown-bag-lunch-untapped-assets-the-3-trillion-value-of-us-documents/&amp;rft.language=English"></span>
<p><img style="border: 0px solid; float: left; margin-right: 10px;" title="Friday Brown Bag Lunch" src="../wp-content/themes/ai3/images/lunchbag_225.jpg" alt="Friday Brown Bag Lunch" width="158" height="179" /></p>
<p>Today, in the advanced knowledge economy of the United States, the information contained within documents represents about a third of total gross domestic product, or an amount of about <span style="text-decoration: underline;">$3.3 trillion</span> annually.</p>
<p>Yet our understanding of the value of documents and the means to manage them is abysmal. These failures impact enterprises of all sizes from the standpoints of revenues, profitability and reputation. Continued national productivity growth — and thus the wealth of all citizens — depends critically on understanding and managing these document values.</p>
<p>As this white paper describes, the lack of a compelling and demonstrable common understanding of the importance of documents is in itself a major factor limiting available productivity benefits. There is an old Chinese saying that roughly translated is “what cannot be measured, cannot be improved.” Many corporate officers may believe this to be the case for document creation and productivity, but, as this paper shows, in fact many of these document issues <span style="text-decoration: underline;">can be measured</span>.</p>
<div class="boxBrownDotted" style="min-height: 80px; max-width: 460px;"><img style="width: 64px; height: 73px; float: left; margin-right: 10px;" title="Friday Brown Bag Lunch" src="../wp-content/themes/ai3/images/lunchbag_64.png" alt="Friday Brown Bag Lunch" /> This <a href="../834/announcing-the-sporadic-friday-brown-bag-lunch">Friday brown bag leftover</a> was first placed into the <span style="font-weight: bold; color: #993300;">AI3</span> <a href="../chronological-listing/">refrigerator</a> on <a href="http://www.mkbergman.com/82/untapped-assets-the-3-trillion-value-of-us-enterprise-documents/">July 20, 2005</a>. No changes have been made to the original posting.</p>
<p>I&#8217;d like to thank David Siegel for recently highlighting this post from 5 years ago with nice kudos on his <a href="http://thepowerofpull.com/pull/mike-bergman-semantic-business-intelligence">PowerOfPull blog</a>. That reference is what caused me to dust off the cobwebs from this older piece.</div>
<p>To wit, some 25% of all of the annual trillions of dollar spent on document creation costs lend themselves to actionable improvements:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 532px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 363px;"><strong>U.S.</strong><strong> FIRMS</strong></td>
<td style="background-color: #cccccc; width: 86px;">
<p align="center"><strong>$ Million</strong></p>
</td>
<td style="background-color: #cccccc; width: 82px;"><strong>%</strong></td>
</tr>
<tr>
<td style="width: 363px;" valign="top">Cost to Create Documents</td>
<td style="width: 86px;">
<p align="right">$3,261,091</p>
</td>
<td style="width: 82px;"></td>
</tr>
<tr>
<td style="width: 363px;"><strong> Benefits</strong></td>
<td style="width: 86px;"></td>
<td style="width: 82px;"></td>
</tr>
<tr>
<td style="width: 363px;" valign="top">Benefits to Finding Missed or Overlooked Documents</td>
<td style="width: 86px;">
<p align="right">$489,164</p>
</td>
<td style="width: 82px;">
<p align="right">63%</p>
</td>
</tr>
<tr>
<td style="width: 363px;">Benefits to Improved Document Access</td>
<td style="width: 86px;">
<p align="right">$81,360</p>
</td>
<td style="width: 82px;">
<p align="right">10%</p>
</td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits of Re-finding Web Documents</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$32,967</p>
</td>
<td style="width: 82px;" valign="bottom">
<p align="right">4%</p>
</td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"></td>
<td style="width: 86px;" valign="bottom"></td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits of Proposal Preparation and Wins</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$6,798</p>
</td>
<td style="width: 82px;" valign="bottom">
<p align="right">1%</p>
</td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits of Paperwork Requirements and Compliance</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$119,868</p>
</td>
<td style="width: 82px;" valign="bottom">
<p align="right">15%</p>
</td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits of Reducing Unauthorized Disclosures</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$51,187</p>
</td>
<td style="width: 82px;" valign="bottom">
<p align="right">7%</p>
</td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"></td>
<td style="width: 86px;" valign="bottom"></td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"><strong>Total Annual Benefits</strong></td>
<td style="width: 86px;" valign="bottom">
<p align="right">$781,314</p>
</td>
<td style="width: 82px;" valign="bottom">
<p align="right">100%</p>
</td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"></td>
<td style="width: 86px;" valign="bottom"></td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"><strong>PER LARGE FIRM</strong></td>
<td style="width: 86px;" valign="bottom">
<p align="center"><strong>$ Million</strong></p>
</td>
<td style="width: 82px;" valign="bottom">
<p align="center"><strong> </strong></p>
</td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Cost to Create Documents</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$955.6</p>
</td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"></td>
<td style="width: 86px;" valign="bottom"></td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits to Finding Missed or Overlooked Documents</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$143.3</p>
</td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits to Improving Document Access</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$23.8</p>
</td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits of Re-finding Web Documents</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$9.7</p>
</td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"></td>
<td style="width: 86px;" valign="bottom"></td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits of Proposal Preparation and Wins</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$2.0</p>
</td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits of Paperwork Requirements and Compliance</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$35.1</p>
</td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom">Benefits of Reducing Unauthorized Disclosures</td>
<td style="width: 86px;" valign="bottom">
<p align="right">$15.0</p>
</td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"></td>
<td style="width: 86px;" valign="bottom"></td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 363px;" valign="bottom"><strong>Total Annual Benefits</strong></td>
<td style="width: 86px;" valign="bottom">
<p align="right">$229.0</p>
</td>
<td style="width: 82px;" valign="bottom"></td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 1. Mid-range Estimates for the Annual Value of Documents, U.S. Firms, 2002<a name="_ednref1"></a>[1]</p>
<p>The total benefit from improved document access and use to the U.S economy is on the order of $800 billion annually, or about 8% of GDP. For the 1,000 largest U.S. firms, benefits from these improvements can approach nearly $250 million annually per firm. About three-quarters of these benefits arise from <strong><em><span style="text-decoration: underline;">not</span></em></strong> re-creating the intellectual capital already invested in prior document creation. About one-quarter of the benefits are due to reduced regulatory non-compliance or paperwork, or better competitiveness in obtaining solicited grants and contracts.</p>
<p>Indeed, even these figures likely severely underestimate the benefits to enterprises from an improved leverage of document assets. It has always been the case that the best and most successful companies have been able to make better advantage of their intellectual assets than their competitors. The competitiveness advantage from better document access and use alone may exceed the huge benefits in the table above.</p>
<p>Documents — that is, <em>unstructured</em> and <em>semi-structured</em> data — are now at the point where structured data was at 15 years ago. At that time, companies realized that consolidating information from multiple numeric databases would be a key source of competitive advantage. That realization led to the development and growth of the data warehousing or business intelligence markets, now representing about $3.9 billion in annual software sales.</p>
<p>Search and enterprise content management software today only represents a fraction of that amount — perhaps on the order of $500 million annually. But given that intellectual content in documents represents three to four times the amount in numeric structured data, it is clear that document software capabilities are not being well utilized, reaching only a small fraction of their market potential.</p>
<p>The estimates provided in this white paper are drawn from numerous sources and are extremely fragmented, perhaps even inconsistent. One hope in preparing this document was to stimulate more research attention and data gathering around the critical issues of document value to the enterprise and the economy at large.</p>
<p style="font-weight: bold;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767203">EXECUTIVE SUMMARY</a></span></p>
<p style="font-weight: bold;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767204">I. INTRODUCTION</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767205">Documents: The Drivers of a Knowledge Economy</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767206">Documents: The Linchpin of Corporate Intellectual Assets</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767207">Documents: Unknown Value, Huge Implications</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767208">Documents: The Next Generation of Data Warehousing?</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767209">Connecting the Dots: A Pointillistic Approach</a></span></p>
<p style="font-weight: bold;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767210">II. INTERNAL DOCUMENTS</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767211">Number of ‘Valuable’ Documents Produced per Firm</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767212">Total Annual U.S. ‘Costs’ to Create Documents</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767213">‘Cost’ of Creating a ‘Typical’ Document</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767214">‘Cost’ of a Missed or Overlooked Document</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767215">Other Document Total ‘Cost’ Factors and Summary</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767216">Archival Lifetime of ‘Valuable’ Documents</a></span></p>
<p style="font-weight: bold;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767217">III. WEB DOCUMENTS AND SEARCH</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767218">Estimate of Time and Effort Devoted to Document Search</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767219">Effect of Non-persistent Search Efforts</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767220">‘Cost’ of Creating and Maintaining a Document Category Portal</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767221">‘Cost’ of Inaccessible or Hidden Intranet Sites</a></span></p>
<p style="font-weight: bold;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767222">IV. OPPORTUNITIES AND THREATS</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767223">‘Costs’ and Opportunity Costs of Winning Proposals</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767224">‘Costs’ of Regulation and Regulatory Non-compliance</a></span></p>
<p style="font-weight: bold; margin-left: 40px;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767225">‘Cost’ of an Unauthorized Posted Document</a></span></p>
<p style="font-weight: bold;"><span style="font-size: x-small;"><a href="../index.php#_Toc106767226">V. CONCLUSIONS</a></span></p>
<h1><a name="_Toc106767204"></a> I. INTRODUCTION</h1>
<p>How many documents does your organization create each year? What effort does this represent in terms of total staffing costs? What does it cost to create a ‘typical’ document? Of documents created, how much of the value in them is readily sharable throughout your organization? How long do you need to keep valuable documents and how can you access them? How much existing document content is re-created simply because prior work cannot be found? When prior information is missed, what do these prior investments in documents represent in terms of loss of market share, revenue or reputation? Indeed, what does the term, “document” represent in your organization’s context?</p>
<p>If you have difficulty answering these questions, you are not alone. Depending on the survey, from 90% to 97% of enterprises cannot answer these questions — in whole or in part. The purpose of this white paper is to provide the first comprehensive assessment ever of these document values.</p>
<p>Enterprises and the analyst community have historically overlooked the impact of <em>document creation</em> as opposed to <em>document handling</em>. Document creation is about 2-3 times more important — from an embedded cost standpoint — than document handling. Second, all aspects of document creation, and later access and use, assume a much greater role in the overall economics of enterprises than have been realized previously.</p>
<h2><a name="_Toc106767205"></a>Documents: The Drivers of a Knowledge Economy</h2>
<p>Put your index finger one inch from your nose. That is how close — and unfocused — document importance is to an organization. Documents are the salient reality of a knowledge economy, but like your finger, documents are often too close, ubiquitous and commonplace to appreciate.</p>
<p>How do your employees earn their livings? Writing proposals? Marketing or selling? Evaluating competitors or opportunities? Persuading? Analyzing? Communicating? Teaching? Of course, in some sectors, many make their living from growing things or making things. These are essential jobs — indeed, until the last few decades were the predominant drivers of economies — but are now being supplanted in advanced economies by knowledge work. Perhaps up to 35% of all company employees in the U.S. can be classified as knowledge workers.</p>
<p>And knowledge work means documents. The fact is that knowledge is produced and communicated through the written word. When we search, when we write, when we persuade, we may often do so verbally but make it persistent through the written word.</p>
<h2><a name="_Toc106767206"></a>Documents: The Linchpin of Corporate Intellectual Assets</h2>
<p>IBM estimates that corporate data doubles every six to eight months, 85% of which are documents.<a name="_ednref2"></a>[2] At least 10% of an enterprise’s information changes on a monthly basis.<a name="_ednref3"></a>[3] Year-on-year office document growth rates are on the order of 22%.<a name="_ednref4"></a>[4] As later analysis indicates, there are perhaps on the order of 10 billion documents created annually in the U.S with a mid-range “asset” value of $3.3 trillion per year. Documents are a huge contributor to the United States’ gross domestic product of $10.5 trillion (2002).</p>
<ul>
<li>According to a Coopers &amp; Lybrand study in 1993:<a name="_ednref5"></a>[5]</li>
<li>Ninety percent of corporate memory exists on paper</li>
<li>Ninety percent of the papers handled each day are merely shuffled</li>
<li>Professionals spend 5-15 percent of their time reading information, but up to 50 percent looking for it</li>
<li>On average, 19 copies are made of each paper document.</li>
</ul>
<p>A Xerox Corporation study commissioned in 2003 and conducted by IDC surveyed 1000 of the largest European companies and had similar findings:<a name="_ednref6"></a>[6],<a name="_ednref7"></a>[7]</p>
<ul>
<li>On average 45% of an executive’s time was spent dealing with documents</li>
<li>82% believe that documents were crucial to the successful operation of their organizations</li>
<li>A further 70% claimed that poor document processes could impact the operational agility of their organizations</li>
<li>While 83%, 78% and 76% consider faxes, email and electronic files as documents, respectively, only 48% and 46% categorize web pages and multimedia content as such.</li>
</ul>
<h2><a name="_Toc106767207"></a>Documents: Unknown Value, Huge Implications</h2>
<p>But, if defining what constitutes a document is hard, identifying the costs associated with all the document activities is almost impossible for many organizations. Ninety to 97 percent of the corporate respondents to the Coopers &amp; Lybrand and Xerox studies, respectively, could not estimate how much they spent on producing documents each year. Almost three quarters of them admit that the information is unavailable or unknown to them.</p>
<p>An A.T. Kearney study sponsored by Adobe, EDS, Hewlett-Packard, Mayfield and Nokia, published in 2001, estimated that workforce inefficiencies related to content publishing cost organizations globally about $750 billion. The study further estimated that knowledge workers waste between 15% to 25% of their time in non-productive document activities.<a name="_ednref8"></a>[8]</p>
<p><img class="center_ok" style="width: 664px; height: 402px;" src="../wp-content/themes/ai3/images/DocValue/Figure1.gif" alt="Enterprise document use (SPIN)" width="664" height="402" /></p>
<p style="text-align: center;">Figure 1. The Situation of Poor Enterprise Document Use Leads to Real Implications</p>
<p>But the situation is much broader and results in part from the inability to quantify the importance of both <em>internal</em> and <em>external</em> document assets to all aspects of the enterprise’s bottom line. For examples drawn from the main body of this white paper, early adopters of enterprise content software typically capture less than 1% of valuable internal documents available; large enterprises are witnessing the proliferation of internal and external Web sites, sometimes exceeding thousands; use of external content is presently limited to Internet search engines, producing non-persistent results and no capture of the investment in discovery or results; and “deep” content in searchable databases, which is common to large organizations and represents 90% of external Internet content, is completely untapped.</p>
<p>A USC study reported that typically only 32% of employees in knowledge organizations have access to good information about technical developments relevant to their work, and 79% claim they have inadequate information about what their competitors are doing.<a name="_ednref9"></a>[9]</p>
<p>The enterprise content integration software market is fragmented and confused, with only a few established companies providing partial solutions. Content integration is still a small market with annual revenues of less than $50 million worldwide.<a name="_ednref10"></a>[10] Vendor offerings fail to satisfy customer needs because of a lack of functionality and a lack of scalability to enterprise volumes. Sales in the market remain distinctly lower than those projected by industry analysts, even as the magnitude of “information overload” continues to grow at a dramatic rate.</p>
<h2><a name="_Toc106767208"></a>Documents: The Next Generation of Data Warehousing?</h2>
<p>Documents — that is, <em>unstructured</em> and <em>semi-structured</em> data — are now at the point where structured data was at 15 years ago. At that time, companies realized that consolidating information from multiple numeric databases would be a key source of competitive advantage. That realization led to the development and growth of the data warehousing or business intelligence markets, now representing about $3.9 billion in annual software sales.<a name="_ednref11"></a>[11]</p>
<p>Certain categories of businesses have been leaders in content integration, especially those that have recently had mergers and acquisitions activity, those that need to integrate business applications with content, and those for which the reuse of marketing assets across the organization is critical.<sup>10</sup></p>
<p>Stonebraker and Hellerstein have provided an insightful roadmap for how enterprise data integration or “federation” has trended over time: Data warehousing → Enterprise application integration → Enterprise content integration → Enterprise information integration.<a name="_ednref12"></a>[12] There are two threads to this trend. First, there has been a growing recognition of the importance of document (unstructured) content to contribute to actionable information. Second, increasingly unified and integrated means are being applied to all data sources to allow single-access retrievals.</p>
<h2><a name="_Toc106767209"></a>Connecting the Dots: A Pointillistic Approach</h2>
<p>The state of information regarding the value and cost of documents is extremely poor. Lack of defensible and vetted estimates for this information undercuts the ability to properly estimate the intellectual assets tied up in documents or the impacts of overlooked or misused documents.</p>
<p>Only three large document studies — the Coopers &amp; Lybrand, Xerox and A.T. Kearney studies noted above — have been conducted in the past ten years regarding the use and importance of documents within enterprises, and then solely from the standpoint of executive perceptions.</p>
<p>The quantified picture presented in this white paper regarding the costs and benefits of document creation, access and use is a paint-by-the-numbers assemblage of disparate data. The paper draws upon about 80 different data sources, many fragmented. The analysis approach by necessity has needed to conjoin assumptions and data from many diverse sources.</p>
<p>This approach leads to both uncertainty regarding “true” values and likely inaccuracies or mis-estimates in some areas. To make the assessment as consistent as possible, a base year of 2002 was used, the common year reference for most of the available data sources. To bracket uncertainties, most estimates are provided in low, medium and high estimates.</p>
<p>Thus, this study should be viewed as preliminary, but strongly indicative of the value of documents. Further research and data collection will surely refine these estimates. Clearly, though, by any measure, the value of documents to the enterprise is significant and huge, and should not continue to be overlooked.</p>
<h1><a name="_Toc106767210"></a>II. INTERNAL DOCUMENTS</h1>
<p>Though valuable content resides everywhere, the first challenge to enterprises is getting a handle on their own internal document content.</p>
<h2><a name="_Toc106767211"></a>Number of ‘Valuable’ Documents Produced per Firm</h2>
<p>A recent UC Berkeley study on “How Much Information?” estimated that more than 4 billion pages of <em>internal</em> office documents with <span style="text-decoration: underline;">archival</span> value are generated annually in the U.S. (Note: this is not the amount created, only those documents deemed worthy of retaining for more than one year).</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 100%;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 20%;" valign="bottom">
<p align="center"><strong>Firm Size (employees)</strong></p>
</td>
<td style="background-color: #cccccc; width: 9%;" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">1-9</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 9%;" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">10-19</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 10%;" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">20-99</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 9%;" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">100-499</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 9%;" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">500-999</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 9%;" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">1000-2500</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 9%;" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">2500-9999</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 10%;" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">&gt;10,000</span></strong></p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Firms</td>
<td style="width: 9%;" valign="bottom">
<p align="right">3,716,944</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">616,064</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">518,258</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">85,304</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">8,572</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">5,161</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">2,704</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">930</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Employees</td>
<td style="width: 9%;" valign="bottom">
<p align="right">12,328,094</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">8,274,541</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">20,370,447</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">16,410,367</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">5,906,266</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">7,894,226</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">12,519,664</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">31,357,579</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Knowledge Workers</td>
<td style="width: 9%;" valign="bottom">
<p align="right">2,217,093</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">1,488,099</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">3,663,435</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">2,951,251</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">1,062,187</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">1,419,703</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">2,251,545</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">5,639,368</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Number of Pages  –  Low</td>
<td style="width: 9%;" valign="bottom">
<p align="right">465,842,666</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">312,670,737</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">769,739,697</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">620,099,840</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">223,180,542</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">298,299,744</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">473,081,537</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">1,184,911,325</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Number of Pages  –  High</td>
<td style="width: 9%;" valign="bottom">
<p align="right">1,164,606,665</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">781,676,843</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">1,924,349,242</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">1,550,249,599</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">557,951,355</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">745,749,360</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">1,182,703,842</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">2,962,278,313</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Number of Docs  –  Low</td>
<td style="width: 9%;" valign="bottom">
<p align="right">46,584,267</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">31,267,074</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">76,973,970</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">62,009,984</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">22,318,054</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">29,829,974</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">47,308,154</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">118,491,133</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Number of Docs- High</td>
<td style="width: 9%;" valign="bottom">
<p align="right">116,460,666</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">78,167,684</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">192,434,924</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">155,024,960</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">55,795,135</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">74,574,936</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">118,270,384</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">296,227,831</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Docs/Firm  –  Low</td>
<td style="width: 9%;" valign="bottom">
<p align="right">13</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">51</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">149</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">727</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">2,604</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">5,780</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">17,496</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">127,410</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Docs/Firm  –  High</td>
<td style="width: 9%;" valign="bottom">
<p align="right">31</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">127</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">371</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">1,817</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">6,509</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">14,450</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">43,739</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">318,525</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Docs/Firm – 3 yr Low</td>
<td style="width: 9%;" valign="bottom">
<p align="right">38</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">152</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">446</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">2,181</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">7,811</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">17,340</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">52,487</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">382,229</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Docs/Firm – 5 yr High</td>
<td style="width: 9%;" valign="bottom">
<p align="right">157</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">634</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">1,857</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">9,087</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">32,545</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">72,249</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">218,695</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">1,592,623</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom"></td>
<td style="width: 9%;" valign="bottom"></td>
<td style="width: 9%;" valign="bottom"></td>
<td style="width: 10%;" valign="bottom"></td>
<td style="width: 9%;" valign="bottom"></td>
<td style="width: 9%;" valign="bottom"></td>
<td style="width: 9%;" valign="bottom"></td>
<td style="width: 9%;" valign="bottom"></td>
<td style="width: 10%;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">Content Management Workers</td>
<td style="width: 9%;" valign="bottom">
<p align="right">105,709</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">70,951</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">174,670</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">140,713</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">50,644</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">67,690</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">107,352</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">268,881</p>
</td>
</tr>
<tr>
<td style="width: 20%;" valign="bottom">CMWs/Firm</td>
<td style="width: 9%;" valign="bottom">
<p align="right">0</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">0</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">0</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">2</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">6</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">13</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">40</p>
</td>
<td style="width: 10%;" valign="bottom">
<p align="right">289</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 2. Document Projections for U.S. Firms by Size, 2002 Basis</p>
<p align="center"><small>Sources: UC Berkeley<a name="_ednref13"></a>[13], U.S. Commerce Department<a name="_ednref14"></a>[14], U.S. Bureau of Labor Statistics<a name="_ednref15"></a>[15], U.S. Census Bureau<a name="_ednref16"></a>[16]</small></p>
<p>Table 2 and Table 3 attempt to summarize the scale of this challenge for U.S. firms (for internal enterprise documents <em>only</em>). (See<a name="_ednref17"></a>[17] for a description of methodology regarding document scales, note<a name="_ednref18"></a>[18] for estimating the numbers of enterprise knowledge workers, and note<a name="_ednref19"></a>[19] for estimating content workers. A rough multiplier of 3x to 4x can be applied to extrapolate globally.<a name="_ednref20"></a>[20]) Breakouts are provided by size of firm; these include estimates for the number of knowledge and content workers within U.S. firms.</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 323px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 201px;" valign="bottom">
<p align="center"><strong>Category</strong></p>
</td>
<td style="background-color: #cccccc; width: 122px;" valign="bottom">
<p align="center"><strong>Value</strong></p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Firms</td>
<td style="width: 122px;" valign="bottom">
<p align="right">4,953,937</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Employees</td>
<td style="width: 122px;" valign="bottom">
<p align="right">127,273,960</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Knowledge Workers</td>
<td style="width: 122px;" valign="bottom">
<p align="right">20,692,680</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Annual Number of Docs – Low</td>
<td style="width: 122px;" valign="bottom">
<p align="right">9,291,013,320</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Annual Number of Docs- High</td>
<td style="width: 122px;" valign="bottom">
<p align="right">21,739,130,435</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Annual Docs/Firm – Low</td>
<td style="width: 122px;" valign="bottom">
<p align="right">1,875</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Annual Docs/Firm – High</td>
<td style="width: 122px;" valign="bottom">
<p align="right">4,388</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Total Docs/Firm – 3 yr Low</td>
<td style="width: 122px;" valign="bottom">
<p align="right">1,990</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Total Docs/Firm – 5 yr High</td>
<td style="width: 122px;" valign="bottom">
<p align="right">5,601</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom"></td>
<td style="width: 122px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">Content Management Workers</td>
<td style="width: 122px;" valign="bottom">
<p align="right">986,610</p>
</td>
</tr>
<tr>
<td style="width: 201px;" valign="bottom">CMWs/Firm</td>
<td style="width: 122px;" valign="bottom">
<p align="right">0.2</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 3. Total Annual Document Projections for U.S. Firms, 2002 Basis</p>
<p>Table 4 takes this information and breaks out distribution of document production for a ‘typical’ knowledge worker according to major document types. The data from this table is based on analysis of dozens of BrightPlanet customers averaged across about 10 million documents in various repositories.</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 97%;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 12%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 12%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 11%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 1%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 26%;" colspan="3" valign="bottom">
<p align="center"><strong>% Based On</strong></p>
</td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 12%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 12%;" valign="bottom">
<p align="center"><strong>All</strong></p>
</td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom">
<p align="center"><strong>Unique</strong></p>
</td>
<td style="background-color: #cccccc; width: 11%;" valign="bottom">
<p align="center"><strong>MBs</strong></p>
</td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom">
<p align="center"><strong>KB/Page</strong></p>
</td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom">
<p align="center"><strong>Pg/Doc</strong></p>
</td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom">
<p align="center"><strong>Pages</strong></p>
</td>
<td style="background-color: #cccccc; width: 1%;" valign="bottom">
<p align="center"><strong> </strong></p>
</td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom">
<p align="center"><strong>Docs</strong></p>
</td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom">
<p align="center"><strong>MBs</strong></p>
</td>
<td style="background-color: #cccccc; width: 9%;" valign="bottom">
<p align="center"><strong>Pages</strong></p>
</td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 24%;" colspan="2" valign="bottom"><strong>Archival Documents (3 yrs)</strong></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 11%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 1%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 8%;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 9%;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">DOC</td>
<td style="width: 12%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">281</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">59</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">20</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">10.5</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">2,938</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">52%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">36%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">50%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">PDF</td>
<td style="width: 12%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">46</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">28</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">14</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">43.6</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">2,017</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">9%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">17%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">34%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">PPT</td>
<td style="width: 12%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">32</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">26</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">55</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">14.6</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">474</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">6%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">16%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">8%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">XLS</td>
<td style="width: 12%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">178</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">51</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">100</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">2.7</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">484</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">33%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">31%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">8%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom"><strong> Weighted</strong></td>
<td style="width: 12%;" valign="bottom"><strong> </strong></td>
<td style="width: 8%;" valign="bottom">
<p align="right">537</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">164</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">28</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">11.0</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">5,912</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">100%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">100%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">100%</p>
</td>
</tr>
<tr>
<td style="width: 24%;" colspan="2" valign="bottom"><strong>Current Documents (I yr)</strong></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 11%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 9%;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">DOC</td>
<td style="width: 12%;" valign="bottom">
<p align="right">221</p>
</td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 11%;" valign="bottom">
<p align="right">71</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">20</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">5.1</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">1,127</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">49%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">35%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">32%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">PDF</td>
<td style="width: 12%;" valign="bottom">
<p align="right">66</p>
</td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 11%;" valign="bottom">
<p align="right">36</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">14</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">24.7</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">1,634</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">15%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">18%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">46%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">PPT</td>
<td style="width: 12%;" valign="bottom">
<p align="right">53</p>
</td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 11%;" valign="bottom">
<p align="right">76</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">55</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">12.9</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">687</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">12%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">38%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">20%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">XLS</td>
<td style="width: 12%;" valign="bottom">
<p align="right">108</p>
</td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 11%;" valign="bottom">
<p align="right">17</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">100</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">0.6</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">70</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">24%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">8%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">2%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom"><strong> Weighted</strong></td>
<td style="width: 12%;" valign="bottom">
<p align="right">449</p>
</td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 11%;" valign="bottom">
<p align="right">199</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">57</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">7.8</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">3,517</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">100%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">100%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">100%</p>
</td>
</tr>
<tr>
<td style="width: 24%;" colspan="2" valign="bottom"><strong>Total per Employee</strong></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 11%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom"></td>
<td style="width: 9%;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">DOC</td>
<td style="width: 21%;" colspan="2" valign="bottom">
<p align="center">502</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">129</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">20</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">8.1</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">4,065</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">51%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">36%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">43%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">PDF</td>
<td style="width: 21%;" colspan="2" valign="bottom">
<p align="center">112</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">64</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">14</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">32.5</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">3,650</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">11%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">18%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">39%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">PPT</td>
<td style="width: 21%;" colspan="2" valign="bottom">
<p align="center">86</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">102</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">55</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">13.5</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">1,161</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">9%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">28%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">12%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom">XLS</td>
<td style="width: 21%;" colspan="2" valign="bottom">
<p align="center">285</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">68</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">100</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">1.9</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">554</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">29%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">19%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">6%</p>
</td>
</tr>
<tr>
<td style="width: 12%;" valign="bottom"><strong> Weighted</strong></td>
<td style="width: 21%;" colspan="2" valign="bottom">
<p align="center">986</p>
</td>
<td style="width: 11%;" valign="bottom">
<p align="right">363</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">39</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">9.6</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">9,430</p>
</td>
<td style="width: 1%;" valign="bottom"></td>
<td style="width: 8%;" valign="bottom">
<p align="right">100%</p>
</td>
<td style="width: 8%;" valign="bottom">
<p align="right">100%</p>
</td>
<td style="width: 9%;" valign="bottom">
<p align="right">100%</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 4. Document Production for a ‘Typical’ Knowledge Worker</p>
<p>Note that word processed documents account for about 50% of typical production and storage demands. However, also note that documents of the highest archival value, as converted to PDFs for sharing and deployment, also represent about a third to two-fifths of stored documents.</p>
<h2><a name="_Toc106767212"></a>Total Annual U.S. ‘Costs’ to Create Documents</h2>
<p>Based on the information from Table 2 to Table 4 above, all updated to a common year 2002 basis, we can now estimate the total annual costs in the U.S. for creating all internal enterprise documents. The analysis is based on the UC Berkeley information and the Coopers &amp; Lybrand studies. The “bottom up” case is based on the number of annual U.S. documents estimated based on Table 2. These results are shown in the table below:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 450px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 144px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 306px;" colspan="3" valign="bottom">
<p align="center"><strong>Annual U.S. Office Documents</strong></p>
</td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 144px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 102px;" valign="bottom">
<p align="center"><strong>Number (M)</strong></p>
</td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom">
<p align="center"><strong>$/Document</strong></p>
</td>
<td style="background-color: #cccccc; width: 96px;" valign="bottom">
<p align="center"><strong>Total $ (B)</strong></p>
</td>
</tr>
<tr>
<td style="width: 144px;" valign="bottom">“Bottom Up” – Low</td>
<td style="width: 102px;" valign="bottom">
<p align="right">1,387</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$738.58</p>
</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$1,024</p>
</td>
</tr>
<tr>
<td style="width: 144px;" valign="bottom">“Bottom Up” – High</td>
<td style="width: 102px;" valign="bottom">
<p align="right">7,242</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$141.43</p>
</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$1,024</p>
</td>
</tr>
<tr>
<td style="width: 144px;" valign="bottom">Coopers &amp; Lybrand</td>
<td style="width: 102px;" valign="bottom">
<p align="right">11,975</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$272.33</p>
</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$3,261</p>
</td>
</tr>
<tr>
<td style="width: 144px;" valign="bottom">C&amp;L – UCB</td>
<td style="width: 102px;" valign="bottom">
<p align="right">27,737</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$272.33</p>
</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$7,554</p>
</td>
</tr>
<tr>
<td style="width: 144px;" valign="bottom">C&amp;L – “Bottom Up”</td>
<td style="width: 102px;" valign="bottom">
<p align="right">4,315</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$272.33</p>
</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$1,175</p>
</td>
</tr>
<tr>
<td style="width: 144px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 96px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 144px;" valign="bottom">Average</td>
<td style="width: 102px;" valign="bottom">
<p align="right">10,531</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$384.11</p>
</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$3,253</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 5. Annual U.S. Office Document Cost Estimates<a name="_ednref21"></a>[21]</p>
<p>The average numbers above represent the average of the unique values in each column. The Table 5 analysis suggests there may be on the order of 10 billion documents created annually in the U.S with a total “asset” value on the order of $3.3 trillion per year.</p>
<h2><a name="_Toc106767213"></a>‘Cost’ of Creating a ‘Typical’ Document</h2>
<p>Based on the averages in the table above, a ‘typical’ document may cost on the order of $380 each to create.<a name="_ednref22"></a>[22] Of course, a “document” can vary widely in size, complexity and time to create, and therefore its individual cost and value will vary widely. An invoice generated from an automated accounting system could be a single page and produced automatically in the thousands; proposals for very large contracts can take tens of thousands to millions of dollars to create. For examples, here are some other ‘typical’ costs for a variety of documents:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 276px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 150px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 126px;" colspan="2" valign="bottom">
<p align="center"><strong>Ave. Cost</strong></p>
</td>
</tr>
<tr>
<td style="width: 150px;" valign="bottom">‘Typical’ Document</td>
<td style="width: 87px;" valign="bottom">
<p align="right">$384.11</p>
</td>
<td style="width: 39px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 150px;" valign="bottom"></td>
<td style="width: 87px;" valign="bottom"></td>
<td style="width: 39px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 150px;" valign="bottom">Invoice</td>
<td style="width: 87px;" valign="bottom">
<p align="right">$4.43</p>
</td>
<td style="width: 39px;" valign="bottom"><a name="_ednref23"></a>[23]</td>
</tr>
<tr>
<td style="width: 150px;" valign="bottom">Mortgage Application</td>
<td style="width: 87px;" valign="bottom">
<p align="right">$210.00</p>
</td>
<td style="width: 39px;" valign="bottom"><a name="_ednref24"></a>[24]</td>
</tr>
<tr>
<td style="width: 150px;" valign="bottom">‘Typical’ Proposal</td>
<td style="width: 87px;" valign="bottom">
<p align="right">$17,500.00</p>
</td>
<td style="width: 39px;" valign="bottom"><a name="_ednref25"></a>[25]</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 6. ‘Typical’ per Document Creation Costs</p>
<p>Depending on document mix and activities, individual enterprises may want to vary the average document creation costs used in their cost-benefit estimates.</p>
<h2><a name="_Toc106767214"></a>‘Cost’ of a Missed or Overlooked Document</h2>
<p>The Coopers &amp; Lybrand study suggests that 7.5 percent of all documents are lost forever, and that it costs $120 in labor ($150 updated to 2002) to find a misfiled document;<a name="_ednref26"></a>[26] other studies suggest that 5% to 6% of documents are routinely misplaced or misfiled.</p>
<p>In fact, the extent of this problem is unknown and is affirmed by the Xerox results:<a name="_ednref27"></a>[27]</p>
<ul>
<li>Almost three quarters of corporate respondents admit that the information is unavailable or unknown to them</li>
<li>95% of the companies are not able to estimate the cost of wasted or unused documents</li>
<li>On average 19% of printed documents were wasted.</li>
</ul>
<h2><a name="_Toc106767215"></a>Other Document Total ‘Cost’ Factors and Summary</h2>
<p>Five independent studies suggest that, on average, organizations spend from 5% to 15% of total company revenue on handling documents.<sup>27,<a name="_ednref28"></a>[28],<a name="_ednref29"></a>[29],<a name="_ednref30"></a>[30],<a name="_ednref31"></a>[31] </sup>These seemingly innocuous percentages can translate into huge bottom-line impacts for U.S. enterprises. For example, the total GDP of the United States was on the order of $10.5 <em>trillion</em> at the end of 2002.<a name="_ednref32"></a>[32] Translating this value into the results of Table 5 and the information in previous sections indicates the importance of document creation and handling for U.S enterprises:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 472px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 247px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 81px;" valign="bottom">
<p align="center"><strong>Low</strong></p>
</td>
<td style="background-color: #cccccc; width: 72px;" valign="bottom">
<p align="center"><strong>Medium</strong></p>
</td>
<td style="background-color: #cccccc; width: 73px;" valign="bottom">
<p align="center"><strong>High</strong></p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">Total U.S. Gross Domestic Product ($B)</td>
<td style="width: 81px;" valign="bottom">
<p align="right">$10,487</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">$10,487</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">$10,487</p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom"></td>
<td style="width: 81px;" valign="bottom"></td>
<td style="width: 72px;" valign="bottom"></td>
<td style="width: 73px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">Total Document Handling ($B)</td>
<td style="width: 81px;" valign="bottom">
<p align="right">$524</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">$1,049</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">$1,573</p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">
<p align="right">% of total GDP:</p>
</td>
<td style="width: 81px;" valign="bottom">
<p align="right">5.0%</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">10.0%</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">15.0%</p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom"></td>
<td style="width: 81px;" valign="bottom"></td>
<td style="width: 72px;" valign="bottom"></td>
<td style="width: 73px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">Total Document Creation ($B)</td>
<td style="width: 81px;" valign="bottom">
<p align="right">$1,100</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">$3,261</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">$7,554</p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">
<p align="right">% of total GDP:</p>
</td>
<td style="width: 81px;" valign="bottom">
<p align="right">10.5%</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">31.1%</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">72.0%</p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom"></td>
<td style="width: 81px;" valign="bottom"></td>
<td style="width: 72px;" valign="bottom"></td>
<td style="width: 73px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">Total Document Misfiled ($B)</td>
<td style="width: 81px;" valign="bottom">
<p align="right">$32</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">$81</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">$160</p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">
<p align="right">% of total GDP:</p>
</td>
<td style="width: 81px;" valign="bottom">
<p align="right">0.3%</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">0.8%</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">1.5%</p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom"></td>
<td style="width: 81px;" valign="bottom"></td>
<td style="width: 72px;" valign="bottom"></td>
<td style="width: 73px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">ALL U.S. Document Burdens ($B)</td>
<td style="width: 81px;" valign="bottom">
<p align="right">$1,656</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">$4,390</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">$9,287</p>
</td>
</tr>
<tr>
<td style="width: 247px;" valign="bottom">
<p align="right">% of total GDP:</p>
</td>
<td style="width: 81px;" valign="bottom">
<p align="right">15.8%</p>
</td>
<td style="width: 72px;" valign="bottom">
<p align="right">41.9%</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">88.6%</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 7. Range Estimates for Total U.S. Document Burdens in Enterprises, 2002<a name="_ednref33"></a>[33]</p>
<p>A few observations relate to this table. First, enterprises and the analyst community have greatly overlooked the impact of <em>document creation</em> as opposed to <em>document handling</em>. Document creation is about 2-3 times more important  – from an embedded cost standpoint  – than document handling. Second, all aspects of document creation assume a much greater role in the overall economics of enterprises than has been realized previously.</p>
<p><strong>The fact that documents have received so little management attention, awareness, measurement and direct attention to improve performance is shocking.</strong></p>
<h2><a name="_Toc106767216"></a>Archival Lifetime of ‘Valuable’ Documents</h2>
<p>The ‘low’ and ‘high’ estimates for documents in Table 2 and Table 3 assume that 2% and 5%, respectively, of internal documents have archival value. Were these percentages to be higher, the volume of documents requiring integration and access would likewise increase. The 2% value is derived from the UC Berkeley study,<a name="_ednref34"></a>[34] which also refers to an unpublished European study that places archival amounts at 10%. Unfortunately, there is little empirical information to support the degree to which documents deserve to be kept for archival purposes.</p>
<p>Assuming that documents may retain value for three to five years, the largest firms perhaps have as many as 4 million <em>internal</em> documents on average with enterprise-wide value. Firms with fewer employees generally have lower document counts. Archival percentages, however, are a tricky matter, since apparently 85% of all archived documents are accessed.<a name="_ednref35"></a>[35]</p>
<h1><a name="_Toc106767217"></a>III. WEB DOCUMENTS AND SEARCH</h1>
<p>Various estimates by Cowles/Simba,<a name="_ednref36"></a>[36] Veronis, Suhler &amp; Associates,<a name="_ednref37"></a>[37] and Outsell<a name="_ednref38"></a>[38] place the current market for on line business information in the $30 billion to $140 billion range, with significant projected growth. Outsell also indicates that marketing, sales, and product development professionals rely most heavily on information from the Internet for their daily decision making, based on a comparative study of Fortune 500 business professionals’ use of the open Web and fee-based desktop information content services.<a name="_ednref39"></a>[39] Clearly, relevant and targeted content, much of which resides on line, has extreme value to enterprises.</p>
<p>UC Berkeley estimates that about 500 petabytes of new information was published on the Web in 2002,<sup>34</sup><sup> </sup>based on original analysis conducted by BrightPlanet.<a name="_ednref40"></a>[40] The compound growth rate in Web documents has been on the order of more than 200% annually.<a name="_ednref41"></a>[41] Estimates for deep Web content range from about 6-8 times larger <a name="_ednref42"></a>[42] to 500 times larger<a name="_ednref43"></a>40 than standard “surface web” content. The size of Internet content is overwhelming, of highly variable quality, growing at a rapid pace, and with much of its content ephemeral.</p>
<h2><a name="_Toc106767218"></a>Estimate of Time and Effort Devoted to Document Search</h2>
<p>According to a recent study by iProspect, about 56 percent of users use search engines every day, based on a population of which more than 70 percent use the Internet more than 10 hours per week. Professionals abandon a current search 38% of the time after inspecting only one results page (the listing of document result URLs), and overall 82% of users attempt another search if relevant results are not found within the first three results pages. Just 13 percent of users said that they use different search engines for different types of searches.<a name="_ednref44"></a>[43] Only 7.5 percent of Internet users said they refined their search with additional keywords in cases where they were unable to achieve satisfactory results.<a name="_ednref45"></a>[44]</p>
<p>The average knowledge worker spends 2.3 hrs per day  –  or about 25% of work time  –  searching for critical job information.<a name="_ednref46"></a>[45] IDC estimates that enterprises employing 1,000 knowledge workers waste well over $6 million per year each in searching for information that does not exist, failing to find information that does, or recreating information that could have been found but was not.<a name="_ednref47"></a>[46] As that report stated, “It is simply impossible to create knowledge from information that cannot be found or retrieved.”</p>
<p>Vendors and customers often use time savings by knowledge workers as a key rationale for justifying a document or content initiative. This comes about because many studies over the years have noted that white collar employees spend a consistent 20% to 25% of their time seeking information; the premise is that more effective search will save time and drop these percentages. As a sample calculation, each 1% reduction in time devoted to search produces:</p>
<p>$50,000 (base salary) * 1.8 (burden rate) * 1.0% = $900/ employee</p>
<p>The stable percentage effort devoted to search over time suggests it is the “satisficing” allocation. (In other words, knowledge workers are willing to devote a quarter of their time to finding relevant information.) Thus, while better tools to aid better discovery may lead to finding better information and making better decisions more productively  – a far more important justification in itself  – there may not result a strict time or labor savings from more efficient search.<a name="_ednref48"></a>[47]</p>
<h2><a name="_Toc106767219"></a>Effect of Non-persistent Search Efforts</h2>
<p>The percentage of Web page visits that are re-visits is estimated at between 58%<a name="_ednref49"></a>[48] and 80%.<a name="_ednref50"></a>[49] While many of these re-visitations occur shortly after the first visit (<em>e.g</em>., during the same session using the back button), a significant number occur after a considerable amount of time has elapsed. Thus, it is not surprising that a survey of problems using the Web found “Not being able to find a page I know is out there,” and “Not being able to return to a page I once visited,” accounted for 17% of the problems reported, and that the most common problem using bookmarks was, “Changed content.”<a name="_ednref51"></a>[50] Depending on the content type, users use either “direct” or “indirect” approaches to re-find previously discovered information:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 335px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 205px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 65px;" valign="bottom">
<p align="center"><strong>Direct</strong></p>
</td>
<td style="background-color: #cccccc; width: 65px;" valign="bottom">
<p align="center"><strong>Indirect</strong></p>
</td>
</tr>
<tr>
<td style="width: 205px;" valign="bottom">Specific Information</td>
<td style="width: 65px;" valign="bottom">
<p align="right">42%</p>
</td>
<td style="width: 65px;" valign="bottom">
<p align="right">58%</p>
</td>
</tr>
<tr>
<td style="width: 205px;" valign="bottom">General Information</td>
<td style="width: 65px;" valign="bottom">
<p align="right">58%</p>
</td>
<td style="width: 65px;" valign="bottom">
<p align="right">43%</p>
</td>
</tr>
<tr>
<td style="width: 205px;" valign="bottom">Specific Documents</td>
<td style="width: 65px;" valign="bottom">
<p align="right">29%</p>
</td>
<td style="width: 65px;" valign="bottom">
<p align="right">71%</p>
</td>
</tr>
<tr>
<td style="width: 205px;" valign="bottom">Web Documents</td>
<td style="width: 65px;" valign="bottom">
<p align="right">77%</p>
</td>
<td style="width: 65px;" valign="bottom">
<p align="right">23%</p>
</td>
</tr>
<tr>
<td style="width: 205px;" valign="bottom">Emails</td>
<td style="width: 65px;" valign="bottom">
<p align="right">9%</p>
</td>
<td style="width: 65px;" valign="bottom">
<p align="right">91%</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 8. General Approaches to Re-finding Previously Discovered Information <a name="_ednref52"></a>[51]</p>
<p>Direct approaches require remembering or specifically noting the specific location of the information. Direct approaches include: direct entry; emailing to self; emailing to others; printing out; saving as file; pasting the URL into a document; and posting to a personal Web site.</p>
<p>Indirect approaches include: searching; looking through bookmarks; and recalling from a history file. All of these indirect approaches are supported by modern browsers. Note that re-finding Web pages or documents relies heavily on having a record of a previously visited URL.</p>
<p>As a University of Washington study supported by Microsoft discovered, all of the specific direct and indirect techniques applied to these re-discovery approaches have significant drawbacks in terms of desired functions for the recall process: <a name="_ednref53"></a>[52]</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 624px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 132px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 48px;"><strong>Portability</strong></td>
<td style="background-color: #cccccc; width: 47px;"><strong>No of Access Points</strong></td>
<td style="background-color: #cccccc; width: 47px;"><strong>Persistence</strong></td>
<td style="background-color: #cccccc; width: 47px;"><strong>Preservation</strong></td>
<td style="background-color: #cccccc; width: 47px;"><strong>Currency</strong></td>
<td style="background-color: #cccccc; width: 47px;"><strong>Context</strong></td>
<td style="background-color: #cccccc; width: 55px;"><strong>Reminding</strong></td>
<td style="background-color: #cccccc; width: 54px;"><strong>Ease of Integration</strong></td>
<td style="background-color: #cccccc; width: 48px;"><strong>Communication</strong></td>
<td style="background-color: #cccccc; width: 54px;"><strong>Ease of Maintenance</strong></td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 180px;" colspan="2" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">DIRECT APPROACHES</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 55px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 54px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 48px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 54px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Direct Entry</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">?</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">High</p>
</td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Email to Self</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Med</p>
</td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Email to Others</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Low?</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">High</p>
</td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Print-out</td>
<td style="width: 48px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Med</p>
</td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Save as File</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Med?</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low?</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Med?</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Med</p>
</td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Paste URL in Doc</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low?</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">High?</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">High?</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">High</p>
</td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Personal Web Site</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">High?</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">High?</p>
</td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 180px;" colspan="2" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">INDIRECT APPROACHES</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 47px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 55px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 54px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 48px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 54px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Search</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">?</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">High</p>
</td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">Bookmark</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Low</p>
</td>
</tr>
<tr>
<td style="width: 132px;" valign="bottom">History</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Med</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">High</p>
</td>
<td style="width: 47px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 55px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">Low?</p>
</td>
<td style="width: 48px;" valign="bottom">
<p align="center">Low</p>
</td>
<td style="width: 54px;" valign="bottom">
<p align="center">?</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 9. Strengths and Weakness of Existing Techniques to Re-use Web Information</p>
<p>The general observation is that no present technique is able alone to keep search persistent, current or maintain context. These combined inadequacies mean that previously found information is not easily found again, or re-discovered, as the following table shows:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 303px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 238px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 65px;" valign="bottom">
<p align="center"><strong>Percent</strong></p>
</td>
</tr>
<tr>
<td style="width: 238px;" valign="bottom">Information No Longer Available</td>
<td style="width: 65px;" valign="bottom">
<p align="right">37%</p>
</td>
</tr>
<tr>
<td style="width: 238px;" valign="bottom">Re-tracing Path Fails</td>
<td style="width: 65px;" valign="bottom">
<p align="right">14%</p>
</td>
</tr>
<tr>
<td style="width: 238px;" valign="bottom">Time Length Since Last Find</td>
<td style="width: 65px;" valign="bottom">
<p align="right">9%</p>
</td>
</tr>
<tr>
<td style="width: 238px;" valign="bottom">Other Failure Reasons</td>
<td style="width: 65px;" valign="bottom">
<p align="right">9%</p>
</td>
</tr>
<tr>
<td style="width: 238px;" valign="bottom">
<p align="center"><strong>Total Information Lost</strong></p>
</td>
<td style="width: 65px;" valign="bottom">
<p align="right">68%</p>
</td>
</tr>
<tr>
<td style="width: 238px;" valign="bottom"></td>
<td style="width: 65px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 238px;" valign="bottom">Success Finding Lost Information</td>
<td style="width: 65px;" valign="bottom">
<p align="right">32%</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 10. Success in Finding Important Earlier Found Web Information <a name="_ednref54"></a>[53]</p>
<p>This table has a number of important observations. First, some 37% of previously found information disappears from the Web, consistent with other findings that estimate about 40% of all Web content disappears annually, some of which has historical or archival value.<a name="_ednref55"></a>[54]</p>
<p>Second, and most importantly, nearly 70% of previously found valuable information cannot be rediscovered again. More than half of this problem is because the information is no longer available on the Web, but other reasons relate to the inadequacies of recall techniques for finding previously discovered information.</p>
<p>These observations can translate into some relatively huge costs on a per employee and per enterprise basis, as the table below shows:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 615px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 173px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 181px;" colspan="2" valign="bottom">
<p align="center"><strong><span style="text-decoration: underline;">Per Knowledge Worker</span></strong></p>
</td>
<td style="background-color: #cccccc; width: 136px;" valign="bottom">
<p align="center"><strong>Per ‘Large’</strong></p>
</td>
<td style="background-color: #cccccc; width: 125px;" valign="bottom">
<p align="center"><strong>All</strong></p>
</td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 173px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 97px;" valign="bottom">
<p align="center"><strong>Per Doc</strong></p>
</td>
<td style="background-color: #cccccc; width: 84px;" valign="bottom">
<p align="center"><strong>All Docs</strong></p>
</td>
<td style="background-color: #cccccc; width: 136px;" valign="bottom">
<p align="center"><strong>Enterprise</strong><strong> ($000)</strong></p>
</td>
<td style="background-color: #cccccc; width: 125px;" valign="bottom">
<p align="center"><strong>Enterprises ($M)</strong></p>
</td>
</tr>
<tr>
<td style="width: 173px;" valign="bottom">Re-finding Documents</td>
<td style="width: 97px;" valign="bottom">
<p align="right">$148.54</p>
</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$585</p>
</td>
<td style="width: 136px;" valign="bottom">
<p align="right">$3,547</p>
</td>
<td style="width: 125px;" valign="bottom">
<p align="right">$12,103</p>
</td>
</tr>
<tr>
<td style="width: 173px;" valign="bottom"></td>
<td style="width: 97px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 136px;" valign="bottom"></td>
<td style="width: 125px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 173px;" valign="bottom">Re-creating Documents</td>
<td style="width: 97px;" valign="bottom">
<p align="right">$384.11</p>
</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$1,008</p>
</td>
<td style="width: 136px;" valign="bottom">
<p align="right">$6,114</p>
</td>
<td style="width: 125px;" valign="bottom">
<p align="right">$20,864</p>
</td>
</tr>
<tr>
<td style="width: 173px;" valign="bottom"></td>
<td style="width: 97px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 136px;" valign="bottom"></td>
<td style="width: 125px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 173px;" valign="bottom">TOTAL</td>
<td style="width: 97px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom">
<p align="right">$1,593</p>
</td>
<td style="width: 136px;" valign="bottom">
<p align="right">$9,661</p>
</td>
<td style="width: 125px;" valign="bottom">
<p align="right">$32,967</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 11. ‘Cost’ of Not Readily Re-finding Valuable Web Information</p>
<p>This analysis assumes that some previously found information of value is again re-found (60%), but some is also not re-found and must be re-created (40%).<a name="_ednref56"></a>[55] The ‘large’ enterprise is identical to the definition in Table 2 (which is also nearly equivalent to a Fortune 1000 company).<a name="_ednref57"></a>[56]</p>
<p>The analysis indicates that poor methods to recall previously found and valuable Web documents may cost $1,600 per knowledge worker per year. This translates into nearly a $10 million productivity loss for the largest enterprises, or nearly $33 billion across all U.S. industries.</p>
<p>In relation to the total document costs noted in Table 7 above, these may seem to be comparatively small numbers. However, when viewed in the context of unproductive standard Web search, they indicate important failings in the ability to recall previously found valuable results from searches and their attendant productivity losses.</p>
<h2><a name="_Toc106767220"></a>‘Cost’ of Creating and Maintaining a Document Category Portal</h2>
<p>Users, administrators and industry analysts alike recognize the importance of placing content into logical, intuitive and hierarchically organized categories. About 60% of knowledge workers note that search is a difficult process, made all the more difficult without a logical organization to content.<a name="_ednref58"></a>[57] While technical distinctions exist, these logical structures organized into a hierarchical presentation are most often referred to as “taxonomies,” though other terms such as ontology, subject directory, subject tree, directory structure or classification schema may be used.</p>
<p>Delphi Group’s research with corporate Web sites points to the lack of organized information as the number one problem in the opinion of business professionals. More than three-quarters of the surveyed corporations indicated that a taxonomy or classification system for documents is imperative or somewhat important to their business strategy; more than one-third of firms that classify documents still use manual techniques.<sup>57</sup> Hierarchical arrangements of categorized subjects trigger associations and relationships that are not obvious when simply searching keywords. Other advantages cited for the taxonomic presentation of documents are the greater likelihood of discovery, ease-of-use, overcoming the difficulty of formulating effective search queries, being able to search only within related documents, discovery of relationships among similar terminology and concepts, and user satisfaction.<a name="_ednref59"></a>[58],<a name="_ednref60"></a>[59]</p>
<p>From the user standpoint, knowledge workers want to impose taxonomic order on document chaos, but only if the taxonomy models their domain accurately. They also want software to assist with categorizing, as long as it respects the taxonomy they created. Finally, the results of these category placements should be presented via a portal. Thus, as the common concern across all requirements, the taxonomy takes on tremendous importance for an application’s success.<a name="_ednref61"></a>[60]</p>
<p><img class="center_ok" src="../wp-content/themes/ai3/images/DocValue/Figure2.gif" alt="Large firm documents" width="447" height="295" /></p>
<p style="text-align: center;">Figure 2. Typical Large Firm Documents, Thousands</p>
<p>Enterprises that have adopted directory structures for content management are not yet achieving enterprise-wide relevance, presenting on average 1% of all relevant documents in an organized portal view. These limitations appear to be driven by weaknesses in the technology and high costs associated with conventional approaches:</p>
<ul>
<li><em>Comprehensiveness and Scale </em> –  according to a market report published by Plumtree in 2003, the average document portal contains about 37,000 documents.<a name="_ednref62"></a>[61] This was an increase from a 2002 Plumtree survey that indicated average document counts of 18,000.<a name="_ednref63"></a>[62] However, about 60% of respondents to a Delphi Group survey said they had more than 50,000 internal documents in their portal environment (generally the department level),<sup> 3</sup> and as Table 2 indicates above, most of the largest firms likely have millions or more<em> internal</em> documents deserving of common access and archiving.</li>
<li>The left-hand bar in Figure 2 indicates current averages for documents in existing content portals. The right-hand (yellow and orange) bar indicates potential based on high and low estimates. The ‘Archive’ case (middle bar) show the same values as provided in Table 2, and represent a conservative view of “archival-likely” documents. The right bar is a more representative view of actual current <em>internal </em>content that enterprises may want to make available to their employees.<a name="_ednref64"></a>[63] Two observations have merit: 1) under current practice, enterprises are at most making 10% of their useful documents available, and more likely slightly over 1%; 2) the documents that are being made available are solely internal, and neglect potentially important external sources that would increase document counts considerably.</li>
<li><em>Implementation Times </em> – though average time to stand-up a new content installation is about 6 months, there is also a 22% risk that deployment times exceeds that and an 8% risk it takes longer than one year. Furthermore, internal staff necessary for initial stand-up average nearly 14 people (6 of whom are strictly devoted to content development), with the potential for much larger head counts<a name="_ednref65"></a>[64]</li>
<li><em>Ongoing Maintenance and Staffing Costs </em> – ongoing maintenance and staffing costs typically exceed the initial deployment effort. This trend is perhaps not surprising in that once a valuable content portal has been created there will be demands to expand its scope and coverage. Based on these various factors, Table 12 summarizes set-up, ongoing maintenance and key metrics for today’s conventional approaches versus what BrightPlanet can do (the BrightPlanet document count is based on a ‘typical’ installation; there are no practical scale limits)</li>
</ul>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 568px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 120px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 98px;" valign="bottom">
<p align="center"><strong>DOCUMENT</strong></p>
</td>
<td style="background-color: #cccccc; width: 187px;" colspan="3" valign="bottom">
<p align="center"><strong>INITIAL SET-UP</strong></p>
</td>
<td style="background-color: #cccccc; width: 163px;" colspan="2" valign="bottom">
<p align="center"><strong>MAINTENANCE</strong></p>
</td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 120px;"></td>
<td style="background-color: #cccccc; width: 98px;">
<p align="center"><strong>BASIS</strong></p>
</td>
<td style="background-color: #cccccc; width: 64px;">
<p align="center"><strong>Staff</strong></p>
</td>
<td style="background-color: #cccccc; width: 49px;">
<p align="center"><strong>Mos</strong></p>
</td>
<td style="background-color: #cccccc; width: 73px;">
<p align="center"><strong>$/Doc</strong></p>
</td>
<td style="background-color: #cccccc; width: 73px;">
<p align="center"><strong>Staff</strong></p>
</td>
<td style="background-color: #cccccc; width: 91px;">
<p align="center"><strong>$/Doc</strong></p>
</td>
</tr>
<tr>
<td style="width: 120px;" valign="bottom">Current Practice</td>
<td style="width: 98px;" valign="bottom">
<p align="right">37,000</p>
</td>
<td style="width: 64px;" valign="bottom">
<p align="right">6.2</p>
</td>
<td style="width: 49px;" valign="bottom">
<p align="right">5.4</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">$4.861</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">6.4</p>
</td>
<td style="width: 91px;" valign="bottom">
<p align="right">$11.278</p>
</td>
</tr>
<tr>
<td style="width: 120px;" valign="bottom">BrightPlanet</td>
<td style="width: 98px;" valign="bottom">
<p align="right">250,000</p>
</td>
<td style="width: 64px;" valign="bottom">
<p align="right">1.0</p>
</td>
<td style="width: 49px;" valign="bottom">
<p align="right">0.8</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">$0.017</p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="right">0.3</p>
</td>
<td style="width: 91px;" valign="bottom">
<p align="right">$0.078</p>
</td>
</tr>
<tr>
<td style="width: 120px;" valign="bottom"></td>
<td style="width: 98px;" valign="bottom"></td>
<td style="width: 64px;" valign="bottom"></td>
<td style="width: 49px;" valign="bottom"></td>
<td style="width: 73px;" valign="bottom"></td>
<td style="width: 73px;" valign="bottom"></td>
<td style="width: 91px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 120px;" valign="bottom">BP Advantage</td>
<td style="width: 98px;" valign="bottom">
<p align="center"><strong>6.8 x + up</strong></p>
</td>
<td style="width: 64px;" valign="bottom">
<p align="center"><strong>6.2 x</strong></p>
</td>
<td style="width: 49px;" valign="bottom">
<p align="center"><strong>6.7 x</strong></p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="center"><strong>280.4 x</strong></p>
</td>
<td style="width: 73px;" valign="bottom">
<p align="center"><strong>21.4 x</strong></p>
</td>
<td style="width: 91px;" valign="bottom">
<p align="center"><strong>144.6 x</strong></p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 12. Staff, Time and per Document Costs for Categorized Document Portals</p>
<ul>
<li>The content staff level estimates in the table are consistent with anecdotal information and with a survey of 40 installations that found there were on average 14 content development staff managing each enterprise’s content portal.<a name="_ednref66"></a>[65]</li>
</ul>
<p>Though conventional approaches to content integration seem to lead to high per document set-up and maintenance costs, these should be contrasted with standard practice that suggests it may cost on average $25 to $40 per document simply for filing.<sup>29</sup> Indeed, labor costs can account for up to 30% of total document handling costs.<sup>28</sup> Nonetheless, at $5 to $11 per document for content management alone, this could result in no actual cost savings if electronic access does not displace current filing practices. When multiplied across all enterprise documents, these uncertainties can translate into huge swings in costs or benefits for a content portal initiative.</p>
<ul>
<li><em>Software License v</em>.<em> Full Project Costs</em> – according to Charles Phillips of Morgan Stanley, only 30% of the money spent on major software projects goes to the actual purchase of commercially packaged software. Another third goes to internal software development by companies. The remaining 37% goes to third-party consultants.<a name="_ednref67"></a>[66] In evaluating a commitment, internal staff and consulting time should be carefully scrutinized. Efficiencies in initial deployment and ongoing support are the biggest cost drivers</li>
<li><em>Internal PLUS External Sources</em> – weaknesses in scalability and high implementation costs often lead to a dismissal of the importance of integrating internal plus external content. Few installations address relevant content external to the enterprise essential to achieving its missions. Granted, the increase in scales associated with external content are large, but for some businesses integration with external content may be essential.</li>
</ul>
<p>While other vendors claim fast categorization times, what they fail to mention is the lengthy pre-processing times necessary for generating their categorization metatags. According to Forrester Research, some of these metatagging systems can only process five to 15 documents per hour!<a name="_ednref68"></a>[67]</p>
<h2><a name="_Toc106767221"></a>‘Cost’ of Inaccessible or Hidden Intranet Sites</h2>
<p>In 2003, the portal vendor Plumtree noticed a new trend that it called “Web sprawl,” by which it meant the costly proliferation of Web applications, intranets and extranets.<a name="_ednref69"></a>[68] BEA has taken up this trend as a major thrust to its Web service offerings through an approach it calls “enterprise portal rationalization” (EPR).<a name="_ednref70"></a>[69] According to BEA, its architectural offerings are meant to control the “metastasizing” of corporate Web sites.</p>
<p>How common and to what scale is the proliferation of enterprise Web sites? I have not been able to find any comprehensive studies on this topic, but has been able to find many anecdotal examples. The proliferation, in fact, began as soon as the Internet became popular:</p>
<ul>
<li>As reported in 2000, Intel had more than 1 million URLs on its intranet with more than 100 new Web sites being introduced each month<a name="_ednref71"></a>[70]</li>
<li>In 2002, IBM consolidated over 8,000 intranet sites, 680 ‘major’ sites, 11 million Web pages and 5,600 domain names into what it calls the IBM Dynamic Workplaces, or W3 to employees<a name="_ednref72"></a>[71]</li>
<li>Silicon Graphics’ ‘Silicon Junction’ company-wide portal serves 7,200 employees with 144,000 Web pages consolidated from more than 800 internal Web sites<a name="_ednref73"></a>[72]</li>
<li>Hewlett-Packard Co., for example, has sliced the number of internal Web sites it runs from 4,700 (1,000 for employee training, 3,000 for HR) to 2,600, and it makes them all accessible from one home, @HP <a name="_ednref74"></a>[73]<sup>,<a name="_ednref75"></a>[74]</sup></li>
<li>Avaya Corporation is now consolidating more than 800 internal Web sites globally<a name="_ednref76"></a>[75]</li>
<li>The <em>Wall Street Journal</em> recently reported that AT&amp;T has 10 information architects on staff to maintain its 3,600 intranet sets that contain 1.5 million public Web pages<a name="_ednref77"></a>[76]</li>
<li>The new Department of Homeland Security is faced with the challenge of consolidating more than 3,000 databases inherited from its various constituent agencies.<a name="_ednref78"></a>[77]</li>
</ul>
<p>BrightPlanet’s customers confirm these trends, with indicators of hundreds if not thousands of internal Web sites common in the largest companies. Indeed, it is surprising how many instances there are where corporate IT does not even know the full extent of Web site proliferation. The problem is likely much greater than realized:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 586px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 306px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 84px;" valign="bottom">
<p align="center"><strong>Low</strong></p>
</td>
<td style="background-color: #cccccc; width: 92px;" valign="bottom">
<p align="center"><strong>Med</strong></p>
</td>
<td style="background-color: #cccccc; width: 103px;" valign="bottom">
<p align="center"><strong>High</strong></p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Number of Large Firms</td>
<td style="width: 84px;" valign="bottom">
<p align="right">930</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">1,500</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">3,000</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Ave Number of Web Sites per Firm</td>
<td style="width: 84px;" valign="bottom">
<p align="right">100</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">500</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">900</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Ave. Number of Documents per Web Site</td>
<td style="width: 84px;" valign="bottom">
<p align="right">100</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">350</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">1,500</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Large Firm Web Sites</td>
<td style="width: 84px;" valign="bottom">
<p align="right">93,000</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">750,000</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">2,700,000</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Percentage of Known Web Sites</td>
<td style="width: 84px;" valign="bottom">
<p align="right">85%</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">60%</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">40%</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Percentage of Doc Federation for Known Sites</td>
<td style="width: 84px;" valign="bottom">
<p align="right">50%</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">10%</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">2%</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"><strong><span style="text-decoration: underline;">Site Development &amp; Maintenance</span></strong></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Development Cost per Web Site</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$300</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$1,701</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$9,000</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Annual Maintenance Cost per Site</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$800</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$3,947</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$21,000</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Yr 1 Cost per Site</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$1,100</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$5,649</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$30,000</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Yr 1 per Large Firm Costs ($000)</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$110</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$2,824</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$27,000</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Yr 1 Large Firm Costs ($M)</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$102</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$4,237</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$81,000</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"><strong><span style="text-decoration: underline;">‘Cost’ of Unfound Documents</span></strong></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">No. of Unknown Documents per Firm</td>
<td style="width: 84px;" valign="bottom">
<p align="right">5,750</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">80,500</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">820,800</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Number of Large Firm Unknown Docs</td>
<td style="width: 84px;" valign="bottom">
<p align="right">5,347,500</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">120,750,000</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">2,462,400,000</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Cost per Web Site</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$6,900</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$23,915</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$350,310</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Cost of Unknown Docs per Firm ($000)</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$690</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$11,958</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$315,279</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Cost of Large Firm Unknown Docs ($M)</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$642</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$17,937</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$945,837</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"><strong><span style="text-decoration: underline;">Summary</span></strong></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Cost per Firm ($000)</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$800</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$14,782</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$342,279</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Total Cost all Large Firms ($M)</td>
<td style="width: 84px;" valign="bottom">
<p align="right">$744</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$22,173</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">$1,026,837</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom"></td>
<td style="width: 84px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 103px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Development as % of Total Costs</td>
<td style="width: 84px;" valign="bottom">
<p align="right">14%</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">19%</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">8%</p>
</td>
</tr>
<tr>
<td style="width: 306px;" valign="bottom">Unfound Documents as % of Total Costs</td>
<td style="width: 84px;" valign="bottom">
<p align="right">86%</p>
</td>
<td style="width: 92px;" valign="bottom">
<p align="right">81%</p>
</td>
<td style="width: 103px;" valign="bottom">
<p align="right">92%</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 13. Development and Unfound Document ‘Costs’ for Large Firms due to Web Sprawl</p>
<p>Table 13 consolidates previous information to estimate what the ‘costs’ of Web sprawl might be to larger firms (analogous to the Fortune 1000). The table presents Low, Medium and High estimates for number of Web sites per firm, known and unknown documents in each, and associated costs for initial site development and first-year maintenance plus the value of unfound information. The Medium category uses the average values from previous tables. The Low and High values bracket these amounts based on distribution of known values and expert judgment.</p>
<p>The table indicates as a mid-range estimate that an individual Web site for a large enterprise may cost about $6,000 to set-up and maintain in the first year and represents $24,000 in opportunity costs due to unknown or unfound documents. For the average large enterprise across all Web sites, these costs may be $4.2 million and $12.0 million, respectively. Across all large firms, total costs due to Web sprawl may be on the order of $22 billion.</p>
<p>While site development and maintenance costs are not trivial, exceeding $4 billion for all large firms (which can also be significantly reduced  – see previous section), the major cost impact comes from the inability to find or federate the information that is available. Unfound documents represent <strong><em><span style="text-decoration: underline;">well in excess of 80%</span></em></strong> of the costs associated with Web sprawl.</p>
<p>The Web sprawl situation is analogous to other major technology shifts. For example, in the early 1980s, IT grappled mightily with the proliferation of personal computers. Centralized control was impossible in that circumstance because individuals and departments recognized the productivity benefits to be gained by PCs. Only when enterprise-capable vendors of networking technology, such as Novell, were able to offer integration solutions was the corporation able to control and fully exploit the PC’s technology potential.</p>
<p>The proliferation of internal enterprise Web sites is responding to similar drivers: innovation, customer service, or superior methods of product or solutions delivery. Ambitious mid-level managers will continue to exploit these advantages by “cowboy” additions of more corporate Web sites, and that is likely to the good for most enterprises. Gaining control and fully realizing the value of this Web site proliferation  – while not stymieing innovation  – will likely require enabling technology analogous to the networking of PCs.</p>
<h1><a name="_Toc106767222"></a>IV. OPPORTUNITIES AND THREATS</h1>
<p>The previous analysis has focused on more-or-less direct costs and drivers. These impacts are huge and deserve proper consideration. But there are other implications from the inability to access and manage relevant document information. These implications fall into the categories of lost opportunities, liabilities, or non-compliance. These implications often far outweigh the direct costs in their bottom-line impacts. This section presents only a few of these many opportunities.</p>
<h2><a name="_Toc106767223"></a>‘Costs’ and Opportunity Costs of Winning Proposals</h2>
<p>Competitive proposals are an important revenue factor to hundreds of thousands of businesses. Indeed, contracts and grants from federal, state and local governments accounted for 12.1% of GDP in 2002; the amount competitively awarded equaled about 5.6% of GDP.<a name="_ednref79"></a>[78] Reducing the fully-burdened costs of producing responses to competitive procurements and improving the rate of successfully obtaining them can be a huge competitive advantage to business.</p>
<p>Significant proportions of commercial projects and programs are likewise awarded through competitive proposals and bids. However, literature references to these are limited, and the remainder of this section relies on federal sector statistics as a proxy for the overall category.</p>
<p>Though the federal government is making strides in providing central clearinghouses to opportunities  – and is also doing much in moving to uniform application standards and electronic application submissions  – these efforts are still in their nascent stages and similar efforts at the state and local level are severely lagging. As a result, the magnitude of the proposal opportunity is perhaps largely unknown to many businesses. This lack of appreciation and attention to the cost- and success-drivers behind winning proposals is a real gap in the competitiveness of many individual businesses.</p>
<p>Table 14 on the following page consolidates information from many government sources to quantify the magnitude of this competitively-awarded grant and contract opportunity with governments.</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 527px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 271px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 102px;" valign="bottom">
<p align="center"><strong>Number of Awards</strong></p>
</td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom">
<p align="center"><strong>Amount ($000)</strong></p>
</td>
<td style="background-color: #cccccc; width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 271px;" valign="bottom"><strong><span style="text-decoration: underline;">Federal Government</span></strong></td>
<td style="background-color: #cccccc; width: 102px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Total Grants</td>
<td style="width: 102px;" valign="bottom">
<p align="right">1,335,813</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$441,037,633</p>
</td>
<td style="width: 46px;" valign="bottom"><a name="_ednref80"></a>[79] <a name="_ednref81"></a>[80]</td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Total Contract Procurements</td>
<td style="width: 102px;" valign="bottom">
<p align="right">1,155,096</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$327,413,076</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Competitively-awarded Grants</td>
<td style="width: 102px;" valign="bottom">
<p align="right">336,091</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$99,234,657</p>
</td>
<td style="width: 46px;" valign="bottom"><a name="_ednref82"></a>[81]</td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Competitively-awarded Procurements</td>
<td style="width: 102px;" valign="bottom">
<p align="right">909,087</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$231,878,136</p>
</td>
<td style="width: 46px;" valign="bottom"><a name="_ednref83"></a>[82]</td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Total Competitive Opportunities</td>
<td style="width: 102px;" valign="bottom">
<p align="right">1,245,179</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$331,112,793</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Ave Competitive Opportunity</td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom">
<p align="right">$266</p>
</td>
<td style="width: 46px;" valign="bottom"><a name="_ednref84"></a>[83]</td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 271px;" valign="bottom"><strong><span style="text-decoration: underline;">State &amp; Local Government</span></strong></td>
<td style="background-color: #cccccc; width: 102px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 46px;" valign="bottom"><a name="_ednref85"></a>[84] <a name="_ednref86"></a>[85]</td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Total Grants</td>
<td style="width: 102px;" valign="bottom">
<p align="right">757,199</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$190,000,000</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Total Contract Procurements</td>
<td style="width: 102px;" valign="bottom">
<p align="right">1,439,031</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$310,000,000</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Competitively-awarded Grants</td>
<td style="width: 102px;" valign="bottom">
<p align="right">190,512</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$42,750,512</p>
</td>
<td style="width: 46px;" valign="bottom"><a name="_ednref87"></a>[86]</td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Competitively-awarded Procurements</td>
<td style="width: 102px;" valign="bottom">
<p align="right">1,132,551</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$219,545,972</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Total Competitive Opportunities</td>
<td style="width: 102px;" valign="bottom">
<p align="right">1,323,063</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$262,296,485</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Ave Competitive Opportunity</td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom">
<p align="right">$198</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 271px;" valign="bottom"><strong><span style="text-decoration: underline;">Total (no B-to-B)</span></strong></td>
<td style="background-color: #cccccc; width: 102px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Competitively-awarded Grants</td>
<td style="width: 102px;" valign="bottom">
<p align="right">526,603</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$141,985,169</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Competitively-awarded Procurements</td>
<td style="width: 102px;" valign="bottom">
<p align="right">2,041,638</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$451,424,108</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Total Competitive Opportunities</td>
<td style="width: 102px;" valign="bottom">
<p align="right">2,568,241</p>
</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$593,409,277</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 271px;" valign="bottom">Ave Competitive Opportunity</td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom">
<p align="right">$231</p>
</td>
<td style="width: 46px;" valign="bottom"></td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 14. Federal, State &amp; Local Contract and Grant Opportunities, 2002</p>
<p>This analysis suggests there are nearly $600 billion available each year for competitively awarded grants and procurements from all levels of government within the U.S.; about 60% from the federal sector. The average competitive award is about $270 K for grants; about $220 K for contract procurements.</p>
<p>Aside from construction firms (which are excluded in this and prior analyses), there are on the order of 92,500 federal contract-seeking firms today.<a name="_ednref88"></a>[87] In 2003, the top 200 federal contracting firms accounted for nearly $190 billion in contract outlays.<a name="_ednref89"></a>[88] While it is unclear what proportion of these commitments were competitive (81% of total federal commitments) or based on all contract procurements (57% of total federal commitments), it is clear that more than 90,000 firms are competing via a classic power curve for a minor portion of available federal revenues. This power curve is shown in Figure 3 below for the 200 largest federal contractors, which obtain a proportionately high percentage of all contract dollars.</p>
<p><img class="center_ok" src="../wp-content/themes/ai3/images/DocValue/Figure3.gif" alt="Power curve distribution of Fedeeral contractors" width="623" height="331" /></p>
<p style="text-align: center;">Figure 3. Power Curve Distribution of Top 200 Federal Contractors by Revenue, 2002</p>
<p>The combination of these factors enables an estimate of the bottom-line proposal impacts by firm. This information is shown in the table below:</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 648px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 324px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom">
<p align="center"><strong>Number</strong></p>
</td>
<td style="background-color: #cccccc; width: 180px;" colspan="3" valign="bottom">
<p align="center"><strong>Amount ($000)</strong></p>
</td>
<td style="background-color: #cccccc; width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Total Competitive Awards</td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 64px;" valign="bottom"></td>
<td style="width: 116px;" colspan="2" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Federal</td>
<td style="width: 108px;" valign="bottom">
<p align="right">1,245,179</p>
</td>
<td style="width: 180px;" colspan="3" valign="bottom">
<p align="center">$331,112,793</p>
</td>
<td style="width: 36px;" valign="bottom"><a name="_ednref90"></a>[89]</td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">State &amp; Local</td>
<td style="width: 108px;" valign="bottom">
<p align="right">1,323,063</p>
</td>
<td style="width: 180px;" colspan="3" valign="bottom">
<p align="center">$262,296,485</p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Number of Competing Firms</td>
<td style="width: 108px;" valign="bottom">
<p align="right">120,250</p>
</td>
<td style="width: 64px;" valign="bottom"></td>
<td style="width: 116px;" colspan="2" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"><a name="_ednref91"></a>[90]</td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Number of Winning Firms</td>
<td style="width: 108px;" valign="bottom">
<p align="right">90,805</p>
</td>
<td style="width: 64px;" valign="bottom"></td>
<td style="width: 116px;" colspan="2" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Number of Winning Proposals</td>
<td style="width: 108px;" valign="bottom">
<p align="right">2,326,485</p>
</td>
<td style="width: 64px;" valign="bottom"></td>
<td style="width: 116px;" colspan="2" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Number of Submitted Proposals</td>
<td style="width: 108px;" valign="bottom">
<p align="right">11,211,974</p>
</td>
<td style="width: 64px;" valign="bottom"></td>
<td style="width: 116px;" colspan="2" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 64px;" valign="bottom"></td>
<td style="width: 116px;" colspan="2" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 324px;" valign="bottom"><strong><span style="text-decoration: underline;">Direct Proposal Preparation Costs</span></strong></td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 64px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 116px;" colspan="2" valign="bottom"></td>
<td style="background-color: #cccccc; width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Winning Proposal Preparation</td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 180px;" colspan="3" valign="bottom">
<p align="center">$5,021,357</p>
</td>
<td style="width: 36px;" valign="bottom"><a name="_ednref92"></a>[91]</td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Losing Proposals Preparation</td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 180px;" colspan="3" valign="bottom">
<p align="center">$16,939,516</p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">TOTAL Proposal Preparation</td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 180px;" colspan="3" valign="bottom">
<p align="center">$21,960,873</p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 90px;" colspan="2" valign="bottom"></td>
<td style="width: 90px;" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom">
<p align="center"><strong>Low</strong></p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="center"><strong>Med</strong></p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="center"><strong>High</strong></p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Improvement in RFP Development</td>
<td style="width: 108px;" valign="bottom">
<p align="right">7.5%</p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="right">15.0%</p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="right">35.0%</p>
</td>
<td style="width: 36px;" valign="bottom"><a name="_ednref93"></a>[92]</td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 90px;" colspan="2" valign="bottom"></td>
<td style="width: 90px;" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 324px;" valign="bottom"><strong><span style="text-decoration: underline;">Proposal Preparation</span></strong></td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 90px;" colspan="2" valign="bottom"></td>
<td style="background-color: #cccccc; width: 90px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Benefits – Individual Submitters ($000)</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$14</p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="right">$27</p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="right">$64</p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Benefits – All Submitters ($000)</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$1,647,065</p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="right">$3,294,131</p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="right">$7,686,305</p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 90px;" colspan="2" valign="bottom"></td>
<td style="width: 90px;" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="background-color: #cccccc; width: 324px;" valign="bottom"><strong><span style="text-decoration: underline;">Proposal Success Benefits</span></strong></td>
<td style="background-color: #cccccc; width: 108px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 90px;" colspan="2" valign="bottom"></td>
<td style="background-color: #cccccc; width: 90px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Increase in Number of Winning Submissions</td>
<td style="width: 108px;" valign="bottom">
<p align="right">6,810</p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="right">13,621</p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="right">31,782</p>
</td>
<td style="width: 36px;" valign="bottom"><a name="_Ref90884783"></a><a name="_ednref94"></a>[93]</td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Increase in Number of Winning Firms</td>
<td style="width: 108px;" valign="bottom">
<p align="right">1,406</p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="right">2,812</p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="right">6,562</p>
</td>
<td style="width: 36px;" valign="bottom"><a name="_ednref95"></a>[94]</td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Benefits – Individual Submitters ($000)</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$1,235</p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="right">$1,235</p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="right">$1,235</p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom">Benefits – All Submitters ($000)</td>
<td style="width: 108px;" valign="bottom">
<p align="right">$1,737,101</p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="right">$3,474,203</p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="right">$8,106,473</p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom"></td>
<td style="width: 108px;" valign="bottom"></td>
<td style="width: 90px;" colspan="2" valign="bottom"></td>
<td style="width: 90px;" valign="bottom"></td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 324px;" valign="bottom"><strong><span style="text-decoration: underline;">Benefits – All Submitters/All Aspects</span></strong></td>
<td style="width: 108px;" valign="bottom">
<p align="right">$3,384,167</p>
</td>
<td style="width: 90px;" colspan="2" valign="bottom">
<p align="right">$6,768,334</p>
</td>
<td style="width: 90px;" valign="bottom">
<p align="right">$15,792,778</p>
</td>
<td style="width: 36px;" valign="bottom"></td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 15. Combined Preparation Costs and Opportunity Costs for Proposals</p>
<p>Across all entities, the annual cost of preparing proposals to competitive solicitations from government agencies at all levels is on the order of $22 billion, $5 billion for winning firms and $17 billion for losing firms. Better access to missing information and better information  – assuming no change in the underlying ideas or proposal-writing skills  – suggests that proposal response costs could be reduced by more than $3 billion annually. Another $3 billion annually is available for better winning of competitive proposals. Individual benefits to firms that respond to competitive solicitations is on average $1.25 million per competing firm.<a name="_ednref96"></a>[95]</p>
<p>The more significant benefit to individual firms from improved access to “missing” information and better information is increasing the likelihood of winning a competitive award. Firms that embrace these practices are estimated to obtain a $1.2 million annual benefit. Given that many firms that have previously been losing awards have relatively low annual revenues, the percent impact on the bottom line can be quite striking due to improved proposal preparation information.</p>
<h2><a name="_Toc106767224"></a>‘Costs’ of Regulation and Regulatory Non-compliance</h2>
<p>A December 2001 small business poll by the National Federation of Independent Business (NFIB) gauged the impacts of the regulatory workload on firms. When asked “is government regulation a very serious, somewhat serious, not too serious, or not at all serious problem for your business,” nearly half, or 43.6 percent, answered “very serious” or “somewhat serious.” The respondents indicated the most serious regulatory problems were at the federal level (49 %), state level (35 %) or local level (13%) of government. The biggest single regulatory problem cited was extra paperwork, followed by difficulty understanding how to comply with regulations and dollars spent doing so.<a name="_ednref97"></a>[96] A later December 2003 NFIB survey indicates that the average cost per hour of complying with paperwork requirements was $48.72.<a name="_ednref98"></a>[97]</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 624px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 156px;" valign="bottom">
<p align="center"><strong>Type of Regulation</strong></p>
</td>
<td style="background-color: #cccccc; width: 96px;" valign="bottom">
<p align="center"><strong>All Firms</strong></p>
</td>
<td style="background-color: #cccccc; width: 120px;" valign="bottom">
<p align="center"><strong>&lt;20 Employees</strong></p>
</td>
<td style="background-color: #cccccc; width: 132px;" valign="bottom">
<p align="center"><strong>20-499 Employees</strong></p>
</td>
<td style="background-color: #cccccc; width: 120px;" valign="bottom">
<p align="center"><strong>500+ Employees</strong></p>
</td>
</tr>
<tr>
<td style="width: 156px;" valign="bottom">All Federal Regulations</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$5,107</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$7,544</p>
</td>
<td style="width: 132px;" valign="bottom">
<p align="right">$4,671</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$4,827</p>
</td>
</tr>
<tr>
<td style="width: 156px;" valign="bottom"></td>
<td style="width: 96px;" valign="bottom"></td>
<td style="width: 120px;" valign="bottom"></td>
<td style="width: 132px;" valign="bottom"></td>
<td style="width: 120px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 156px;" valign="bottom">Environmental</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$1,312</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$3,600</p>
</td>
<td style="width: 132px;" valign="bottom">
<p align="right">$1,269</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$776</p>
</td>
</tr>
<tr>
<td style="width: 156px;" valign="bottom">Economic</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$2,234</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$1,748</p>
</td>
<td style="width: 132px;" valign="bottom">
<p align="right">$1,782</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$2,688</p>
</td>
</tr>
<tr>
<td style="width: 156px;" valign="bottom">Workplace</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$843</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$897</p>
</td>
<td style="width: 132px;" valign="bottom">
<p align="right">$944</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$755</p>
</td>
</tr>
<tr>
<td style="width: 156px;" valign="bottom">Tax Compliance</td>
<td style="width: 96px;" valign="bottom">
<p align="right">$719</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$1,300</p>
</td>
<td style="width: 132px;" valign="bottom">
<p align="right">$676</p>
</td>
<td style="width: 120px;" valign="bottom">
<p align="right">$608</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 16. Per Employee Costs of Federal Regulation by Firm Size, 2002</p>
<p>According to a 2001 report, “The Impact of Regulatory Costs on Small Firms” by W. Mark Crain and Thomas D. Hopkins, the total costs of Federal regulations were estimated to be $843 billion in 2000, or 8 percent of the U. S. Gross Domestic Product. Of these costs, $497 billion fell on business and $346 billion fell on consumers or other governments. Here are how those impacts are estimated on a per employee basis across a range of firm sizes:<a name="_ednref99"></a>[98]</p>
<p>As of September 30, 2002, federal agencies estimated there were about 8.2 billion “burden hours” of paperwork government-wide. Almost 95 percent of those 8.2 billion hours were being collected primarily for the purpose of regulatory compliance. <a name="_ednref100"></a>[99]</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 492px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 192px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 156px;" valign="bottom">
<p align="center"><strong>Burden Hrs (million)</strong></p>
</td>
<td style="background-color: #cccccc; width: 144px;" valign="bottom">
<p align="center"><strong>Labor Costs ($M)</strong></p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom"><strong>Total Government</strong></td>
<td style="width: 156px;" valign="bottom">
<p align="right">8,223.17</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$318,237</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom"><strong>Total Gov (excl. Treasury)</strong></td>
<td style="width: 156px;" valign="bottom">
<p align="right">1,472.74</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$56,995</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom"></td>
<td style="width: 156px;" valign="bottom"></td>
<td style="width: 144px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Treasury</td>
<td style="width: 156px;" valign="bottom">
<p align="right">6,750.43</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$261,242</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Transportation</td>
<td style="width: 156px;" valign="bottom">
<p align="right">244.73</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$9,471</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">HHS</td>
<td style="width: 156px;" valign="bottom">
<p align="right">224.83</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$8,701</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Labor</td>
<td style="width: 156px;" valign="bottom">
<p align="right">189.22</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$7,323</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">EPA</td>
<td style="width: 156px;" valign="bottom">
<p align="right">140.47</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$5,436</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Defense</td>
<td style="width: 156px;" valign="bottom">
<p align="right">92.36</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$3,574</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Agriculture</td>
<td style="width: 156px;" valign="bottom">
<p align="right">88.59</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$3,428</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Justice</td>
<td style="width: 156px;" valign="bottom">
<p align="right">46.60</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$1,803</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Education</td>
<td style="width: 156px;" valign="bottom">
<p align="right">38.44</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$1,488</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">State</td>
<td style="width: 156px;" valign="bottom">
<p align="right">29.23</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$1,131</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">HUD</td>
<td style="width: 156px;" valign="bottom">
<p align="right">21.93</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$849</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Commerce</td>
<td style="width: 156px;" valign="bottom">
<p align="right">11.65</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$451</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Interior</td>
<td style="width: 156px;" valign="bottom">
<p align="right">7.66</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$296</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Energy</td>
<td style="width: 156px;" valign="bottom">
<p align="right">3.76</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$146</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom"></td>
<td style="width: 156px;" valign="bottom"></td>
<td style="width: 144px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">SEC</td>
<td style="width: 156px;" valign="bottom">
<p align="right">136.58</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$5,286</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">FTC</td>
<td style="width: 156px;" valign="bottom">
<p align="right">69.66</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$2,696</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">FCC</td>
<td style="width: 156px;" valign="bottom">
<p align="right">26.80</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$1,037</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">SSA</td>
<td style="width: 156px;" valign="bottom">
<p align="right">24.89</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$963</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">FAR (contracts)</td>
<td style="width: 156px;" valign="bottom">
<p align="right">24.49</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$948</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">FCIC</td>
<td style="width: 156px;" valign="bottom">
<p align="right">9.87</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$382</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">NRC</td>
<td style="width: 156px;" valign="bottom">
<p align="right">8.34</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$323</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">FEMA</td>
<td style="width: 156px;" valign="bottom">
<p align="right">7.77</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$301</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">Veterans Administration</td>
<td style="width: 156px;" valign="bottom">
<p align="right">7.31</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$283</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">NASA</td>
<td style="width: 156px;" valign="bottom">
<p align="right">5.95</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$230</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">NSF</td>
<td style="width: 156px;" valign="bottom">
<p align="right">4.46</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$173</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">FERC</td>
<td style="width: 156px;" valign="bottom">
<p align="right">4.38</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$170</p>
</td>
</tr>
<tr>
<td style="width: 192px;" valign="bottom">SBA</td>
<td style="width: 156px;" valign="bottom">
<p align="right">2.77</p>
</td>
<td style="width: 144px;" valign="bottom">
<p align="right">$107</p>
</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 17. Federal Government Paperwork Burdens, 2002<a name="_ednref101"></a>[100]</p>
<p>A December 2003 NFIB survey indicates that the average cost per hour of complying with paperwork requirements was $48.72.<a name="_ednref102"></a>[101] If these costs are substituted, the total cost burden in the table above would be about $400 billion, $71 billion of which excludes Treasury and the IRS.</p>
<p>Despite legislation requiring federal paperwork reduction and embracing of e-government initiatives, paperwork burdens continue to increase. Total burden hours in 2002, for example, increased 600 million hours, or about 4 percent, from the previous year. The Code of Federal Regulations (CFR) continues to expand despite efforts to curtail further growth. The CFR grew from 71,000 pages in 1975 to 135,000 pages in 1998. Annually, there are more than 4,000 regulatory changes introduced by the federal government. The federal government now has over 8,000 separate information collection requests authorized by OMB.<a name="_ednref103"></a>[102]</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 546px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 415px;">
<p align="center"><strong>Federal Source</strong></p>
</td>
<td style="background-color: #cccccc; width: 102px;">
<p align="center"><strong>Fines ($ 000)</strong></p>
</td>
<td style="background-color: #cccccc; width: 29px;"></td>
</tr>
<tr>
<td style="width: 415px;">Internal Revenue Service</td>
<td style="width: 102px;">
<p align="right">$4,119,622</p>
</td>
<td style="width: 29px;">
<p align="center"><a name="_ednref104"></a>[103]</p>
</td>
</tr>
<tr>
<td style="width: 415px;">Corporate Income</td>
<td style="width: 102px;">
<p align="right">$1,120,531</p>
</td>
<td style="width: 29px;"></td>
</tr>
<tr>
<td style="width: 415px;">Employment Taxes</td>
<td style="width: 102px;">
<p align="right">$2,691,021</p>
</td>
<td style="width: 29px;"></td>
</tr>
<tr>
<td style="width: 415px;">Excise Taxes</td>
<td style="width: 102px;">
<p align="right">$200,585</p>
</td>
<td style="width: 29px;"></td>
</tr>
<tr>
<td style="width: 415px;">Other Taxes</td>
<td style="width: 102px;">
<p align="right">$107,486</p>
</td>
<td style="width: 29px;"></td>
</tr>
<tr>
<td style="width: 415px;"></td>
<td style="width: 102px;"></td>
<td style="width: 29px;" valign="bottom"><a name="_ednref105"></a>[104]</td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Agriculture</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$2,000</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Economic Stabilization</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$9,000</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Labor &amp; Immigration</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$72,000</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Commerce &amp; Customs (excl SEC)</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$22,000</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">SEC</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$101,000</p>
</td>
<td style="width: 29px;" valign="bottom">
<p align="right"><a name="_ednref106"></a>[105]</p>
</td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Narcotics &amp; Alcohol</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$2,000</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Mine Safety</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$18,000</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Environmental Protection</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$212,000</p>
</td>
<td style="width: 29px;" valign="bottom">
<p align="right"><a name="_ednref107"></a>[106]</p>
</td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Miscellaneous</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$1,000</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">Other</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$448,000</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 415px;" valign="bottom">TOTAL</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$5,006,622</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 18. Federal Fines and Penalties to Corporations, 2002</p>
<p>Another source of costs to enterprises are civil penalties and fines for non-compliance with existing regulations, as shown in the table above for 2002 by agency. A total of $5 billion annually is expended by U.S. businesses for civil penalties due to non-compliance with federal regulation, $1 billion of which is due to non-tax purposes.</p>
<p>However, these estimates may undercount actual fines and penalties levied by the federal government due to the accounting basis of the OMB source. For example, the Department of Labor (DOL) collected fines and penalties totaling $175 million from employers in fiscal year 2002 for Fair Labor Standards Act (FLSA) violations.<a name="_ednref108"></a>[107] According to a 2002 report, since 1990, 43 of the government’s top contractors paid approximately $3.4 billion in fines/penalties, restitution, and settlements.<a name="_ednref109"></a>[108] And, according to another report, the corporations liable to the top 100 False Claims Act paid more than $12 billion since 1986.<a name="_ednref110"></a>[109] Since there is no central clearinghouse for this information, with both individual agency general counsels and the Department of Justice responsible for actual collections, the figures in Table 18 should be interpreted as estimates.</p>
<p>Table 19 on the next page consolidates the information in Table 16 to Table 18 to estimate the overall regulatory and paperwork burdens on U.S. businesses, plus estimates of the benefits to be gained from better document access and use.</p>
<h2><a name="_Toc106767225"></a>‘Cost’ of an Unauthorized Posted Document</h2>
<p>Unauthorized information disclosures derive mainly from within an organization. The ease of electronic record duplication and dissemination  – particularly through postings on enterprise Web sites  – increases a firm’s vulnerability to this problem. Records mutate and propagate in poorly controlled environments. On average, unauthorized disclosure of confidential information costs Fortune 1000 companies about $15 million per company per year.<a name="_ednref111"></a>[110]</p>
<p>A few privacy laws demonstrate the potential liabilities associated with disclosure of confidential information due to inadvertent mistakes or disgruntled employees. As one example, the Health Insurance Portability and Accountability Act (HIPAA) of 1996 sets security standards protecting the confidentiality and integrity of “individually identifiable health information,” past, present or future. Failure to comply with any of the electronic data, security, or privacy standards can result in civil monetary penalties up to $25,000 per standard per year. Violation of the privacy regulations for commercial or malicious purposes can result in criminal penalties of $50,000 to $250,000 in fines and one to ten years of imprisonment.<a name="_ednref112"></a>[111]</p>
<table style="text-align: left; margin-left: auto; margin-right: auto; width: 641px;" border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color: #cccccc; width: 318px;" valign="bottom"></td>
<td style="background-color: #cccccc; width: 92px;" valign="bottom">
<p align="center"><strong> </strong></p>
</td>
<td style="background-color: #cccccc; width: 202px;" colspan="3" valign="bottom">
<p align="center"><strong>Amount ($000)</strong></p>
</td>
<td style="background-color: #cccccc; width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Total Federal Paperwork Burden (non-tax)</td>
<td style="width: 294px;" colspan="4" valign="bottom">
<p align="center">$56,995,038</p>
</td>
<td style="width: 29px;" valign="bottom"><a name="_ednref113"></a>[112]</td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Total Federal Other Regulatory Burden</td>
<td style="width: 294px;" colspan="4" valign="bottom">
<p align="center">$331,791,551</p>
</td>
<td style="width: 29px;" valign="bottom"><a name="_ednref114"></a>[113]</td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Total Federal Fines and Penalties</td>
<td style="width: 294px;" colspan="4" valign="bottom">
<p align="center">$5,006,622</p>
</td>
<td style="width: 29px;" valign="bottom"><a name="_ednref115"></a>[114]</td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 110px;" colspan="2" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Total State and Local Paperwork Burden (non-tax)</td>
<td style="width: 294px;" colspan="4" valign="bottom">
<p align="center">$32,059,709</p>
</td>
<td style="width: 29px;" valign="bottom"><a name="_ednref116"></a>[115]</td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Total State and Local Other Regulatory Burden</td>
<td style="width: 294px;" colspan="4" valign="bottom">
<p align="center">$186,632,748</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Total State and Local Fines and Penalties</td>
<td style="width: 294px;" colspan="4" valign="bottom">
<p align="center">$2,816,225</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom">
<p align="center"><strong>Low</strong></p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="center"><strong>Med</strong></p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="center"><strong>High</strong></p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Improvements Due to Better Information</td>
<td style="width: 92px;" valign="bottom">
<p align="right">7.5%</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">15.0%</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">35.0%</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"><strong><span style="text-decoration: underline;">Paperwork Burdens </span></strong><span style="text-decoration: underline;">(non-tax)</span></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Benefits per Large Firm</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$1,957</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">$3,915</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$9,134</p>
</td>
<td style="width: 29px;" valign="bottom"><a name="_ednref117"></a>[116]</td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Benefits – All Firms</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$6,679,106</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">$13,358,212</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$31,169,161</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"><strong><span style="text-decoration: underline;">Other Regulatory Burdens</span></strong></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Benefits per Large Firm</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$11,394</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">$22,788</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$53,172</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Benefits – All Firms</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$38,881,822</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">$77,763,645</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$181,448,505</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"><strong><span style="text-decoration: underline;">Reductions in Fines and Penalties</span></strong></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Benefits per Large Firm</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$4,212</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">$8,424</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$19,655</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Benefits – All Firms</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$14,372,953</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">$28,745,905</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$67,073,779</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom"><strong><span style="text-decoration: underline;">TOTAL – All Regulatory Burdens</span></strong></td>
<td style="width: 92px;" valign="bottom"></td>
<td style="width: 100px;" colspan="2" valign="bottom"></td>
<td style="width: 102px;" valign="bottom"></td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Benefits per Large Firm</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$17,563</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">$35,126</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$81,962</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
<tr>
<td style="width: 318px;" valign="bottom">Benefits – All Firms</td>
<td style="width: 92px;" valign="bottom">
<p align="right">$59,933,881</p>
</td>
<td style="width: 100px;" colspan="2" valign="bottom">
<p align="right">$119,867,762</p>
</td>
<td style="width: 102px;" valign="bottom">
<p align="right">$279,691,445</p>
</td>
<td style="width: 29px;" valign="bottom"></td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Table 19. Regulatory Burden and Benefits to Firms from Improved Information</p>
<p>As another example, the Gramm-Leach-Bliley Act (GLBA) of 1999 mandates the financial industry to create guidelines for the safeguarding of customer information. GLBA includes severe civil and criminal penalties for non-compliance, with civil penalties up to $100,000 for each violation and key officers may be fined up to $10,000 per violation. Violation of the GLBA can also carry hefty sanctions, including termination of FDIC insurance and fines of up to $1,000,000 for an individual or one percent of the total assets of the financial institution.<a name="_ednref118"></a>[117]</p>
<p>Other major areas of unauthorized disclosure liability occur in national security, identity theft, and commerce, tax and Social Security information. Indeed, virtually every state and federal agency related to a company’s business has policies and fines regarding unauthorized disclosures. Monitoring these requirements is thus an imperative for enterprise management to prevent exposure to fines and loss of reputation.</p>
<p>On a less-quantifiable basis there are also risks about the clarity of the enterprise message to customers, suppliers and partners. Unmanaged Web sprawl is a critical hole for enterprises to ensure compliance with privacy and confidentiality regulations, and to promote clarity of message and accuracy to stakeholders.</p>
<h1><a name="_Toc106767226"></a>V. CONCLUSIONS</h1>
<p>Prior to the analysis in this white paper, the state of understanding about the value of document assets had been abysmal. While still preliminary and subject to much improvement, this study has nonetheless found:</p>
<ul>
<li>The value of documents  –  in their creation, access and use  –  can indeed be measured</li>
<li>The information contained within U.S. enterprise documents represents about a third of gross domestic product, or an amount of about <em><span style="text-decoration: underline;">$3.3 trillion</span></em> annually</li>
<li>Some 25% of all of these expenditures lend themselves to actionable improvements</li>
<li>There are perhaps on the order of 10 billion documents created annually in the U.S.</li>
<li>Corporate data doubles every six to eight months; 85% of this data is contained in documents</li>
<li>Ninety to 97 percent of enterprises cannot estimate how much they spend on producing documents each year</li>
<li>Document creation is about 2-3 times more important  –  from an embedded cost standpoint  –  than document handling</li>
<li>It costs, on average, $350 to create a ‘typical’ document</li>
<li>The total potential benefit from practical improvements in document access and use to the U.S economy is on the order of $800 billion annually, or about 8% of GDP</li>
<li>For the 1,000 largest U.S. firms, benefits from these improvements can approach nearly $250 million annually per firm</li>
<li>About three-quarters of these benefits arise from <strong><em><span style="text-decoration: underline;">not</span></em></strong> re-creating the intellectual capital already invested in prior document creation</li>
<li>Another 25% of the benefits are due to reduced regulatory non-compliance or paperwork, or better competitiveness in obtaining solicited contracts and grants</li>
<li>$33 billion is wasted each year in re-finding previously found Web documents</li>
<li>Paperwork and regulatory improvements due to documents can save U.S. enterprises $120 billion each year</li>
<li>Lack of document access due to Web sprawl costs U.S. enterprises $22 billion each year</li>
<li>$8 billion in annual benefits is available due to document improvements for competitive governmental grant and contract solicitations</li>
<li>These figures likely severely underestimate the benefits to enterprises from improved competitiveness, a factor not analyzed in this study</li>
<li>Documents are now at the point where structured data was at 15 years ago at the nascent emergence of the data warehousing market.</li>
</ul>
<p style="text-align: left;">As noted throughout, there is a considerable need for additional research and data on document creation, use, costs and benefits. Additional technical endnotes are provided in the PDF version of the full paper.</p>
<p style="text-align: left;">
<hr style="border: 1px solid #cccccc; height: 1px; width: 33%; color: #ffffff;" size="1" noshade="noshade" />
<p style="text-align: left;"><a name="_edn1"></a><span style="font-size: x-small;">[1] All sources and assumptions are fully documented in footnotes in the main body of this white paper; general assumptions used in multiple tables are provided in the Technical Endnotes.</span></p>
<p><span style="font-size: x-small;"><a name="_edn2"></a>[2] As quoted by Armando Garcia, vice president of content management at IBM; see <a href="http://www.contentworld.com/conference/conthur.html">http://www.contentworld.com/conference/conthur.html</a></span></p>
<p><span style="font-size: x-small;"><a name="_edn3"></a> [3] Delphi Group, “Taxonomy &amp; Content Classification Market Milestone Report,” <em>Delphi Group White Paper</em>, 2002. See <a href="http://delphigroup.com/">http://delphigroup.com</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn4"></a> [4] Based on the 1999 to 2001 estimate changes in reference 34, Table 2-6.</span></p>
<p><span style="font-size: x-small;"><a name="_edn5"></a>[5] As initially published in Inc Magazine in 1993. Reference to this document may be found at: <a href="http://www.contingencyplanning.com/PastIssues/marapr2001/6.asp">http://www.contingencyplanning.com/PastIssues/marapr2001/6.asp</a></span></p>
<p><span style="font-size: x-small;"><a name="_edn6"></a>[6] J. Snowdon, <em>Documents </em> – <em> The Lifeblood of Your Business?</em>, October 2003, 12 pp. The white paper may be found at: <a href="http://www.mdy.com/News&amp;Events/Newsletter/IDCDocMgmt.pdf">http://www.mdy.com/News&amp;Events/Newsletter/IDCDocMgmt.pdf</a></span></p>
<p><span style="font-size: x-small;"><a name="_edn7"></a>[7] Xerox Global Services, <em>Documents – An Opportunity for Cost Control and Business Transformation</em>, 28 pp., 2003. The findings may be found at: <a href="http://www.sap.com/solutions/srm/pdf/CCS_Xerox.pdf">http://www.sap.com/solutions/srm/pdf/CCS_Xerox.pdf</a></span></p>
<p><span style="font-size: x-small;"><a name="_edn8"></a>[8] A.T. Kearney, <em>Network Publishing: Creating Value Through Digital Content</em>, A.T. Kearney White Paper, April 2001, 32 pp. See <a href="http://www.adobe.com/aboutadobe/pressroom/pressmaterials/networkpublishing/pdfs/netpubwh.pdf">http://www.adobe.com/aboutadobe/pressroom/pressmaterials/networkpublishing/pdfs/netpubwh.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn9"></a>[9] S.A. Mohrman and D.L. Finegold, <em>Strategies for the Knowledge Economy: From Rhetoric to Reality, 2000,</em><a href="http://www.marshall.usc.edu/ceo/Books/pdf/knowledge_economy.pdf">http://www.marshall.usc.edu/ceo/Books/pdf/knowledge_economy.pdf</a>.</span> University of Southern California study as supported by Korn/Ferry International, January 2000, 43 pp. See</p>
<p><span style="font-size: x-small;"><a name="_edn10"></a>[10] C. Moore, <em>TheContent Integration Imperative</em>, Forrester Research Trends Report, March 26, 2004, 14 pp.</span></p>
<p><span style="font-size: x-small;"><a name="_edn11"></a>[11] D. Vesset, <em>Worldwide Business Intelligence Forecast and Anal ysis, 2003-2007</em>, International Data Corporation, June 2003, 18 pp. See <a href="http://www.dwway.com/file/20030708085453_IDC_WW-BIFORECASTANDANALYSIS2003-07_JUN03.pdf">http://www.dwway.com/file/20030708085453_IDC_WW-BIFORECASTANDANALYSIS2003-07_JUN03.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn12"></a>[12] M. Stonebraker and J. Hellerstein, “Content Integration for E-Business,” in <em>ACM SIGMOD Proceedings</em>, Santa Barbara, CA, pp. 552-560, May 2001.</span></p>
<p><span style="font-size: x-small;"><a name="_edn13"></a>[13] P. Lyman and H. Varian, “How Much Information, 2003,” retrieved from <a href="http://www.sims.berkeley.edu/how-much-info-2003">http://www.sims.berkeley.edu/how-much-info-2003</a> on December 1, 2003.</span></p>
<p><span style="font-size: x-small;"><a name="_edn14"></a>[14] U.S. Department of Commerce, Digital Economy 2003, Economic Statistics Administration, U.S. Dept. of Commerce, Washington, D.C., April 2004, 155 pp. See <a href="http://www.esa.doc.gov/DigitalEconomy2003.cfm">http://www.esa.doc.gov/DigitalEconomy2003.cfm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn15"></a>[15] U.S. Department of Labor, “Occupation Employment and Wages, 2002,” Bureau of Labor Statistics. See <a href="http://www.bls.gov/news.release/archives/ocwage_11192003.pdf">http://www.bls.gov/news.release/archives/ocwage_11192003.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn16"></a>[16] U.S. Census Bureau, “Statistics of U.S. Businesses 2001.” See <a href="http://www.census.gov/epcd/susb/2001/us/US--.htm">http://www.census.gov/epcd/susb/2001/us/US–.htm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn17"></a>[17] Total office documents counts were obtained on a page basis from reference 13, which used a value of 2% for what documents deserve to be archived. This formed the ‘lo’ case, with the high case using a 5% estimate (lower still than the ENST 10% estimated cited in reference 13). Total pages were converted to numbers of documents on an average 8 pp per document basis; see Technical Endnotes for further discussion.</span></p>
<p><span style="font-size: x-small;"><a name="_edn18"></a>[18] See Technical Endnotes for the derivation of knowledge worker estimates.</span></p>
<p><span style="font-size: x-small;"><a name="_edn19"></a>[19] See Technical Endnotes for the derivation of content worker estimates.</span></p>
<p><span style="font-size: x-small;"><a name="_edn20"></a>[20] Citation sources and assumptions for this analysis are presented in the BrightPlanet white paper, “A Cure to IT Indigestion: Deep Content Federation,” BrightPlanet Corporation White Paper, June 2004, 31 pp.</span></p>
<p><span style="font-size: x-small;"><a name="_edn21"></a>[21] The “bottom up” cases are built from the number of assumed knowledge workers in Table 3. The “low” and “high” variants are based on a 5% archival value or 350 annual documents created per worker, respectively, applied to worker staff costs associated with document creation. The “Coopers &amp; Lybrand” case is a strict updating of that study to 2002. The other two “C&amp;L” cases use the updated per document costs from the C&amp;L study; the first variant uses the annual documents created from the UC Berkeley study without archiving; the second variant uses the average of the “low” and “high” document numbers. See further Technical Endnotes for other key assumptions.</span></p>
<p><span style="font-size: x-small;"><a name="_edn22"></a>[22] The individual values in Table 5 range from about $140 to $740 per document, with the update of the Coopers &amp; Lybrand study being about $270. Separate Delphi analysis by BrightPlanet has shown median values of about $550 per document.</span></p>
<p><span style="font-size: x-small;"><a name="_edn23"></a>[23] See http:// <a href="http://www.eds.com/services_offerings/ibill_openbill_b2b.shtml">www.eds.com/services_offerings/ibill_openbill_b2b.shtml</a></span></p>
<p><span style="font-size: x-small;"><a name="_edn24"></a>[24] See <a href="http://www.hsh.com/cfee-sample.html">http://www.hsh.com/cfee-sample.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn25"></a>[25] See <a href="http://www.atp.nist.gov/eao/applicants/section9.htm">http://www.atp.nist.gov/eao/applicants/section9.htm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn26"></a>[26] As initially published in Inc Magazine in 1993. Reference to this document may be found at: <a href="http://www.contingencyplanning.com/PastIssues/marapr2001/6.asp">http://www.contingencyplanning.com/PastIssues/marapr2001/6.asp</a></span></p>
<p><span style="font-size: x-small;"><a name="_edn27"></a>[27] Xerox Global Services, Documents – An Opportunity for Cost Control and Business Transformation, 28 pp., 2003. The findings may be found at: <a href="http://www.sap.com/solutions/srm/pdf/CCS_Xerox.pdf">http://www.sap.com/solutions/srm/pdf/CCS_Xerox.pdf</a> and J. Snowdon, Documents  –  The Lifeblood of Your Business?, October 2003, 12 pp. The white paper may be found at: <a href="http://www.mdy.com/News&amp;Events/Newsletter/IDCDocMgmt.pdf">http://www.mdy.com/News&amp;Events/Newsletter/IDCDocMgmt.pdf</a></span></p>
<p><span style="font-size: x-small;"><a name="_edn28"></a> [28] Optika Corporation. See <a href="http://www.optika.com/ROI/calculator/ROI_roiresults.cfm">http://www.optika.com/ROI/calculator/ROI_roiresults.cfm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn29"></a>[29] Cap Ventures information, as cited in ZyLAB Technologies B.V., “Know the Cost of Filing Your Paper Documents,” Zylab White Paper, 2001. See <a href="http://www.zylab.com/downloads/whitepapers/PDF/21%20-%20Know%20the%20cost%20of%20filing%20your%20paper%20documents.pdf">http://www.zylab.com/downloads/whitepapers/PDF/21%20-%20Know%20the%20cost%20of%20filing%20your%20paper%20documents.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn30"></a>[30] ALL Associates Group, Inc., EDAM Sector Summary, April 2003, 2 pp.</span></p>
<p><span style="font-size: x-small;"><a name="_edn31"></a>[31] ALL Associates Group, 2002 EDAM Metrics for Major U.S. Companies.</span></p>
<p><span style="font-size: x-small;"><a name="_edn32"></a>[32] By the second Q 2004, this amount was $11.6 trillion. U.S. Federal Reserve Board, Flow of Funds Accounts for the United States, Sept. 16, 2004. See <a href="http://www.federalreserve.gov/releases/Z1/current/accessible/f6.htm">http://www.federalreserve.gov/releases/Z1/current/accessible/f6.htm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn33"></a>[33] The bases for this table have the following assumptions: 1) the three cases for document handling are based on 5%, 10% and 15% of total enterprise revenues, per the earlier section; 2) the three cases for document creation are based on the ‘C&amp;L Bottom-Up’, ‘Bottom-up  – High,’ and ‘Coopers &amp; Lybrand’ items for the Low, Medium, and High columns, respectively, in Table 5; and 3) the document misfiling case draws on the same basis but using the total document estimates and misfiled percentages of 5%, 7.5% and 9% consistent with the previous discussion section. See further the Technical Endnotes.</span></p>
<p><span style="font-size: x-small;"><a name="_edn34"></a>[34] P. Lyman and H. Varian, “How Much Information, 2003,” retrieved from <a href="http://www.sims.berkeley.edu/how-much-info-2003">http://www.sims.berkeley.edu/how-much-info-2003</a> on December 1, 2003.</span></p>
<p><span style="font-size: x-small;"><a name="_edn35"></a>[35] Cap Ventures information, as cited in ZyLAB Technologies B.V., “Know the Cost of Filing Your Paper Documents,” Zylab White Paper, 2001. See <a href="http://www.zylab.com/downloads/whitepapers/PDF/21%20-%20Know%20the%20cost%20of%20filing%20your%20paper%20documents.pdf">http://www.zylab.com/downloads/whitepapers/PDF/21%20-%20Know%20the%20cost%20of%20filing%20your%20paper%20documents.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn36"></a>[36] As reported in <a href="http://www.hoovers.com/company/archive/detail/0,2049,7_2322,00.html">http://www.hoovers.com/company/archive/detail/0,2049,7_2322,00.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn37"></a>[37] See <a href="http://www.veronissuhler.com/businfo/segment.html">http://www.veronissuhler.com/businfo/segment.html</a>, August 2, 2000.</span></p>
<p><span style="font-size: x-small;"><a name="_edn38"></a>[38] See <a href="http://www.outsellinc.com/docs/pr_release/pr20000602_01.htm">http://www.outsellinc.com/docs/pr_release/pr20000602_01.htm</a>, June 2, 2000.</span></p>
<p><span style="font-size: x-small;"><a name="_edn39"></a>[39] See <a href="http://www.outsellinc.com/docs/pr_release/pr20000629_01.htm">http://www.outsellinc.com/docs/pr_release/pr20000629_01.htm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn40"></a>[40] M.K. Bergman, “The Deep Web: Surfacing Hidden Value,” BrightPlanet Corporation White Paper, June 2000. The most recent version of the study was published by the University of Michigan’s Journal of Electronic Publishing in July 2001. See <a href="http://www.press.umich.edu/jep/07-01/bergman.html">http://www.press.umich.edu/jep/07-01/bergman.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn41"></a>[41] This analysis assumes there were 1 million documents on the Web as of mid-1994.</span></p>
<p><span style="font-size: x-small;"><a name="_edn42"></a>[42] See, for example, C. Sherman and G. Price, The Invisible Web, Information Today, Inc., Medford, NJ, 2001, 439 pp., and P. Pedley, The Invisible Web: Searching the Hidden Parts of the Internet, Aslib-IMI, London, 2001, 138pp.</span></p>
<p><span style="font-size: x-small;">[43] iProspect Corporation, iProspect Search Engine User Attitudes, April/May 2004, 28 pp. See <a href="http://www.iprospect.com/premiumPDFs/iProspectSurveyComplete.pdf">http://www.iprospect.com/premiumPDFs/iProspectSurveyComplete.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn45"></a>[44] As reported at <a href="http://www.nua.ie/surveys/index.cgi?f=VS&amp;art_id=905358569&amp;rel=true">http://www.nua.ie/surveys/index.cgi?f=VS&amp;art_id=905358569&amp;rel=true</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn46"></a>[45] Delphi Group, “Taxonomy &amp; Content Classification Market Milestone Report,” Delphi Group White Paper, 2002. See <a href="http://delphigroup.com/">http://delphigroup.com</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn47"></a>[46] C. Sherman and S. Feldman, “The High Cost of Not Finding Information,” International Data Corporation Report #29127, 11 pp., April 2003.</span></p>
<p><span style="font-size: x-small;"><a name="_edn48"></a>[47] M.E.D. Koenig, “Time Saved  –  a Misleading Justification for KM,” KMWorld Magazine, Vol 11, Issue 5, May 2002. See <a href="http://www.kmworld.com/publications/magazine/index.cfm">http://www.kmworld.com/publications/magazine/index.cfm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn49"></a>[48] G. Xu, A. Cockburn and B. McKenzie, Lost on the Web: An Introduction to Web Navigation Research, <a href="http://www.cosc.canterbury.ac.nzq/ACMchapterq/NZCSPGq/papers">http://www.cosc.canterbury.ac.nzq/ACMchapterq/NZCSPGq/papers</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn50"></a>[49] A. Cockburn and B. McKenzie, What Do Web Users Do? An Empirical Analysis of Web Use, 2000. See <a href="http://citeseer.ist.psu.edu/cockburn00what.html">http://citeseer.ist.psu.edu/cockburn00what.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn51"></a>[50] Tenth edition of GVU’s (graphics, visualization and usability} WWW User Survey, May 14, 1999. See <a href="http://www.gvu.gatech.edu/user_surveys/survey-1998-10/tenthreport.html">http://www.gvu.gatech.edu/user_surveys/survey-1998-10/tenthreport.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn52"></a>[51] C. Alvarado, J. Teevan, M. S. Ackerman and D.Karger, “Surviving the Information Explosion: How People Find Their Electronic Information,” AI Memo 2003-06, April 2003, 11 pp.., Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory. See <a href="ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-006.pdf">ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-006.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn53"></a>[52] W. Jones, H. Bruce and S. Dumais, “Keeping Found Things Found on the Web,” See <a href="http://washington.edu/KFTF_Web.pdf">http://washington.edu/KFTF_Web.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn54"></a>[53] J. Teevan, “How People Re-find Information When the Web Changes,” AI Memo 2004-014, June 2004, 10 pp., Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory. See <a href="ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-012.pdf">ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-012.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn55"></a>[54] Library of Congress, “Preserving Our Digital Heritage: Plan for the National Digital Information Infrastructure and Preservation Program”, a Report to Congress by the U.S. Library of Congress, 2002, 66 pp. See <a href="http://www.digitalpreservation.gov/ndiipp/">http://www.digitalpreservation.gov/ndiipp/</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn56"></a>[55] Consistent with Table 8; this analysis also assumes the 25% search time commitment by employee and previous values from earlier tables.</span></p>
<p><span style="font-size: x-small;"><a name="_edn57"></a>[56] All subsequent references to ‘Large’ firms is based on the last column in Table 2, namely the 930 U.S. firms with more than 10,000 employees.</span></p>
<p><span style="font-size: x-small;"><a name="_edn58"></a>[57] Delphi Group, “Taxonomy &amp; Content Classification Market Milestone Report,” Delphi Group White Paper, 2002. See <a href="http://delphigroup.com/">http://delphigroup.com</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn59"></a>[58] S. Stearns, “Realize the Value Locked in Your Content Silos Without Breaking the Bank: Automated Classification Tools to Improve Information Discovery,” Inmagic White Paper, version 1.0, 2004. 10 pp. See <a href="http://www.inmagic.com/">http://www.inmagic.com</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn60"></a>[59] P. Sonderegger, “Weave Search into the Browsing Experience,” ForresterQuick Take, Forrester Research, Inc., Feb. 18, 2004. 2 pp.</span></p>
<p><span style="font-size: x-small;"><a name="_edn61"></a> [60] P. Russom, “An Eye for the Needle,” Intelligent Enterprise, January 14, 2002. See <a href="http://www.iemagazine.com/020114/502feat2_1">http://www.iemagazine.com/020114/502feat2_1</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn62"></a>[61] This average was estimated by interpolating figures shown on Figure 8 in reference 68.</span></p>
<p><span style="font-size: x-small;"><a name="_edn63"></a>[62] This average was estimated by interpolating figures shown on the p.14 figure in Plumtree Corporation, “The Corporate Portal Market in 2002,” Plumtree Corp. White Paper, 27 pp. See <a href="http://www.plumtree.com/pdf/Corporate_Portal_Survey_White_Paper_February2002.pdf">http://www.plumtree.com/pdf/Corporate_Portal_Survey_White_Paper_February2002.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn64"></a>[63] The ‘low’ case represents the archival value in the middle bars with the addition that 30% of internal documents generated in the current year have a value to be shared for one year; the ‘high’ case represents the related archival value in the middle bars but with 40% of documents generated in that year having a value to be shared for one year.</span></p>
<p><span style="font-size: x-small;"><a name="_edn65"></a>[64] Analysis based on reference 68, with interpolations from Figure 16.</span></p>
<p><span style="font-size: x-small;"><a name="_edn66"></a>[65] M. Corcoran, “When Worlds Collide: Who Really Owns the Content,” AIIM Conference, New York, NY, March 10, 2004. See <a href="http://show.aiimexpo.com/convdata/aiim2003/brochures/64CorcoranMary.pdf">http://show.aiimexpo.com/convdata/aiim2003/brochures/64CorcoranMary.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn67"></a>[66] C. Phillips, “Stemming the Software Spending Spree,” Optimize Magazine, April 2002, Issue 6. See <a href="http://www.optimizemag.com/article/showArticle.jhtml?articleId=17700698&amp;pgno=1">http://www.optimizemag.com/article/showArticle.jhtml?articleId=17700698&amp;pgno=1</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn68"></a>[67] C. Moore, “The Content Integration Imperative,” Forrester Research, Inc., March 26, 2004, 14 pp.</span></p>
<p><span style="font-size: x-small;"><a name="_edn69"></a> [68] Plumtree Corporation, “The Corporate Portal Market in 2003,” Plumtree Corp. White Paper, 30 pp. See <a href="http://www.plumtree.com/portalmarket2003/default.asp">http://www.plumtree.com/portalmarket2003/default.asp</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn70"></a> [69] BEA Corporation, “Enterprise Portal Rationalization,” BEA Technical White Paper, 23 pp., 2004. See <a href="http://www.bea.com/content/news_events/white_papers/BEA_epr_wp.pdf">http://www.bea.com/content/news_events/white_papers/BEA_epr_wp.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn71"></a>[70] A. Aneja, C.Rowan and B. Brooksby, “Corporate Portal Framework for Transforming Content Chaos on Intranets,” Intel Technology Journal Q1, 2000. See <a href="http://developer.intel.com/technology/itj/q12000/pdf/portal.pdf">http://developer.intel.com/technology/itj/q12000/pdf/portal.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn72"></a> [71] J. Smeaton, “IBM’s Own Intranet: Saving Big Blue Millions,” Intranet Journal, Sept. 25, 2002. See <a href="http://www.intranetjournal.com/articles/200209/ij_09_25_02a.html">http://www.intranetjournal.com/articles/200209/ij_09_25_02a.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn73"></a> [72] See <a href="http://www.wookieweb.com/Intranet/">http://www.wookieweb.com/Intranet/</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn74"></a> [73] D. Voth, “Why Enterprise Portals are the Next Big Thing,” LTI Magazine, October 1, 2002. See <a href="http://www.ltimagazine.com/ltimagazine/article/articleDetail.jsp?id=36877">http://www.ltimagazine.com/ltimagazine/article/articleDetail.jsp?id=36877</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn75"></a> [74] A. Nyberg, “Is Everybody Happy?” CFO Magazine, November 01, 2002. See <a href="http://www.cfo.com/article/1%2C5309%2C8062%2C00.html">http://www.cfo.com/article/1%2C5309%2C8062%2C00.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn76"></a> [75] See <a href="http://www.proudfoot-plc.com/pdf_20004-USPR1002Avayaweb.asp">http://www.proudfoot-plc.com/pdf_20004-USPR1002Avayaweb.asp</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn77"></a> [76] Wall Street Journal, May 4, 2004, p. B1.</span></p>
<p><span style="font-size: x-small;"><a name="_edn78"></a> [77] pers. comm.., Jonathon Houk, Director of DHS IIAP Program, November 2003.</span></p>
<p><span style="font-size: x-small;"><a name="_edn79"></a>[78] These figures are based on Table 12 and the GDP figures from reference 32. Note, the analysis in this section also ignores business-to-business opportunities, which are also likely significant.</span></p>
<p><span style="font-size: x-small;"><a name="_edn80"></a>[79] Total grant and procurement amounts are derived from the U.S. Census Bureau, Consolidated Federal Funds Report (CFFR). See <a href="http://harvester.census.gov/cffr/asp/Reports.asp">http://harvester.census.gov/cffr/asp/Reports.asp</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn81"></a>[80] The number of awards and an analysis of which line items are competitively awarded was derived from the U.S. Census Bureau, Federal Assistance Award Data System (FAADS). See <a href="http://www.census.gov/govs/faads/021sumus.htm">http://www.census.gov/govs/faads/021sumus.htm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn82"></a>[81] Specific categories of grants were analyzed based on the U.S. General Services Administration’s Catalog of Federal Domestic Assistance (CFDA) definitions to determine degree of competitiveness; see <a href="http://12.46.245.173/cfda/cfda.html">http://12.46.245.173/cfda/cfda.html</a>. Figures from the U.S. Department of Health and Human Services, Grant.gov Clearinghouse (see <a href="http://www.grants.gov/">http://www.grants.gov/</a>) suggest that $350 billion in federal grants is available, but many of the specific grant opportunities are geared to state governments or individuals. That is why the figures shown indicate only $100 billion in competitive opportunities available directly to enterprises.</span></p>
<p><span style="font-size: x-small;"><a name="_edn83"></a>[82] U.S. General Services Administration, Federal Procurement Data System  –  NG (FY 2003 data); see <a href="http://www.fpdc.gov/fpdc/FPR2003a.pdf">http://www.fpdc.gov/fpdc/FPR2003a.pdf</a> and <a href="http://www.fpdc.gov/fpdc/FPR2003c.pdf">http://www.fpdc.gov/fpdc/FPR2003c.pdf</a>. These sources are also the reference for the number of actions or successful awards. Due to discrepancies, these amounts were adjusted to conform with the totals in reference 79.</span></p>
<p><span style="font-size: x-small;"><a name="_edn84"></a>[83] Average competitive opportunities are derived by dividing the total award amount by category by the number of awards for that category.</span></p>
<p><span style="font-size: x-small;"><a name="_edn85"></a>[84] See <a href="http://www.gcswin.com/opportunities/opp2.htm">http://www.gcswin.com/opportunities/opp2.htm</a>. This is the only summary reference for state and local information found. Splits between grants and contract procurements were adjusted based on the assumption that contract amounts differed at the non-federal level. Thus, while the split for grant-contract procurements in the federal sector is about 58%-42% in the federal sector, it is assumed to be 38%-62% at the state and local level.</span></p>
<p><span style="font-size: x-small;"><a name="_edn86"></a>[85] There may also be some double counting of state amounts due to transfers from the federal government. For example, in 2002, $360,534 million in direct transfers was made to states and localities from the federal government. U.S. Census Bureau, State and Local Government Finances by Level of Government and by State: 2001  – 02. See <a href="http://www.census.gov/govs/estimate/0200ussl_1.html">http://www.census.gov/govs/estimate/0200ussl_1.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn87"></a>[86] This analysis assumes that individual grant and contract awards are 80% of the amount shown at the federal level.</span></p>
<p><span style="font-size: x-small;"><a name="_edn88"></a>[87] To be listed requires a minimum of $10,000 in federal contracts; see <a href="http://clinton2.nara.gov/WH/EOP/OP/html/aa/aa06.html">http://clinton2.nara.gov/WH/EOP/OP/html/aa/aa06.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn89"></a>[88] See <a href="http://www.govexec.com/features/0804-15/0804-15s1s1.htm">http://www.govexec.com/features/0804-15/0804-15s1s1.htm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn90"></a>[89] This header information is drawn from Table 12.</span></p>
<p><span style="font-size: x-small;"><a name="_edn91"></a>[90] Number of competing firms is increased from the federal contractor baseline by a factor of 1.30 to account for new state and local government contractors.</span></p>
<p><span style="font-size: x-small;"><a name="_edn92"></a>[91] Winning and losing proposal preparation costs are based on the empirical percentages from NIST (see reference 93), namely 0.85% and 0.59%, respectively, as a percent of total award amounts.</span></p>
<p><span style="font-size: x-small;"><a name="_edn93"></a>[92] The ‘Low’ basis for improvements is based on the finding of missing information discussed in a previous section; the ‘High” basis reflects the difference between lowest quartile and highest quartile efforts spent on successful proposal preparation (see reference 93). The ‘Med’ basis is an intermediate value between these two.</span></p>
<p><span style="font-size: x-small;"><a name="_edn94"></a>[93] The increase in winning submissions is calculated based on numbers of winning proposals times the RFP improvement factor. In fact, because all things being equal the pool of contract dollars does not change, this amount merely represents a shift of winning awards from existing winners to new winners. In other words, total contracts amounts are a zero-sum game with proposal improvements by previous losers taken from the pool of previous winners.</span></p>
<p><span style="font-size: x-small;"><a name="_edn95"></a>[94] The analysis in Figure 2 indicates there is a power curve distribution of awards. The number of new winning proposals was applied to this curve to estimate the actual number of new firms winning awards; see Figure 2 for the power-curve fitting equation.</span></p>
<p><span style="font-size: x-small;"><a name="_edn96"></a>[95] Of course, better probabilities of winning competitive solicitations are a zero-sum game. New winners displace old winners. The real advantage in this arena is to individual firms that better succeed at securing the existing pool of competitive funds. The benefits to individual companies can be the difference between profitability, indeed survival.</span></p>
<p><span style="font-size: x-small;"><a name="_edn97"></a>[96] NFIB, Coping with Regulation, NFIB National Small Business Poll, Vol. 1, Issue 5. See <a href="http://www.nfib.com/object/3105105.html">http://www.nfib.com/object/3105105.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn98"></a>[97] NFIB, Paperwork and Record-keeping, NFIB National Small Business Poll, Vol. 3, Issue 5. See <a href="http://www.nfib.com/object/4131277.html">http://www.nfib.com/object/4131277.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn99"></a>[98] W. M. Crain &amp; T. D. Hopkins, “The Impact of Regulatory Costs on Small Firms”, Report to the Small Business Administration, RFP No. SBAHQ-00-R-0027 (2001). The report’s 2000 year basis was updated to 2002 based on a 4% annual inflation factor.</span></p>
<p><span style="font-size: x-small;"><a name="_edn100"></a>[99] U.S. General Accounting Office, Paperwork Reduction Act: Record Increase in Agencies’ Burden Estimates, testimony of V. S. Rezendes, before the Subcommittee on Energy, Policy, Natural Resources and Regulatory Affairs, Committee on Government Reform, House of Representatives, April 11, 2003. See <a href="http://www.reform.house.gov/UploadedFiles/Testimony_GAO_Revised.pdf">http://www.reform.house.gov/UploadedFiles/Testimony_GAO_Revised.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn101"></a>[100] Office of Management and Budget, Managing Information Collection and Dissemination, Fiscal Year 2003, 198 pp. (Table A1). See <a href="http://www.whitehouse.gov/omb/inforeg/2003_info_coll_dism.pdf">http://www.whitehouse.gov/omb/inforeg/2003_info_coll_dism.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn102"></a>[101] NFIB, Paperwork and Record-keeping, NFIB National Small Business Poll, Vol. 3, Issue 5. See <a href="http://www.nfib.com/object/4131277.html">http://www.nfib.com/object/4131277.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn103"></a>[102]U.S. Small Business Administration, Final Report of the Small Business Paperwork Relief Task Force, June 27, 2003, 64 pp. See <a href="http://www.sbaonline.sba.gov/advo/laws/final_paperwork03.pdf">http://www.sbaonline.sba.gov/advo/laws/final_paperwork03.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn104"></a>[103] IRS, Civil Penalties Assessed and Abated, by Type of Penalty and Type of Tax (Table 26), September 20, 2002. See <a href="http://www.irs.gov/pub/irs-soi/02db26cp.xls">http://www.irs.gov/pub/irs-soi/02db26cp.xls</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn105"></a>[104] Except as footnoted, the figures below are drawn from the OMB Public Budget Tables. Civil penalties for crime victims have been excluded from these figures. See <a href="http://www.whitehouse.gov/omb/budget/fy2005/db.html">http://www.whitehouse.gov/omb/budget/fy2005/db.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn106"></a>[105] Obtained orders in SEC judicial and administrative proceedings requiring securities law violators to disgorge illegal profits of approximately $1.293 billion. Civil penalties ordered in SEC proceedings totaled approximately $101 million. See SEC <a href="http://www.sec.gov/pdf/annrep02/ar02enforce.pdf">http://www.sec.gov/pdf/annrep02/ar02enforce.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn107"></a>[106] T. L. Sansonetti, U.S. Department of Justice, testimony before the House Committee on the Judiciary, Subcommittee on Commercial and Administrative Law, March 9, 2004. See <a href="http://www.house.gov/judiciary/sansonetti030904.htm">http://www.house.gov/judiciary/sansonetti030904.htm</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn108"></a>[107]Argy, Wiltse &amp; Robinson, Business Insights, Summer 2003, 4 pp. See <a href="http://www.awr.com/news_let/Argy%20Summer%202003.pdf">http://www.awr.com/news_let/Argy%20Summer%202003.pdf</a></span></p>
<p><span style="font-size: x-small;"><a name="_edn109"></a>[108] Project on Government Oversight, Federal Contractor Misconduct: Failures of the Suspension and Debarment System, revised May 10, 2002. See <a href="http://www.pogo.org/p/contracts/co-020505-contractors.html">http://www.pogo.org/p/contracts/co-020505-contractors.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn110"></a>[109]Corporate Crime Reporter, Top 100 False Claims Act Settlements, December 30, 2003, 64 pp. See <a href="http://www.corporatecrimereporter.com/fraudrep.pdf">http://www.corporatecrimereporter.com/fraudrep.pdf</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn111"></a>[110] According to Alchemia Corporation testimony citing a Price Waterhouse Coopers study, FDA Hearing, Jan. 17, 2002. See http://www.fda.gov/ohrms/dockets/dockets/ 00d1538/00d-1538_mm00023_01_vol7.doc.</span></p>
<p><span style="font-size: x-small;"><a name="_edn112"></a>[111] For example, see <a href="http://www.medschool.ucsf.edu/curriculum/clinical/guide/section2/confidentiality.asp">http://www.medschool.ucsf.edu/curriculum/clinical/guide/section2/confidentiality.asp</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn113"></a>[112] From Table 17.</span></p>
<p><span style="font-size: x-small;"><a name="_edn114"></a>[113] From Table 16 after adjusting by total number of employees for all firms as shown on Table 2, and removal of total burdens as shown in Table 17.</span></p>
<p><span style="font-size: x-small;"><a name="_edn115"></a>[114] From Table 18.</span></p>
<p><span style="font-size: x-small;"><a name="_edn116"></a>[115] All ‘State and Local’ items are based on the ratio of state and local budgets in relation to the federal budget, excluding direct federal transfers, and applied to those factors for the federal sector. This ratio is 0.563. See <a href="http://www.gpoaccess.gov/usbudget/fy01/guide01.html">http://www.gpoaccess.gov/usbudget/fy01/guide01.html</a>.</span></p>
<p><span style="font-size: x-small;"><a name="_edn117"></a>[116] All ‘Large Firm’ estimates are based on the ratio of large firm documents to total firm documents; see Table 2.</span></p>
<p><span style="font-size: x-small;"><a name="_edn118"></a>[117] For example, see <a href="http://www.nfr.com/why/mandates.php#gramm">http://www.nfr.com/why/mandates.php#gramm</a></span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mkbergman.com/871/brown-bag-lunch-untapped-assets-the-3-trillion-value-of-us-documents/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Murky Depths of the &#8216;Deep Web&#8217;</title>
		<link>http://www.mkbergman.com/343/the-murky-depths-of-the-deep-web/</link>
		<comments>http://www.mkbergman.com/343/the-murky-depths-of-the-deep-web/#comments</comments>
		<pubDate>Wed, 21 Feb 2007 18:22:14 +0000</pubDate>
		<dc:creator>Mike Bergman</dc:creator>
				<category><![CDATA[Deep Web]]></category>
		<category><![CDATA[Document Assets]]></category>

		<guid isPermaLink="false">http://www.mkbergman.com/?p=343</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=The Murky Depths of the &#8216;Deep Web&#8217;&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Deep Web&amp;rft.subject=Document Assets&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2007-02-21&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/343/the-murky-depths-of-the-deep-web/&amp;rft.language=English"></span>
It&#8217;s Taken Too Many Years to Re-visit the &#8216;Deep Web&#8217; Analysis It&#8217;s been seven years since Thane Paulsen and I first coined the term &#8216;deep Web&#8216;, perhaps representing a couple of full generational cycles for the Internet. What we knew then and what &#8220;Web surfers&#8221; did then has changed markedly. And, of course, our coining [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=The Murky Depths of the &#8216;Deep Web&#8217;&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Deep Web&amp;rft.subject=Document Assets&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2007-02-21&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/343/the-murky-depths-of-the-deep-web/&amp;rft.language=English"></span>
<p><img hspace="5" align="left" alt="Deep Web" title="Deep Web" src="http://www.mkbergman.com/wp-content/themes/ai3/images/2007Posts/070221a_DeepWeb.png" /><font color="#820000"><strong>It&#8217;s Taken Too Many Years to Re-visit the &#8216;Deep Web&#8217; Analysis </strong></font></p>
<p>It&#8217;s been seven years since <a title="Thane Paulsen" href="http://www.brightplanet.com/company/brightplanet/managers-and-directors.html">Thane Paulsen</a> and I first coined the term &#8216;<a title="Deep Web" href="http://en.wikipedia.org/wiki/Deep_web">deep Web</a>&#8216;, perhaps representing a couple of full generational cycles for the Internet.  What we knew then and what &#8220;Web surfers&#8221; did then has changed markedly.  And, of course, our coining of the term and <a title="BrightPlanet" href="http://www.brightplanet.com">BrightPlanet&#8217;s</a> publishing of the first quantitative study on the deep Web did nothing to create the phenomenon of dynamic content itself &#8212; we merely gave it a name and helped promote a bit of understanding within the general public of some powerful subterranean forces driving the nature and tectonics of the emerging Web.</p>
<p>The first public release of <em><a title="Original Deep Web Paper Release" href="http://web.archive.org/web/20000816013534/128.121.227.57/download/deepwebwhitepaper.pdf">The Deep Web:  Surfacing Hidden Value</a></em> (courtesy of the Internet Archive&#8217;s <a title="Internet Archive's Wayback Machine" href="http://www.archive.org/web/web.php">Wayback Machine</a>), in July 2000, opened with a bold claim:</p>
<blockquote><p><em>BrightPlanet has uncovered the &quot;deep&quot; Web &#8212; a vast reservoir of Internet content that is 500 times larger than the known &quot;surface&quot; World Wide Web.  What makes the discovery of the deep Web so significant is the quality of content found within.  There are literally hundreds of billions of highly valuable documents hidden in searchable databases that cannot be retrieved by conventional search engines.</em></p></blockquote>
<p>The day the study was released we needed to increase our servers nine-fold to meet news demand after <a title="CNN" href="http://www.cnn.com">CNN</a> and then 300 major news outlets eventually picked up the story.  By 2001 when the University of Michigan&#8217;s <em><a title="JEP Deep Web Version" href="http://www.press.umich.edu/jep/07-01/bergman.html">Journal of Electronic Publishing</a></em> and its wonderful editor, <a title="Judith Axler Turner" href="http://www.turner.net/employee-Judith_Axler_Turner.html">Judith A. Turner</a>, decided to give the topic renewed thrust, we were able to clean up the presentation and language quite a bit, but did little to actually update many of the statistics.  (That version, in fact, is the one mostly cited today.)</p>
<p>Over the years there have been some books published and other estimates put forward, more often citing lower amounts in the deep Web than my original estimates, but, with one exception (see below), none of these were backed by new analysis.  I was asked numerous times to update the study, and indeed had even begun collating new analysis at a couple of points, but the effort to complete the work was substantial and the effort always took a back seat to other duties and so was never completed.</p>
<p><strong>Recent Updates and Criticisms</strong></p>
<p>It was thus with some surprise and pleasure that I first found reference yesterday to Dirk Lewandowski&#8217;s and Phillip Mayr&#8217;s 2006 paper, <em><a title="Exploring the Academic Invisible Web" href="http://www.durchdenken.de/lewandowski/doc/LHT_Preprint.pdf">&#8220;Exploring the Academic Invisible Web&#8221;</a></em> [<em>Library Hi Tech</em> <strong>24</strong>(4), 529-539], that takes direct aim at the analysis in my original paper.  (Actually, they worked from the 2001 JEP version, but, as noted, the analysis is virtually identical to the original 2000 version.)  The authors pretty soundly criticize some of the methodology in my original paper and, for the most part, I agree with them.</p>
<p>My original analysis combined a manual evaluation of the &#8220;top 60&#8243; then-extant Web databases with an estimate of the total number of searchable databases (estimated at about 200,000, which they incorrectly cite as 100,000) and assessments of the mean size of each database based on a random sampling of those databases.  Lewandowski and Mayr note conceptual flaws in the analysis at these levels:</p>
<ul>
<li>First, by use of <u><em>mean</em></u> database size rather than <u><em>median</em></u> size, the size is overestimated,</li>
<li>Second, databases of questionable content to their interests in academic content (such as weather records from NOAA or Earth survey data by satellite) skewed my estimates upward, and</li>
<li>Third, my estimates were based on database size estimates (in GBs) and not internal record counts.</li>
</ul>
<p>On the other hand, the authors also criticized that my definition of deep content was too narrow, and overlooked certain content types such as PDFs now routinely indexed and retrieved on the surface Web.  We also have had uncertain, but tangible growth in standard search engine content &#8212; with the last cited amounts about 20 billion documents since Google and Yahoo! ceased their war of index numbers.</p>
<p>Though not really offering an alternative, full-blown analysis, the authors use the <a title="Gale Directory of Databases" href="http://library.dialog.com/bluesheets/html/bl0230.html">Gale Directory of Databases</a> to derive an alternative estimate of perhaps 20 billion to 100 billion documents on the deep Web of interest for academic purposes, which they later seem to imply also needs to be discounted by further percentages to get at &#8220;word-oriented&#8221; and &#8220;full-text or bibliographic&#8221; records that they deem appropriate.</p>
<p><strong>My Assessment of the Criticisms</strong></p>
<p>As noted, I generally agree with these criticisms.  For example, since the time of original publication, we have seen the power distribution nature of most things on the Internet, including popularity and traffic.  Exponential distributions will always result in overestimates using calculations based on <u><em>means</em></u> rather than <u><em>medians</em></u>.  I also think that meaningful content types were both overused (more database-like records) and underused (PDF content that is now routinely indexed) in my original analysis.</p>
<p>However, the authors&#8217; third criticism is patently wrong, since three different methods were used to estimate internal database record counts and the average sizes of each record they contained.  I would also have preferred a more careful reading by the authors of my actual paper, since there are numerous other citations in error and mis-characterizations.</p>
<p>On an epistemological level, I disagree with the authors&#8217; use of the term &#8220;invisible Web&#8221;, a label that we tried hard in the paper to overturn and that is fading as a current term of art. <a title="Internet Tutorials on the Deep Web" href="http://www.internettutorials.net/deepweb.html">Internet Tutorials</a> (initially, <a title="SUNY at Albany Libary" href="http://library.albany.edu/internet/deepweb.html">SUNY at Albany Library</a>) addresses this topic head-on, preferring &#8220;deep Web&#8221; on a number of compelling grounds, including that <em>&#8220;there is no such thing as recorded information that is invisible. Some information may be more of a challenge to find than others, but this is not the same as invisibility.&#8221;</em></p>
<p>Finally, I am not compelled by the author&#8217;s simplistic, alternate partial estimate based solely on the Gale database, but they readily acknowledge to not doing a full-blown analysis and to having different objectives in mind.  I agree with the authors in calling for a full, alternative analysis.  I think we all agree that is a non-trivial undertaking and could itself be subject to newer methodological pitfalls.</p>
<p><strong>So, What is the Quantitative Update?</strong></p>
<p>Within a couple of years after the initial publication of my paper, I suspected the &#8220;500 times&#8221; claim for the greater size of the deep Web in comparison to what is discoverable by search engines may have been too high.  Indeed, in later corporate literature and Powerpoint presentations, I backed off the initial 2000-2001 claims and began speaking in ranges from a &#8220;few times&#8221; to as high as &#8220;100 times&#8221; greater for the size of the deep Web.</p>
<p>In the last seven years, the only other <a title="'Deep Web' on Google Scholar" href="http://scholar.google.com/scholar?q=%22deep+web%22&#038;hl=en&#038;lr=&#038;btnG=Search">quantitative study of its kind</a> of which I am aware is documented in the paper, <a title="Chang et al., Structured Databases on the WEb" href="http://eagle.cs.uiuc.edu/pubs/2004/dwsurvey-sigmodrecord-chlpz-aug04.pdf"><em>&#8220;Structured Databases on the Web:  Observations and Implications,&#8221;</em></a> conducted by Chang <em>et al.</em> in April 2004 and published in the ACM SIGMOD, that estimated 330,000 deep Web sources with over 1.2 million query forms, reflecting a fast 3-7 times increase in 4 years from the date of my original paper.  Unlike the Lewandowski and Mayr partial analysis, this effort and others by that group suggests an even larger deep Web than my initial estimates!</p>
<p>The truth is, we didn&#8217;t know then &#8212; and we don&#8217;t know now &#8212; what the actual size of the dynamic Web truly is.  (And, aside from a sound bite, does it really matter?  It is huge by any measure.)  Heroic efforts such as these quantitative analyses or the still-more ambitious efforts of UC Berkeley&#8217;s SIM School on <a title="How Much Information?  2003" href="http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/execsum.htm"><em>How Much Information?</em></a> still have a role in helping to bound our understanding of information overload.  As long as such studies gain news traction, they will be pursued.  So, what might today&#8217;s story look like?</p>
<p>First, the methodological problems in my original analysis remain and (I believe today) resulted in overestimates.  Another factor today leading to a potential overestimate of the deep Web <em>v.</em> the surface Web would be the fact that much &#8220;deep&#8221; content is being more exposed to standard search engines, be it through Google&#8217;s Scholar, Yahoo!&#8217;s library relationships, individual site indexing and sharing such as through search appliances, and other &#8220;gray&#8221; factors we noted in our 2000-2001 studies.  These factors, and certainly more, act to narrow the difference between exposed search engine content (&#8220;surface Web&#8221;) and what we have termed the &#8220;deep Web.&#8221;</p>
<p>However, countering these facts are two newer trends.  First, foreign language content is growing at much higher rates and is often under-sampled.  Second, blogs and other democratized sources of content are exploding.  What these trends may be doing to content balances is, frankly, anyone&#8217;s guess.</p>
<p>So, while awareness of the qualitative nature of Web content has grown tremendously in the past near-decade, our quantitative understanding remains weak.  Improvements in technology and harvesting can now overcome earlier limits.</p>
<p>Perhaps there is another Ph.D. candidate or three out there that may want to tackle this question in a better (and more definitive) way.  According to Chang and Cho in their paper, <a title="Chang and Cho Paper" href="http://eagle.cs.uiuc.edu/pubs/2006/webaccesstutorial-sigmod06-cc-mar06.pdf"><em>&#8220;Accessing the Web: From Search to Integration,&#8221;</em></a> presented at the 2006 ACM SIGMOD International Conference on Management of Data in Chicago:</p>
<blockquote><p><em>On the other hand, for the deep Web, while the proliferation of structured sources has promised unlimited possibilities for more precise and aggregated access, it has also presented new challenges for realizing large scale and dynamic information integration. These issues are in essence related to data management, in a large scale, and thus present novel problems and interesting opportunities for our research community. </em></p></blockquote>
<p>Who knows?  For the right researcher with the right methodology, there may be a <a title="Science Magazine" href="http://www.sciencemag.org/"><em>Science</em></a> or <a title="Nature Journal" href="http://www.nature.com/nature/index.html"><em>Nature</em></a> paper in prospect!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mkbergman.com/343/the-murky-depths-of-the-deep-web/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>The Commoditization of Content Software</title>
		<link>http://www.mkbergman.com/277/the-commoditization-of-content-software/</link>
		<comments>http://www.mkbergman.com/277/the-commoditization-of-content-software/#comments</comments>
		<pubDate>Fri, 08 Sep 2006 15:50:29 +0000</pubDate>
		<dc:creator>Mike Bergman</dc:creator>
				<category><![CDATA[Adaptive Information]]></category>
		<category><![CDATA[Document Assets]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Software and Venture Capital]]></category>

		<guid isPermaLink="false">http://www.mkbergman.com/?p=277</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=The Commoditization of Content Software&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Adaptive Information&amp;rft.subject=Document Assets&amp;rft.subject=Open Source&amp;rft.subject=Software and Venture Capital&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2006-09-08&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/277/the-commoditization-of-content-software/&amp;rft.language=English"></span>
John Newton (co-founder formerly of Documentum, now of Alfresco) puts a telling marker on the table in his recent post on the Commoditization of ECM. Though noting the term &#34;enterprise content management&#34; did not even exist prior to 1998, he goes on to observe that expansion of the definition of what was appropriate in ECM [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=The Commoditization of Content Software&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Adaptive Information&amp;rft.subject=Document Assets&amp;rft.subject=Open Source&amp;rft.subject=Software and Venture Capital&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2006-09-08&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/277/the-commoditization-of-content-software/&amp;rft.language=English"></span>
<p>John Newton (co-founder formerly of <a title="Former Documentum Site" href="http://www.documentum.com/">Documentum</a>, now of <a title="Alfresco" href="http://www.alfresco.com/">Alfresco</a>) puts a telling marker on the table in his recent post on the <a target="_blank" title="Site: Content Log" class="bl_itemtitle" href="http://newton.typepad.com/content/2006/09/commoditization.html">Commoditization of ECM</a>. Though noting the term &quot;enterprise content management&quot; did not even exist prior to 1998, he goes on to observe that expansion of the definition of what was appropriate in ECM and the consolidation of the leading players occurred rapidly.  He concludes that this process has commoditized the market, with competitive differentiation now based on market size rather than functionality.  The platforms from the leading IBM, Microsoft and EMC-Documentum vendors all can manage documents, Web content, images, forms and records via basic library services, metadata management, search and retrieval, workflow, portal integration, and development kits.</p>
<p>If such consolidation and standardization of functionality were Newton&#8217;s only point one could say, &#8220;ho, hum,&#8221; such has been true in all major enterprise software markets.</p>
<p>But, in my reading, he goes on to make two more important and fundamental points, both of which existing enterprise software vendors ignore at their peril.</p>
<p><strong>Poor Foundations and Poor Performance </strong></p>
<p>Newton notes that ECM applications are never bought based on the nature of their repositories, but an inefficient repository can result in the rejection of the system.  He also acknowledges that ECM installations are costly to set up and maintain, difficult to use, poorly performing and lack essential automation (such as classification).  (Kind of sounds like most enterprise software initiatives, doesn&#8217;t it?)</p>
<p>Indeed, I have <a title="Posts on 'Document Assets'" href="http://www.mkbergman.com/?cat=7">repeatedly documented these gaps</a> for virtually all large-scale document-centric or federated applications.  The root cause &#8212; besides rampant poor interface designs &#8212; has been in my opinion poorly suited data management foundations.  Relational or IR-based systems both perform poorly for different reasons in <a title="Enterprise Semantic Webs Demand New Database Paradigms" href="http://www.mkbergman.com/?p=185">managing semi-structured data</a>.  This problem will not be solved by open source <em>per se</em> (see below), though there are some interesting options emerging from open source that may point the way to new alternatives, as well as incipient designs from <a title="BrightPlanet" href="http://www.brightplanet.com">BrightPlanet</a> and others.</p>
<p><strong>The Proprietary Killers of Open Standards and Open Source</strong></p>
<p>Service-oriented architectures (SOA), the various Web services standards (WS**), the certain JSRs (170 and 283 in documents, but also 168 and others), plus all of the various XML and semantic derivatives are moving rapidly with the very real prospect of &#8220;pluggability&#8221; and the substitution of various packages, components and applications across the entire enterprise stack.</p>
<p>In quoting Newton&#8217;s case at Alfresco, by aggregating these existing open source components they were able to get their ECM product ready in less than one year:</p>
<blockquote>
<ul>
<li><em>Spring &#8211; A framework that provides the wiring of the repository and the tools to extend capabilities without rebuilding the repository (Aspect-Oriented Programming)</em></li>
<li><em>Hibernate &#8211; An object-relational mapping tool that stores content metadata in database and handles all the idiosyncrasies of each SQL dialect</em></li>
<li><em>Lucene &#8211; An internet-scale full-text and general purpose information retrieval engine that supports federated search, taxonomic, XML and full-text search</em></li>
<li><em>EHCache &#8211; Distributed intelligent caching of content and metadata in a loosely coupled environment</em></li>
<li><em>jBPM &#8211; A full featured enterprise production workflow and business process engine that includes BPEL4WS support</em></li>
<li><em>Chiba &#8211; A complete Xforms interface that can be used for the configuration and management of the repository</em></li>
<li><em>Open Office &#8211; Provides a server-based and Linux-compatible transformation of MS Office based content</em></li>
<li><em>ImageMagic &#8211; Supports transformation and watermarking of images.</em></li>
</ul>
</blockquote>
<p>Moreover, the combination of these components led to an inherent architecture including pluggable modules, rules and templating engines, workflow and business process management, security, and other enterprise-level capabilities.  In prior times, I estimate no proprietary-based vendor could have accomplished this for ten times or more the effort.</p>
<p><strong>Similar Trends and Challenges in the Entire Enterprise Space</strong></p>
<p>Newton is obviously well placed to comment on these trends within ECM.  But similar trends can be seen in every major enterprise software space.  For virtually every component one can imagine, there is a very capable open source offering.  Many of the newer open source ventures are indeed centered around aggregating and integrating various open source components followed by either dual-source licensing or support services as the basis of their <a title="Open Source Business Models" href="http://www.mkbergman.com/?p=115">business models</a>.  At its most extreme, this trend has expanded to the whole process of enterprise application integration (EAI) itself through offerings such as <a title="LogicBlaze FUSE" href="http://www.logicblaze.com/">LogicBlaze FUSE</a> with its SOA-oriented standards and open source components.  Initiatives such as SCA (<a title="Service Component Architecture" href="http://en.wikipedia.org/wiki/Service_component_architecture">service component architecture</a>) will continue to fuel this trend.</p>
<p>So, enterprise software vendors, listen to your wake up call.  It is as if gold dubloons, pearls and jewels are laying all of the floor.  If you and your developers don&#8217;t take the time to bend over and pick them up, someone else will.  As Joel Mokyr has <a title="Origins of the Knowledge Economy" href="http://www.mkbergman.com/?p=249">compellingly researched,</a> the innovation of systems or how to integrate pieces can be every bit as important as the &#8216;Aha!&#8217; discovery.  Open source is now giving a whole new breed of bakers new ingredients for baking the cake.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mkbergman.com/277/the-commoditization-of-content-software/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Full Report:  Why Are $800 Billion in Document Assets Wasted Annually?</title>
		<link>http://www.mkbergman.com/197/full-report-why-are-800-billion-in-document-assets-wasted-annually/</link>
		<comments>http://www.mkbergman.com/197/full-report-why-are-800-billion-in-document-assets-wasted-annually/#comments</comments>
		<pubDate>Tue, 04 Apr 2006 15:29:14 +0000</pubDate>
		<dc:creator>Mike Bergman</dc:creator>
				<category><![CDATA[Adaptive Information]]></category>
		<category><![CDATA[Document Assets]]></category>
		<category><![CDATA[Information Automation]]></category>

		<guid isPermaLink="false">http://www.mkbergman.com/?p=197</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Full Report:  Why Are $800 Billion in Document Assets Wasted Annually?&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Adaptive Information&amp;rft.subject=Document Assets&amp;rft.subject=Information Automation&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2006-04-04&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/197/full-report-why-are-800-billion-in-document-assets-wasted-annually/&amp;rft.language=English"></span>
Author&apos;s Note: An earlier blog series by me has now been turned into a PDF white paper under the auspices of BrightPlanet Corp The citation for this effort is: M.K. Bergman, &#34;Why Are $800 Billion in Document Assets Wasted Annually?&#8221; BrightPlanet Corporation White Paper, April 2006, 27 pp. Click here to obtain a PDF copy [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Full Report:  Why Are $800 Billion in Document Assets Wasted Annually?&amp;rft.aulast=Bergman&amp;rft.aufirst=Mike&amp;rft.subject=Adaptive Information&amp;rft.subject=Document Assets&amp;rft.subject=Information Automation&amp;rft.source=AI3:::Adaptive Information&amp;rft.date=2006-04-04&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://www.mkbergman.com/197/full-report-why-are-800-billion-in-document-assets-wasted-annually/&amp;rft.language=English"></span>
<p><strong><em>Author&apos;s Note:</em></strong> An earlier blog series by me has now been turned into a PDF white paper under the auspices of <a title="BrightPlanet Corporation" href="http://www.brightplanet.com">BrightPlanet Corp</a>  The citation for this effort is:</p>
<p style="margin-left: 40px"><em>M.K. Bergman, &quot;Why Are $800 Billion in Document Assets Wasted Annually?&#8221; BrightPlanet Corporation White Paper, April 2006, 27 pp.</em></p>
<p><em><a href="http://www.mkbergman.com/wp-content/themes/ai3/files/2006Posts/WhyDocumentAssetsWasted060403.pdf"><img style="border: 0px solid " alt="Download PDF file" src="http://www.mkbergman.com/wp-content/themes/ai3/images/pdfdoc.gif" /></a> <a href="http://www.mkbergman.com/wp-content/themes/ai3/files/2006Posts/WhyDocumentAssetsWasted060403.pdf">Click here</a> to obtain a PDF copy of this full report (27 pp, 203 KB)</em></p>
<p>It is a tragedy of no small import when $800 billion in readily available savings from creating, using and sharing documents is wasted in the United States <strong><em>each year</em></strong>. How can waste of such magnitude <a name="_ednref1"></a>occur right before our noses? And how can this waste occur so silently, so insidiously, and so ubiquitously that none of us can see it?</p>
<p><a name="_ednref1"></a><a name="_ednref1"></a>This free white paper attempts to address these questions.  This report is the result of a series of posts in response to an earlier white paper I authored under <a href="http://www.brightplanet.com">BrightPlanet</a> sponsorship entitled, <em><a title="$3 Trillion in Untapped Docment Assets White Paper" href="http://www.mkbergman.com/?p=82">Untapped Assets: The $3 Trillion Value of U.S. Enterprise Documents</a>. </em><a name="_ednref3"></a>[1]</p>
<p><a name="_ednref3"></a><a name="_ednref3"></a>This full report intetgrates information from earlier blog postings:</p>
<p><a name="_ednref3"></a></p>
<ul><a name="_ednref3"></a></p>
<li><a href="http://www.mkbergman.com/?p=130">Part I: The &apos;Nature&apos; of Information and Its Ownership in the Commons</a></li>
<li><a href="http://www.mkbergman.com/?p=135">Part II: Barriers to Collaboration</a></li>
<li><a href="http://www.mkbergman.com/?p=136">Part III: The Perceived High Costs of Enterprise &apos;Solutions&apos;</a></li>
<li><a href="http://www.mkbergman.com/?p=137">Part IV: The Closeness and Ubiquity of the Problem</a>, and</li>
<li><a href="http://www.mkbergman.com/?p=138">Summary</a>.</li>
</ul>
<p>Public and enterprise expenditures to address the wasted document assets problem remain comparatively small, with growth in those expenditures flat in comparison to the rate of document production. This report attempts to bring attention and focus to the various ways that technology, people, and process can bring real document savings to our collective pocketbooks.</p>
<hr width="33%" size="1" align="left" /><a name="_edn3"></a>[1] Michael K. Bergman, &quot;Untapped Assets: The $3 Trillion Value of U.S. Enterprise Documents,&quot; <em>BrightPlanet Corporation White Paper</em>, July 2005, 42 pp. The paper contains 80 references, 150 citations, and many data tables.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mkbergman.com/197/full-report-why-are-800-billion-in-document-assets-wasted-annually/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

