Posted:August 19, 2005

Mike’s note:  The following are comments submitted by Hiranya K Nath of Sam Houston State University on my earlier posted paper, "Untapped Assets: The $3 Trillion Value of U.S. Enterprise Documents."  I subsequently referred to Hiranya’s and Uday M. Apte’s paper, "Size, Structure and Growth of the US Information Economy," in a follow-on to that post discussing supporting views for document assets occupying trillions of dollars in US economic activity.  The following is reprinted with Hiranya’s permission.

This is an important and interesting study that attempts to measure the value of corporate 'documents' in the U.S. It not only measures the cost of creating new documents but also the cost of handling or mishandling of documents. This study measures the benefits from improved document access and use. The value of corporate documents is assessed under three major categories: internal documents, web documents (which generally reside in public domain) and 'opportunities and threats'. The first two categories provide information for internal or external use while the third category of documents is to obtain solicited grants and contracts or to satisfy regulatory requirements.

In an economy which has increasingly been information-based, the importance and challenges of managing information have reached a proportion that was never witnessed before. Quantifying the value of creating and handling documents is extremely important and, to my knowledge, this study is one of the first attempts in that direction. However, as the author admits, the estimates are compiled from various sources and, therefore, they are extremely fragmentary and may have been inconsistent. In the following paragraphs, I present my thoughts on how I would proceed if I were to conduct the study. Nevertheless, this white paper has done an excellent job in initiating a research agenda.

First, define and explain the terms and concepts. The terms and concepts used in the study need some explanations as they may be useful for a reader to have a good grasp over the issues associated with quantifying the value of documents. Some of the terms related to information economy have not yet entered the general vocabulary. The dictionary meaning of 'document' is proof or evidence in a written format. Oxford dictionary has extended the definition to include the digital format as well. Also, concepts like knowledge industry, knowledge worker, information industry, information worker need to be defined. Studies like Machlup (1962), Porat (1977) have conceptualized these terms but they have not entered mainstream research vocabulary. More generally, a more settled, well accepted vocabulary has not been developed yet.


Merriam-Webster Dictionary:  Document

Function: noun

1 a
archaic : PROOF, EVIDENCE b : an original or official paper relied on as the basis, proof, or support of something c : something (as a photograph or a recording) that serves as evidence or proof

2 a
: a writing conveying information b : a material substance (as a coin or stone) having on it a representation of thoughts by means of some conventional mark or symbol

Function: transitive verb

1
: to furnish documentary evidence of

2
: to furnish with documents

3 a
: to provide with factual or substantial support for statements made or a hypothesis proposed; especially : to equip with exact references to authoritative supporting information b (1) : to construct or produce (as a movie or novel) with authentic situations or events (2) : to portray realistically
4 : to furnish (a ship) with ship’s papers

Oxford Advanced Learner's Dictionary :  Document

Function: noun
1 an official paper or book that gives information about sth, or that can be used as evidence or proof of sth: legal documents travel documents Copies of the relevant documents must be filed at court. One of the documents leaked to the press was a memorandum written by the head of the security police.
2 a computer file that contains text that has a name that identifies it: Save the document before closing.

Function: verb
1 to record the details of sth: Causes of the disease have been well documented. The results are documented in Chapter 3.
2 to prove or support sth with documents: documented evidence


Second, categorize and identify various documents. This is important because it will provide operational guideline for collecting relevant information for estimating the value of documents. From the standpoint of an organization, I think the documents can be divided into the following categories:

  1. The first category includes traditional/conventional documents which are necessary for the operation of the organization. These documents play the role of 'intermediate inputs' in the production process. They do not directly contribute to creation of new knowledge but mostly act as store of information. There are two sub categories:

    i) The first comprises documents that record operational details such as documents created by the accounting or payroll departments. Gathering of information and documentation thereof follow standard practices. With the advent of new technology the format of these documents may have changed but the standards have not changed much. I would assume that the cost of mishandling this category of documents is minimal. This category also includes documents which are created to satisfy legal requirements.

    ii) The other sub category includes documents which are mainly for dissemination of information. An organization generally interacts with a target group and for optimum outcome from this interaction it is important that this target group is fully informed. Basically, these documents are created to reduce information asymmetry among agents so that problems related to asymmetric information do not arise. There is a marketing aspect to this category of documents. With the availability of new technology, constant changes in media, and people's access to diverse sources of information, this category of documents is expected to grow in volume and value. But this might cause substantial reduction in overall cost of production by reducing inefficiency that arises from information asymmetry.

  2. The second category includes documents that directly contribute to knowledge creation. In the process of production, the firm/corporation constantly tries to invent and innovate. Invention and innovation add to the pool of existing knowledge. Documentation of newly created knowledge is crucial for the progression of human civilization. The cost of creating this type of document is expected to be relatively higher than that of other types. Since some of the documents created in this process may turn out to be useless later on, the cost of creating and recreating these documents could be enormous.

Third, develop a methodology to measure the value of the documents. The first category should not be too hard to measure. Since every organization has well-defined departments responsible for creating and handling this category of documents, the information should be relatively easily available. I would anticipate some formidable problems in measuring the value of the second category. These documents may be created without specific planning or without following standard practices. Also, measuring the value of large amount of 'unnecessary' documents will be challenging yet important for overall value of the documents

Among other issues, since most documents are created and used at the intermediate level, a cost-based valuation will be appropriate. But some documents could be priced and if price-based valuation is used for those documents then appropriate adjustments should be made to make them consistent with each other.

References:
Machlup, F. (1962), The Production and Distribution of Knowledge in the United States, (Princeton University Press, Princeton, NJ).

Porat, Marc U and Rubin, Michael R. (1977), The Information Economy (9 volumes), Office of Telecommunications Special Publication 77-12 (US Department of Commerce, Washington D.C.)

Posted by AI3's author, Mike Bergman Posted on August 19, 2005 at 11:36 am in Document Assets | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/117/naths-comments-on-untapped-assets-the-3-trillion-value-of-us-enterprise-documents/
The URI to trackback this post is: https://www.mkbergman.com/117/naths-comments-on-untapped-assets-the-3-trillion-value-of-us-enterprise-documents/trackback/
Posted:August 15, 2005

As my efforts proceeded in getting this blog set up, I began to realize I was devoting substantially more time and effort to the activity than I originally anticipated. It was roughly at this time of realization that I began tracking time and effort. To date, I have spent about 300 hours(!!) getting my site ready to go, but I know this is in no way typical.

In fact, with services like Blogger, you can be up and running in 5 minutes and for free and posting comments immediately. For reasons noted in my ‘Prepare to Blog’ diary, this de minimus effort may not be advisable. On the other hand, my own needs and demands should not be indicative either. In any event, I present below the breakdown of my time and effort tracking and discuss what may be more "typical" expectations.

Unusual Demands for AI3

As I’ve stated elsewhere, there are some unique and unusual circumstances I have placed on my set-up and investigations leading to AI3. I have wanted, for example, to:

  • Understand the blogging and self-publishing phenomenon
  • Get my hands dirty with respect to existing tools and infrastructure
  • Actually put in place a procedure that will allow me to continue to contribute in an efficient way
  • Be aggressive about capabilities and understand "gaps" for bloggers (esp. the "top 1%" in moving forward
  • Learn and test tools and techniques to discover gaps and friction points suitable for commercial attention
  • Push the edge of the envelop on performance, scale and functionality so as to approach industrial-strength blog sites, perhaps suitable for enterprise use; and
  • In general, thoroughly immerse myself into this new culture and technology.

I think I’ve been successful in these aims, but as noted before, incurred time and effort is not typical. As I present the numbers below, I will try to be specific about what may be applicable. Please understand the reference and viewpoint I present is for a serious blog content site, perhaps only applicable to 10% of bloggers or so.

Time and Effort Breakdowns

The table below presents the results of my time and effort tracking. The table shows that about XXX hours have been spent getting AI3 ready over a time from decision to do it until commercial release of about three months. These times and efforts are well removed from a 5-min Blogger site!



Hours by Major Area

Blog Link
Date
Research
Set-up
Add Tools
Techniques
Composition
Posting
Total
First Post – Decided to Blog 4/27/05 1.0 0.2
0.1
1.3
First Blog Test Drive 4/28/05 1.0 0.5 0.2
0.1
1.8
WordPress 4/29/05 4.0 2.0 0.6
0.3
6.9
Local Hosting
5/2/05
5.0



1.0
0.5
6.5
Install Difficulties and Then Success!
5/5/05
2.0
14.0


1.2
0.6
17.8
Design and Hacking CSS 5/6/05 5.0 2.0 0.2
0.1
7.3
No Local Images
5/7/05
3.0
0.5

2.0
0.4
0.2
6.1
Posts/Comments Behavior 5/8/05 1.0 1.5 0.1

2.6
Advanced Functionality
5/9/05
5.0



0.2
0.1
5.3
Site Transfer
5/17/05

6.0

1.0
0.1

7.1
Begin Content 5/18/05 0.5
0.1
0.6
Release Checklist 5/20/05 2.0 2.0 4.0
2.0
10.0
Editor Comparisons
5/20/05
6.0
2.0

1.0
6.0
3.0
18.0
Xinha Integration 5/31/05 1.0 1.0 6.0 1.0 0.4
0.2
9.6
External Credits and Thanks 6/13/05 3.0 1.0 0.8
0.4
5.2
Permalink Problems
6/15/05
3.0
0.8

3.0
1.6
0.8
9.2
Standard Site Content 6/15/05 1.0 3.0 16.0 8.0 28.0
Word Docs to HTML
6/16/05
4.0
2.0
3.0
6.0
8.0
4.0
27.0
Site Project Management 6/17/05 1.0 1.5 3.0
0.3
5.8
Not Playing Nice in the Sandbox 6/19/05 2.0 6.0 2.0
1.0
11.0
Use of Styles and Style Sheets 6/20/05 3.0 6.0 4.0
1.5
14.5
Clean Up Posts [not posted]
6/22/05




2.0
8.0
10.0
Some Best Practices 6/22/05 1.0

4.0 6.0
3.0
14.0
Large Document Transfer 6/24/05 1.0 1.0 2.0 1.0
8.0
13.0
Cross-browser Compatibility 6/24/05 2.0 3.0 4.0
2.0
11.0
File Organization and Naming 6/25/05 1.0 1.0
0.5
2.5
The Purposeful Blogger 6/25/05 2.0
1.0
3.0
Time Estimates
6/26/05
2.5



0.8
0.4
3.7
Word Docs to HTML II
6/26/05
1.0


2.0
3.0
1.0
7.0
W3C XHTML Validation
6/26/05
4.0
0.1

3.0
1.5
0.5
9.1
Screen Resolution Fix 7/5/05 4.0 0.1 1.5 1.5
0.5
7.6
Trackback and Ping Setup/Testing 7/12/05 3.0 1.5 1.0
0.3
5.8
Better Quicktags for Comments 7/14/05 3.0 1.0 1.0 0.5 2.0
0.5
8.0
Formal Site Release! 7/18/05 3.0 1.0
4.0
0.0
Prepare to Blog Summary and PDF
7/20/05
1.5


0.5
6.0
2.0
10.0
Summary – ‘Typical’ Tasks
Total 37.0 10.6 8.5 33.0 16.0 8.0 113.1
% of Total 32.7% 9.4% 7.5% 29.2% 14.1% 7.1%
Summary – All AI3 Tasks (incl. red)
Total 74.0 36.0 11.5 51.5 85.3 52.0 310.3
% of Total 23.9% 11.6% 3.7% 16.6% 27.5% 16.7%
Summary – Non-‘Typical’ AI3 Tasks








Total

37.0
25.4
3.0
18.5
69.3
44.0
197.2
% of Total

18.8%
12.9%
1.5%
9.4%
35.2%
22.3%

The table lists about 30 subtasks (generally documented as individual posts on AI3) broken into the six major activity areas of Research, Set-up, Adding (or integrating) Tools, Composing Posts, or Posting clean posts with review and formatting. Please note the red entries, since these are deemed to be specific to my unusual demands for AI3 and are therefore not typical of what a serious blogger without these aims might experience. The unusual entries are either entire tasks associated with investigating tools and techniques or the efforts spent in composing and posting the ‘Preparing to Blog’ diary.

Observations and Guidance for the Serious Blogger

A serious blogger should be able to get a fairly comprehensive and well-designed site up and running in less than 100 hours, less if some of the lessons and guidance from the ‘Preparing to Blog’ diary are followed, and further less if standard site content (mission, about me, etc.) is shorter than what I provided on the AI3 site. Moreover, unlike the three months it took to get AI3 released, much quicker turnarounds could be easily accomplished. The longer times for AI3 were exacerbated by the three-times effort associated with the site’s unusual demands, business travel and demands, and one family vacation!

Some other observations that may guide planning for serious blogging from these numbers are (with the obvious caveats that different styles and skills may significantly alter these points):

  • As a rule of thumb, consider that research and reading in advance of a given post takes about two times longer than actually writing up the results
  • Besides normal composition time, consider adding another 50% of time to make sure the formatting is correct and the posting will display properly. In other words, preparing a "content-rich" document for your blog may require 150% of the time it formerly took you
  • Set-up time, checklists, site management techniques, naming and filing conventions, etc., are well worth getting worked out in advance to reduce ongoing maintenance and relieve you to post and respond, and
  • Continue to record and maintain best practices as you encounter them.

Finally, ongoing requirements and care-and-feeding will remain demands. If one assumes roughly three "good" posts per week to keep a blog active, the numbers above suggest a weekly effort of about 20-25 hours per week or about 1-2 hrs per day, exclusive of responding to user comments. This may suggest lowering expectations to only a couple of quality postings per week.

 

Author’s Note:  I actually decided to commit to a blog on April 27, 2005, and began recording soon thereafter my steps in doing so.  Because of work demands and other delays, the actual site was not released until July 18, 2005.  To give my ‘Prepare to Blog …’ postings a more contemporaneous feel, I arbitrarily changed posting dates on this series one month forward, which means some aspects of the actual blog were better developed than some of these earlier posts indicate.  However, the sequence and the content remain unchanged.  A re-factored complete guide will be posted at the conclusion of the ‘Prepare to Blog …’ series, targeted for release about August 18, 2005.  mkb

Posted by AI3's author, Mike Bergman Posted on August 15, 2005 at 1:24 pm in Blogs and Blogging, Site-related | Comments (2)
The URI link reference to this post is: https://www.mkbergman.com/94/preparing-to-blog-time-and-effort-estimates/
The URI to trackback this post is: https://www.mkbergman.com/94/preparing-to-blog-time-and-effort-estimates/trackback/
Posted:August 14, 2005

Author’s Note: There is zipped Javascript code that supports the information in this post. If you develop improvements, please email Mike and let him know of your efforts.


Download JS quicktags code file Click here to download the zipped Javascript code file and LGPL license (14 KB)

Most blogs provide some text explanation above the comment field for what HTML tags the commenter may provide in her response.  However, I’d seen some other sites that had some buttons that automated some of this process. I set out to add this capability to my site with two objectives in mind:  1) make it easy for the non-HTML user to format a post comment; and 2) keep options narrowed to what I thought was an appropriate subset of HTML format tags.

The WordPress Quicktags Facility

The internal WordPress post and page editor comes with what it calls “quicktags.”  These tags, and their explanations, are:

str
“Strong” – creates a <strong> tag that gives strong emphasis (read – bold) to your text
em
“Emphasis” – creates a <em> tag that gives emphasis (read – italics) to your text
link
Link – creates a hyperlink to a web address which you supply in the pop-up box that is activated by the quicktag. If you select text before you click on the link tag, that text will be used as the link text (the clickable stuff) that will be displayed in your post
b-quote
Blockquote – creates a set of blockquote tags that indent text on both the left and right margins. An example follows

This text is within a blockquote tag

del
Delete – deleted text, text that has a strikethrough line through it
ins
Insert – inserted text, text that has been inserted; marked with an underline
img
Image – this works in much the same way as the link tag, you enter the URL of an image into a pop-up box and the image will be inserted into your post
ul
Unordered List – this adds the opening tag to create an unordered list (bulleted) in your post
ol
Ordered List – this adds the opening tag to create an ordered list (numbered) in your post
li
List Item – this adds a single list item to either the unordered or ordered lists. This tag requires either the ordered or unordered list tags to precede it.
code
Code – text that is formatted in a monospaced font to differentiate between code clips and regular text
more
More – this tag adds a <!–more–> tag to your post, which puts in a “more . . . ” link and puts the rest of your post on another page
page
Page – this tag adds a <!–nextpage–> tag to your post, which continues your post on a second page
Dict.
Dictionary lookup – looks up the word you enter into the pop-up box at dictionary.com and opens the definition page in a new window of your browser
Close Tags
Close Tags – this closes any tags (str, em, link, b-quote, del, ins, ul, ol, li, code) that must be closed in order to stop the formatting from continuing down the page. Each tag also turns into a close tag sign (ul becomes /ul, etc.) to allow you to close that individual tag as well.

Alex King’s Expanded Javascript Quicktags

As an addition to either the existing administrator quicktags or for addition in formatting online comments, Alex King expanded this listiing and provided it in a LGPL-based Javascript download.  This Javascript Quicktags is up to version 1.2 and is the basis for what I incorporated on my site for formatting online comments.

My Modifications

While I liked this version, it had some limitations in meeting my objectives:

  1. Too many buttons were provided; I wanted a simpler subset of format tools
  2. None of the buttons had tooltips, so that it was hard to understand what some of them did
  3. Some of the labels needed clarification
  4. I wanted a ‘Preview’ option that would show respondents what their final comments would look like once posted.

The result was that I modified Alex’s starting code.  If you would like to install the same version I have on this site — see the comment field with its buttons below — download the file noted at the top of this post. It includes all of the license and installation instructions of Alex’s current 1.2 version with my modifications.

Future Efforts

I will work to add a preview capability so that the commenter can see her formatted comments before submitting the post.  Longer term, I want to install a limited format WYSIWYG editor, similar to what I am using internally with Xinha.

Author’s Note: I actually decided to commit to a blog on April 27, 2005, and began recording soon thereafter my steps in doing so.  Because of work demands and other delays, the actual site was not released until July 18, 2005.  To give my ‘Prepare to Blog …’ postings a more contemporaneous feel, I arbitrarily changed posting dates on this series one month forward, which means some aspects of the actual blog were better developed than some of these earlier posts indicate.  However, the sequence and the content remain unchanged.  A re-factored complete guide will be posted at the conclusion of the ‘Prepare to Blog …’ series, targeted for release about August 18, 2005.  mkb

Posted by AI3's author, Mike Bergman Posted on August 14, 2005 at 10:55 am in Blogs and Blogging, Site-related | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/109/preparing-to-blog-better-quicktags-for-comment-entries/
The URI to trackback this post is: https://www.mkbergman.com/109/preparing-to-blog-better-quicktags-for-comment-entries/trackback/
Posted:August 12, 2005

Setting a Ping List

Pinging is a powerful way to alert posting and blog searching services that you have posted a new entry. There is a facility within the WordPress administration center for setting a ping list under Options-Writing-Update Services; the standard ping provided with the default installation is Pingomatic. Eliliot Back has provided a useful starting list of ping sites to include with WordPress. In reviewing his list, I looked at and decided to include these entries, excluding dead links and most foreign sites:

     www.a2b.cc/setloc/bp.a2b
    api.feedster.com/ping
    api.moreover.com/ping
    api.my.yahoo.com/rss/ping
    www.blogdigger.com/RPC2
    blogmatcher.com/u.php
    www.blogshares.com/rpc.php
    www.blogsnow.com/ping
    www.blogstreet.com/xrbin/xmlrpc.cgi
    coreblog.org/ping/
    www.mod-pubsub.org/kn_apps/blogchatter/ping.php
    www.newsisfree.com/xmlrpctest.php
    ping.blo.gs/
    ping.feedburner.com
    ping.syndic8.com/xmlrpc.php
    ping.weblogalot.com/rpc.php
    www.popdex.com/addsite.php
    rpc.blogrolling.com/pinger/
    rpc.pingomatic.com/
    rpc.technorati.com/rpc/ping
    rpc.weblogs.com/RPC2
    www.snipsnap.org/RPC2
    topicexchange.com/RPC2
    xping.pubsub.com/ping/

What is Trackback?

Trackback is a mechanism for a third-party to post on its own site a detailed response to one of your posts.  When done, the third party site pings your site, provides an excerpt ithat displays in the standard comment field, and can then be viewed by URL reference from your comments list.  Thus, the third party respondent need not post on both sites and more detailed responses can be provided on each author’s respective site.  Trackbacks can also be used for content aggregation purposes by topic.

There are many trackback overviews — which sometimes seem hard to understand because of its poor name — available on the Web.  A couple that are useful include the newbie guide from Moveable Type, the original developer of the trackback function and protocol, and a different short version..

Should Trackback be Used?

Initial difficulties in getting trackback configured for my site led me to question whether the function was even desirable.  In fact, within the past year, there has been an explosion of spamming against trackback facilities.  ‘Trackback is Dead‘ is one of the more provocative discussions arguing that the time for trackback is past; Matthew Mullenweg, a founding developer of WordPress, has spent time looking at how spamming can be overcome.

Because WordPress 1.5 has what appears to be a pretty effective trackback moderating facility — the same as what is used for comments — and because there appear to be some trackback spam filters and other plugin utilities, I decided to implement the feature for now, see if it is used, and if spam does occur deal with that problem at that time.

Configuring and Testing Trackback

My attempts to grapple with trackback had some difficulties, compounded by my fundamental misunderstandings.  First, the term is not descriptive, and it took me quite a bit of time to understand exactly what trackback was designed to accomplish.  Second, most blogs that have a trackback function show it as a link prior to the comments section.  I first had difficulty writing my comments code such that the trackback link URL displayed properly (for some reason, I needed to pass arguments in single quotes, not double quotes).  Then, when that problem was fixed, clicking on the link produced a 403 error from my hosting server.  This latter problem suggested that I did not have trackback properly configured, when, actually, it was my misunderstanding of how the function worked that was the problem.

The breaktrhrough in understanding came from external sites that offer ping and trackback testing.  The one I used was from Red Alt .  Definitely use these utilities!  I also testing pinging with  Pingomation

The Red Alt instructions showed that my site was indeed sending and receiving pings properly.  In fact, the use of the Trackback link appeared solely to be a means to get the reference URL to display for the third party to properly link to it.  That caused me to re-think the use of a link in the first place.

Displaying Trackback and Permalinks within Comments Section

To prevent others from having the same confusions I had, I decided to make two changes to how most sites handle trackbacks. First, I decided to eliminate the active links for both trackbacks and permalinks, instead replacing them with more description text fields with URLs that can be copied directly off the browser page. For example, for this post, the references are:

 

The URI link reference to this post is: http://mkbergman.com/?p=103

The URI to trackback this post is: http://mkbergman.com/wp-context/trackback/?p=103

 

Second, I also decided to make a clear distinction between direct comments and trackbacks in my post comments field. I did this by testing for the type of comment; if a trackback, it is shown as a a different type. The PHP code for implementing this is as follows, placed into the comments.php file where the comment display loop is shown:

             <?php if ($comment->comment_type == ‘trackback’) : ?>
          provided a trackback on
      <?php else : // the comment is a true comment ?>
          commented on
      <?php endif; ?>

When activitated, a regular comment will show as XXX commented on date; if a trackback, it will show as XXX provided a trackback on date.

With these changes, I now had pinging working, trackback working, and clear distinctions in my comments fields as to true comments or trackbacks.

 

Author’s Note:  I actually decided to commit to a blog on April 27, 2005, and began recording soon thereafter my steps in doing so.  Because of work demands and other delays, the actual site was not released until July 18, 2005.  To give my ‘Prepare to Blog …’ postings a more contemporaneous feel, I arbitrarily changed posting dates on this series one month forward, which means some aspects of the actual blog were better developed than some of these earlier posts indicate.  However, the sequence and the content remain unchanged.  A re-factored complete guide will be posted at the conclusion of the ‘Prepare to Blog …’ series, targeted for release about August 18, 2005.  mkb

Posted by AI3's author, Mike Bergman Posted on August 12, 2005 at 5:34 pm in Blogs and Blogging, Site-related | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/103/preparing-to-blog-trackback-and-ping-testing/
The URI to trackback this post is: https://www.mkbergman.com/103/preparing-to-blog-trackback-and-ping-testing/trackback/
Posted:August 10, 2005

Reports began surfacing in recent months about rekindled interest by venture capital firms (VC) in open source software companies. The first wave of VC interest in 1999-2000 or so resulted in $714 million in venture funding.[1] Most of these open source companies were based in one manner or another around the Linux operating system. Of the reported 71 open source companies that received VC financing at that time, most failed ($150 M in VC financing alone for Linuxcare and TurboLinux), though Red Hat among some other notables succeeded quite well.

Matt Asay, the organizer of the Open Source Business Conference (OSBC), among other open source advocacies, was the first to note the renewed interest by VCs in next-generation open source companies. In April of this year, he provided a rough tally of about $150 million in VC funding had come into open source companies in 2004. This story was picked up by Gary Rivlin of the New York Times in late April. Using estimates from the VentureOne database, Gary estimated 20 open source companies had received $149 million in VC funding in 2004. On this Monday Aug. 8 the Wall Street Journal updated a retrieval from the Dow Jones VentureOne database suggesting $290 million was invested by VCs in new open source start-ups in 2004.[2]

New and more mature business models, plus the growing acceptance of open source and the need for related services by business, as others and I have documented elsewhere, are fueling this rekindled interest. In fact, this new interest began more approximately in 2003, though it is accelerating today.

With Matt Asay’s assistance, I have assembled a listing of about 45 firms that have received more than $425 million in VC financing over the past 18-24 months. The trigger point or date appears to be the last financing round into MySQL of $20 million in 2003. Some of these firms, such as JasperSoft, are already in their third round (Series C) of financing.

The table below lists these firms and financing received since 2003. The companies were broadly clustered as either professional services firms (installation, training, support, services, custom programming, or commercial software add-ons), subscription (hosted applications usually provided under per user fees), or dual license where there is a mix of open source and commercial licenses.

Subscription Professional Dual License
Company $$ (M) Company $$ (M) Company $$ (M)
JotSpot $5.2 5Bridge $2.7 Active Endpoints $2.0
OpenLogic $4.0 Aduva $7.8 ActiveGrid $13.0
Simula Labs $12.5 Black Duck $5.0 Akibia $8.0
SpikeSource $15.0 Cymphonix $4.0 Astaro $12.9
Emic Networks $10.0 Coridan $2.5
Groundwork IT $11.5 db4objects $1.5
Jboss $10.0 Forum Systems $30.5
Medsphere $10.0 Funambol $5.0
Optaros $7.0 Gluecode $5.0
Palamida $5.0 Green Plum $20.0
Ping Identify $13.3 Jabber $7.2
pingtel $10.0 JasperSoft $23.3
Rally Software $4.5 Klocwork $24.0
Realm Systems $8.5 Laszlo Systems $18.3
Social Text $0.5 LignUp $5.9
SourceLabs $3.5 MySQL $19.5
Transitive $24.5 Scalix $19.2
Univa $1.0 Six Apart $13.0
Xen Source $6.0 SugarCRM $7.8
Zend Technologies $6.0
SUB-CATEGORY $36.7 $150.8 $238.5
TOTAL $426.0

This information likely has omissions and other errors. Data has been collected from standard venture databases, plus news releases and open reporting. Corrections and updates are welcomed. Though Matt’s assistance is greatly appreciated, any errors are my own.

Dual licensing opportunities have received the largest share of the funding, though recent trends have tended to support the professional services and subscription models. Very few of the most recent wave of financings are a straight Linux play, and then mostly only for large clustered applications. Services around certification and interoperability have been especially attractive to the VC community.

Though there is always a high failure rate for VC-backed software companies, the more mature and sophisticated business models surrounding the new crop of open source start-ups suggests some cause for optimism. Clearly, both the market and the vendor community are beginning to discover new roles and new needs surrounding open source use in the enterprise. Open-source based companies appear to be moving into the mainstream from the standpoint of venture capitalists.


[1] Gary Rivlin, “Open Wallets for Open Source Software,” New York Times, April 27, 2005. See http://www.nytimes.com/2005/04/27/technology/27open.html?ex=1272254400&en=87f44523b543a6a5&ei=5088&partner=rssnyt&emc=rss

[2] Robert A. Guth and Don Clark, “Linux Feels Growing Pains as Users Demand More Features,” Wall Street Journal, p. B1, August 8, 2005.

Posted by AI3's author, Mike Bergman Posted on August 10, 2005 at 3:14 pm in Open Source, Software and Venture Capital | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/113/new-425-million-wave-in-open-source-vc-funding/
The URI to trackback this post is: https://www.mkbergman.com/113/new-425-million-wave-in-open-source-vc-funding/trackback/