Posted:September 16, 2007

Sweet Tools Listing

AI3's Sweet Tools Listing Updated to Version 10

This AI3 blog maintains Sweet Tools, the largest listing of about 800 semantic Web and -related tools available. Most are open source. Click here to see the current listing!

AI3's listing of semantic Web and -related tools has just been updated to version 10. This version adds 36 new tools since the last update on June 19, bringing the new total to 578 tools.

This version 10 update of Sweet Tools also includes an upgrade to version 2 of the lightweight Exhibit display (thanks again, MIT's Simile program and David Huynh, plus congratulations on your Ph.D, David!) and is separately provided as a simple table for quick download and copying.

Background on prior listings and earlier statistics may be found on these previous posts:

With interim updates periodically over that period.

Because of comments expirations on prior posts, this entry is now the new location for adding a suggested new tool. Simply provide your information in the comments section, and the tool will be included in the next update.

Posted:September 11, 2007

rdf-zitgist-wordpress.png Zitgist’s Plug-in Exposes Linked Data for Hundreds of Thousands of WordPress Sites

Notice Anything New at the End of AI3‘s Links ??? (hint: )

The essence of the Web is the link. We use it to navigate, discover, form communities and get high rankings (or not!) for our Web pages on search engines. But, each link carries much more behind it than what has generally been exposed. That is, until now . . . .

Frédérick Giasson is a pragmatic innovator of the structured Web and semantic Web. Most recently, his efforts have included Ping the Semantic Web (that aggregates RDF published on the Web), the Zitgist semantic Web browser (that enables that RDF data to be viewed in useful ways), TalkDigger (for finding and sharing topical Web discussions), and efforts on a variety of ontologies, including jointly with me on UMBEL.

I have been an aggressive “linker” for some time and try to refer to Wikipedia often for definitions or background as well. Thus, Fred’s most recent efforts to continue to add value to the link as the basic coin of the Web realm really caught my eye.

What is zLinks?

In the early days of the Web, links were used solely to visit specific Web pages or locations within those documents. Somewhat later, actions such as searching or purchasing items could be associated with a link. Most recently, with the emergence of the semantic Web, the very nature of the link has become ambiguous, potentially representing any of the link’s former uses or either direct or indirect references to data and resources.

The Zitgist zLinks plug-in now makes these link uses explicit from within WordPress blogs.

Thus, we see that links can fulfill three different purposes, in rough order of their emergence:

  1. To visit Web pages and locations
  2. To potentially take actions (say, buy or search), and
  3. To retrieve data regarding resources.

The emergence of linked data and the semantic Web (or at least the provision of data via the structured Web) are making the use of the link more complicated and ambiguous. Moreover, sometimes a link is an indirect reference to where data exists, and not the actual resource itself.

What Zitgist’s zLinks does is to make these uses explicit and to remove ambiguities. Further, if a link is not to an actual resource but only a reference to it, zLinks resolves to the link’s correct destination. And, still further, a zLinks link is the gateway to still additional links from its reference destination, making the service a powerful jumping off point in the true spirit of the interlinked Web.

To my knowledge, zLinks is is the first and purest implementation of what Kingsley Idehen has termed the “enhanced anchor” or <a++>. RDFa and embedded RDF have similar objectives but are not premised on resolving the existing link.

Like the SIOC Import Plug-in, which imports SIOC metadata into a WordPress blog, the zLinks tool recognizes the importance of standard blogging software and automated background tools to expose data and capabilities. Since WordPress has many hundreds of thousands of site owners and bloggers — not to mention hundreds of millions of visitors — zLinks could be an important first exposure for many to the real power of linking and the semantic Web.

How Do You Use It?

As a site owner, zLinks works identically to other plug-ins: simply install it and then it works smoothly and easily.

As a site user who might encounter a zLinks icon in a WordPress blog, all you need to do is click on mouse over the zLinks launcher icon at the end of any visible link. You will first get an alert that the system is working, retrieving all of the necessary background link information. You will then get a popup showing the results, similar to this one for my own AI3 blog:

Sample Zitgist Browser Linker Popup

The zLinks popup offers direct and related links, with the icons and other associated information an indicator as to the nature of the link and its purpose. In our example case, I click on my name reference, which brings up my FOAF file in the Zitgist browser:

Example FOAF File from Zitgist Browser
[Click for full image]

Note how picture, mapping and other information is automatically “meshed” with my FOAF file. From this Zitgist browser location, I could obviously continue to explore still further links and relationships. In this manner, zLinks adds an entirely new dynamic dimension to the concept of ‘interlinking.’

If the initial zLinks link references data, that data is now resolved to its proper direct location, and is presented as RDF with further meshing and manipulation available. Other resources may take you directly to a Web page or perform other actions. Some of those actions, for example, may be to format data results in specific views (timelines, maps, charts, tables, graphs, structured reports, etc.). If the sources are data, the ability to make transformations or present the data in various views opens a rich horizon of options.

Tweaks and Caveats

I made some minor tweaks to the Zitgist distribution as provided. First, I replaced the initial link icon — – with this one –– that is smaller and more in keeping with my local WordPress theme. I did this simply by replacing the mini_rdf.gif image in the /public_html/wp-content/plugins/zitgist-browser-linker/imgs/ directory.

Then, also in keeping with my local theme, I made the text in the popup a bit smaller. I did this simply by adding a font-size: 80%; property to the style.css stylesheet in the /public_html/wp-content/plugins/zitgist-browser-linker/css/ directory.

And, that was it! Simple and sweet.

It is also important to realize that this is just a first-release prototype. Some initial bugs have been discovered and worked out, sometimes the server site is down, and longer-term potentialities are only now beginning to emerge. But, this is still professional software with much thought behind it and much potential in front of it. If it breaks, so what? It is free and it is fun.

Where Next?

To all of you out there new to RDF and structured, linked data, I say: Play and enjoy!

zLinks is only beginning to touch the most visible part of the iceberg. It is pretty clear that the use and usefulness of links are only now being understood. Harking back to the original listing of three possible uses for a link it is clear that “actions” and the use of the link itself as a referrer and “mini-banner” on the Web are still not appreciated, let alone exploited.

It is interesting that AdaptiveBlue has also come out with a SmartLinks approach that differs somewhat from the Zitgist approach (items and linkages are constructed and then referred to from a central location), but their screenshot does affirm the untapped potential of links.

The W3C semantic Web community continues to grapple with resource/link terminology and nuances, the implications of which will be deferred to another day and another blog entry. However, suffice it to say that with a growing ‘Web of data’ and linked data, not to mention the original document vision and then one of commerce and services, the once lowly link is growing mighty indeed!

Posted:September 10, 2007

Astoria is Whistling Past the Graveyard to Irrelevance

I was pleased to see in my blog reader this morning a post from the Microsoft Astoria team on anticipated data formats for its pending formal release. I have been working on modeling Web data models and hoped to see some insight in the piece.

As the project team states,

The goal of Astoria is to make data available to loosely coupled systems for querying and manipulation. In order to do that we need to use protocols that define the interaction model between the producer and the consumer of that data, and of course we have to serialize the data in some form that all the involved parties understand. So protocols and formats are an important topic in our design process.

With that said, the team announced that the first formal Astoria release will support these three formats (with the single HTTP protocol):

  • ATOM / APP
  • JSON, the JavaScript Object Notation, and
  • Web3S, a Microsoft marketing wonder that as far as I know is only used by the MS Live group.

The later is a strange mapping of a tree data model to the record base of Astoria, in the process also abandoning a straight XML implementation in earlier versions.

Also notable for its absence is RDF (Resource Description Framework). The defensive response of the Astoria team to this absence speaks for itself:

The May [announcement on Astoria] included support for RDF. While we got positive comments about the fact we supported it, we didn't see any early user actually using it and we haven't seen a particular popular scenario where RDF was a must-have. So we are thinking that we may not include RDF as a format in the first release of Astoria, and focus on the other 3 formats (which are already a bunch from the development/testing perspective).

My personal take is that while I understand how RDF fits in the picture of the semantic web and related tools, the semantic web goes well beyond a particular format. The point is to have well-defined, derivable semantics from services. I believe that Astoria does this independently of the format being used. That, combined with the fact that we didn't see a strong demand for it, put RDF lower in our priority lists for formats.

There was a funny Glenn Ford movie from 1964 called “Advance to the Rear”. The problem is, this is not a movie, but the largest software company in the world taking two steps back for each one forward. Congratulations on alienating still further many thought leaders on the Web.

This is yet another stunning and lame attempt by Microsoft to replace open standards with proprietary ones. Get a clue, Redmond!

Posted:September 9, 2007

This Past Week’s Theme is Lightweight, and Cool!

Danny Ayers has recently instituted a semantic Web links re-cap each week (hey folks, set up a specific category!, though the del.icio.us tag “semweb weekly” is a nice touch), which is a fantastic service by Talis and one that most of us in the community look forward to. This post is not meant to be competitive with it. But, with the stuff happening last week, I could not resist a bit of a re-cap of my own.

While I have a pending post on Zitgist’s zLinks tool for WordPress (stay tuned!), there were some other announcements of the past week I thought were unique and noteworthy:

  • CouchDB — this is a very lightweight, scalable and flexible database approach that involves simple declarations on Web pages and a simple Erlang server component to render them. What is exciting about the approach is the ease for non-developers or users to create structured and transferrable data from unstructured and semi-structured bases; the technical overview is a good place to start
  • RDF/JSON — the Javascript community and many of us desirous of seeing a broadening of semWeb efforts to mainstream developers welcomed Keith Alexander‘s announcement and request for input on RDF/JSON; Keith has been a pragmatic innovator of the first rank and I suspect this initiative will be widely embraced
  • CoScripter — a simple, natural language system for automating repeated tasks within a Web browser, based on the Koala initiative (see, for example, the CHI 2007 paper) from IBM that many of us have been keeping our ears to the railroad tracks, and
  • zLinks — see my next post.
Posted:August 23, 2007

Production Printing PressWas the Industrial Revolution Truly the Catalyst?

Why, roughly beginning in 1820, did historical economic growth patterns skyrocket?

This is a question of no small import, and one that has occupied economic historians for many decades. We know what some of the major transitions have been in recorded history: the printing press, Renaissance, Age of Reason, Reformation, scientific method, Industrial Revolution, and so forth. But, which of these factors were outcomes, and which were causative?

This is not a new topic for me. Some of my earlier posts have discussed Paul Ormerod’s Why Most Things Fail: Evolution, Extinction and Economics, David Warsh’s Knowledge and the Wealth of Nations: A Story of Economic Discovery, David M. Levy’s Scrolling Forward: Making Sense of Documents in the Digital Age, Elizabeth Eisenstein’s classic Printing Press, Joel Mokyr’s Gifts of Athena : Historical Origins of the Knowledge Economy, Daniel R. Headrick’s When Information Came of Age : Technologies of Knowledge in the Age of Reason and Revolution, 1700-1850, and Yochai Benkler’s, The Wealth of Networks: How Social Production Transforms Markets and Freedoms. Thought provoking references, all.

But, in my opinion, none of them posits the central point.

Statistical Leaps of Faith

Statistics (originally derived from the concept of information about the state) really only began to be collected in France in the 1700s. For example, the first true population census (as opposed to the enumerations of biblical times) occurred in Spain in that same century, with the United States being the first country to set forth a decennial census beginning around 1790. Pretty much everything of a quantitative historical basis prior to that point is a guesstimate, and often a lousy one to boot.

Because no data was collected — indeed, the idea of data and statistics did not exist — attempts in our modern times to re-create economic and population assessments in earlier centuries are truly a heroic — and an estimation-laden exercise. Nonetheless, the renowned economic historian who has written a number of definitive OECD studies, Angus Maddison, and his team have prepared economic and population growth estimates for the world and various regions going back to AD 1 [1].

One summary of their results shows:

Year Ave Per Capita Ave Annual Yrs Required
AD GDP (1990 $) Growth Rate for Doubling
1 461
1000 450 -0.002% N/A
1500 566 0.046% 1,504
1600 596 0.051% 1,365
1700 615 0.032% 2,167
1820 667 0.067% 1,036
1870 874 0.542% 128
1900 1,262 1.235% 56
1913 1,526 1.470% 47
1950 2,111 0.881% 79
1967 3,396 2.836% 25
1985 4,764 1.898% 37
2003 6,432 1.682% 42

Note that through at least 1000 AD economic growth per capita (as well as population growth) was approximately flat. Indeed, up to the nineteenth century, Maddison estimates that a doubling of economic well-being per capita only occurred every 3000 to 4000 years. But, by 1820 or so onward, this doubling accelerated at warp speed to every 50 years or so.

Looking at a Couple of Historical Breakpoints

The first historical shift in millenial trends occurred roughly about 1000 AD, when flat or negative growth began to accelerate slightly. The growth trend looks comparatively impressive in the figure below, but that is only because the doubling of economic per capita wealth has now dropped to about every 1000 to 2000 years (note the relatively small differences in the income scale). These are annual growth rates about 30 times lower than today, which, with compounding, prove anemic indeed (see estimated rates in the table above).

Nonetheless, at about 1000 AD, however, there is an inflection point, though small. It is also one that corresponds somewhat to the adoption of raw linen paper v. skins and vellum (among other correlations that might be drawn).

When the economic growth scale gets expanded to include today, these optics change considerable. Yes, there was a bit of growth inflection around 1000 AD, but it is almost lost in the noise over the longer historical horizon. The real discontinuity in economic growth appears to have occurred in the early 1800s compared to all previous recorded history. At this major inflection point in the early 1800s, historically flat income averages skyrocketed. Why?

The fact that this inflection point does not correspond to earlier events such as invention of the printing press or Reformation (or other earlier notable transitions) — and does more closely correspond to the era of the Industrial Revolution — has tended to cement in popular histories and the public’s mind that it was machinery and mechanization that was the causative factor creating economic growth.

Had a notable transition occurred in the mid-1400s to 1500s it would have been obvious to ascribe more modern economic growth trends with the availability of information and the printing press. And, while, indeed, the printing press had massive effects, as Elizabeth Eisenstein has shown, the empirical record of changes in economic growth is not directly linked with adoption of the printing press. Moreover, as the graph above shows, something huge did happen in the early 1800s.

Pulp Paper and Mass Media

In its earliest incarnations, the printing press was an instrument of broader idea dissemination, but still largely to and through a relatively small and elite educated class. That is because books and printed material were still too expensive — I would submit largely due to the exorbitant cost of paper — even though somewhat more available to the wealthy classes. Ideas were fermenting, but the relative percentage of participants in that direct ferment were small. The overall situation was better than monks laboriously scribing manuscripts, but not disruptively so.

However, by the 1800s, those base conditions change, as reflected in the figure above. The combination of mechanical presses and paper production with the innovation of cheaper “pulp” paper were the factors that truly brought information to the “masses.” Yet, some have even taken “mass media” to be its own pejorative. But, look closely as what that term means and its importance to bringing information to the broader populace.

In Paul Starr’s Creation of the Media, he notes how in 15 years from 1835 to 1850 the cost of setting up a mass-circulation paper increased from $10,000 to over $2 million (in 2005 dollars). True, mechanization was increasing costs, but from the standpoint of consumers, the cost of information content was dropping to zero and approaching a near-time immediacy. The concept of “news” was coined, delivered by the “press” for a now-emerging “mass media.” Hmmm.

This mass publishing and pulp paper were emerging to bring an increasing storehouse of content and information to the public at levels never before seen. Though mass media may prove to be an historical artifact, its role in bringing literacy and information to the “masses” was generally an unalloyed good and the basis for an improvement in economic well being the likes of which had never been seen.

More recent trends show an upward blip in growth shortly after the turn of the 20th century, corresponding to electrification, but then a much larger discontinuity beginning after World War II:

In keeping with my thesis, I would posit that organizational information efforts and early electromechanical and then electronic computers resulting from the war effort, which in turn led to more efficient processing of information, were possible factors for this post-WWII growth increase.

It is silly, of course, to point to single factors or offer simplistic slogans about why this growth occurred and when. Indeed, the scientific revolution, industrial revolution, increase in literacy, electrification, printing press, Reformation, rise in democracy, and many other plausible and worthy candidates have been brought forward to explain these historical inflections in accelerated growth. For my own lights, I believe each and every one of these factors had its role to play.

But at a more fundamental level, I believe the drivers for this growth change came from the global increase and access to prior human information. Surely, the printing press helped to increase absolute volumes. Declining paper costs (a factor I believe to be greatly overlooked but also conterminous with the growth spurt and the transition from rag to pulp paper in the early 1800s), made information access affordable and universal. With accumulations in information volume came the need for better means to organize and present that information — title pages, tables of contents, indexes, glossaries, encyclopedia, dictionaries, journals, logs, ledgers,etc., all innovations of relatively recent times — that themselves worked to further fuel growth and development.

Of course, were I an economic historian, I would need to argue and document my thesis in a 400-pp book. And, even then, my arguments would appropriately be subject to debate and scrutiny.

Information, Not Machines

Tools and physical artifacts distinguish us from other animals. When we see the lack of a direct correlation of growth changes with the invention of the printing press, or growth changes approximate to the age of machines corresponding to the Industrial Revolution, it is easy and natural for us humans to equate such things to the tangible device. Indeed, our current fixation on technology is in part due to our comfort as tool makers. But, is this association with the technology and the tangible reliable, or (hehe) “artifactual”?

Information, specifically non-biological information passed on through cultural means, is what truly distinguishes us humans from other animals. We have been easily distracted looking at the tangible, when it is the information artifacts (“symbols”) that make us the humans who we truly are.

So, the confluence of cheaper machines (steam printing presses) with cheaper paper (pulp) brought information to the masses. And, in that process, more people learned, more people shared, and more people could innovate. And, yes, folks, we innovated like hell, and continue to do so today.

If the nature of the biological organism is to contain within it genetic information from which adaptations arise that it can pass to offspring via reproduction — an information volume that is inherently limited and only transmittable by single organisms — then the nature of human cultural information is a massive shift to an entirely different plane.

With the fixity and permanence of printing and cheap paper — and now cheap electrons — all prior discovered information across the entire species can now be accumulated and passed on to subsequent generations. Our storehouse of available information is thus accreting in an exponential way, and available to all. These factors make the fitness of our species a truly quantum shift from all prior biological beings, including early humans.

What Now Internet?

The information by which the means to produce and disseminate information itself is changing and growing. This is an infrastructural innovation that applies multiplier benefits upon the standard multiplier benefit of information. In other words, innovation in the basis of information use and dissemination itself is disruptive. Over history, writing systems, paper, the printing press, mass paper, and electronic information have all had such multiplier effects.

The Internet is but the latest example of such innovations in the infrastructural groundings of information. The Internet will continue to support the inexorable trend to more adaptability, more wealth and more participation. The multiplier effect of information itself will continue to empower and strengthen the individual, not in spite of mass media or any other ideologically based viewpoint but due to the freeing and adaptive benefits of information itself. Information is the natural antidote to entropy and, longer term, to the concentrations of wealth and power.

If many of these arguments of the importance of the availability of information prove correct, then we should conclude that the phenomenon of the Internet and global information access promises still more benefits to come. We are truly seeing access to meaningful information leapfrog anything seen before in history, with soon nearly every person on Earth contributing to the information dialog and leverage.

Endnote: And, oh, to answer the rhetorical question of this piece: No, it is information that has been the source of economic growth. The Industrial Revolution was but a natural expression of then-current information and through its innovations a source of still newer-information, all continuing to feed economic growth.


[1] The historical data were originally developed in three books by Angus Maddison: Monitoring the World Economy 1820-1992, OECD, Paris 1995; The World Economy: A Millennial Perspective, OECD Development Centre, Paris 2001; and The World Economy: Historical Statistics, OECD Development Centre, Paris 2003. All these contain detailed source notes. Figures for 1820 onwards are annual, wherever possible.

For earlier years, benchmark figures are shown for 1 AD, 1000 AD, 1500, 1600 and 1700. These figures have been updated to 2003 and may be downloaded by spreadsheet from the Groningen Growth and Development Centre (GGDC), a research group of economists and economic historians at the Economics Department of the University of Groningen headed by Maddison. See http://www.ggdc.net/.