Zotero, first released in October, is perhaps the best Firefox extension that most users have never heard of, unless you are an academic historian or social scientist, in which case Zotero is becoming quite the rage. It is also percolating into other academic fields, including law, math and science.
Zotero is a complete research citation platform, or what its developer, George Mason University’s Center for History and New Media (CHnM), calls, “The next-generation research tool.” Zotero allows academics and researchers to extract, manage, annotate, organize, export and publish citations in a variety of formats and from a variety of sources — all within the Firefox browser, and all while obviously the user is interacting directly with the Web.
What it Is
Like all Firefox extensions, Zotero is easy to install. From the Firefox add-on site or the Zotero site itself, a single click downloads the app to your browser. Upon a prompted re-start the app is now active. (Later alerts for any version upgrades are similarly automatic — as for any Firefox extension.)
Upon installation, Zotero puts an icon in your status bar and places new options on menus. When you encounter a site that Zotero supports (currently, mostly university libraries, but also Amazon and major publication outlets as well, totaling more than 150; here is a listing of Zotero’s so-called translators), you will see a folder symbol in your address bar telling you Zotero is active. A single click downloads the citations from that site automatically to your local system.
Citations have traditionally been one of the more “semantically challenging” data sets, with variations in style, order, format, presentation, coverage and you name it rampant. The fact that Zotero supports a given source site means that it understands these nuances and is ready to store the information in a single, canonical representation. Once downloaded, this citation representation can now be easily managed and organized. More importantly, you can now export this internal, standard representation into a multitude of export formats (including, most recently, MS Word). In short, like for-fee citation software in the past, Zotero now provides a free and universal mechanism for managing this chaos.
While the address icon acts to download one or more citations (yes, they also work in groups if there are multiple listings on the page!), choosing the Zotero icon itself invokes the full Zotero as an app within the browser, as this screen shot shows:
The left panes provide organization and management and tag support; the middle pane shows the active resources; and the right pane shows the structure associated with the active citation. This is all supported with attractive icons and logical tooltips and organization.
Zotero also offers utilities for creating your own scrapers (“translators”) for new sites not yet in the standard, supported roster. This capability is itself an extension to Zotero, called Scaffold, that also points to the building block nature of the core app. (Other utilities such as Solvent from MIT or others surely to come could either enhance or replace the current Scaffold framework.)
What is Impressive
Though supposedly in “beta,” Zotero already shows a completeness, sophistication and attention to detail not evident in most Firefox extensions. Indeed, this system approaches a complete application in its scope and professionalism. The fact it can be so easily installed and embedded in the browser itself is worth noting.
Firefox extensions have continuously evolved from single-function wonders to crude apps and now, as Zotero and a handful of other extensions show, complete functional applications. And, like OSes of the past, these extensions also adhere to standards and practices that make them pretty easy to use across applications. Firefox is indeed becoming a full-fledged platform.
This system is also using the new SQLite local database function (“mozStorage”) in Firefox 2.x to manage the local data (perhaps one of the first Firefox extensions to do so). This provides a clean and small install footprint for the extension, as well as opens it up to other standard data utilities.
What it Implies
So, what Zotero is exemplifying — beyond its own considerable capabilities — are some important implications. First, full-bodied apps, building on many piece-parts, can now be expected around the Fireflox platform. (Indeed, I earlier noted the emergence of such “Web OS” prospects as Parakey, whose developers also come from earlier Firefox legacies. One of those developers, Joe Hewitt, is also the author of the impressive Firebug extension.)
Second, the openness of Firefox for web-centric computing will, as I’ve stated before, continue to put competitive pressure on Microsoft’s Internet Explorer. This is good for all users at large and will continue to spur innovation.
Third, the pending version 2.0 of Zotero is slated to have a server-side component. What we are potentially seeing, then, are local client-side instantiations in the browser that can then communicate with remote data servers. This opens up a wealth of possibilities in social networking and collaboration.
And, last, and more specific to Zotero itself (but also enabled with Firefox’s native RDF support), we are now seeing a complete app framework for dealing with structured information and tagging on the Web. While clearly Zotero has a direct audience for citation management and research, the same infrastructure and techniques used by the system could become a general semantic Web or data framework for any other structured application.
Hmmm. Now that sounds like an opportunity . . . .
|An AI3 Jewels & Doubloon Winner|
The past couple of days has seen a flurry of activity and much excitement revolving around a new “database-free” mashup and publication system called Exhibit. Another in a string of sneaky-cool software from MIT’s Simile program (and written by David Huynh, a pragmatic semantic Web developer of the first order), Exhibit (and its sure to follow rapid innovations) will truly revolutionize Web publishing and the visualization and presentation of structured data. Exhibit is quite simply “structure for the masses.”
What is It?
Exhibit requires no traditional database technology, no server-side code, and no need for a web server. Here is a sampling of Exhibit‘s current capabilities:
Exhibit is as simple as defining a spreadsheet; after that you have a complete database! And, if you want to get wild and crazy with presentation and display, then that is easy as well!
What Are Some Examples?
Though Exhibit has been released barely one month, already there are some pretty impressive examples:
What Are People Saying?
Granted, we’re only talking about the last 24 hours or so, but interesting people are noticing and commenting on this phenomenon:
What is Coming?
Johan Sundström has created an Instant Google Spreadsheets Exhibit, which lets you turn any Google spreadsheet (with certain formatting requirements) into an “exhibit” just by pasting in its public feed URL with immediate faceted browsing; maps and timelines are forthcoming.
Well, a WordPress plug-in is in the works (to be announced, with Derek helping to take the lead on it). Though incorporation into a blog is easy, it does require the author to have system administration rights and access to the WordPress server. A plug-in could remove those hurdles and make usage still easier.
Exhibit‘s very helpful online tutorials are being expanded, particularly with more examples and more templates. For those seriously interested in the technology, definitely monitor the Simile project site.
There continues to be activity and expansion of the Babel translation formats. You can now convert BibTeX, Excel, Notation 3 (N3), RDF/XML or tab-separated values (TSV) to a choice of Exhibit JSON , N3 or RDF/XML. And, since Exhibit itself internally stores its data representation as triples, it is tantalizing to think that another Simile project, RDFizers, with its impressive storehose of RDF converters, may also be more closely tied with Babel. Is it possible that Exhibit JSON may become the lingua franca of small-scale data representation formats?
And, within the project team of Huynh and his Ph.D. thesis advisor, David Karger, there are also efforts underway to extend the syntax and functionality of Exhibit. We’ve just seen the expansion to direct Google spreadsheet support, and support for more spreadsheet functionality is desired, including possible string concatenation and numeric operations.
Exhibit itself has been designed with extensibility in mind. Its linkage to Timeline, for example, is one such example. What will be critical in the weeks and months ahead is the development of a developer and user community surrounding Exhibit. There is presently a fairly active mailing list and I’m sure the MIT folks would welcome serious contributions.
Finally, other aspects of the Simile project itself and related intiatives at MIT have direct and growing ties to Exhibit both in terms of team members and developers and in terms of philosophy. You may want to check out these additional MIT projects including Longwell, Piggy Bank, Solvent, Semantic Bank, Welkin, DSpace, Haystack, Dwell, Ajax, Sifter, Relo plugin, Re:Search, Chickenfoot, and LAPIS. This is a program on the move, to which the closest attention is warranted.
Expected Growing Pains
There are some known issues sometimes with display in Safari and Opera browsers; these are being worked on and should be resolved shortly. There are also some style issues and conflicts when embedding in blogs (easily fixed with CSS modifications). There are likely performance problems when data sets get into the hundreds or thousands, but that exceeds Exhibit‘s lightweight objectives anyway. There may be other problems that emerge as use broadens.
These issues are to be expected and should not diminish playing with the system immediately. You’ll be amazed at what you can do, and how rapidly with so little code.
It has been a fun few days. It’s exciting to be able to be a camp follower during one of those seminal moments in Web development. And, so I say to David and colleagues at MIT and the band of merry collaborators on their mailing list: Thanks! This is truly cool.
|An AI3 Jewels & Doubloon Winner|
Getting the Words Right
There has been some laudable progress in test-driven development (TDD), leading to what is now being touted as “behaviour-driven development” (note the English spelling). Two key proponents of this approach have been Dave Astels and Dan North, obviously among others, in setting up the BDD organization.
According to Dave’s first posting on this subject more than a year ago:
Maybe 10% of the people I talk to really understand what [TDD is] really about. Maybe only 5%. That sucks. What’s wrong? Well… one thing is that people think it’s about testing. That’s just not the case.
Sure, there are similarities, and you end up with a nice low level regression suite… but those are coincidental or happy side effects. So why have things come to this unhappy state of affairs? Why do so many not get it?
The thing about BDD is that it is not a new discipline or a radical change from earlier initiatives. It begins from the observation that test-driven design deals mostly with behavior and only in a small portion with unit tests. It extends the metaphor from development to engage the sponsor and (as I argue below) the market as well.
One of the things I find most compelling about the BDD approach is its emphasis on what sales people in the SPIN methodology have called “common language” and the domain-driven design people have called “ubiquitous language.” The notion is that all stakeholders in a project — including importantly the market, users and sponsors — need to have a common vocabulary that is simple, accurate, accessible, descriptive and consistent. In short, if such a language can be defined and used assiduiously, it becomes compelling and memorable. From the standpoint of development, this leads to consistency and clear communications, with the real side benefit of being more productive. From the standpoint of use and acceptance (“sales”), clear language leads to broader and quicker adoption.
Mindset matters. The language we use in our actual code, the language we use to describe our projects internally, the language we use to communicate the wonderful stuff we have created to the outside world, all of this matters. (Three cheers for dynamic languages and domain-specific languages – DSLs.) In fact, it matters so much, that if we are not taking the market’s viewpoint about what and how to explain this stuff we are likely producing crap that no one is interested in.
We all reflect the tools and the terminology that we use to work our way in the world. Development, testing (behaviorial design), and programming languages should all be in sync with our users’ end goals. What is wrong with users being able to read our code and understand what it is intending to do?
The BDD Web site does not yet offer any “cookbooks” for how such language is actually developed nor what specific steps need to be followed. (All practitioners would agree this is a hard process that requires focused attention.) But I think the protagonists are on to something very meaningful and real here.
Modular code development through agile dynamic languages, well-tested, and designed for clarity and purpose with all stakeholders is good code. I encourage the community to pay close attention and to get involved with BDD.
|An AI3 Jewels & Doubloon Winner|
The late Douglas Adams, of Doctor Who and A Hitchhiker’s Guide to the Galaxy fame, produced an absolutely fascinating, prescient and entertaining TV program 16 years ago for BBC2 presaging the Internet. Called Hyperland (see also the IMDB write up), this self-labelled ‘fantasy documentary’ 50-min video from 1990 can now be seen in its entirety from Google video. Mind you, this was well in advance of the World Wide Web (remember the source for ‘www’?) and the browser, though both that name and hypertext are liberally sprinked thrughout the show.
The presentation, written by and starring Adams as the protoganist having a fantasy dream, features Tom, the semantic simulacrum (actually, Tom Baker from Doctor Who), who is the “obsequious, and fully customizable” personal software agent who introduces, anticipates and guides Adams through what in actuality is a semantic Web of interconnected information. Laptops (actually an early Apple), pointing devices, icons and avatars sprinkle this tour de force in an uncanny glimpse into the (now) future.
Sure, some details are gotten wrong and perhaps there is a bit too much emphasis (given today’s realities) on virtual reality, but the vision presented is exactly that promised by the semantic Web and an interconnected global digital library of information and multimedia. Wow! And entertaining and fun to boot!
This is definitely Must See TV!
I’d like to thank Buzzsort for first writing about the availability of this video. Apparently fans and aficiandos have been clamoring for some time to see this show again, which has only recently been posted. Indeed, the access to an archived video such as this is a great example of Hyperland coming to reality.
|An AI3 Jewels & Doubloon Winner|
“Why are we not using the data we have?”
So asks Hans Rosling, a professor of international health at Sweden’s world-renowned Karolinska Institute, in a recent TED 2006 talk, now available on the Web. (The specific link is
http://www.ted.com/tedtalks/tedtalksplayer.cfm?key=hans_rosling&flashEnabled=1, recorded in February 2006.) This 20 minutes video is perhaps the most cogent and entertaining presentation you will ever see regarding how data can be made real and meaningful through appropriate visualization. Professor Rosling inspires us to unlock understanding from the manifest data all around us.
Fortunately, the data visualization techniques he uses can be obtained from the non-profit organization he has founded, Gapminder, which brings global health and demographic data to life using the free Trendalyzer software.
The TED (Technology Entertainment Design) annual conference draws about 1,000 attendees to Monterey, CA, for the bargain price of $4400 per
attendee. TED 2007 is already oversubscribed.
|An AI3 Jewels & Doubloon Winner|