Another Innovative Faceted Browser from UVa and the Humanities
A bit over a year ago I spotlighted Collex, a set of tools for COLLecting and EXhibiting information in the humanities. Collex was developed for the NINES project (which stands for the Networked Infrastructure for Nineteenth-century Electronic Scholarship, a trans-Atlantic federation of scholars). Collex has now spawned Blacklight, a library faceted browser and discovery tool.
Project Blacklight
Blacklight is intended as a general faceted browser with keyword inclusion for use by libraries and digital collections. As with Collex, Blacklight is based on the Lucene/Solr facet-capable full-text engine. The name Blacklight is based on the combination of Solr + UV(a).
Blacklight is being prototyped on UVa’s Digital Collections Repository. It was first shown at the 2007 code4lib meeting, but has recently been unveiled on the Web and released as an open source project. More on this aspect can be found at the Project Blacklight Web site.
Blacklight was developed by Erik Hatcher, the lead developer of Flare and Collex, with help from library staff Bess Sadler, Bethany Nowviskie, Erin Stalberg, and Chris Hoebeke. You can experiment yourself with Blacklight at: http://blacklight.betech.virginia.edu/.
The figure below shows a typical output. Various pre-defined facets, such as media type, source, library held, etc., can be combined with standard keyword searches.
Many others have pursued facets, and the ones in this prototype are not uniquely interesting. What is interesting, however, is the interface design and the relative ease of adding, removing or altering the various facets or queries to drive results:
BlacklightDL
An extension of this effort, BlacklightDL, provides image and other digital media support to the basic browser engine. This instance, drawn from a separate experiment at UVa, shows a basic search of ‘Monticello’ when viewed through the Image Gallery:
Like the main Blacklight browser, flexible facet selections and modification are offered. With the current DL prototype, using similar constructs from Collex, there are also pie chart graphics to show the filtering effects of these various dimensions (in this case, drilling down on ‘Monticello’ by searching for ‘furniture’):
BlacklightDL is also working in conjunction with the OpenSource Connections (a resource worth reviewing in its own right).
Blacklight has just been released as an open source OPAC (online public access catalog). That means libraries (or anyone else) can use it to allow people to search and browse their collections online. Blacklight uses Solr to index and search, and uses Ruby on Rails for front-end configuration. Currently, Blacklight can index, search, and provide faceted browsing for MaRC records and several kinds of XML documents, including TEI, EAD, and GDMS; the code is available for downloading here.
Faceted Browsing
There is a rich and relatively long history of faceted browsing in the humanities and library science community. Notably, of course is Flamenco, one of the earliest dating from 2001 and still active, to MIT’s SIMILE Exhibit, which I have written of numerous times. Another online example is Footnote, a repository of nearly 30 million historical images. It has a nice interface and an especially nifty way of using a faceted timeline. Also see Solr in Libraries from Ryan Eby.
In fact, faceted browsing and search, especially as it adapts to more free-form structure, will likely be one of the important visualization paradigms for the structured Web. (It is probably time for me to do a major review of the area. 🙂 )
The library and digital media and exhibits communities (such as museums) are working hard at the intersection of the Web, search, display and metadata and semantics. For example, we also have recently seen the public release of the Omeka exhibits framework from the same developers of Zotero, one of my favorite Firefox plug-ins. And Talis continues to be a leader in bringing the semantic Web to the library community.
The humanities and library/museum communities have clearly joined the biology community as key innovators of essential infrastructure to the semantic Web. Thanks, community. The rest of us should be giving more than a cursory wave to these developments.
* * *
BTW, I’d very much like to thank Mark Baltzegar for bringing many of these initiatives to my attention.