Posted:November 2, 2006

Just in Time for Christmas: Vista in the Crosshairs

Or, Give your computer the bird.

Computers are frustrating. Creating documents, finding files, sharing information — why do everyday things still seem so tedious and counterintuitive?

Dave Kushner interviews Blake Ross and gets a preview of his new Parakey venture in the November issue of IEEE Spectrum. Ross, a 20-yr old wunderkind and one of the driving forces behind the Firefox browser, has teamed with Joe Hewitt of Firefox and Firebug fame to create an absolutely disruptive new approach to computing. Quoting from Kushner’s article:

Just as with Firefox, Ross began this project by asking himself one simple question: What's bad about today's software? The answer . . . resided in the gap between the desktop and the Web. . . . The problem, according to Ross, is there's no simple, cohesive tool to help people store and share their creations online. Currently, the steps involved depend on the medium. If you want to upload photos, for example, you have to dump your images into one folder, then transfer them to an image-sharing site such as Flickr. The process for moving videos to YouTube or a similar site is completely different. If you want to make a personal Web page within an online community, you have to join a social network, say, MySpace or Friendster. If you intend to rant about politics or movies, you launch a blog and link up to it from your other pages. The mess of the Web, in other words, leaves you trapped in one big tangle of actions, service providers, and applications. Ross's answer is . . . Parakey, "a Web operating system that can do everything an OS can do." Translation: it makes it really easy to store your stuff and share it with the world. Most or all of Parakey will be open source, under a license similar to Firefox's.

Thus, Parakey aims to bridge the divide between desktop operating systems and the Internet, using the browser as the common user interface. Parakey will give users the ability to easily host their own Web sites via their desktop. Even though Parakey works within the browser (all leading ones are to be supported), it actually runs on the local computer. This enables developers to do many things not allowable in a traditional Web site. By the use of easily assigned “keys”, the desktop owner can also easily and simply post or allow access to content of their choosing — from documents to photos to files — to become “public” to the distribution lists associated with these keys. Remote users get issued cookies so that their access to the local resources is seamless and without friction.

Similar to the models of the Firefox plugin or Web services, the basic Parakey platform can be easily extended. Ross and Hewitt have created a programming language, JUL (for ‘Just another User interface Language’), likely similar to the Mozilla XUL, for developers to write these components and extensions. Though the launch date for Parakey is being kept under wraps, all signals point to before January. The pre-launch company site allows interested parties to enter their email address to receive formal notification of the launch.

It is rather amazing that this article came out on the same day, yesterday, as John Milan’s blog post on on Richard McManus’s Read/Write Web blog. In that post, Milan posits Mozilla as another one of the gorillas (elephants) in the room and Adobe’s Apollo project as another “under the radar” approach to the desktop/Internet browser convergence.

All of this seems rather ironic as the world (Redmond) awaits the release of the long-delayed Windows update, Vista. Even the mighty do indeed live in interesting times.

Posted by AI3's author, Mike Bergman Posted on November 2, 2006 at 11:48 am in Adaptive Information, Open Source | Comments (3)
The URI link reference to this post is: http://www.mkbergman.com/298/bridging-the-divide-with-parakey/
The URI to trackback this post is: http://www.mkbergman.com/298/bridging-the-divide-with-parakey/trackback/
Posted:October 5, 2006

Eric Blue has assembled a valuable list of about 30 reference links and examples of information visualization in his post, Dataesthetics: The Power and Beauty of Data Visualization. I especially liked the http://infosthetics.com/ reference. Across the board, these links provide a pretty compelling showcase of what approaches and techniques can be taken to effective data presentation.

For example, as one of the many hundreds of options available, here is Sun’s interactive Java concept map (will spawn a new window and requires Flash viewer):

Launch Concept Map

I highly recommend Eric Blue’s posting.

Posted by AI3's author, Mike Bergman Posted on October 5, 2006 at 9:45 am in Adaptive Information | Comments (0)
The URI link reference to this post is: http://www.mkbergman.com/292/valuable-data-visualization-reference/
The URI to trackback this post is: http://www.mkbergman.com/292/valuable-data-visualization-reference/trackback/
Posted:October 4, 2006
This AI3 blog maintains Sweet Tools, the largest listing of about 800 semantic Web and -related tools available. Most are open source. Click here to see the current listing!

Since my recent posting of 175 semantic Web tools, I got many suggestions from users (thanks all of you!) and also came across another great reference site, Krugle. (You can also use the engine for finding white papers, technical papers and projects, in addition to code. I used the project search with keywords such as “semantic web”, ontology, annoation and the like. There is a useful demo of Krugle as well.)

At any rate, these sources have now enabled me to add another 75 or so new semantic Web tools to the previous listing. The resulting comprehensive update to 250 SW tools is shown below, with the new additions having the colored background.

NAME (URL)
DESCRIPTION
3store A core C library that uses MySQL to store its raw RDF data and caches, forming an important part of the infrastructure required to support a range of knowledgeable services
4Suite 4RDF The 4Suite 4RDF an open-source platform for XML and RDF processing implemented in Python with C extensions
ActiveRDF ActiveRDF is a library for accessing RDF data from Ruby programs. It can be used as data layer in Ruby-on-Rails. You can address RDF resources, classes, properties, etc. programmatically, without queries
Adaptiva A user-centred ontology building environment, based on using multiple strategies to construct an ontology, minimising user input by using adaptive information extraction
Aduna Metadata Server The Aduna Metadata Server automatically extracts metadata from information sources, like a file server, an intranet or public web sites. The Aduna Metadata Server is a powerful and scalable store for metadata
AeroText Entity extraction engine from Lockheed Martin
AJAX Client for SPARQL AJAX Client for SPARQL is a simple AJAX client that can be used for running SELECT queries against a service and then integrating them with client-side Javascript code
AKT Research Map A competence map for members of the AKT project
AKT-Bus An open, lightweight, Web standards-based communication infrastructure to support interoperability among knowledge services.
AllegroGraph Franz Inc’s AllegroGraph is a system to load, store and query RDF data. It includes a SPARQL interface and RDFS reasoning. It has a Java and a Prolog interface
Alembic The Alembic Workbench project from Mitre has as its goal the creation of a natural language engineering environment for the development of tagged corpora
Almo An ontology-based workflow engine in Java
Altova SemanticWorks Visual RDF and OWL editor that auto-generates RDF/XML or nTriples based on visual ontology design
AMALGAM The AMALGAM (Automatic Mapping Among Lexico-Grammatical Annotation Models) project is an attempt to create a set of mapping algorithms to map between the main tagsets and phrase structure grammar schemes used in various research corpora. Software has been developed to tag text with up to 8 annotation schemes
Amilcare An adaptive information extraction tool designed to support document annotation for the Semantic Web.
Amine Amine is a Multi-Layer Platform implemented in Java. It provides various Engines and GUIs to build a wide variety of Ontology-based applications, Conceptual Graph based applications, Intelligent Systems and Multi-Agents Systems
Anacubis Anacubis is a visual analysis tool the lets its users visualize the relationships between entities in a collection of information. The visualization is rather similar to concept maps.
ANNIE – Open Source Information Extraction An open-source robust information extraction system
Aperture Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (e.g. file systems, web sites, mail boxes) and the file formats (e.g. documents, images) occurring in these systems
Applications of FCA in AKT Formal Concept Analysis (FCA) is used in a variety of application scenarios in AKT in order to perform concept-based domain analysis and automatically deduce a taxonomy lattice of that domain.
Aqua AQUA is a system which answer questions written in English. It combines several technologies Natural Language Processing, Logic, Information Retrieval and Ontologies
ARC ARC is a lightweight, SPARQL-enabled RDF system for mainstream Web projects. It is written in PHP and has been optimized for shared Web environments
Armadillo Exploits the redundancies apparent in the Internet, combining many information sources to perform document annotation with minimal human intervention.
ArtEquAkt A system that automatically extracts information about artists from the web, populates an ontology, then uses the knowledge to generate personalised biographies
ASIS for GNAT ASIS (Ada Semantic Interface Specification) for GNAT on gcc. ASIS is a published international ISO standard (ISO/IEC 15291:1999). ASIS based tools are available as well
ATLAS ATLAS (Architecture and Tools for Linguistic Analysis Systems) is a joint initiative of NIST, MITRE and the LDC to build a general purpose annotation architecture and a data interchange format. The starting point is the annotation graph model, with some significant generalizations
Automatic Support for Enterprise Modelling and Workflow Knowledge management using multi-modelling techniques and how modelling activities may be assisted with automation based on formal methods.
AutoSemantix AutoSemantix is a round-trip code generation tool designed to streamline the creation of Semantic Web applications for the Java platform
BBN OWL Validator BBN OWL Validator
Beagle++ Beagle++ is an extensions to the Beagle search tool for the personal information space. Beagle++ now makes that search semantic, moving towards a visious of the Semantic Desktop
Bibster A semantics-based bibliographic peer-to-peer system
Bossam Bossam, a rule-based OWL reasoner (free, well-documented, closed-source)
Brahms Brahms is a fast main-memory RDF/S storage, capable of storing, accessing and querying large ontologies. It is implemented as a set of C++ classes
BrownSauce The BrownSauce RDF browser is a project to aggregate and present arbitrary RDF data in as pleasing a manner as possible, that is a ‘semantic web browser’. Brownsauce is a local http server; however it should be trivial to add other front-ends
BuddySpace Instant messaging with custom map visualizations, semantics of presence (beyond ‘offline’/'online’/'away’ status) and value-added web services (group alerts, bots, inferences via personal profiles)
Callisto The Callisto annotation tool was developed to support linguistic annotation of textual sources for any Unicode-supported language with annotation support from jATLAS
CARA CARA (*CA*RMEN *R*DF *A*PI) provides an API for the Resource Description Framework (RDF). The API is based on the graph model of RDF, supports in-memory and persistant storage and includes an RDF Parser
CASD A tool for producing system architecture diagrams from service and data descriptions.
CASheW-s Engine The purpose of this project is to facilitate the composition of semantic web services. It consists of two parts, of which this is one
CCM Content-Based Cross-Site Mining (CCM) of Web Data Records algorithm combines techniques of extracting data records based on the structure of documents (HTML tags) with an analysis of the semantics of the content for better data record extraction
Cerebra Server A technology platform that is used by enterprises to build model-driven applications and highly adaptive information integration infrastructure; company recently bought by webMethods
COCKATOO A knowledge acquisition tool which can be used to produce a set of cases for use with a Case-Based Reasoning system.
COHSE – Conceptual Open Hypermedia Services Environment COHSE researches methods to improve significantly the quality, consistency and breadth of linking of WWW documents at retrieval and authoring time.
CS AKTiveSpace CS AKTiveSpace is a smart browser interface for a Semantic Web application that provides ontologically motivated information about the UK computer science research community.
ClassAKT A text classification web service for classifying documents according to the ACM Computing Classification System.
Compendium Compendium is a semantic, visual hypertext tool for supporting collaborative domain modelling and real time meeting capture
ConRef A service discovery system which uses ontology mapping techniques to support different user vocabularies
ConcepTool A system to model, analyse, verify, validate, share, combine, and reuse domain knowledge bases and ontologies, reasoning about their implication.
Corese Corese stands for Conceptual Resource Search Engine. It is an RDF engine based on Conceptual Graphs (CG) and written in Java. It enables the processing of RDF Schema and RDF statements within the CG formalism, provides a rule engine and a query engine accepting the SPARQL syntax
cwm The Closed World Machine (CWM) data manipulator, rules processor and query system mostly using using the Notation 3 textual RDF syntax. It also has an incomplete OWL Full and a SPARQL access. It is written in Python
Cypher Cypher Generates RDF and SeRQL representation of natural language statements and phrases
D2R MAP Processor D2R MAP is a declarative language to describe mappings between relational database schemata and OWL ontologies. This D2R processor implements the D2R mapping language and exports data from a relational database as RDF, N3, N-TRIPLES or as Jena models
D2R Server D2R Server, turns relational databases into SPARQL endpoints, based on Jena’s Joseki
D3E – Digital Document Discourse Environment D3E enables the easy conversion of websites or structured documents into interactive discussion sites
DBIN DBin brings the Semantic Web to the end users. By joining P2P groups and communities, users can annotate any topic or subject of interest and enjoy browsing and editing in a semantically rich environment.
Deep Query Manager Search federator from deep Web sources
DEVONthink DEVONthink is a single database for organizing and annotating all desktop and Web documents using semantic concepts; it runs only on Mac OS X
DOME A programmable XML editor which is being used in a knowledge extraction role to transform Web pages into RDF, and available as Eclipse plug-ins.DOME stands for DERI Ontology Management Environment
DOSE A distributed platform for semantic annotation
Drive Drive is an RDF parser written in C# for the .NET platform
ekoss.org A collaborative knowledge sharing environment where model developers can submit advertisements
Ellogon Ellogon is a multi-lingual, cross-platform, general-purpose language engineering environment, based on the earlier TIPSTER approach
Endeca Facet-based content organizer and search platform
Eprep An add-on for the Eprints document archive which uses text extraction to automatically create the bibliographic metadata needed for the submission of a new document.
eServices The e-Services framework provides advanced scholarly services (in particular visualisations) using distributed metadata.
Euler Euler is an inference engine supporting logic based proofs. It is a backward-chaining reasoner enhanced with Euler path detection. It has implementations in Java, C#, Python, Javascript and Prolog. Via N3 it is interoperable with W3C Cwm
Exteca The Exteca platform is an ontology-based technology written in Java for high-quality knowledge management and document categorisation. It can be used in conjunction with search engines
ExtrAKT ExtrAKT is a tool for extracting ontologies from Prolog knowledge bases.
F-Life F-Life is a tool for analysing and maintaining life-cycle patterns in ontology development.
FaCT++ FaCT++ is an OWL DL Reasoner implemented in C++
Fastr Fastr is a parser for term and variant recognition. Fastr take as input a corpus and a list of terms and ouputs the indexed corpus in which terms and variants are recognized
Floodsim A prototype system which demonstrates the benefits of applying semantically rich service descriptions (expressed using Semantic Web technologies) to Web Services.
FOAF-o-matic Online FOAF generator
FOAM Framework for ontology alignment and mapping
Foxtrot Foxtrot is a recommender system which represents user profiles in ontological terms, allowing inference, bootstrapping and profile visualization
FreeLing FreeLing is an open source language analysis tool suite. The FreeLing package consists of a library providing language analysis services (such as morphological analysis, date recognition, PoS tagging, etc.) The current version (1.2) of the package provides tokenizing, sentence splitting, morphological analysis, NE detection, date/number/currency recognition, PoS tagging, and chart-based shallow parsing
Fresh Framework Fresh Framework is a CMS designed for the Semantic Web, with WYSIWYG page editing, RDF summaries of profiles and news, and countless other quality features you expect to find in a CMS
GATE – General Architecture for Text Engineering GATE is a stable, robust, and scalable open-source infrastructure which allows users to build and customise language processing components, while it handles mundane tasks like data storage, format analysis and data visualisation.
Gnowsis A semantic desktop environment
GNOWSYS GNOWSYS, Gnowledge Networking and Organizing System, is a web based hybrid knowledge base with a kernel for semantic computing. It is devleoped in Python and works as an installed product in ZOPE
Graphl Graphl is a generic graph visualization and manipulation tool written in Java. Graphl reads and writes RDF files, visualizes them in a flexible and customizeable way and allows users to edit them intuitively
Groove Graph transformation, model transformation, object-oriented verification, behavioural semantics
GrOWL Open source graphical ontology browser and editor
HALoGEN HALoGEN is an extremely powerful and easy to use general-purpose natural language generation system. It consists of a symbolic generator, a forest ranker, and some sample inputs. The symbolic generator includes the Sensus Ontology dictionary based on WordNet. The forest ranker includes a 250 million word ngram language model (unigram, bigram, and trigram) trained on the Wall Street Journal newspaper text. The symbolic generator is written in LISP and requires a Lisp interpreter
HAWK OWL repository framework and toolkit
Haystack Haystack is a tool designed to let individuals manage all their information in ways that make the most sense to them. By removing arbitrary barriers created by applications that handle only certain information “types” and that record only a fixed set of relationships defined by the developer, we aim to let users define whichever arrangements of, connections between, and views of information they find most effective
Heart of Gold Heart of Gold is a middleware for the integration of deep and shallow natural language processing components. It provides a uniform and flexible infrastructure for building applications that use Robust Minimal Recursion Semantics (RMRS) and/or general XML standoff annotation produced by NLP components
HELENOS A Knowledge discovery workbench for the semantic Web
hMAFRA (Harmonize Mapping Framework) hMAFRA is a set of tools supporting semantic mapping definition and data reconciliation between ontologies. The targeted formats are XSD, RDFS and KAON
hypKNOWsys hypKNOWsys aims at developing a Java-based workbench for knowledge discovery and knowledge management. Currently, hypKNOWsys has released two intermediate tools: DIAsDEM Workbench (text mining for semantic tagging) and WUMprep (Web mining pre-processing)
I-X Process Panels The I-X tool suite supports principled collaborations of human and computer agents in the creation or modification of some product
IBM Semantics Toolkit BM Semantics Toolkit is designed for storage, manipulation, query, and inference of ontologies and corresponding instances. A major purpose is to establish an end-to-end ontology engineering environment tightly integrated with dominant Meta- Object Facility (MOF)-based modeling and application development tools. The semantics toolkit contains three main components (Orient, EODM, and RStar), which are designed for users of different levels.
Identify Knowledge Base Identify-Knowledge-Base is a tool of Topic Identification about Knowledge Base
IF-Map IF-Map is an Information Flow based ontology mapping method. It is based on the theoretical grounds of logic of distributed systems and provides an automated streamlined process for generating mappings between ontologies of the same domain.
IkeWiki IkeWiki is a new kind of Wiki (a so-called Semantic Wiki”) developed by Salzburg Research
ILP for Information Extraction To overcome the knowledge acquisition bottleneck, we apply Inductive Logic Programming techniques to learn Information Extraction rules.
Internet Reasoning Service The Internet Reasoning Service provides a a number of tools which supports the publication, location, composition and execution of heterogeneous web services, specified using semantic web technology
IODT IBM’s toolkit for ontology-driven development
IsaViz IsaViz is a visual authoring tool for browsing and authoring RDF models represented as graphs. Developed by Emmanuel Pietriga of W3C and Xerox Research Centre Europe
Jambalaya PComprehensive Listing of 250 Semantic Web Tools (updated) plug-in for visualizing ontologies
Jastor Open source Java code generator that emits Java Beans from ontologies
Javascript RDF/Turtle parser Javascript RDF/Turtle parser, can be used with Jibbering
Jena Jena is a Java framework to construct Semantic Web Applications. It provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule-based inference engine. It also has the ability to be used as an RDF database via its Joseki layer. See the Jena discussion list for more information
JessTab JessTab is a plug-in for Protégé that allows you to use Jess and Protégé together. JessTab provides a Jess console window where you can interact with Jess while running Protégé. Furthermore, JessTab extends Jess with additional functions that allows you to map Protégé knowledge bases to Jess facts. Also, there are functions for manipulating Protégé knowledge bases from Jess
Jibbering Jibbering, a simple javascript RDF Parser and query thingy
Joseki Jena’s Joseki layer offers an RDF Triple Store facility with SPARQL interface (see also the entry on Jena)
JRDF JRDF Java RDF Binding is an attempt to create a standard set of APIs and base implementations to RDF using Java. Includes a SPARQL GUI.
KAON Open source ontology management infrastructure
KAON2 KAON2 is an an infrastructure for managing OWL-DL, SWRL, and F-Logic ontologies. it is capable of manipulating OWL-DL ontologies; queries can be formulated using SPARQL
Kazuki Generates a java API for working with OWL instance data directly from a set of OWL ontologies
KIM Platform KIM is a software platform for the semantic annotation of text, automatic ontology population, indexing and retrieval, and information extraction from Ontotext
KnoZilla
Knowledge Broker The knowledge broker addresses the problem of knowledge service location in distributed environments.
Kowari Open source database for RDF and OWL
KRAFT – I-X TIE Supports collaboration among members of a virtual organisation by integrating workflow and communication technology with constraint solving.
Kraken Kraken is an application for managing knowledge objects, which can be documents, remote or locally cached Web pages, personal information, todo list items, appointments, and so on. It is especially useful for researchers or students to manage their information. Users can annotate these knowledge objects with metadata, perform complex queries, and present the results as HTML pages. Kraken uses RDF as its native format, allowing its data to be easily read by external applications
LingPipe LingPipe is a suite of Java tools designed to perform linguistic analysis on natural language data. LingPipe’s flexibility and included source make it appropriate for research use. Version 1.0 tools include a statistical named-entity detector, a heuristic sentence boundary detector, and a heuristic within-document coreference resolution engine
LinguaStream LinguaStream is an integrated experimentation environment (IEE) targeted to researchers in Natural Language Processing. LinguaStream allows processing streams to be assembled visually, picking individual components in a “palette” (the standard set contains about fifty components, and is easily extensible using a Java API, a macro-component system, and templates). Some components are specifically targeted to NLP, while others solve various issues related to document engineering (especially to XML processing). Other components are to be used in order to perform computations on the annotations produced by the analysers, to visualise annotated documents, to generate charts, etc.
LinKFactory Language & Computing’s LinKFactory is an ontology management tool, it provides an effective and user-friendly way to create, maintain and extend extensive multilingual terminology systems and ontologies (English, Spanish, French, etc.). It is designed to build, manage and maintain large, complex, language independent ontologies.
Longwell Longwell is a web-based RDF-powered highly-configurable faceted browser
Lucene Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. It is open source
LuMriX A commercial search engine using semantic Web technologies
Machinese Syntax Machinese Syntax provides a full analysis of texts by showing how words and concepts relate to each other in sentences – still with very competitive speed and accuracy. Machinese Syntax helps analytic applications understand text beyond the level of words, phrases and entities: also their interrelations (such as events, actions, states and circumstances); from Connexor
MAFRA Toolkit Ontology MApping FRAmework Toolkit allows to create semantic relations between two (source and target) ontologies, and apply such relations in translating source ontology instances into target ontology instances
Magpie Magpie supports the interpretation of web documents through on-the-fly ontologically based enrichment. Semantic services can be invoked either by the user or be automatically triggered by patterns of browsing activity
MatrixBrowser Visualization Kit The MatrixBrowser project presents a new approach for visualizing and exploring large networked information structures which may represent, for instance, linked information resources or metadata structures such as ontologies
Melita Melita is a semi-automatic annotation tool using an Adaptive Information Extraction engine (Amilcare)to support the user in document annotation
MetaDesk MetaDesk is an RDF authoring tool that emphasizes entry of facts, rather than construction of ontologies. MetaDesk places no restrictions on vocabulary-users can invent terms on-the-fly, which the system converts into underlying RDF structures.
MetaMatrix Semantic vocabulary mediation and other tools
Metatomix Commercial semantic toolkits and editors
MindRaider Open source semantic Web outline editor
MnM MnM is an annotation tool which provides both automated and semi-automated support for annotating web pages with semantic contents. MnM integrates a web browser with an ontology editor and provides open APIs to link to ontology servers and for integrating information extraction tool
Model Futures OWL Editor Simple OWL tools, featuring UML (XMI), ErWin, thesaurus and imports
Morla Morla is an editor of RDF documents that allows you to manage more RDF documents simultaneously, visualize graphs, and use templates for quick writing. You can import RDFS documents and use their content to write new RDF triples. Templates are also RDF documents, and they make Morla easily personalizable and expandable. You can also use Morla as an RDF navigator, browsing the RDF documents present on the Internet exactly as you are used to doing with normal browsers
Mulgara The Mulgara Semantic Store is an Open Source, massively scalable, transaction-safe, purpose-built database for the storage and retrieval of RDF, written in Java. It is an active fork of Kowari
Muskrat-II Given a set of knowledge bases and problems solvers, the Muskrat system will try to identify which knowledge bases could be combined with which problems solvers to solve a given problem.
MyPlanet MyPlanet allows users to create a personalised version of a web based newsletter using an ontologically based profile.
Net OWL Entity extraction engine from SRA International
NMARKUP NMARKUP helps the user build ontologies by detecting nouns in texts and by providing support for the creation of an ontology based on the entities extracted.
Nokia Semantic Web Server An RDF based knowledge portal for publishing both authoritative and third party descriptions of URI denoted resources
Nuin BDI Agent Engine A Java BDI agent engine for semantic web agents
Oink OINK is a browser for RDF data. OINK queries data in an RDF triple store, and renders it as XHTML pages (essentially, one page per each node in the graph, on demand). This allows one to view RDF (and OWL) data in a very clear, intuitive way. OINK is built on top of Wilbur
OMCSNet-WordNet The OMCSNet-WordNet project aims to improve the quality of the OMCSNet dataset by using automated processes to map WordNet synonym sets to OMCSNet concepts and import additional semantic linkage data from WordNet. It is based on OMCSNet 1.2, a semantic network and inference toolkit written in Python/Java. OMCSNet currently contains over 280,000 separate pieces of common sense information extracted from the raw OMCS dataset. This project is also based on WordNet, an online lexical reference system that in recent years has become a popular tool for AI researchers
ONDEX Suite Framework for text mining, data integration and data analysis. Keywords: ontology and graph alignment, relation mining, warehouse, semantic database integration, bioinformatics, systems biology, microarray, Java, Postgres, LINUX
ONTOCOPI A tool which uncovers Communities Of Practise by analysing the connectivity of instances in the 3store knowledge base.
OntoEdit/OntoStudio Engineering environment for ontologies
Ontology Organizer A DAML+OIL ontology editor with constraint propagation functionality to ensure that constraints applied to properties and restrictions are correctly propagated through an ontology, and datatype management functionality for manipulating custom datatypes
OntoMat Annotizer Interactive Web page OWL and semantic annotator tool
OntoPortal Enables the authoring and navigation of large semantically-powered portals
OpenLink Data Spaces (ODS) ODS is a distributed collaborative application platform for creating Semantic Web applications such as: blogs, wikis, feed aggregators, etc., with built-in SPARQL support and incorporation of shared ontologies such as SIOC, FOAF, and Atom OWL. ODS is an application of OpenLink Virtuoso and is available in Open Source and Commercial Editions.
Oracle Spatial 10g Oracle Spatial 10g includes an open, scalable, secure and reliable RDF management platform
Oyster Peer-to-peer system for storing and sharing ontology metadata
OWL API A Java interface and implementation for the W3C Web Ontology Language (OWL), used to represent Semantic Web ontologies. The API is focused towards OWL Lite and OWL DL and offers an interface to inference engines and validation functionality
OWL Consistency checker OWL Consistency checker (based on Pellet)
OWL-DL Validator WonderWeb OWL-DL Validator
OWLJessKB OWLJessKB is a description logic reasoner for OWL. The semantics of the language is implemented using Jess, the Java Expert System Shell. Currently most of the common features of OWL lite, plus some and minus some
OWLLib This is PHP library for accessing OWL files. OWL is w3.org standard for storing semantic information
OWLIM OWLIM is a high-performance semantic repository, packaged as a Storage and Inference Layer (SAIL) for the Sesame RDF database
OWLViz OWLViz is visual editor for OWL and is available as a Protégé plug-in
Pedro Pedro is an application that creates data entry forms based on a data model written in a particular style of XML Schema. Users can enter data through the forms to create data files that conform to the schema. They can use controlled vocabularies to mark-up text fields and have the application perform basic validation on field data
Pellet Pellet is an open-source Java based OWL DL reasoner. It can be used in conjunction with both Jena and OWL API libraries; it can also be downloaded and be included in other applications
Piggy Bank A Firefox-based semantic Web browser
Pike A dynamic programming (scripting) language similar to Java and C for the semantic Web
Platypus Wiki Platypus Wiki is an enhanced Wiki Wiki Web with ideas taken from Semantic Web. It offers a simple user interface to create a Wiki Page plus metadata according with W3C standards. It uses RDF/RDFS and OWL to create ontologies and manage metadata
POR Protege+OWL+Ruby (POR) Utilities provides an ontology, a set of ruby classes and methods to simplify the development of Protege+OWL Ontology Driven applications. At the moment project is limited to JRuby
pOWL Semantic Web development platform
Protégé Open source visual ontology editor written in Java with many plug-in tools
Pytypus Wiki Pytypus is a Semantic Web project. In Pytypus, RDF is the base of comunication between agents in the semantic net. Every URI in the semantic net has its owner that rule its behavior
RACER A collection of Projects and Tools to be used with the semantic reasoning engine RacerPro
RacerPro RacerPro is an OWL reasoner and inference server for the Semantic Web
Raptor The Raptor RDF parser toolkit is a free software / Open Source C library that provides a set of parsers and serializers that generate Resource Description Framework (RDF) triples by parsing syntaxes or serialize the triples into >a syntax. The supported parsing syntaxes are RDF/XML, N-Triples, Turtle, RSS tag soup including Atom 1.0 and 0.3, GRDDL for XHTML and XML. The serializing syntaxes are RDF/XML (regular, and abbreviated), N-Triples, RSS 1.0, Atom 1.0 and Adobe XM
Rasqual Rasqal is a C library for querying RDF, supporting the RDQL and SPARQL languages. It provides APIs for creating a query and parsing query syntax. It features pluggable triple-store source and matching interfaces, an engine for executing the queries and an API for manipulating results as bindings. It uses the Raptor RDF parser to return triples from RDF content and can alternatively work with the Redland RDF library’s persistent triple stores. It is portable across many POSIX systems
rdfabout.com’s Validator RDF/XML and N3 validator
RDF Filter This program acts as a filter layer between SAX (The Simple API for XML) and the higher-level RDF (Resource Description Format), an XML-based object-serialization and metadata format. The RDF filter library is used by several RDF-based projects
RDF Gateway Intellidimension’s RDF Gateway is an RDF Triple database with RDFS reasoning and SPARQL interface
RDF InferEd Intellidimension’s RDF InferEd is an authoring environment with the ability to navigate and edit RDF documents
RDFizers RDFIzers arew little conversion tools for converting a source file in a given format to RDF. RDFizers are provided for JPEG, MARC/MODS, OAI-PMH, OCW, EMail, BibTEX, Flat, Weather, Java, Javadoc, Jira, Subversion and Random. In addition, the project page has links to other third-party RDF converters for iCal, Palm, Outlook, RFC822, Garmin, EXIF, Fink, D2RQ, D2RMAP, XLS, CSV, XSD, XML and MPEG-7/CS
RDFLib RDFLib, an RDF libary for Python, including a SPARQL API. The library also contains both in-memory and persistent Graph backends
RDFReactor Access RDF from Java using inferencing
RDF Server The RDF server of the PHP RAP environment
RDFStore RDFStore is an RDF storage with Perl and C API-s and SPARQL facilities
RDFSuite The ICS-FORTH RDFSuite open source, high-level scalable tools for the Semantic Web. This suite includes Validating RDF Parser (VRP), a RDF Schema Specific DataBase (RSSDB) and supporting RDF Query Language (RQL)
RDFX RDFX is a suite of plug-ins for the Eclipse platform designed to encourage and facilitate experimentation of semantically enhanced applications
Redfoot Redfoot is a hypercoding system which is being used to create a webized operating system and is also being used to create applications. It is built around the notion of an RDF Graph for persistence rather than a File Tree
Redland The Redland RDF Application Framework is a set of free software libraries that provide support for RDF. It provides parser for RDF/XML, Turtle, N-triples, Atom, RSS; has a SPARQL and GRDDL implementation, and has language interfaces to C#, Python, Obj-C, Perl, PHP, Ruby, Java and Tcl
RelationalOWL Automatically extracts the semantics of virtually any relational database and transforms this information automatically into RDF/OW
ReTAX+ ReTAX is an aide to help a taxonomist create a consistent taxonomy and in particular provides suggestions as to where a new entity could be placed in the taxonomy whilst retaining the integrity of the revised taxonomy (c.f., problems in ontology modelling).
Refiner++ REFINER++ is a system which allows domain experts to create and maintain their own Knowledge Bases, and to receive suggestions as to how to remove inconsistencies, if they exist.
Rhizome Wiki Rhizome is a Wiki-like content management and delivery system that exposes the entire site including content, structure, and metadata as editable RDF. This means that instead of creating a site with URLs that correspond to a page of HTML, you can create URLs that represent just about anything. It was designed to enable non-technical users to create these representations in an easy, ad-hoc manner. For developers, this allows both content and structure to be easily repurposed and complex Web applications to be rapidly developed
Rx4RDF Rx4RDF shields developers from the complexity of RDF by enabling you to use familar XML technologies like XPath, XSLT and XUpdate to query, transform and manipulate RDF. Also included is Rhizome, a wiki-like application for viewing and editing RDF models
Seamark Navigator Siderean’s Seamark Navigator provides a platform to combine Web search pages with product catalog databases, document servers, and other digital information from both inside and outside the enterprise
Searchy Searchy is a metasearch engine that is able to integrate information from a wide range of sources performings a semantic translation into RDF. It has a distributed nature and is specially suitable to integrate information across different organisations with a minimun coupling
SECO SECO proivdes mediation services for Semantic Web data, comprising data acquistion and data integration mediators. A SECO mediator comprises an HTTP server, an RDQL parser, and means to fetch data via RDQL/HTTP. User interface and scutter can accept commands via HTTP GET, where the user interface serves HTML pages, and the scutter fetches a page
Semantic Annotation with MnM MnM is a semantic annotation tool which provides manual, automated and semi-automated support for annotating web pages with ‘semantics’, i.e., machine interpretable descriptions.
Semantical Open source semantic Web search engine
Semantic Engine The Semantic Engine is a standalone indexer/search application. Mac OSX only; Windows and Linux versions are on their way
Semantic Explorer The Semantic Explorer allows you to enter a search query and watch as the resulting sub-graph is layed out on screen, visually clustering documents and terms together. Mac OS X only
Semantic Tools for Web Services Semantic Tools for Web Services is a set of Eclipse plug-ins that allow developers to insert semantic annotations into a WSDL document to describe the semantics of the input, output, preconditions, and effects of service operations. A second plug-in matches the description of the service or composition of services to that for which a developer is searching. This technology is part of the Emerging Technologies Toolkit (ETTK)
Semantic Web It includes Ontology, Knowledge-base Representation, Description Logic, and Agent Development for the next Generation Web – the Semantic Web. It is designed to use OWL, DAML+OIL, RDFs, RDF, or XML syntax to design ontology; developed using J2EE
Semantic Web Assistant The Semantic Web Assistant combines the capabilities of production rule systems with RDF data on the Semantic Web. It lets users define rules that work with RDF data in order to carry out actions like e-mail notification etc.
SemanticWorks A visual RDF/OWL Editor from Altova
Semantic Mediawiki Semantic extension to the MediaWiiki wiki
Semantic Net Generator Utility for generating topic maps automatically
SemWeb SemWeb for .NET supports persistent storage in MySQL, Postgre, and Sqlite; has been tested with 10-50 million triples; supports SPARQL
Sesame Sesame is an open source RDF database with support for RDF Schema inferencing and querying. It offers a large scale of tools to developers to leverage the power of RDF and RDF Schema
SHAME (Standardized Hyper Adaptible Metadata Editor) SHAME is a metadata editing and presentation framework for RDF metadata. Annotation profiles are then used to generate user interfaces for either editing, presentation or querying purposes. The user interface may be realized in a web setting (both a jsp and velocity version exists) or in a stand alone application (a java/swing version exists)
SMART System for Managing Applications based on RDF Technology
SMORE OWL markup for HTML pages
SOFA SOFA is a Java API for modeling ontologies and Knowledge Bases in ontology and Semantic Web applications. It provides a simple, abstract and language neutral ontology object model, inferencing mechanism and representation of the model with OWL, DAML+OIL and RDFS languages; from java.dev
Solvent Solvent is a Firefox extension that helps you write Javascript screen scrapers for Piggy Bank
SPARQL Query language for RDF
SPARQLer SPARQL query demo and service
SPARQLette A SPARQL demo query service
SPARQL JavaScript Library SPARQL JavaScript Library interfaces to the SPARQL Protocol and interpret the return values as part of an Ajax framework
Surnia Surnia can check an OWL ontology/knowledge base for inconsistency and entailments. It is implemented as a wrapper around first-order theorem prover (OTTER, for now at least). Unlike Hoolet (which turns the OWL into FOL), Surnia just turns the OWL into triples and mixes in axioms
SWCLOS A semantic Web processor using Lisp
SWI-Prolog SWI-Prolog is a comprehensive Prolog environment, which also includes an RDF Triple store. There is also a separate Prolog library to handle OWL
Swish Swish is a framework for performing deductions in RDF. It has similar features to CWM. It is written for Haskell developers
Swoogle A semantic Web search engine with 1.5 M resources
SWOOP A lightweight ontology editor
Thema Thema is an XML based data format (DTD) for thesauri, glossaries, lexicons, conceptual maps etc. up to ontologies. It contains publishing tools to convert into HTML, RDF etc. and to read different formats and is has a connection to the Semantic Web
ThoughtTreasure ThoughtTreasure is a comprehensive platform for natural language processing and commonsense reasoning. It runs on PCs and Unix and includes 20,000 concepts organized into a hierarchy, 50,000 English and French words and phrases, a syntactic and semantic parser and an English and French generator. Application programs can use ThoughtTreasure to obtain answers to questions easily answered by humans but previously difficult for computers
Timeline Timeline is a DHTML-based AJAXy widget for visualizing time-based events. It is like Google Maps for time-based information
TopBraid Composer Top Quandrant’s TopBraid Composer is a complete standards-based platform for developing, testing and maintaining Semantic Web applications
Treebank The Penn Treebank Project annotates naturally-occuring text for linguistic structure. Most notably, we produce skeletal parses showing rough syntactic and semantic information — a bank of linguistic trees. We also annotate text with part-of-speech tags
Trellis Trellis is an interactive environment that allows users to add their observations, viewpoints, and conclusions as they analyze information by making semantic annotations to documents and other on-line resources
Tripple TRIPLE is an RDF query, inference, and transformation language for the Semantic Web
Trippi Trippi is a Java library providing a consistent, thread-safe access point for updating and querying a triplestore. It is similar in spirit to JDBC, but for RDF databases
Tucana Suite Northrop Grumman’s Tucana Suite is an industrial quality version of the Kowari metastore
Turtle Terse RDF “Triple” language
Visualisations for the CS AKTive Portal Maps are used to geographically illustrate knowledge from the Triplestore, such as highlighting the locations in the UK that are active in a particular research area.
VisuaText VisualText ® is an integrated development environment for building information extraction systems, natural language processing systems, and text analyzers
W3C’s RDF Validator W3C’s RDF Validator
WebCAT WebCAT is an extensible tool to extract meta-data and generate RDF descriptions from existing Web documents. Implemented in Java, it provides a set of APIs (Application Programming Interfaces) that allow one to analyse text documents from the Web without having to write complicated parsers
WebOnto WebOnto supports the browsing, creation and editing of ontologies through coarse grained and fine grained visualizations and direct manipulation.
Welkin Welkin is a graph-based RDF visualizer.
WGFA WGFA (Web Gateway for Fact Assessment) is a web application to create and manage W3C-OWL based ontologies, index websites, extract XML-RDF or Dublin-Core metadata and provide search and query operations on the websites based on the created semantic webs
Wilbur Wilbur is lisp based toolkit for Semantic Web Programming. Wilbur is Nokia Research Center’s toolkit for programming Semantic Web applications that use RDF written in Common Lisp
WOM The IBM Web Ontology Manager (WOM) is a lightweight, J2EE Web-based system for managing Web Ontology Language (OWL) ontologies. It enables developers to browse or search the ontologies registered with the system by class or property names. In addition, they can submit a new ontology file
Wraf Wraf (Web resource application framework) implements a RDF API that hopes to realize the Semantic Web. The framework uses RDF for data, user interface, modules and object methods. It uses interfaces to other sources in order to integrate all data in one enviroment, regardless of storage
WSMO Studio A semantic Web service editor compliant with WSMO as a set of Eclipse plug-ins
WSMT Toolkit The Web Service Modeling Toolkit (WSMT) is a collection of tools for use with the Web Service Modeling Ontology (WSMO), the Web Service Modeling Language (WSML) and the Web Service Execution Environment (WSMX)
WSMX Execution environment for dynamic use of semantic Web services
xml2owl Up to now, most ontologies are created manually, which is very time-expensive. The goal is it, to produce ontologies automatically via XSLT, which fit as good as possible to a given XML-file resp. XML-Schema-file
XML Army Knife XML Army Knife
XMP A labeling technology from Adobe that enables data about a file to be embedded as metadata into the file itself.
YARS YARS (Yet Another RDF Store) is a data store for RDF in Java and allows for querying RDF based on a declarative query language, which offers a somewhat higher abstraction layer than the APIs of RDF toolkits such as Jena or Redland
Zeus Agent Toolkit Zeus provides a graphical environment to build distributed agent systems. A rule engine, planner and visualisation tools are included. The released version contains some extensions for the DAML semantic web project and Web Services integration features
Zotero Firefox add-in (in development) that allows the auto-completion of online citations
ZTM (Zope Topic Map) ZTM aims to enable distributed development and maintenance of ‘topic map’-driven ‘semantic’ web sites by handling data model information items derived from the ISO 13250 Topic Map Data Model as managed content using the Zope CMF

Posted by AI3's author, Mike Bergman Posted on October 4, 2006 at 10:03 am in Adaptive Information, Semantic Web, Semantic Web Tools | Comments (26)
The URI link reference to this post is: http://www.mkbergman.com/291/comprehensive-listing-of-250-semantic-web-tools-updated/
The URI to trackback this post is: http://www.mkbergman.com/291/comprehensive-listing-of-250-semantic-web-tools-updated/trackback/
Posted:September 29, 2006

Matt Asay, of OSBC and Alfresco, makes a very telling point in a recent post: One power of open source (if done right) is its suitability to interoperability and extensibility. As Matt states:

. . . let me give him/you an idea of what we’re already doing in this space. It’s not a question of what we might do, but what we’re already doing. You can get Alfresco integrated with Asterisk (VoiceRD from Novacoast) and SugarCRM (CRM) today. (And since our 1.4 Business Process Management release, we already have BPM in spades.)

Now extend this. Add some JasperSoft or Pentaho for Business Intelligence (perhaps reporting capabilities). Some DimDim for web conferencing. Some Zimbra or Scalix for email/collaboration. Want to scale this out on a grid? Get yourself some 3Tera. Etc. The great thing about all of this is that we don’t have to do all of it ourselves. In many instances, enterprises are already extending Alfresco (or these other projects) to meet these and other needs. Hence, when a large pharmaceutical/medical devices company wanted wiki functionality in Alfresco, it didn’t ask us. It just built it in.

One could certainly make the argument that first-generation open source like Linux was adopted for cost, risk and code-access purposes, and that second-generation open source like JBoss or Red Hat was adopted because of completeness and support across a broader portion of the stack. But I think what we are now seeing in third-generation open source efforts like Alfresco or LogicBlaze is the enterprise-scale integration and interoperability of components.

Open source combined with open standards avoid vendor lock-in and points the way to a very, very different application and deployment paradigm: identifying, evaluating and glueing, rather than baking the cake each time from scratch.

Posted by AI3's author, Mike Bergman Posted on September 29, 2006 at 12:55 pm in Open Source | Comments (0)
The URI link reference to this post is: http://www.mkbergman.com/290/the-power-of-open-source/
The URI to trackback this post is: http://www.mkbergman.com/290/the-power-of-open-source/trackback/
Posted:September 27, 2006

Thanks to a post from NewsForge on Open source search technology goes beyond keywords, I was directed to a description of the Semantic Indexing Project at Middlebury College. Aaron Coburn, the lead developer of the project, says his team is currently documenting its open source search toolkit and finishing up a new desktop search application that should be released later this month. From the project Web site:

The National Institute for Technology in Liberal Education (NITLE) and Middlebury College have been experimenting with algorithms to help unstructured data organize itself into conceptually useful categories without human intervention. Part of our motivation is to find an alternative to spending prohibitive amounts of time and money on marking up course materials, documents, and online collections with metadata by hand. For many of the most common markup standards in use today, such as SCORM or Dublin Core, it can actually take longer to create markup than it did to create the course materials themselves.

The method being applied is a more scalable variant of latent semantic indexing that the team calls contextual network graphing. A PDF paper from the project, Semantic Search of Unstructured Data using Contextual Network Graphs by Maciej Ceglowski, Aaron Coburn and John Cuadrado explains this promising technique in greater detail and notes its debt to a 1981 Ph.D. dissertation by Scott Preece at the University of Illinois describing an almost identical technique under the name spreading activation search.

The Semantic Indexing Project is an umbrella effort over a number of subsidiary projects including a blog census, literary analysis tool, refinement of search and clustering algorithms, bioinformatics, use of ontologies, and semantic relationship visualization through a Semantic Explorer, as this example shows:

All of the source code is available for download from the project, published under the terms of the GNU General Public License. The project’s core technology is the Semantic Engine, which is distributed with its C++ code, Perl bindings, and all the necessary code for building the GUI. A new desktop application, called the the Standalone Engine, will be available later this month.

This work looks very, very promising as a step forward to bringing automation to semantic Web markup, among related advantages deriving from tagged documents.