I wrote the following in response to a recent press inquiry asking me to define the terms “semantic Web” and “industry standards”. It got me to thinking about how some new companies are misappropriating the terms:
The “Semantic Web” is a vision first promoted by Tim Berners-Lee, founder of the WWW and director of the W3C standards consortium [1,2,3]. In its full sense, understood to require many years to reach fruition, today’s document Web evolves into a Web of data with machines being able to understand the meaning of that data and to interoperate and take action on it, performing many useful tasks for people such as finding relevant and desired information and doing and interconnecting stuff automatically. This longer-term vision is often expressed as the “uppercase” Semantic Web.
Nearer term, the evolution to a Web of data still occurs but the aspirations are more immediate and at hand. Important Web data is broken out and expressed in ways that aid interconnections and interoperability. Key sources, like Wikipedia and Census data and much else is now expressed at the more atomic data and object (as opposed to Web page) level, that leads to meaningful linkages and interoperability.
This partial vision, also supported by Berners-Lee (and, of course, many others), is being demonstrated by the linked data initiative , bringing meaningful results to both machines and humans, and is often called the “lowercase” semantic Web. Others have also called this phase in the Web’s evolution “Web 3.0″ (a phrase I dislike however because it conveys little meaning nor compliance to any standards).
Many wonderful and dedicated people have been working towards these visions for a decade or more. Some adhere more to the “pure” uppercase expression of the vision; others are more near-term and pragmatic lowercase in nature. The press sometimes likes to see these differences in viewpoint as expressions of controversy or dispute in the community, but, to my own lights, I think they are more differences in perspective than objectives.
The common thread is the “semantics”, or the meaning, of the data. If we know that two pieces of information or data are related in meaning than we can act accordingly upon them.
In any case, the mechanisms by which semantic interoperability occur are via standards, nearly all developed and promulgated by the W3C. Key semantic Web standards include URIs (of course), Resource Description Framework (RDF)  that defines the “triples” of how to express data relationships between subjects and objects (the two pieces of data), RDF Schema or the Web ontology language (OWL)  for how to describe data domains and their structure and vocabularies, GRDDL  for converting common information to RDF, SPARQL  for how to query compliant semantic data stores, and of course many others.
By “industry compliant”, we mean that it conforms to all of these open standards guiding this evolution to the Semantic Web. And, obviously, via this compliance, we are then able to easily interoperate with others that also so conform.
While there are certainly cases and issues where I disagree with the definitions or specific uses of semantic Web concepts by the World Wide Web consortium (W3C), without these standards there would be chaos and no interoperability moving forward.
So, while I think it is fair game to criticize and lobby for changes in the W3C’s promulgations, it is not “compliant” to not use the standards. Beware of emerging companies that claim the mantle of the semantic Web — or worse, still, push the fluff of Web 3.0 — but do not adhere to these standards. For in the end, they are not the semantic Web, but just another example of the proprietary dinosaurs of the past.