Marko Rodriguez has been one of the most exciting voices in graph applications and theory with relevance to the semantic Web over the past five years. He is personally innovating an entire ecosystem of graph systems and tools for which all of us should be aware.
The other thing about Marko I like is that he puts thoughtful attention and graphics to all of his posts. (He also likes logos and whimsical product names.) The result is that, when he presents a new post, it is more often than not a gem.
Today Marko posted what I think is a keeper on graph-related stuff:
I personally think it is a nice complement to my own Age of the Graph of a few months back. In any event, put Marko’s blog in your feed reader. He is one of the go-to individuals in this area.
Shun the Frumious Bandersnatch?The Web and open source have opened up a whole new world of opportunities and services. We can search the global information storehouse, connect with our friends and make new ones, form new communities, map where stuff is, and organize and display aspects of our lives and interests as never before. These advantages compound into still newer benefits via emergent properties such as social discovery or bookmarking, adding richness to our lives that heretofore had not existed.
And all of these benefits have come for free.
Of course, as our use and sophistication of the Web and open source have grown we have come to understand that the free provision of these services is rarely (ever?) unconditional. For search, our compact is to accept ads in return for results. For social networks, our compact is give up some privacy and control of our own identities. For open source, our compact is the acceptance of (generally) little or no support and often poor documentation.
We have come to understand this quid pro quo nature of free. Where the providers of these services tend to run into problems is when they change the terms of the compact. Google, for example, might change how its search results are determined or presented or how it displays its ads. Facebook might change its privacy or data capture policies. Or, OpenOffice or MySQL might be acquired by a new provider, Oracle, that changes existing distribution, support or community involvement procedures.
Sometimes changes may fit within the acceptable parameters of the compact. But, if such changes fundamentally alter the understood compact with the user community, users may howl or vote with their feet. Depending, the service provider may relent, the users may come to accept the new changes, or the user may indeed drop the service.
But there is another aspect of the use of free services, the implications of which have been largely unremarked. What happens if a service we have come to depend upon is no longer available?
Abandonment or changes in service may arise from bankruptcy or a firm being acquired by another. My favorite search service of a decade ago, AltaVista, and Delicious are two prominent examples here. Existing services may be dropped by a provider or APIs removed or deprecated. For Google alone, examples include Wave and Gears, Google Labs, and many, many APIs. (The howls around Google Translate actually caused it to be restored.) And existing services may be altered, such as moving from free to fee or having capabilities significantly modified. Ning and Babbel are two examples here. There are literally thousands of examples of Web-based free services that have gone through such changes. Most have not seen widespread use, but have affected their users nonetheless.
There is nothing unique about free services in these regards. Ford was able to cease production of its Edsel and change the form factor of the Thunderbird despite some loyal fans. Sugar Pops morphed into a variety of breakfast cereal brands. Sony Betamax was beat out by VHS, which then lost out to CDs and now DVDs. My beloved Saabs are heading for the dustbin, or Chinese ownership.
In all of these cases, as consumers we have no guarantees about the permanence of the service or the infrastructure surrounding it. The provider is solely able to make these determinations. It is no different when the service or offering is free. It is the reality of the marketplace that causes such changes.
But, somehow, with free Web services, it is easy to overlook these realities. I offer a couple of personal case studies.
I have earlier described the five different versions of site search that I have gone through for this blog. The thing is, my current option, Relevanssi, is also a free plug-in. What is notable about this example, though, is the multiple attempts and (unanticipated) significant effort to discover, evaluate and then implement alternatives. Unfortunately, I rather suspect my current option may itself — because of the nature of free on the Web — need to be replaced at some time down the road.
Part of what caused me to abandon Google Custom Search as one of the above search options was the requirement I serve ads on my blog to use it. So, when I decided to eliminate ads entirely in 2010 I not only gave up this search option, but I also lost some of the better tracking and analytics options also provided for free by Google. Fortunately, I had also adopted FeedBurner early in the life of this blog. It was also becoming increasingly clear that feed subscribers — in addition to direct site visitors — were becoming an essential metric for gauging traffic.
I thus had a replacement means for measuring traffic trends. Google (strange how it keeps showing up!) had purchased FeedBurner in 2007, and had made some nice site and feature improvements, including turning some paid services into free. The service was performing quite well, despite FeedBurner’s infamous knack to lose certain feed counts periodically. However, this performance broke last Summer when my site statistics indicated a massive drop in subscribers.
The figure below, courtesy of Feed Compare, shows the daily subscriber statistics for my AI3 blog for the past two years. The spikiness of the curve affirms the infamous statistics gaps of the service. The first part of the curve also shows nice, steady growth of readers, growing to more than 4000 by last Summer. Then, on August 16, there was a massive drop of 85% in my subscriber counts. I monitored this for a couple of days, thinking it was another temporary infamous event, then realized something more serious was afoot:

It was at this point I became active on the Google group for FeedBurner. Many others had noted the same service drop. (The major surmise is that FeedBurner now is having difficulty including Feedfetcher feeds, which is interesting because it is the feed of Google’s own Reader service, and the largest feed aggregation source on the Web.)
Over the ensuing months until last week I posted periodic notices to the official group seeking clarification as to the source of these errors and a fix to the service. In that period, no Google representative ever answered me, nor any of the numerous requests by others. I don’t believe there has been a single entry on any matter by Google staff for nearly the past year.
I made requests and inquiries no fewer than eight times over these months. True, Google had announced it was deprecating the FeedBurner API in May 2011, but, in that announcement, there was no indication that bug fixes or support to their own official group would cease. While it is completely within Google’s purview to do as it pleases, this behavior hardly lends itself to warm feelings by those using the service.
Finally, last week I dropped the FeedBurner stats and installed a replacement WordPress plugin service [1]. It was clear no fixes were forthcoming and I needed to regain an understanding of my actual subscriber base. The counts you now see on this site use this new service; they show the continuation of this site’s historical growth trend.
It is not surprising that in the prior discussions Google figures prominently. It is the largest provider of APIs and free services on the Web. But, even with its continuing services, I am seeing trends that disturb me in terms of what I thought the “compact” was with the company.
I’m not liking recent changes to Google’s bread and butter, search. While they are doing much to incorporate more structure in their results, which I applaud, they are also making ranking, formatting and presentation changes I do not. I am now spending at least us much of my search time on DuckDuckGo, and have been mightily impressed with its cleanliness, quality and lack of ads in results.
I also do not like how all of my current service uses of Google are now being funneled into Google Plus. I am seeing an arrogance that Google knows what is best and wants to direct me to workflows and uses, reminiscent of the arrogance Microsoft came to assume at the height of its market share. How does that variant of Lord Acton’s dictum go? “Market share tends to corrupt, and absolute market share corrupts absolutely.”
We are seeing Google’s shift to monetize extremely popular APIs such as Maps and Translate. My company, Structured Dynamics, has utilized these services heavily for client work in the past. We now must find alternatives or cost the payment for these services into the ongoing economics of our customer installations. Of course, charging for these services is Google’s right, but it does change the equation and causes us to evaluate alternatives.
I fear that Google may be turning into a frumious Bandersnatch. I’m not sure we will shun it, but we certainly are changing our views of the basis by which we engage or not with the company and its services. Once we shift from a basis of free, our expectations as to permanence and support change as well.
This is not a diatribe against Google nor a woe is us. Us big kids have come to know that there is no such thing as a free lunch. But that message is getting reaffirmed now more strongly in the Web context.
There can be benefits from seeking, installing or adapting to new alternatives with different service profiles when dependent services are abandoned or deprecated. Learning always takes place. Accepting one’s own responsibility for desired services also leads to control and tailoring for specific needs. Early use of free services also educates about what is desired or not, which can lead to better implementation choices if and when direct responsibility is assumed.
But, in some areas, we are seeing services or uses of the Web that we should adopt only with care or even shun. Business opportunities that depend on third-party services or APIs are very risky. Strong reliance on single-provider service ecosystems adds fragility to dependence. Own systems should be designed to not depend too strongly on specific API providers and their unique features or parameters.
Free is not forever, and it is conditional. Substitutability is a good design practice to embrace.
Overcoming the Limitations of WordPress SearchSince the inception of this AI3 blog a bit over six years ago, I have gone through five different approaches to local site search, all geared to overcome the limitations of WordPress‘ native search function. The current and last iteration uses the Relevanssi plug-in, the best I have used so far. (Check it out yourself in the search box to the upper right.) I describe these five iterations in this post.
When first released, AI3 used the native search that comes with the WordPress installation (when first installed that was WP version 1.5; the current version is at 3.2.1). That was OK when few knew of my site and the number of visitors was low.
But the WP search is known to suck, mostly because of search results based on date posted not relevance and its slow performance. Once I began to get more traffic, it was time for a change.
The option I have kept longest on this site is Google’s Custom Search. When first announced at the end of 2006 it was a real godsend and very innovative. I installed my first version in January 2007 and continued to make modifications and use it up through April 2010. I used it on various sites with many different types of Custom Search implementations.
Unfortunately, to use the free version it is necessary to include ads that Google provides. For a while, this served my purposes, since I was actively trying to learn whether ad revenues were viable for a standard blog and what kinds of traffic are necessary to produce meaningful revenues. However, by early 2010 I had come to the conclusion that — even with a quite popular blog for its niche — that ad revenues would never be that meaningful and it was not worth cluttering up my site. So I ended my experiment with Google ads and, being cheap, chose not to use the paid version of the search service and thus dropped the system.
What I liked:
What I did not like:
Microsoft’s Bing was starting to come on strongly at that time so I decided next to try the Bing Site Owner’s service. I began this new approach immediately upon retiring Google.
What I liked:
However, without direct notice, Microsoft ended this service as of April of this year.
What I did not like:
I was pretty pleased with the Bing service and would likely have continued using it because it wasn’t broke. But, the sudden plug-pulling was offputting.
Thus, I decided, heck, if I was going to have to go through the effort of learning the new Bing API, I might as well learn to do it all myself.
So, it was back to researching options and WP plug-ins on the Web. After assembling the options, I first chose to go with WPSearch 2. The thing that most initially attracted me to this option was its reliance on the Lucene open source search engine, the same option that my company Structured Dynamics uses in its Solr text indexing for the Open Semantic Framework (OSF).
Since my AI3 blog theme is of my own design with many changes over the years, I had lost its original capabilities in having a native search form and search results page. So, my first task after installing the WPSearch plugin and indexing my content was to add these pages to my theme. The WP Codex has an OK set of instructions on creating a search page and related discussion.
There are some valuable tutorials out there that explain how this is done; I refer to them rather than repeat such information here.
I completed this work and kept WPSearch 2 up and active on my site for roughly the past week. But, I also kept trying to achieve some of the aspects I wanted in formatting and organizing search results and became increasingly frustrated. I also experienced numerous freezes and white screens and fatal PHP errors while editing new pages or deleting comment spam that told me I simply had to abandon this option.
In summary, what I liked:
What I did not like:
I’m sorry that I needed to abandon this option, since I do view highly the underlying Lucene text engine. But, the integration with existing WP functionality and other modules was not fully baked. I think with more work, including exposing more of the Lucene search API functionality, that this option could redeem itself. But, as of today, it is not reliable enough for my site.
In trying to find hacks and workarounds to some of the desires and issues noted above, I had come across reference to the Relevanssi plug-in, which appeared to embrace much of what I was looking to achieve. The download is quite small (100 K) and must therefore use the native WP MySQL for the index, but it is feature rich and has a strong relevance-ranking and with ranking flexibility. There is great flexibility and configurability in how search results get presented, also an attraction.
Installation of this system and then indexing was very clean and straightforward. It has a syntax that readily supports the Boolean AND operator (the default behavior I have set for the site) (if the AND search finds no matches, it will automatically do an OR search) and phrase searching, with the prior links showing examples from this blog (also see the search form at upper right).
As implemented, then, here is the listing of major features in Relevanssi:
Here is a screen capture of the complete configuration menu in WordPress for Relevanssi:
For further information, you may also want to see some more advanced search functions and the Relevanssi knowledge base.

As of yesterday, the readership on this AI3 blog passed 3000 daily for the first time. It has been steadily inching upward, and finally passed that minor milestone. Thank you!
I’ve been writing this blog for five years now, with some 400 total posts, or about 1.5 blog posts per week. I know my style is toward longer articles and less frequent posting, most often of a fairly detailed or technical nature. And, while I have a Twitter account, I do not bleat. My style is for more meaty discussions. Perhaps it belies my age.

The real growth in this blog, however, has come about with my conscious attempt to write for the enterprise audience. RDF, the semantic enterprise, linked data and ontologies need a bridge from the technical community to the one of practitioners. Much progress and uptake has been occurring with these business and government audiences.
At the recent SemTech meeting, I was taken aside by many individuals noting my blog posts and thanking me for the thought and effort behind them. Thank you for noticing, and reading, and you are welcome. We need more translation of semantic topics and technologies to pragmatic terms.
If you have been following the standard W3C and SemWeb mailing lists recently, you will have noticed an anxiety and a continuation of the fractious nature of this “community”. In part this comes about because there are efforts afoot to revisit the RDF specs. But, mostly, I think, it is the ongoing nature of many in this group to snatch defeat from the jaws of victory. The search by some for perfection and insistence on parochial needs and preferences can give a pettiness to this “community” that is unbecoming.
Many of us have abandoned those forums for those reasons. As for myself, I will continue to evangelize to the buying market and keep the gaze pointing outward. There is a wealth of need for tools, techniques, methods, documentation, structures, and narratives. Thanks to all of you, the readership of this blog, for continuing to affirm this value.
So, in the great scheme of things, the readership of this blog is quite small in comparison with the big boys. On the other hand, very few individuals have higher numbers, and all of this for a fairly esoteric area. I think this proves there is a market and a need out there for semantic solutions.
Thanks again! And, for those in the United States, have a most enjoyable 4th of July holiday!

OK. After an experiment of more than three years, I have just now canceled my Google AdSense participation. (Which, Google, by the way, makes almost impossible to do: Finding the cancel link is hard enough; but who remembers the day they first signed up for ads and how many impressions they got that day? Both are required to get a cancellation request approved. Give me a break. It is worse than banks claiming small digits from bank interest for their own income!!)
Despite my sub-title, I never did expect to make much (or, really, any) money from Google ads. When I first signed up for it in Dec 2006, I stated I was doing it to find out how this ad-based business really works.
Well, from my standpoint, it does not work well; actually, not well at all.
Over the years I have seen visits on this site climb to nearly 3 K per day, and other nice growth factors. Perhaps if I were really focused on ad revenue I would have rotated stuff, tried alternative placements, yada yada. But, mostly, I was just trying to see who made out in this ad game.
It is certainly not the standard blog. I think my stats put me somewhere in the top 1% of all sites visited, but even that is not enough to even pay my monthly server charges (now higher with Amazon EC2).
Yet, in recent months, I have noticed some vendors have specifically targeted advertising on my blog and there also has been an increase in full banner ads (away from the standard, unobtrusive link Google ads of years past). Maybe they know something I don’t and they are winning, but my monthly ad income has dropped or remained flat.
And, then, I began to get full panel flashing ads on my site that just screamed Hit me! Hit me!. WTF. It was the last straw. Where did the unobtrusive link stuff go? Screw it; I can afford to pay my own monthly chump change.
This is probably not the time or place to discuss business models on the Web, but the woeful state of ad-based revenue is apparent. My goodness, I’m getting tired of ReadWriteWeb, as an example and one of the biggest at that, shilling with repeats and big ads with stories for their prominent advertisers each weekend. And, they are one of the only few ad winners!
My honest guess is that fewer than 1/10 of 1% of Web sites with advertising make enough to cover their bandwidth and server costs. How do you spell s-m-a-r-t?
So, the experiment is over. I will now think a bit about how I can reclaim that valuable Web page space from my former charitable contribution to the Google cafeteria. Bring on the sushi!