Evolution
AI³
Adaptive Information
Adaptive Innovation
Adaptive Infrastructure
a·dap·tive adj. Showing or having a capacity to make fit for new or special situations; flexible; a successful adjustment.

Blogasbörd (cloud version):
Send Email   Get SIOC Profile   Get FOAF Profile   Syndicate full contents for this site using RSS 20
Main Links
Categories
Calendar
February 2012
S M T W T F S
« Jan    
 1234
567891011
12131415161718
19202122232425
26272829  
Archives
More . . .  
Credits
Blog software courtesy of WordPress

View Mike's profile on LinkedIn
Search
Date:   December 12, 2011

State of SemWeb Tools - 2011Number of Semantic Web Tools Passes 1000 for First Time; Many Other Changes

We have been maintaining Sweet Tools, AI3‘s listing of semantic Web and -related tools, for a bit over five years now. Though we had switched to a structWSF-based framework that allows us to update it on a more regular, incremental schedule [1], like all databases, the listing needs to be reviewed and cleaned up on a periodic basis. We have just completed the most recent cleaning and update. We are also now committing to do so on an annual basis.

Thus, this is the inaugural ‘State of Tooling for Semantic Technologies‘ report, and, boy, is it a humdinger. There have been more changes — and more important changes — in this past year than in all four previous years combined. I think it fair to say that semantic technology tooling is now reaching a mature state, the trends of which likely point to future changes as well.

In this past year more tools have been added, more tools have been dropped (or abandoned), and more tools have taken on a professional, sophisticated nature. Further, for the first time, the number of semantic technology and -related tools has passed 1000. This is remarkable, given that more tools have been abandoned or retired than ever before.

Click here to browse the Sweet Tools listing. There is also a simple listing of URL links and categories only.

We first present our key findings and then overall statistics. We conclude with a discussion of observed trends and implications for the near term.

Key Findings

Some of the key findings from the 2011 State of Tooling for Semantic Technologies are:

  • As of the date of this article, there are 1010 tools in the Sweet Tools listing, the first it has passed 1000 total tools
  • A total of 158 new tools have been added to the listing in the last six months, an increase of 17%
  • 75 tools have been abandoned or retired, the most removed at any period over the past five years
  • A further 6%, or 55 tools, have been updated since the last listing
  • Though open source has always been an important component of the listing, it now constitutes more than 80% of all listings; with dual licenses, open source availability is about 83%. Online systems contribute another 9%
  • Key application areas for growth have been in SPARQL, ontology-related areas and linked data
  • Java continues to dominate as the most important language.

Many of these points are elaborated below.

The Statistical Picture

The updated Sweet Tools listing now includes nearly 50 different tools categories. The most prevalent categories, each with over 6% of the total, are information extraction, general RDF tools, ontology tools, browser tools (RDF, OWL), and parsers or converters. The relative share by category is shown in this diagram (click to expand):

Sweet Tools Applications

Since the last listing, the fastest growing categories have been SPARQL, linked data, knowledge bases and all things related to ontologies. The relative changes by tools category are shown in this figure:

Last Year Sweet Tools Applications

Though it is true that some of this growth is the result of discovery, based on our own tool needs and investigations, we have also been monitoring this space for some time and serendipity is not a compelling explanation alone. Rather, I think that we are seeing both an increase in practical tools (such as for querying), plus the trends of linked data growth matched with greater sophistication in areas such as ontologies and the OWL language.

The languages these tools are written in have also been pretty constant over the past couple of years, with Java remaining dominant. Java has represented half of all tools in this space, which continues with the most recent tools as well (see below). More than a dozen programming or scripting languages have at least some share of the semantic tooling space (click to expand):

Sweet Tools Languages

With only 160 new tools it is hard to draw firm trends, but it does appear that some languages (Haskell, XSLT) have fallen out of favor, while popularity has grown for Flash/Flex (from a small base), Python and Prolog (with the growth of logic tools):

Last Year Change in Sweet Tools Languages

PHP will likely continue to see some emphasis because of relations to many content management systems (WordPress, Drupal, etc.), though both Python and Ruby seem to be taking some market share in that area.

New Tools

The newest tools added to the listing show somewhat similar trends. Again, Java is the dominant language, but with much increased use of JavaScript and Python and Prolog:

Sweet Tools Languages

The higher incidence of Prolog is likely due to the parallel increase in reasoners and inference engines associated with ontology (OWL) tools.

The increase in comprehensive tool suites and use of Eclipse as a development environment would appear to secure Java’s dominance for some time to come.

Trends and Observations

These dry statistics tend to mask the feel one gets when looking at most of the individual tools across the board. Older academic and government-funded project tools are finally getting cleaned out and abandoned. Those tools that remain have tended to get some version upgrades and improved Web sites to accompany them.

The general feel one gets with regard to semantic technology tooling at the close of 2011 has these noticeable trends:

  • A three-tiered environment – the tools seem to segregate into: 1) a bottom tier of tools (largely) developed by individuals or small groups, now most often found on Google Code or Github; 2) a middle-tier of (largely) government-funded projects, sometimes with multiple developers, often older, but with no apparent driving force for ongoing improvements or commercialization; and 3) a top-tier of more professional and (often) commercially-oriented tools. The latter category is the most noticeable with respect to growth and impact
  • Professionalism – the tools in the apparent top tiers feel to have more professionalism and better (and more attractive) packaging. This professionalism is especially true for the frameworks and composite applications. But, it also applies to many of the EU-funded projects from Europe, which has always been a huge source of new tool developments
  • More complete toolsets – similarly, the upper levels of tools are oriented to pragmatic problems and problem-solving, which often means they embody multiple functions and more complete tooling environments. This category actually appears to be the most visible one exhibiting growth
  • Changing nature of academic releases – yet, even the academic releases seem to be increasing in professionalism and completeness. Though in the lowest tier it is still possible to see cursory or experimental tool releases, newer academic releases (often) seem to be more strategically oriented and parts of broader programmatic emphases. Programs like AKSW from the University of Leipzig or the Freie Universität Berlin or Finland’s Semantic Computing Research Group (SeCo), among many others, tend to be exemplars of this trend
  • Rise of commercial interests and enterprise adoption – the growing maturity of semantic technologies is also drawing commercial interest, and the incubation of new start-ups by academic and research institutions acts to reinforce the above trends. Promising projects and tools are now much more likely to be spun off as potential ventures, with accompanying better packaging, documentation and business models
  • Multiple languages and applications – with this growing complexity and sophistication has also come more complicated apps, combining multiple languages and functions. In fact, for some time the Sweet Tools listing has been justifiably criticized by some as overly “simplifying” the space by classifying tools under (largely) single applications or single languages. By the 2012 survey, it will likely be necessary to better classify the tools using multiple assignments
  • Google code over SourceForge for open source (and an increase in Github, as well) – virtually all projects on SourceForge now feel abandoned or less active. The largest source of open source projects in the semantic technology space is now clearly Google Code. Though of a smaller footprint today, we are also seeing many of the newer open source projects also gravitate to Github. Open source hosting environments are clearly in flux.

I have said this before, and been wrong about it before, but it is hard to see the tooling growth curve continue at its current slope into the future. I think we will see many individual tools spring up on the open source hosting sites like Google and Github, perhaps at relatively the same steady release rate. But, old projects I think will increasingly be abandoned and older projects will not tend to remain available for as long a time. While a relatively few established open source standards, like Solr and Jena, will be the exception, I think we will see shorter shelf lives for most open source tools moving forward. This will lead to a younger tools base than was the case five or more years ago.

I also think we will continue to see the dominance of open source. Proprietary software has increasingly been challenged in the enterprise space. And, especially in semantic technologies, we tend to see many open source tools that are as capable as proprietary ones, and generally more dynamic as well. The emphasis on open data in this environment also tends to favor open source.

Yet, despite the professionalism, sophistication and complexity trends, I do not yet see massive consolidation in the semantic technology space. While we are seeing a rapid maturation of tooling, I don’t think we have yet seen a similar maturation in revenue and business models. While notable semantic technology start-ups like Powerset and Siri have been acquired and are clear successes, these wins still remain much in the minority.


[1] Please use the comments section of this post for suggesting new or overlooked tools. We will incrementally add them to the Sweet Tools listing. Also, please see the About tab of the Sweet Tools results listing for prior releases and statistics.

Posted by AI3's author, Mike Bergman

Posted on December 12, 2011 at 8:29 am in Open Source, Semantic Web Tools, Structured Web | Comments (5)
The URI link reference to this post is: http://www.mkbergman.com/991/the-state-of-tooling-for-semantic-technologies/
The URI to trackback this post is: http://www.mkbergman.com/991/the-state-of-tooling-for-semantic-technologies/trackback/
Date:   October 17, 2011

Structured DynamicsToday’s Post is a Testimony to the Value of Vacations

My partner, Fred Giasson, today posted the second part of his series on open source. Since returning from a well-earned vacation a few weeks back — after more than three years without a break — Fred has been writing and developing up a storm. As someone said to me last week, “Fred’s on fire!” I could not agree more.

I think Fred’s post speaks for itself as to why and how Structured Dynamics has made a conscious choice to embrace open source. The major reason he puts forth — to bootstrap the company without the need for external investment — is unusual in itself. But one thing he is silent about is why this is a compelling reason. I’ll comment on that.

Fred and I have both worked for others dependent on their capital for our ventures (a few more times in my case). Capital is great for expansion and operations, but it can be deadly when visions requiring patience are in play. Structured Dynamics is only now a bit more than halfway through its five-year plan. While semantics technologies are exciting with a world of upside potential, they have also been incubated in academic labs with (as yet) a general lack of practical deployment. The promise is there, but often the delivery and maturation have been lacking. We are committed to play a visible role in correcting that.

The approach Fred outlines was not perhaps easily available to new startups a decade ago. But now, with open source and the Internet, costs of entry and ongoing development have dropped markedly. Yet, surprisingly, the idea of financing a startup via revenues is still not talked about sufficiently — let alone often used as an actual basis for building a company.

I’ve been fortunate to be able to partner with a young, world-class technologist whose maturity exceeds that of individuals many years his senior. He understands that in order to achieve important visions that the stewardship of those ideas can not be left to venture capitalists committed solely or mostly to gaming terms or near-term returns. We’re placing our bets on the paying customer and our own judgment.

So, it is great to see Fred continue his phenomenal development productivity since he returned from Hawaii. The benefit of his vacation is that we are also now getting his insights on his blog again.

Posted by AI3's author, Mike Bergman

Posted on October 17, 2011 at 10:19 pm in Open Semantic Framework, Open Source, Software Development, Structured Dynamics | Comments (0)
The URI link reference to this post is: http://www.mkbergman.com/981/freds-hair-is-on-fire/
The URI to trackback this post is: http://www.mkbergman.com/981/freds-hair-is-on-fire/trackback/
Date:   August 15, 2011

World's Tallest Flagpole; see ref [9]The New Paradigm of ‘Substantive Marketing’ for Innovative IT

This decade has clearly marked a sea change in the move of enterprise software from proprietary to open source, as I have recently discussed [1]. It is instructive that only a mere six years ago I was in heated fights with my then Board about open source; today, that seems so quaint and dated.

Also during this period many have noted how open source has changed the capital required to begin a new software startup [2]. Open source both provides the tooling and the components for cobbling together specialty apps and extensions. Six and seven and even eight figure startup costs common just a decade ago have now dropped to four or five figures. When we see the explosion of hundreds of thousands of smartphone apps we are seeing the glowing residue of these additional sea changes. Dropping startup costs by one to three orders of magnitude is truly democratizing innovation.

But something else has been going on that is changing the face of enterprise software (besides consolidation, another factor I also recently commented on). And that factor is “marketing”. Much less commentary is made about this change, but it, too, is greatly lowering costs and fundamentally changing market penetration strategies. That topic — and my personal experience with it — is the focus of this article.

The Obsolete Recent Past

Besides the few remaining big providers of enterprise software — like IBM, Oracle, HP, SAP — most vendors have totally remade their sales practices of just a few years ago. Large sales forces with big commissions and a year to two year sales cycles can no longer be justified when software license fees and the percentage maintenance annuities that flow from them are dropping rapidly. Today’s mantras are doing more with less and doing it faster, hardly consistent with the traditional enterprise software model. Sure, big enterprises, especially big government and big business, have large sunk costs in legacy systems that will continue to be milked by existing vendors. But the flow is constricting with longer-term trends clear to see. The old enterprise software model is obsolete.

Even if it were not dying, it is hard to square huge investments in sales and marketing when product development has become inexpensive and agile. The proliferation of three-letter marketing acronyms for branding “new” product areas and standard formulas for product hype of just a few years ago also feels old and dated. Cozy relationships with conventional trade press pundits and market analysts seem to be diminishing in importance, possibly because the authoritativeness of their influence is also diminishing. It is harder to justify market firm subscription costs when priority budget items are being cut and new information outlets have emerged.

In response to this, many developers have forsaken the enterprise market for the consumer one. Indeed enterprises themselves are looking more and more to the consumer sector and commodity apps for innovation and answers. But, still, problems unique to enterprises remain and how to effectively reach them in this brave new world is today’s marketing problem for enterprise software vendors.

Most entities today, when opining about these challenges, tend to emphasize the need for “laser focus” and “rifle-shot” targeting of prospects. The advice takes the form of: 1) emphasize well-defined verticals; 2) know your market well; and 3) target and go after your likely prospects. Prospect data mining and targeted ad analysis are the proferred elixirs.

But, there is little evidence such refined methods for prospect identification and targeting are really working. Like politicians doing focus groups and opinion polling to capture the desired “message” of their potential electorates, these are all still “push” models of marketing. Yet we are swamped with pushed messages and marketing everywhere we turn. The model is failing.

Besides message overload, there are two issues with laser targeting. First, despite all that we try to know about ready buyers (for enterprise software), we really don’t know if any particular individual is truly needful, in a position to buy, has the authority to buy, or is the right advocate to make the internal sell. Second, though the idea of “laser” carries with it the image of focus and not flailing, it is in fact expensive to identify the targets and send a focused message their way. Because of these issues, decay rates for laser prospects throughout conventional sales pipelines continue to rise.

A New Marketing ParadigmNew Paradigm Roadsign

There has always been the phenomenon of the “fish jumping into the boat“; that is, the unanticipated inbound inquiry from a previously unknown prospect leading to a surprisingly swift sale. But we have seen this phenomenon increase markedly in recent years. Structured Dynamics‘ current customer base — including recurring customers — comes almost exclusively from this source. As we have noted this trend in comparison with more targeted outreach, we have spent much time trying to understand why it is occurring and how we can leverage what Peter Drucker called the “unexpected success” [3].

What we are seeing, I believe, is a shift from sales to marketing, and within marketing from direct or outbound marketing to a new paradigm of marketing. Others have likened this to inbound marketing [4] or content marketing [5] or permission marketing [6]. What we are seeing at Structured Dynamics bears many resemblances to parts of what is claimed for these other approaches, but not all. And, it is also true that what we are seeing may pertain mostly to innovative IT for emerging enterprise markets, and not a generalized paradigm suitable to other products or markets.

For lack of a better term, what we are seeing we can term “substantive marketing”. By this we mean offering valuable content and solutions-oriented systems for free and without restriction. This shares aspects with content marketing. Then, in keeping with the trend for buyers doing their own research and analysis to fulfill their own needs, similar to the premises of inbound or permission marketing, potential consumers can make their own judgments as to relevance and value of our offerings.

Sometimes, of course, some prospects find our approaches and solutions lacking. Sometimes, they may grab what we have offered for free and use them on their own without compensation to us. But where the match is right — and we need to be honest with both ourselves and the customer when it is not — we can better spend the customer’s limited time and resources to tailor our generic solutions to their specific needs. In doing so, we offer higher value (tailored services) while learning better about another spectrum of consumer need that can virtuously enhance our substantive offerings for the next prospect.

So, let’s decompose these components further to see what they can tell us about this new practice of substantive marketing and how to use it as an engine for moving forward.

Substantive Marketing

The Virtuous Cycle Begins with Substantive Solutions

The premise of substantive marketing is to offer square-deal value to the marketplace in the form of solutions-based content. Like content marketing that offers “the creation or sharing of content for the purpose of engaging current and potential consumer bases” [5], substantive marketing goes even further. The whole basis and premise of the approach is to provide substantive content, in one of more of these areas, preferably all:

  • Knowledge — this substantive area includes papers, commentary, survey results or listings of tools and references useful to the target market
  • Analysis — this content area includes unique analysis of market trends, data, technologies or reviews that pertain to the target market
  • Code — this area relates to the provision of open source code and tools, preferably under licenses that allow users to use the software without restriction (two examples are the Apache 2 license and the MIT license)
  • Documentation — a critical substantive area is the documentation in how to install, use, modify or customize these tools, including a prejudice to APIs and tutorial information
  • Methodologies, workflows and best practices — it is important to also discuss how to properly operate and utilize these tools and information. Taking care to document lessons learned and best practices also helps the user community avoid common mistakes and to speed adoption and utility, and
  • Demos — this area involves setting up (and sharing code and procedures for same) demos that show how the code and its methods actually work. Demos also become first use cases to aid the new user in learning and setting up the code bases.

Further, this substantive content is offered without strings, restrictions or customer fill-in forms. The content is not a come on or a teaser. We are not trying to gather leads or prospect names, because we have no intent to dun them with emails or follow-ups.

This substantive content is as complete as can be to enable new users to adopt the information and tools in their current state without further assistance. (In some cases, the information also educates the marketplace in order to prepare future customers for adoption.) Most importantly, this substantive content is offered for free, either open source (for code) or creative commons for documentation and other content. In return, it is fair to request — and we do — attribution when this material is used.

We have previously termed this complete panoply of substantive content a total open solution [7]. Some might find the provision of such robust information crazy: How can we give away the store of our proprietary knowledge and systems? But we find this kind of thinking old school. In an open source world where so much information is now available online, with a bit of effort customers can find this information anyway. Rather, our mindset is that customers do not want to pay again for what has already been done, but are willing to pay for what can be done with that knowledge for their own specific problems. Offering the complete storehouse of our knowledge in fact signals our interest in only charging the customer for new answers, new value or new formulations. The customers we like to work with feel they are getting an honest, square deal.

Flagpole Venues Help Increase Awareness

Consider your substantive content to be your flag, a unique banner for conveying and packaging your specific brand. It is thus important to find appropriate flagpoles — in the virtual territories that your customers visit — for raising this content high for them to see. Since the role of these flagpoles is to create awareness in potential prospects — who you do not likely know individually or even by group in advance — it makes sense to raise your offerings up on many flagpoles and on the highest flagpoles. Visibility is the object of the approach.

This approach is distinctly not leafletting or cramming links or emails into as many spaces as possible. The idea of substantive marketing is to fly valuable content high enough that desirous potential customers can discover and then inspect the information on their own, and only if they so choose. In this regard, substantive marketing resembles permission marketing [6].

Being visible helps ensure that the needful, questing prospect that you would never have been able to target on your own is able to see and be aware of your offerings. And, since they are seeking information and answers, your collateral needs to be of a similar nature. Solutions and substance are what they are seeking; what you have run up the flagpole should respond to that.

The mindset here is to respect your prospective customers and to allow them to chose to receive and inspect your offerings, but only if they so choose. If flown in the right venues with the right visibility, customers will see your flags and inspect them if they meet their requirements.

Some of the venues at which you can raise your flags include:

  • Blogs — this venue is especially helpful, since you have complete control over content, message, voice and packaging
  • Social networks — the value of social networks is now accepted, and should be a core component of any visibility strategy. However, it is also important to make sure that your contributions are driven by substance and value and do not become part of the cacophonous background noise
  • Vertical media — there are always existing outlets well-read and -respected by your customer propects. Establishing relationships and value with these third-party outlets can extend your reach
  • Web sites — this venue includes your standard Web sites, of course. But, you should also consider setting up specific project-related sites or sites dedicated to documentation (c.f., our TechWiki site of 300+ technical articles) or to methodologies (the excellent MIKE2.0 site is one great example) or to other ways by which particular content (such as tools with the Sweet Tools site) can raise another flag
  • User forums — user discussion groups and forums also become their own attractants for like-interested prospects, and
  • Conferences and tradeshows — while potentially valuable, presence at conferences and tradeshows must be carefully evaluated. Since participation and opportunity costs are high, the venues should be clearly relevant to your market space with likely decision makers in attendance.

The observant reader will have already concluded that each of these venues develops slowly, and therefore raising visibility is generally a slow-and-steady game that requires patience. Start-up vendors backed by venture firms or those looking for quick visibility and cashout will not find this approach suitable. On the other hand, customer prospects looking for answers and self-sustaining solutions are not much interested in flash in the pan vendors, either.

A Model Responsive to the Changing Nature of Customer Prospects

The real drivers for this changing paradigm come from customer prospects. Sophisticated buyers of enterprise IT and instrumental change agents within organizations share most if not all of these characteristics:

  • They are inundated with marketing messages and jaded about hype and “pushed” messages
  • They are generally knowledgeable about their needs and problem spaces and about approximate technologies. They are eager and desirous of learning independently and know that their recommendations affect their personal reputations and standing within their enterprises
  • With the many volatile external and internal changes, including staff reductions and fluid assignments, leadership for new technology adoption can come from many different and unknown corners of the organization; it is extremely difficult to identify and target prospects
  • The economic and competitive environment places a premium on affordability and low-risk evaluations of new technologies
  • Lock-ins of any kind — be it to specific vendors or technologies — are understood as inherently risky. This understanding is raising the importance of open and standards-based approaches
  • Being the subject of a pushy sales effort is distasteful and a negative to an eventual sale. Education and learning, however, is respected
  • Because of all that is at stake, honesty with no bullshit is highly appreciated. If you as a vendor do not offer an appropriate solution or have fulfillment weaknesses, tell the prospect so. Further, tell them who can supply the solution. One never knows when and where the next problem may arise, and providing trustworthy advice can lead to later engagements.

More often than not we find our customers to have already installed and used our existing substantive materials for some time before they approach us about further work. They appreciate the tutorial information and have taught themselves much in advance. By the time we engage, both parties are able to cost-effectively focus on what is truly missing and needed and to deliver those answers in a quick way. Re-engagements tend to occur when a next set of gaps or challenges arise.

Though it may sound trite or even unbelievable to those who have not yet experienced such a relationship, the square deal value offered by substantive marketing can really lead to true partnerships and trust between vendor and customer. We experience it daily with our customers, and vice versa. We also think this is the adaptive approach that our new environment demands.

The Free Path to Open Source and Solutions

Once prospects learn of our substantive offerings, many may decide independently that what we have is not suitable. Others may simply download and use the information on their own, for which we often never know let alone receive revenue. We are completely fine with this, as shown for three different cases.

First, some of these prospects need no more than what we already have. This increases our user base, increases our visibility and often results in contributions to our forums and documentation.

Then, some of these prospects come to learn they need or want more than what our current offerings provide, leading to two possible forks. In one fork, the second case, they may have sufficient skills internally or with other suppliers to extend the system on their own. Some of this flows back to an improved code base or improved installation or documentation bases.

In the other fork, the third case, they may decide to engage us in tailoring a solution for them. That case is the only one of the three that leads to a direct revenue path.

In all three cases we win, and the customer wins. Maybe enterprise software vendors of decades past rue this reality of lower margins and shared benefits; we agree that the absolute profit potential of substantive marketing is much less. But we gladly accept the more enjoyable work and steady revenue relationships resulting from these changes. We are not engaged in some pollyann-ish altruism here, but in a steely-eyed honest brokering that best serves our own self-interest (and fairly that of the customer, as well).

A Square Deal Baseline for Tailored Services

Great IT product does not come from idle musings or dreamed up functionality. It comes solely and directly from solving customer problems. Only via customers can software be refined and made more broadly usable.

A slipstream of those who have previously become aware and tested our offerings will choose to engage our services. This generally takes the form of an inbound call, where the prospect not only qualifies itself, but also establishes the terms and conditions for the sale. They have chosen to select us; they are fish that have jumped into the boat.

To again quote Peter Drucker, “. . . the aim of marketing is to make selling superfluous. The aim of marketing is to know and understand the customer so well that the product or service fits him and sells itself. Ideally, marketing should result in a customer who is ready to buy. All that should be needed then is to make the product or service available . . .” [8]. This is precisely what I meant earlier about the shift in emphasis from sales to marketing.

Even at this point there may be mismatches in needs and our skills and availabilities. If such is the case, we do not hesitate to say so, and attempt to point the prospect in another direction (from which we also gain invaluable market knowledge). If there is indeed a match, we then proceed to try to find common ground on schedule and budget.

Paradoxically, this square deal and honesty about the readiness and weaknesses of our offerings often leads to forgiveness from our customers. For example, for some time we have lacked automated installation scripts that would make it easier for prospects to install our open semantic framework. But, because of compensating value in other areas, such gaps can be overlooked and tackled later on (indeed, as a current customer is now funding). By not pretending to be everything to everyone, we can offer what we do have without embarrassment and get on with the job of solving problems.

For larger potential engagements, we typically suggest a fixed price initial effort to develop an implementation plan. The interviews and research to support this typical 4- to 6-weeks effort (generally in the $5 K to $10 K range, depending) then result in a detailed fulfillment proposal, with firm tasks, budget and schedule, specific to that customer’s requirements. Just as we respect our prospects’ time and budget, we expect the same and do not conduct these detailed plans without compensation. With respect to fulfillment contracts, we cap contract amount and limit milestone payments to pre-set percentages or time expended, whichever is lower.

This approach ensures we understand the customer’s needs and have budgeted and tasked accordingly. Capped contracts also put the onus on us the contractor to understand our own effort and tasking structures and realities, which leads to better future estimating. For the customer, this approach caps risk and potential exposure, and ensures milestones are being met no matter the time expenditures by us, the contractor. This approach extends our square-deal basis to also embrace risks and payments.

New (and Open Source) Developments Fuel the Substance Pipeline

Thus, when customers engage us, they spend almost solely on new functionality specifically tailored to their needs. In doing so, we suggest they agree to release the new developments they fund as open source. We argue — and customers predominantly agree — that they are already benefitting from lower overall costs because other customers have funded sharable, open source before them. We point out that the new customers that follow them will also be independently creating new functionality, to which they will also later benefit.

(This argument does not apply to specific customer data or ontologies, which are naturally proprietary to the customer. Also, if the customer wants to retain intellectual ownership of extensions, we charge higher development fees.)

Once these new developments are completed, they are fed back into a new baseline of valuable content and code. From this new baseline the cycle of substantive marketing can be augmented anew and perpetuated.

Three Guidelines to Leverage Substantive Marketing

All of these points can really be boiled down to three guidelines for how to make substantive marketing effective:

  • First, whatever your domain or market, provide useful and substantive content. The content you offer is indeed your marketing collateral. Prospective customers can gauge from it directly whether it meets their needs, appears sound and workable, and has value. If you have little of substance to offer, this paradigm is not for you
  • Second, plant many flagpoles and raise your flags high in territories your market prospects are likely to visit. This is a process that requires thoughtfulness and patience. Thoughtfulness, because that is how you determine where to plant your flags. If you yourself are a consumer of what you offer, it is easier to find those venues. And patience, because it takes time to stack valuable content upon valuable content in order to raise visibility
  • And, third, be honest and respectful. Help your prospect work within available budget to achieve the most possible at lowest risk. And help them find others, if need be, who might be better able than you to truly solve their problems.

What we are finding — as we continue to refine our understanding of this new paradigm — is that through substantive marketing the fish are finding us and they sometimes jump into the boat. We like our enterprise customers to pre-qualify themselves and already be “sold” once they knock on the door. One never knows when that phone might ring or the email might come in. But when it does, it often results in a collaborative customer as a partner who is a joy to work with to solve exciting new problems.


[1] M.K. Bergman, 2011. “Declining IT Innovation in the Enterprise,” in AI3:::Adaptive Innovation blog, January 17, 2011. See http://www.mkbergman.com/940/declining-it-innovation-in-the-enterprise/.
[2] Paul Graham has been the most prominent observer of this scene; see P. Graham, 2008. “Why There Aren’t Any More Googles,” April 2008 (see http://www.paulgraham.com/googles.html) and subsequent articles.
[3] See esp. Peter F. Drucker, 1985. Innovation and Entrepreneurialship: Practice and Principals, Harper & Row, New York, NY, 277 pp.
[4] Inbound marketing is a marketing strategy that focuses on getting found by customers. According to David Meerman Scott, inbound marketers “earn their way in” (via publishing helpful information on a blog etc.) in contrast to outbound marketing where they used to have to “buy, beg, or bug their way in” (via paid advertisements, issuing press releases in the hope they get picked up by the trade press, or paying commissioned sales people, respectively). Brian Halligan, cofounder and CEO of HubSpot, claims he first coined the term of inbound marketing.
[5] Content marketing is an umbrella term encompassing all marketing formats that involve the creation or sharing of content for the purpose of engaging current and potential consumer bases. In contrast to traditional marketing methods that aim to increase sales or awareness through interruption techniques, content marketing subscribes to the notion that delivering high-quality, relevant and valuable information to prospects and customers drives profitable consumer action. See also Holger Shulze, 2011. B2B Content Marketing Trends slideshow, see http://www.slideshare.net/hschulze/b2b-content-marketing-report.
[6] Seth Godin coined the term permission marketing wherein marketers obtain permission before advancing to the next step in the purchasing process. It is mostly used by online marketers, notably email marketers and search marketers, as well as certain direct marketers who send a catalog in response to a request. Godin contrasts this approach to traditional “interruption marketing” where messages are sent without prior permission.
[7] See the three-part series, M.K. Bergman, 2010. “Listening to the Enterprise: Total Open Solutions,” “Part 1,” “Part 2” and “Part 3,” AI3:::Adaptive Information blog, May 12 – 31, 2010.
[8] Peter F. Drucker, 1974. Management: Tasks, Responsibilities, Practices. New York, NY: Harper & Row. pp. 864. ISBN 0-06-011092-9.
[9] The intro photo is of the world’s tallest flagpole (at 165 m), in Dushanbe, Tajikistan. The photo is courtesy of CentralAsiaOnline.com.

Posted by AI3's author, Mike Bergman

Posted on August 15, 2011 at 2:25 am in Open Source, Software and Venture Capital, Structured Dynamics | Comments (2)
The URI link reference to this post is: http://www.mkbergman.com/969/of-flagpoles-and-fishes/
The URI to trackback this post is: http://www.mkbergman.com/969/of-flagpoles-and-fishes/trackback/
Date:   August 8, 2011

Geshi NetworkVisualization + Analysis Pushes Aside Cytoscape

Though I never intended it, some posts of mine from a few years back dealing with 26 tools for large-scale graph visualization have been some of the most popular on this site. Indeed, my recommendation for Cytoscape for viewing large-scale graphs ranks within the top 5 posts all time on this site.

When that analysis was done in January 2008 my company was in the midst of needing to process the large UMBEL vocabulary, which now consists of 28,000 concepts. Like anything else, need drives research and demand, and after reviewing many graphing programs, we chose Cytoscape, then provided some ongoing guidelines in its use for semantic Web purposes. We have continued to use it productively in the intervening years.

Like for any tool, one reviews and picks the best at the time of need. Most recently, however, with growing customer usage of large ontologies and the development of our own structOntology editing and managing framework, we have begun to butt up against the limitations of large-scale graph and network analysis. With this post, we announce our new favorite tool for semantic Web network and graph analysis — Gephi — and explain its use and showcase a current example.

The Cytoscape Baseline and Limitations

Three and one-half years ago when I first wrote about Cytoscape, it was at version 2.5. Today, it is at version 2.8, and many aspects have seen improvement (including its Web site). However, in other respects, development has slowed. For example, version 3.x was first discussed more than three years ago; it is still not available today.

Though the system is open source, Cytoscape has also largely been developed with external grant funds. Like other similarly funded projects, once and when grant funds slow, development slows as well. While there has clearly been an active community behind Cytoscape, it is beginning to feel tired and a bit long in the tooth. From a semantic Web standpoint, some of the limitations of the current Cytoscape include:

  • Difficult conversion of existing ontologies — Cytoscape requires creating a CSV input; there was an earlier RDFscape plug-in that held great promise to bridge the software into the RDF and semantic Web sphere, but it has not remained active
  • Network analysis — one of the early and valuable generalized network analysis plug-ins was NetworkAnalyzer; however, that component has not seen active development in three years, and dynamic new generalized modules suitable for social network analysis (SNA) and small-world networks have not been apparent
  • Slow performance and too-frequent crashes — Cytoscape has always had a quirky interface and frequent crashes; later versions are a bit more stable, but usability remains a challenge
  • Largely supported by the biomedical community — from the beginning, Cytoscape was a project of the biomedical community. Most plug-ins still pertain to that space. Because of support for OBO (Open Biomedical and Biological Ontologies) formats and a lack of uptake by the broader semantic Web community, RDF- and OWL-based development has been keenly lacking
  • Aside from PDFs, poor ability to output large graphs in a viewable manner
  • Limited layout support — and poor performance for many of those included with the standard package.

Undoubtedly, were we doing semantic technologies in the biomedical space, we might well develop our own plug-ins and contribute to the Cytoscape project to help overcome some of these limitations. But, because I am a tools geek (see my Sweet Tools listing with nearly 1000 semantic Web and -related tools), I decided to check out the current state of large-scale visualization tools and see if any had made progress on some of our outstanding objectives.

Choosing Geshi and Using It

There are three classes of graph tools in the semantic technology space:

  1. Ontology navigation and discovery, to which the Relation Browser and RelFinder are notable examples
  2. Ontology structure visualization (and sometimes editing), such as the GraphViz (OWLViz) or OntoGraf tools used in Protégé (or the nice FlexViz, again used by the OBO community), and
  3. Large-scale graph visualization in order to gain a complete picture and macro relationships in the ontology.

One could argue that the first two categories have received the most current development attention. But, I would also argue that the third class is one of the most critical:  to understand where one is in a large knowledge space, much better larger-scale visualization and navigation tools are needed. Unfortunately, this third category is also the one that appears to be receiving the least development attention. (To be sure, large-scale graphs pose computational and performance challenges.)

In the nearly four years since my last major survey of 26 tools in this category, the new entrants appear quite limited. I’ve surely overlooked some, but the most notable are Gruff, NAViGaTOR, NetworkX and Gephi [1]. Gruff actually appears to belong most in Category #2; I could find no examples of graphs on the scale of thousands of nodes. NAViGaTOR is biomedical only. NetworkX has no direct semantic graph importing and — while apparently some RDF libraries can be used for manipulating imports — alternative workflows were too complex for me to tackle for initial evaluation. This leaves Gephi as the only potential new candidate.

From a clean Web site to well-designed intro tutorials, first impressions of Gephi are strongly positive. The real proof, of course, was getting it to perform against my real use case tests. For that, I used a “big” ontology for a current client that captures about 3000 different concepts and their relationships and more than 100 properties. What I recount here — from first installing the program and plug-ins and then setting up, analyzing, defining display parameters, and then publishing the results — took me less than a day from a totally cold start. The Gephi program and environment is surprisingly easy to learn, aided by some great tutorials and online info (see concluding section).

The critical enabler for being able to use Gephi for this source and for my purposes is the SemanticWebImport plug-in, recently developed by Fabien Gandon and his team at Inria as part of the Edelweiss project [2]. Once the plug-in is installed, you need only open up the SemanticWebImport tab, give it the URL of your source ontology, and pick the Start button (middle panel):

SemWeb Plug-in for GephiNote the SemanticWebImport tool also has the ability (middle panel) to issue queries to a SPARQL endpoint, the results of which return a results graph (partial) from the source ontology. (This feature is not further discussed herein.) This ontology load and display capability worked without error for the five or six OWL 2 ontologies I initially tested against the system. 

Once loaded, an ontology (graph) can be manipulated with a conventional IDE-like interface of tabs and views. In the right-hand panels above we are selecting various network analysis routines to run, in this case Average Degrees. Once one or more of these analysis options is run, we can use the results to then cluster or visualize the graph; the upper left panel shows highlighting the Modularity Class, which is how I did the community (clustering) analysis of our big test ontology. (When run you can also assign different colors to the cluster families.) I also did some filtering of extraneous nodes and properties at this stage and also instructed the system via the ranking analysis to show nodes with more link connections as larger than those nodes with fewer links.

At this juncture, you can also set the scale for varying such display options as linear or some power function. You can also select different graph layout options (lower left panel). There are many layout plug-in options for Gephi. The layout plugin called OpenOrd, for instance, is reported to be able to scale to millions of nodes.

At this point I played extensively with the combination of filters, analysis, clusters, partitions and rankings (as may be separately applied to nodes and edges) to: 1) begin to understand the gross structure and characteristics of the big graph; and 2) refine the ultimate look I wanted my published graph to have.

In our example, I ultimately chose the standard Yifan Hu layout in order to get the communities (clusters) to aggregate close to one another on the graph. I then applied the Parallel Force Atlas layout to organize the nodes and make the spacings more uniform. The parallel aspect of this force-based layout allows these intense calculations to run faster. The result of these two layouts in sequence is then what was used for the results displays.

Upon completion of this analysis, I was ready to publish the graph. One of the best aspects of Gephi is its flexibility and control over outputs. Via the main Preview tab, I was able to do my final configurations for the published graph:

Publication Options for GephiThe graph results from the earlier-worked out filters and clusters and colors are shown in the right-hand Preview pane. On the left-hand side, many aspects of the final display are set, such as labels on or off, font sizes, colors, etc. It is worth looking at the figure above in full size to see some of the options available. 

Standard output options include either SVG (vector image) or PDFs, as shown at the lower left, with output size scaling via slider bar. Also, it is possible to do standard saves under a variety of file formats or to do targeted exports.

One really excellent publication option is to create a dynamically zoomable display using the Seadragon technology via a separate Seadragon Web Export plug-in. (However, because of cross-site scripting limitations due to security concerns, I only use that option for specific sites. See next section for the Zoom It option — based on Seadragon — to workaround that limitation.)

Outputs Speak for Themselves

I am very pleased with the advances in display and analysis provided by Gephi. Using the Zoom It alternative [3] to embedded Seadragon, we can see our big ontology example with:

  • All 3000 nodes labeled, with connections shown (though you must must zoom to see) and
  • When zooming (use scroll wheel or + icon) or panning (via mouse down moves), wait a couple of seconds to get the clearest image refresh:

Note: at standard resolution, if this graph were to be rendered in actual size, it would be larger than 7 feet by 7 feet square at full zoom !!!

To compare output options, you may also;

Still, Some Improvements Would be Welcomed

It is notable that Gephi still only versions itself as an “alpha”. There is already a robust user community with promise for much more technology to come.

As an alpha, Gephi is remarkably stable and well-developed. Though clearly useful as is, I measure the state of Gephi against my complete list of desired functionality, with these items still missing:

  • Real-time and interactive navigation — the ability to move through the graph interactively and to issue queries and discover relationships
  • Huge node numbers — perhaps the OpenOrd plug-in somewhat addresses this need. We will be testing Gephi against UMBEL, which is an order of magnitude larger than our test big ontology
  • More node and edge control — Cytoscape still retains the advantage in the degree to which nodes and edges can be graphically styled
  • Full round-tripping — being able to use Gephi in an edit mode would be fantastic; the edit functionality is fairly straightforward, but the ability to round-trip in appropriate formats (OWL, RDF or otherwise) may be the greater sticking point.

Ultimately, of course, as I explained in an earlier presentation on a Normative Landscape for Ontology Tools, we would like to see a full-blown graphical program tie in directly with the OWL API. Some initial attempts toward that have been made with the non-Gephi GLOW visualization approach, but it is still in very early phases with ongoing commitments unknown. Optimally, it would be great to see a Gephi plug-in that ties directly to the OWL API.

In any event, while perhaps Cytoscape development has stalled a bit for semantic technology purposes, Gephi and its SemanticWebImport plug-in have come roaring into the lead. This is a fine toolset that promises usefulness for many years to come.

Some Further Gephi Links

To learn more about Gephi, also see the:

Also, for future developments across the graph visualization spectrum, check out the Wikipedia general visualization tools listing on a periodic basis.


[1] The R open source math and statistics package is very rich with apparently some graph visualization capabilities, such as the dedicated network analysis and visualization project statnet. rrdf may also provide an interesting path for RDF imports. R and its family of tools may indeed be quite promising, but the commitment necessary to R appears quite daunting. Longer-term, R may represent a more powerful upgrade path for our general toolsets. Neo4j is also a rising star in graph databases, with its own visualization components. However, since we did not want to convert our underlying data stores, we also did not test this option.
[2] Erwan Demairy is the lead developer and committer for SemanticWebImport. The first version was released in mid-April 2011.
[3] For presentations like this blog post, the Seadragon JavaScript enforces some security restrictions against cross-site scripting. To overcome that, the option I followed was to:
  • Use Gephi’s SVG export option
  • Open the SVG in Inkscape
  • Expand the size of the diagram as needed (with locked dimensions to prevent distortion)
  • Save As a PNG
  • Go to Zoom It and submit the image file
  • Choose the embed function, and
  • Embed the link provided, which is what is shown above.
(Though Zoom.it also accepts SVG files directly, I found performance to be spotty, with many graphical elements dropped in the final rendering.)

Posted by AI3's author, Mike Bergman

Posted on August 8, 2011 at 3:27 am in Ontologies, Open Source, Semantic Web Tools, UMBEL | Comments (1)
The URI link reference to this post is: http://www.mkbergman.com/968/a-new-best-friend-gephi-for-large-scale-networks/
The URI to trackback this post is: http://www.mkbergman.com/968/a-new-best-friend-gephi-for-large-scale-networks/trackback/
Date:   May 16, 2011

A Video Introduction to a New Online Ontology Editor and Manager

Structured Dynamics is pleased to unveil structOntology — its ontology manager application within the conStruct open source semantic technology suite. We are doing so via a video, which provides a bit more action about this exciting new app.

structOntology has been on our radar for more than two years. But, it was only in embracing the OWLAPI some eight months back that we finally saw our way clear to how to implement the system.

The app, superbly developed by Fred Giasson, has many notable advantages — some of which are covered by the video — but two deserve specific attention:  1) the superior search function (if you have been using Protégé or similar, you will love the fact this search indexes everything, courtesy of Solr); and 2) the availability of its functionality directly within the applications that are driven by the ontologies. Of course, there’s other cool stuff too!:

 

 

(If you have trouble seeing this, here is the direct YouTube link or an alternate local Flash version if you can not access YouTube.)

More information on structOntology will be forthcoming over the coming weeks. We will be posting it as open source as part of the Open Semantic Framework by early summer.

Posted by AI3's author, Mike Bergman

Posted on May 16, 2011 at 9:06 pm in Ontologies, Open Semantic Framework, Open Source, Structured Dynamics | Comments (3)
The URI link reference to this post is: http://www.mkbergman.com/959/intro-to-structontology/
The URI to trackback this post is: http://www.mkbergman.com/959/intro-to-structontology/trackback/
Page 1 of 161234510...Last »
Copyright © 2004–2012 Michael K. Bergman.   This work is licensed under a Creative Commons License