Last week I came across a reference from Search Engine Watch – for which I have been a subscriber for many years and have been a speaker at their conferences — that TOTALLY FRIED me. It’s related to a topic near and dear to me, because, I am both the father and the steward. What I am speaking about is the general topic of the “deep Web.” I began a public response to that last week’s posting, but then, after cooling down, simply notified the author, Gary Price, of my attribution concerns. He graciously and subsequently amended his posting with appropriate attribution. Thanks, Gary, for proper and ethical behavior!
With some of the issues handled privately, I decided that discretion was the better part of valor and I would let the topic alone with respect to some of the other parties in the chain of lack of attribution. After all, Gary was merely reporting information from a reporter. The genesis of the issues resided elsewhere.
Then, today, I saw the issue perpetuated still further by the VC backer of Glenbrook Networks, piling onto to the previous egregious oversights. I could sit still no longer.
First, let me say, I am not going to get into the question of “invisible Web” versus “deep Web” (the latter being the term which Thane Paulsen and I coined nearly 5 years ago to reflect dynamic content not accessbile via standard search engine crawlers). Deep Web has become the term of art, much like kleenex, and if you know what the term means then the topic of this post needs no further intro.
However, I’m going to make a few points below about the misappropriation of the term ‘deep Web’ and the technology around it. I believe that some may legitimately say, “Tough luck; it is your responsibility to monitor such things, and if they did not credit or acknowledge your rights, that is your own damn fault.” Actually, I will generally agree with this sentiment.
My real point in this posting, therefore, is not my term versus your term, but the integrity of intellectual property, attribution and “truth” in the dynamic Internet. If I step back from my own circumstance and disappointment, the real implication, I believe, is that future historians will be terribly hard-pressed to discern past truths from Internet content. If we think it is difficult to extract traceable DNA from King Tut today, it will be close to impossible to discern the true genesis, progression, linkages and idea flow based on Internet digital information into the future. But I digress …
Last Week’s Posting
The genesis of this issue began with a posting on Silicon Beat by Matt Marshall, Diving deep into the web: Glenbrook Networks. Marshall is a reporter for the San Jose Mercury News. Much was made of the “deep Web” phenomenon and the fact that Glenbrook Networks now had technology to tap into it. This story was then picked up by the Search Engine Watch blog. SEW is one of the best and most authoritative sources for search engine related information on the Web. The blog author was Gary Price. The SEW blog entry cited two references on deep Web topics, both of which referenced my seminal paper as their own first references. Neither of these press articles mentioned BrightPlanet. I notified Gary Price of what I thought was an oversight of attribution, and he properly and graciously added an addendum to the original piece:
Using this press, Jeff Clavier, one of the VCs backing the vendor, Glenbrook Networks, began flogging the press coverage on his own blog site. There were assertions made in that original piece that deserved countering, but there have been vendors that have come and gone in the past (see below) that have attempted to misappropriate this “space” and its technology and have generally fallen by the wayside or gone out of business. I chose to let the matter go quiet publicly, ground some more enamel off my teeth, and referred the matter to our general counsel for private action.
Today’s Posting
The flogging continued today under a new posting on Jeff Clavier’s site, Glenbrook Networks: Trawling the Deep Web. This new posting extended the misappropriation further, and since part of an ongoing series obviously planned to push the investment, goaded me to finally make a public response. In part, here is some of what that new post said:
Because the Deep Web contains a lot of factual information, it can be seen metaphorically as an ocean with a lot of fish. That is why we call the system that navigates the Deep Web a trawler.
Note that the figures used come directly out of our research, and are frequently used by others without attribution, as is the case here. However, the trawler imagery is especially egregious, since it is a direct rip-off of our original papers!. In fact, here are the two trrawler images from our original Deep Web: Surfacing Hidden Value first published in 2000, the first representing surface content retrieval:

The next image represents deep Web content retrieval:

The post then goes on to overview some “technology” with fancy names that is very straightforward, has been documented extensively before by BrightPlanet, and is covered by existing patents to our company.
Misappropriation is Nothing New
Such misappropriations have happened before. In one instance, now out of business, complete portions of BrightPlanet’s white paper were plagiarized on the home page of a competitor. We have also had competitors name themselves after the deep Web (e.g., Deep Web Technologies), appropriate the name and grab Web addresses (Quigo, with http://www.deepweb.com, now largely abandoned), government agencies make videos (the Deep Web DOE deep Web search engine), national clubs form (Deep Web Club), or competitor push products and technologies citing our findings and insights (e.g., Grokker from Groxis or Connotate), all instances without attribution or mention of BrightPlanet.
Imitation is the sincerest form of flattery and enforcement of intellectual property rights depends on the vigilance of the owner. We understand this, though small company size often means it is difficult to discover and police. Indeed, in the initial naming of the “deep Web” we wanted it to become the term of art. By not keeping it proprietary, it largely has. We have thus welcomed the growth of the concept. However, we do not welcome the blatant infringement on intellectual property and technology by competitors. We particularly expect VC-backed companies to adhere to ethical standards. I admonish Glenbrook Networks and its financial backers to provide attribution where attribution is due. This degree of misappropriation is too great. Shame, shame ….
BTW, for the record, you can see the most recent update of my and BrightPlanet’s deep Web paper and analysis at the University of Michigan’s Journal of Electronic Publishing, July 2001. Of course, there remains the definitive information on this topic in spades at BrightPlanet’s Web site.
Defense via Electrons
Actually, the real sadness here is that perhaps what is ”truth” is only as good as what has been posted last. Post it last, say it loudest, and the whole world only knows what it sees. The Internet certainly poses challenges to past institutions such as peer review or professional publishing that helped to reinforce standards of truth, verification and defensibility. What standards will emerge on the Web to help affirm authoritativeness?
Certainly, one hopes that the community itself, which has shown constantly it can do so, will find and expose lies, deceit, fraud, or other crimes of the information commons. This appears to work well in the political arena, perhaps is working okay in the academic arena, but how well is it working in the general arena of ideas and intellectual property? Unfortunately, as perhaps this example shows, maybe not so great at times. The thing that I fear is that defense can only occur by how many electrons we shower onto the Internet, how broadly we broadcast them, and how frequently we do so. May the electrons be with you ….
Mike’s note: The following are comments submitted by Hiranya K Nath of Sam Houston State University on my earlier posted paper, "Untapped Assets: The $3 Trillion Value of U.S. Enterprise Documents." I subsequently referred to Hiranya’s and Uday M. Apte’s paper, "Size, Structure and Growth of the US Information Economy," in a follow-on to that post discussing supporting views for document assets occupying trillions of dollars in US economic activity. The following is reprinted with Hiranya’s permission.
This is an important and interesting study that attempts to measure the value of corporate 'documents' in the U.S. It not only measures the cost of creating new documents but also the cost of handling or mishandling of documents. This study measures the benefits from improved document access and use. The value of corporate documents is assessed under three major categories: internal documents, web documents (which generally reside in public domain) and 'opportunities and threats'. The first two categories provide information for internal or external use while the third category of documents is to obtain solicited grants and contracts or to satisfy regulatory requirements.
In an economy which has increasingly been information-based, the importance and challenges of managing information have reached a proportion that was never witnessed before. Quantifying the value of creating and handling documents is extremely important and, to my knowledge, this study is one of the first attempts in that direction. However, as the author admits, the estimates are compiled from various sources and, therefore, they are extremely fragmentary and may have been inconsistent. In the following paragraphs, I present my thoughts on how I would proceed if I were to conduct the study. Nevertheless, this white paper has done an excellent job in initiating a research agenda.
First, define and explain the terms and concepts. The terms and concepts used in the study need some explanations as they may be useful for a reader to have a good grasp over the issues associated with quantifying the value of documents. Some of the terms related to information economy have not yet entered the general vocabulary. The dictionary meaning of 'document' is proof or evidence in a written format. Oxford dictionary has extended the definition to include the digital format as well. Also, concepts like knowledge industry, knowledge worker, information industry, information worker need to be defined. Studies like Machlup (1962), Porat (1977) have conceptualized these terms but they have not entered mainstream research vocabulary. More generally, a more settled, well accepted vocabulary has not been developed yet.
Merriam-Webster Dictionary: Document
Function: noun
1 a archaic : PROOF, EVIDENCE b : an original or official paper relied on as the basis, proof, or support of something c : something (as a photograph or a recording) that serves as evidence or proof
2 a : a writing conveying information b : a material substance (as a coin or stone) having on it a representation of thoughts by means of some conventional mark or symbol
Function: transitive verb
1 : to furnish documentary evidence of
2 : to furnish with documents
3 a : to provide with factual or substantial support for statements made or a hypothesis proposed; especially : to equip with exact references to authoritative supporting information b (1) : to construct or produce (as a movie or novel) with authentic situations or events (2) : to portray realistically
4 : to furnish (a ship) with ship’s papers
Oxford Advanced Learner's Dictionary : Document
Function: noun
1 an official paper or book that gives information about sth, or that can be used as evidence or proof of sth: legal documents travel documents Copies of the relevant documents must be filed at court. One of the documents leaked to the press was a memorandum written by the head of the security police.
2 a computer file that contains text that has a name that identifies it: Save the document before closing.
Function: verb
1 to record the details of sth: Causes of the disease have been well documented. The results are documented in Chapter 3.
2 to prove or support sth with documents: documented evidence
Second, categorize and identify various documents. This is important because it will provide operational guideline for collecting relevant information for estimating the value of documents. From the standpoint of an organization, I think the documents can be divided into the following categories:
i) The first comprises documents that record operational details such as documents created by the accounting or payroll departments. Gathering of information and documentation thereof follow standard practices. With the advent of new technology the format of these documents may have changed but the standards have not changed much. I would assume that the cost of mishandling this category of documents is minimal. This category also includes documents which are created to satisfy legal requirements.
ii) The other sub category includes documents which are mainly for dissemination of information. An organization generally interacts with a target group and for optimum outcome from this interaction it is important that this target group is fully informed. Basically, these documents are created to reduce information asymmetry among agents so that problems related to asymmetric information do not arise. There is a marketing aspect to this category of documents. With the availability of new technology, constant changes in media, and people's access to diverse sources of information, this category of documents is expected to grow in volume and value. But this might cause substantial reduction in overall cost of production by reducing inefficiency that arises from information asymmetry.
Third, develop a methodology to measure the value of the documents. The first category should not be too hard to measure. Since every organization has well-defined departments responsible for creating and handling this category of documents, the information should be relatively easily available. I would anticipate some formidable problems in measuring the value of the second category. These documents may be created without specific planning or without following standard practices. Also, measuring the value of large amount of 'unnecessary' documents will be challenging yet important for overall value of the documents
Among other issues, since most documents are created and used at the intermediate level, a cost-based valuation will be appropriate. But some documents could be priced and if price-based valuation is used for those documents then appropriate adjustments should be made to make them consistent with each other.
References:
Machlup, F. (1962), The Production and Distribution of Knowledge in the United States, (Princeton University Press, Princeton, NJ).
Porat, Marc U and Rubin, Michael R. (1977), The Information Economy (9 volumes), Office of Telecommunications Special Publication 77-12 (US Department of Commerce, Washington D.C.)
As my efforts proceeded in getting this blog set up, I began to realize I was devoting substantially more time and effort to the activity than I originally anticipated. It was roughly at this time of realization that I began tracking time and effort. To date, I have spent about 300 hours(!!) getting my site ready to go, but I know this is in no way typical.
In fact, with services like Blogger, you can be up and running in 5 minutes and for free and posting comments immediately. For reasons noted in my ‘Prepare to Blog’ diary, this de minimus effort may not be advisable. On the other hand, my own needs and demands should not be indicative either. In any event, I present below the breakdown of my time and effort tracking and discuss what may be more "typical" expectations.
Unusual Demands for AI3
As I’ve stated elsewhere, there are some unique and unusual circumstances I have placed on my set-up and investigations leading to AI3. I have wanted, for example, to:
I think I’ve been successful in these aims, but as noted before, incurred time and effort is not typical. As I present the numbers below, I will try to be specific about what may be applicable. Please understand the reference and viewpoint I present is for a serious blog content site, perhaps only applicable to 10% of bloggers or so.
Time and Effort Breakdowns
The table below presents the results of my time and effort tracking. The table shows that about XXX hours have been spent getting AI3 ready over a time from decision to do it until commercial release of about three months. These times and efforts are well removed from a 5-min Blogger site!
|
|
|
Hours by Major Area |
|
|||||
| Blog Link |
Date |
Research |
Set-up |
Add Tools |
Techniques |
Composition |
Posting |
Total |
| First Post – Decided to Blog | 4/27/05 | 1.0 | 0.2 |
0.1 |
1.3 | |||
| First Blog Test Drive | 4/28/05 | 1.0 | 0.5 | 0.2 |
0.1 |
1.8 | ||
| WordPress | 4/29/05 | 4.0 | 2.0 | 0.6 |
0.3 |
6.9 | ||
| Local Hosting |
5/2/05 |
5.0 |
|
|
|
1.0 |
0.5 |
6.5 |
| Install Difficulties and Then Success! |
5/5/05 |
2.0 |
14.0 |
|
|
1.2 |
0.6 |
17.8 |
| Design and Hacking CSS | 5/6/05 | 5.0 | 2.0 | 0.2 |
0.1 |
7.3 | ||
| No Local Images |
5/7/05 |
3.0 |
0.5 |
|
2.0 |
0.4 |
0.2 |
6.1 |
| Posts/Comments Behavior | 5/8/05 | 1.0 | 1.5 | 0.1 |
|
2.6 | ||
| Advanced Functionality |
5/9/05 |
5.0 |
|
|
|
0.2 |
0.1 |
5.3 |
| Site Transfer |
5/17/05 |
|
6.0 |
|
1.0 |
0.1 |
|
7.1 |
| Begin Content | 5/18/05 | 0.5 |
0.1 |
0.6 | ||||
| Release Checklist | 5/20/05 | 2.0 | 2.0 | 4.0 |
2.0 |
10.0 | ||
| Editor Comparisons |
5/20/05 |
6.0 |
2.0 |
|
1.0 |
6.0 |
3.0 |
18.0 |
| Xinha Integration | 5/31/05 | 1.0 | 1.0 | 6.0 | 1.0 | 0.4 |
0.2 |
9.6 |
| External Credits and Thanks | 6/13/05 | 3.0 | 1.0 | 0.8 |
0.4 |
5.2 | ||
| Permalink Problems |
6/15/05 |
3.0 |
0.8 |
|
3.0 |
1.6 |
0.8 |
9.2 |
| Standard Site Content | 6/15/05 | 1.0 | 3.0 | 16.0 | 8.0 | 28.0 | ||
| Word Docs to HTML |
6/16/05 |
4.0 |
2.0 |
3.0 |
6.0 |
8.0 |
4.0 |
27.0 |
| Site Project Management | 6/17/05 | 1.0 | 1.5 | 3.0 |
0.3 |
5.8 | ||
| Not Playing Nice in the Sandbox | 6/19/05 | 2.0 | 6.0 | 2.0 |
1.0 |
11.0 | ||
| Use of Styles and Style Sheets | 6/20/05 | 3.0 | 6.0 | 4.0 |
1.5 |
14.5 | ||
| Clean Up Posts [not posted] |
6/22/05 |
|
|
|
|
2.0 |
8.0 |
10.0 |
| Some Best Practices | 6/22/05 | 1.0 | 4.0 | 6.0 |
3.0 |
14.0 | ||
| Large Document Transfer | 6/24/05 | 1.0 | 1.0 | 2.0 | 1.0 |
8.0 |
13.0 | |
| Cross-browser Compatibility | 6/24/05 | 2.0 | 3.0 | 4.0 |
2.0 |
11.0 | ||
| File Organization and Naming | 6/25/05 | 1.0 | 1.0 |
0.5 |
2.5 | |||
| The Purposeful Blogger | 6/25/05 | 2.0 |
1.0 |
3.0 | ||||
| Time Estimates |
6/26/05 |
2.5 |
|
|
|
0.8 |
0.4 |
3.7 |
| Word Docs to HTML II |
6/26/05 |
1.0 |
|
|
2.0 |
3.0 |
1.0 |
7.0 |
| W3C XHTML Validation |
6/26/05 |
4.0 |
0.1 |
|
3.0 |
1.5 |
0.5 |
9.1 |
| Screen Resolution Fix | 7/5/05 | 4.0 | 0.1 | 1.5 | 1.5 |
0.5 |
7.6 | |
| Trackback and Ping Setup/Testing | 7/12/05 | 3.0 | 1.5 | 1.0 |
0.3 |
5.8 | ||
| Better Quicktags for Comments | 7/14/05 | 3.0 | 1.0 | 1.0 | 0.5 | 2.0 |
0.5 |
8.0 |
| Formal Site Release! | 7/18/05 | 3.0 |
1.0 |
4.0 |
0.0 | |||
| Prepare to Blog Summary and PDF |
7/20/05 |
1.5 |
|
|
0.5 |
6.0 |
2.0 |
10.0 |
| Summary – ‘Typical’ Tasks | ||||||||
| Total | 37.0 | 10.6 | 8.5 | 33.0 | 16.0 | 8.0 | 113.1 | |
| % of Total | 32.7% | 9.4% | 7.5% | 29.2% | 14.1% | 7.1% | ||
| Summary – All AI3 Tasks (incl. red) | ||||||||
| Total | 74.0 | 36.0 | 11.5 | 51.5 | 85.3 | 52.0 | 310.3 | |
| % of Total | 23.9% | 11.6% | 3.7% | 16.6% | 27.5% | 16.7% | ||
| Summary – Non-’Typical’ AI3 Tasks |
|
|
|
|
|
|
|
|
| Total |
|
37.0 |
25.4 |
3.0 |
18.5 |
69.3 |
44.0 |
197.2 |
| % of Total |
|
18.8% |
12.9% |
1.5% |
9.4% |
35.2% |
22.3% |
|
The table lists about 30 subtasks (generally documented as individual posts on AI3) broken into the six major activity areas of Research, Set-up, Adding (or integrating) Tools, Composing Posts, or Posting clean posts with review and formatting. Please note the red entries, since these are deemed to be specific to my unusual demands for AI3 and are therefore not typical of what a serious blogger without these aims might experience. The unusual entries are either entire tasks associated with investigating tools and techniques or the efforts spent in composing and posting the ‘Preparing to Blog’ diary.
Observations and Guidance for the Serious Blogger
A serious blogger should be able to get a fairly comprehensive and well-designed site up and running in less than 100 hours, less if some of the lessons and guidance from the ‘Preparing to Blog’ diary are followed, and further less if standard site content (mission, about me, etc.) is shorter than what I provided on the AI3 site. Moreover, unlike the three months it took to get AI3 released, much quicker turnarounds could be easily accomplished. The longer times for AI3 were exacerbated by the three-times effort associated with the site’s unusual demands, business travel and demands, and one family vacation!
Some other observations that may guide planning for serious blogging from these numbers are (with the obvious caveats that different styles and skills may significantly alter these points):
Finally, ongoing requirements and care-and-feeding will remain demands. If one assumes roughly three "good" posts per week to keep a blog active, the numbers above suggest a weekly effort of about 20-25 hours per week or about 1-2 hrs per day, exclusive of responding to user comments. This may suggest lowering expectations to only a couple of quality postings per week.
Author’s Note: I actually decided to commit to a blog on April 27, 2005, and began recording soon thereafter my steps in doing so. Because of work demands and other delays, the actual site was not released until July 18, 2005. To give my ‘Prepare to Blog …’ postings a more contemporaneous feel, I arbitrarily changed posting dates on this series one month forward, which means some aspects of the actual blog were better developed than some of these earlier posts indicate. However, the sequence and the content remain unchanged. A re-factored complete guide will be posted at the conclusion of the ‘Prepare to Blog …’ series, targeted for release about August 18, 2005. mkb
Author’s Note: There is zipped Javascript code that supports the information in this post. If you develop improvements, please email Mike and let him know of your efforts.
Click here to download the zipped Javascript code file and LGPL license (14 KB)
Most blogs provide some text explanation above the comment field for what HTML tags the commenter may provide in her response. However, I’d seen some other sites that had some buttons that automated some of this process. I set out to add this capability to my site with two objectives in mind: 1) make it easy for the non-HTML user to format a post comment; and 2) keep options narrowed to what I thought was an appropriate subset of HTML format tags.
The WordPress Quicktags Facility
The internal WordPress post and page editor comes with what it calls “quicktags.” These tags, and their explanations, are:
This text is within a blockquote tag
text that is formatted in a monospaced font to differentiate between code clips and regular textAlex King’s Expanded Javascript Quicktags
As an addition to either the existing administrator quicktags or for addition in formatting online comments, Alex King expanded this listiing and provided it in a LGPL-based Javascript download. This Javascript Quicktags is up to version 1.2 and is the basis for what I incorporated on my site for formatting online comments.
My Modifications
While I liked this version, it had some limitations in meeting my objectives:
The result was that I modified Alex’s starting code. If you would like to install the same version I have on this site — see the comment field with its buttons below — download the file noted at the top of this post. It includes all of the license and installation instructions of Alex’s current 1.2 version with my modifications.
Future Efforts
I will work to add a preview capability so that the commenter can see her formatted comments before submitting the post. Longer term, I want to install a limited format WYSIWYG editor, similar to what I am using internally with Xinha.
Author’s Note: I actually decided to commit to a blog on April 27, 2005, and began recording soon thereafter my steps in doing so. Because of work demands and other delays, the actual site was not released until July 18, 2005. To give my ‘Prepare to Blog …’ postings a more contemporaneous feel, I arbitrarily changed posting dates on this series one month forward, which means some aspects of the actual blog were better developed than some of these earlier posts indicate. However, the sequence and the content remain unchanged. A re-factored complete guide will be posted at the conclusion of the ‘Prepare to Blog …’ series, targeted for release about August 18, 2005. mkb
Setting a Ping List
Pinging is a powerful way to alert posting and blog searching services that you have posted a new entry. There is a facility within the WordPress administration center for setting a ping list under Options-Writing-Update Services; the standard ping provided with the default installation is Pingomatic. Eliliot Back has provided a useful starting list of ping sites to include with WordPress. In reviewing his list, I looked at and decided to include these entries, excluding dead links and most foreign sites:
www.a2b.cc/setloc/bp.a2b
api.feedster.com/ping
api.moreover.com/ping
api.my.yahoo.com/rss/ping
www.blogdigger.com/RPC2
blogmatcher.com/u.php
www.blogshares.com/rpc.php
www.blogsnow.com/ping
www.blogstreet.com/xrbin/xmlrpc.cgi
coreblog.org/ping/
www.mod-pubsub.org/kn_apps/blogchatter/ping.php
www.newsisfree.com/xmlrpctest.php
ping.blo.gs/
ping.feedburner.com
ping.syndic8.com/xmlrpc.php
ping.weblogalot.com/rpc.php
www.popdex.com/addsite.php
rpc.blogrolling.com/pinger/
rpc.pingomatic.com/
rpc.technorati.com/rpc/ping
rpc.weblogs.com/RPC2
www.snipsnap.org/RPC2
topicexchange.com/RPC2
xping.pubsub.com/ping/
What is Trackback?
Trackback is a mechanism for a third-party to post on its own site a detailed response to one of your posts. When done, the third party site pings your site, provides an excerpt ithat displays in the standard comment field, and can then be viewed by URL reference from your comments list. Thus, the third party respondent need not post on both sites and more detailed responses can be provided on each author’s respective site. Trackbacks can also be used for content aggregation purposes by topic.
There are many trackback overviews — which sometimes seem hard to understand because of its poor name — available on the Web. A couple that are useful include the newbie guide from Moveable Type, the original developer of the trackback function and protocol, and a different short version..
Should Trackback be Used?
Initial difficulties in getting trackback configured for my site led me to question whether the function was even desirable. In fact, within the past year, there has been an explosion of spamming against trackback facilities. ‘Trackback is Dead‘ is one of the more provocative discussions arguing that the time for trackback is past; Matthew Mullenweg, a founding developer of WordPress, has spent time looking at how spamming can be overcome.
Because WordPress 1.5 has what appears to be a pretty effective trackback moderating facility — the same as what is used for comments — and because there appear to be some trackback spam filters and other plugin utilities, I decided to implement the feature for now, see if it is used, and if spam does occur deal with that problem at that time.
Configuring and Testing Trackback
My attempts to grapple with trackback had some difficulties, compounded by my fundamental misunderstandings. First, the term is not descriptive, and it took me quite a bit of time to understand exactly what trackback was designed to accomplish. Second, most blogs that have a trackback function show it as a link prior to the comments section. I first had difficulty writing my comments code such that the trackback link URL displayed properly (for some reason, I needed to pass arguments in single quotes, not double quotes). Then, when that problem was fixed, clicking on the link produced a 403 error from my hosting server. This latter problem suggested that I did not have trackback properly configured, when, actually, it was my misunderstanding of how the function worked that was the problem.
The breaktrhrough in understanding came from external sites that offer ping and trackback testing. The one I used was from Red Alt . Definitely use these utilities! I also testing pinging with Pingomation.
The Red Alt instructions showed that my site was indeed sending and receiving pings properly. In fact, the use of the Trackback link appeared solely to be a means to get the reference URL to display for the third party to properly link to it. That caused me to re-think the use of a link in the first place.
Displaying Trackback and Permalinks within Comments Section
To prevent others from having the same confusions I had, I decided to make two changes to how most sites handle trackbacks. First, I decided to eliminate the active links for both trackbacks and permalinks, instead replacing them with more description text fields with URLs that can be copied directly off the browser page. For example, for this post, the references are:
The URI link reference to this post is: http://mkbergman.com/?p=103
The URI to trackback this post is: http://mkbergman.com/wp-context/trackback/?p=103
Second, I also decided to make a clear distinction between direct comments and trackbacks in my post comments field. I did this by testing for the type of comment; if a trackback, it is shown as a a different type. The PHP code for implementing this is as follows, placed into the comments.php file where the comment display loop is shown:
<?php if ($comment->comment_type == ‘trackback’) : ?>
provided a trackback on
<?php else : // the comment is a true comment ?>
commented on
<?php endif; ?>
When activitated, a regular comment will show as XXX commented on date; if a trackback, it will show as XXX provided a trackback on date.
With these changes, I now had pinging working, trackback working, and clear distinctions in my comments fields as to true comments or trackbacks.
Author’s Note: I actually decided to commit to a blog on April 27, 2005, and began recording soon thereafter my steps in doing so. Because of work demands and other delays, the actual site was not released until July 18, 2005. To give my ‘Prepare to Blog …’ postings a more contemporaneous feel, I arbitrarily changed posting dates on this series one month forward, which means some aspects of the actual blog were better developed than some of these earlier posts indicate. However, the sequence and the content remain unchanged. A re-factored complete guide will be posted at the conclusion of the ‘Prepare to Blog …’ series, targeted for release about August 18, 2005. mkb