Posted:August 25, 2006

Author’s Note: There is zipped HTML and Javascript code that supports the information in this post. If you develop improvements, please email Mike and let him know of your efforts.


Download Language Translator 2 and JS code file Click here to download the zipped file (2 KB)

Due to a recommendation by BrightPlanet’s lead programmer, Will Bushee, I was also turned to another online translation service from Applied Language (AL). In addition to the nine languages handled by Google in its English translations covered in my previous post, AL adds Russian and Dutch. And, so, again with Graham Beynon’s able Javscript assistance, my online translation system has now been expanded to 11 languages — Arabic, Chinese (simplified and traditional), Dutch, French, German, Italian, Japanese, Korean, Portuguese, Russian and Spanish — in two services. Thanks again, guys!

This now allows you, the user, to directly compare translation quality, which both services acknowledge as being less than perfect. It will also enable me over time to see which of the services and for which languages readers of my site prefer.

An Example Comparison

My earlier post drew some criticism of machine translation from Graham Higgins, using the example of his sister-in-law’s German Web site translated to English. To test some of these changes myself, I decided to roundtrip both the Google and Applied Language services for my site, arbitrarily picking Portuguese for the example. I began with some text from my online bio, using simple cut-and-pastes with no HTML formatting on a base of about 150 words:

Michael K. Bergman is a co-founder, chief technology officer, and chairman of BrightPlanet Corporation, and a Web scientist.

Mr. Bergman began his professional career in the mid-1970s as a project director for the U.S. EPA for a major energy study called the Coal Technology Assessment (CTA). He later taught in the Graduate School of Engineering at the University of Virginia and was a fellow in the Energy Policies Study Center. Mr. Bergman joined the American Public Power Association in 1982 and was soon promoted to director of energy research. He quadrupled the trade association’s voluntarily supported R&D program to 500 members.

He remained at APPA for seven years then formed his own successful technology commercialization company, Instar Community Systems, Inc. At Instar, Mr. Bergman helped create three technology commercialization groups in fuel cells, photovoltaics and biomass energy. In late 1994 he founded VisualMetrics, the developer of the technology now owned by BrightPlanet.

Here is Google’s translation to Portuguese based on its online service:

Michael K. Bergman é um co-founder, um oficial principal da tecnologia, e um presidente de BrightPlanet Corporaçõ, e um cientista da correia fotorreceptora. O Sr. Bergman começou sua carreira profissional nos mid-1970s como um diretor do projeto para os ESTADOS UNIDOS EPA para um estudo principal da energia chamado a avaliação da tecnologia de carvão (CTA). Mais tarde ensinou na escola graduada de projetar na universidade de Virgínia e foi um companheiro no centro do estudo das políticas de energia. O Sr. Bergman juntou a associação pública americana do poder em 1982 e foi promovido logo ao diretor da pesquisa de energia. Quadrupled o programa voluntàriamente suportado do R&D de associação comercial a 500 membros. Remanesceu em APPA por sete anos a seguir deu forma a sua própria companhia bem sucedida do commercialization da tecnologia, sistemas da comunidade de Instar, Inc. Em Instar, em Sr. Bergman ajudado criar três grupos do commercialization da tecnologia em células combustíveis, em photovoltaics e em energia do biomass. Em 1994 atrasado fundou VisualMetrics, o colaborador da tecnologia possuída agora por BrightPlanet.

Here is Applied Language’s translation to Portuguese based on its online service :

Michael K. Bergman é um co-founder, um oficial principal da tecnologia, e um presidente de BrightPlanet Corporaçõ, e um cientista da correia fotorreceptora.

O Sr. Bergman começou sua carreira profissional nos mid-1970s como um diretor do projeto para os ESTADOS UNIDOS EPA para um estudo principal da energia chamado a avaliação da tecnologia de carvão (CTA). Mais tarde ensinou na escola graduada de projetar na universidade de Virgínia e foi um companheiro no centro do estudo das políticas de energia. O Sr. Bergman juntou a associação pública americana do poder em 1982 e foi promovido logo ao diretor da pesquisa de energia. Quadrupled o R voluntàriamente suportado de associação comercial . . .

Here is the Google Portuguese translation roundtripped back to English:

Michael K. Bergman is to co-founder, a main officer of the technology, and a president of BrightPlanet Corporaçõ, and a scientist of the fotorreceptora leather strap. Mr. Bergman started its professional career in mid-1970s as a director of the project for United States EPA for a main study of the energy called the evaluation the coal technology (CTA). Later he taught in the graduated school to project in the university of Virginia and was an friend in the center of the study of the energy politics. Mr. Bergman joined the American public association of the power in 1982 and was soon promoted the director of the energy research. Quadrupled the program voluntàriamente supported of the R&D of trade association the 500 members. It remained in APPA per seven years to follow gave to form its proper successful company of commercialization of the technology, systems of the community To urge, Incorporation In Urging, Mr. helped Bergman to create three groups of commercialization of the technology in combustible cells, photovoltaics and energy of biomass. In 1994 been slow it established VisualMetrics, the collaborator of the possessed technology now for BrightPlanet.

Here is the Applied Language Portuguese translation roundripped back to English:

Michael K. Bergman is to co-founder, a main officer of the technology, and a president of BrightPlanet Corporaçõ, and a scientist of the fotorreceptora leather strap.

Mr. Bergman started its professional career in mid-1970s as a director of the project for United States EPA for a main study of the energy called the evaluation the coal technology (CTA). Later he taught in the graduated school to project in the university of Virginia and was an friend in the center of the study of the energy politics. Mr. Bergman joined the American public association of the power in 1982 and was soon promoted the director of the energy research. Quadrupled the R voluntàriamente supported of trade association . . .

Observations

It is clear that neither machine translation is perfect and that AL’s online service also truncates to a shorter result (both do OK viz length when used as an embedded site translator as my blog example indicates). Technical terms (I love the fotorreceptora leather strap roundtripped translation for Web!) and capitalization, not to mention pronouns and possessives, appear to be especially problematic. But, nonetheless, for a non-native speaker, the translations do generally convey the subject matter and thrust of the original document. While no one would argue that these machine translations could be depended upon for actionable intelligence — likely qualified human translators are necessary for that — machine translation can get us close to the ballpark.

Of course, each iteration in a machine translation cascade introduces errors, and roundtripping as the examples above show have two levels of errors. For example, here is the still further degraded result after five roundtrips using the Google Portuguese service:

Michael K. Bergman is to co-founder, a main officer of the technology, and a president of BrightPlanet Corporaçõ, and a scientist of the brace of the leather of the fotorreceptora. Mr. Bergman started its professional career in mid-1970s where a director of the project for United States EPA for a main study of the energy called the evaluation the coal the technology (CTA). Later he taught in the graduated school to project itself in the university of Virginia and was a friend in the center of the study of the politics of the energy. Mr. Bergman joined the American public association of the power in 1982 and was soon promoted the director of the energy research. Voluntàriamente Quadrupled trade association supported of the program of the R&D the 500 members. It remained in APPA per seven years to follow gave to give to give to give to give to form its appropriate successful company of commercialization of the technology, of the systems of the community to agitate above of, of the incorporation in agitating, Mr. Bergman helped above creating three groups of commercialization of the technology in the stacks, photovoltaics and the fuel of the energy of biomass. In 1994 been slow it it established VisualMetrics, collaborator of the possessed technology now for BrightPlanet.

Compare that to the first roundtrip and you can see that further errors were introduced.

Naturally, even greater degradation occurs when passing through multiple languages, with this example of English-German-French-English:

Michael K. Bergman is a more technologieoffizier one Cogründer, and a president de BrightPlanet corporation and scientists of network. Mr Bergman began his professional career in the middle of the years 70-iger as a director of project of the United States EPA for an energy energy study which was called the estimate of coal technology (CTA). He informed later of the vehicles in the study of energy policy in the school obtained a diploma of the technique at the university of Virginia and was a medium. Mr Bergman connected American general energy to relation in 1982 and was encouraged soon with the director of energy research. He quadrupled voluntarily the supported relation AND RESEARCH program for 500 members. It remained, melted then with APPA its own company of marketing of successful technology, Instar for seven years of Community systems, helped Inc with of Instar, Mr Bergman to cause three groups of marketings of technologies in the fuel cells, photovoltaics and in the energy of mass of life. Late in 1994 it founded VisualMetrics, the promoter of the technology which was had now by BrightPlanet.

The key point, however, is that as a screening tool and as a means for non-native speakers to generally grasp subjects and topic areas, machine translation can be an impressive and productive aid.

Instructions for Updating Your Own Translations

To add the Applied Language translations and its added languages as my blog now sports, you will first need to sign up and get a unique key, plus list the languages desired. Please see here for the Applied Language sign up page. AL will then send you an email with the HTML and the unique ID for your domain. You NEED this key! Using the Javascript listed above, you can then replace with your unique key in the “value” field and then embed the system within your blog or Web site, similar to the instructions in my previous machine translation post. Good luck!

Posted by AI3's author, Mike Bergman Posted on August 25, 2006 at 1:11 pm in Information Automation, Site-related | Comments (3)
The URI link reference to this post is: https://www.mkbergman.com/267/more-languages-and-a-comparison-translation-services/
The URI to trackback this post is: https://www.mkbergman.com/267/more-languages-and-a-comparison-translation-services/trackback/
Posted:August 23, 2006
NOTE: With Google’s recent announcement of its language translation service (see http://translate.google.com), there is no longer a use or need for this AI3 blog to maintain its language translation service and Javascript. Thus, the downloads listed below are still available, but no longer maintained or supported. MKB

Author’s Note: There is zipped HTML and Javascript code that supports the information in this post. If you develop improvements, please email Mike and let him know of your efforts.


Download Language Translator and JS code file Click here to download the zipped file (2 KB)

For those of you that follow BrightPlanet, we have been moving aggressively for some time now into international document harvesting and all that that implies regarding language and encoding detection and roundtripping. In fact, there is a fairly definitive tutorial post on my blog that deals with these so-called i18n internationalization issues that has become quite the reference on these matters. With its partnership with Basis Tech, in fact, BrightPlanet now can harvest documents in about 140 different languages with accurate encoding translation in multiple legacy forms for about 40 of them and morphological analysis for another 20 or so. There can be no doubt that the need for multi-lingual searching and harvesting and encoding support is an abiding trend of the evolving Internet.

So it was a great surprise and pleasure to encounter Lorelle VanFossen‘s blog site where she has cleverly linked in Google’s machine language translation capabilities. Her explanation of that approach is provided by this specific posting. So, using these techniques, my site has now embraced these language translation capabilities for the nine languages shown as follows:

So add some language translation links to your sidebar or posts and help spread the word about your blog to the world. (Go ahead, actually click on these!):


Translate into Spanish


Translate into German


Translate into French


Translate into Portuguese


Translate into Italian


Translate into Arabic


Translate into Japanese


Translate into Chinese


Translate into Korean

You will also note that my blog now has a standard panel link (different format; see below) to translations into these languages on the main and subsidiary pages.

Try this! It’s fun and impressive. Some have criticised the “ultimate” quality of these translations, but Google improves them continuously over time.

Actual Implementation and Javascript

You should note that Google itself limits the amount of actual text it will translate at any given time. Thus, if you use the translate links
from this site’s main page with its many cascading prior posts, you will see only a few posts translated. If you use the links on specific posts, however, you will find most of the content even for my longish entries translate fully.

Also, these translations are uni-directional. Don’t continue to cascade from language to language; you will get processing errors. Always begin with the English pages as originally published on this site.

There are also two other flaws in the straight implementation as described above:

  1. Google’s listing of machine translated languages is growing, and the nine listed above already take up some real estate for the languages listed. We’re probably already past the point of buttonitis
  2. There is not context for picking up the dynamic URL of wherever a user might be in a Web site or blog.

So, Graham Beynon, one of BrightPlanet’s senior developers, wrote a more generalized Javascript approach, a variant of which presently appears on this site. Via standard option listings, the languages can easily be expanded should more become available from Google, simply by adding another option entry and using the appropriate two-letter language code. Great work, Graham, and thanks.

If you inspect the source code, you’ll also see a couple of other choices you can make in the code operation by removing or adding comments. And, of course, should you choose to use this snippet, make sure you get rid of the test query and remove the HTML header stuff. You can, however, use the LanguageTranslator.html as is.

To download this file, click on the link at the top of this post. And, enjoy!

So, Welcome to Adaptive Information on the Modern Web. Or, rather:

  • In Spanish — Recepción a la información adaptante sobre el Web moderno
  • In German — Willkommen zu den anpassungsfähigen Informationen über das moderne Netz
  • In French — Bienvenue à l’information adaptative sur le Web  moderne
  • In Portuguese — Boa vinda à informação adaptável na correia fotorreceptora moderna
  • In Italian — Benvenuto alle informazioni adattabili sul fotoricettore moderno
  • In Arabic — ارحب التكيف معلومات عن شبكه حديثه
  • In Japanese — 現代網の適応性がある情報への歓迎
  • In Chinese (simplified) — 欢迎在适应现代信息网络
  • In Korean — 현대 웹에 적합한 정보에 환영.

Why is it that all of us get on particular jags?  I started yesterday with a minor interest in pursuing some semantic Web relationships with my blog site and soon found I was cruising in all directions plugins.

My last post describes the first stages of semantic readiness for my blog, with a number of follow-on enhancements in the works.  But in the process, I also came upon a nice display of social bookmark sites at the Ebiquity blog (thanks, Tim Finin, it is one of my favorites) that I immediately envied.

Well, after an amazingly short time in research, I came across one plugin that puts such links on WordPress — Sociable — that is a very clean and capable plugin.  It not only installs like a dream in WordPress (standard procedure), but it also adds some nice option settings that allow you, the blog administrator, to:

  • Select which pages the bookmark links appear on
  • Select which bookmarks get listed
  • And, select the order and some other display options.

In fact, install was so easy in looking at the PHP I was confused as to how the obvious settings and options in the plugin could be set.  My only silly criticism of the Sociable plugin is that there is virtually NO documentation and NO hype, which meant it took me a little poking around to see that when installed these site administrator options even became available.  But once discovered, cool.  How these setting options look in my WordPress administration center is shown below: 

Now we’re talking real power and coolness in a plugin.

Socialble was inspired by Paul Stamatiou and Kirk Montgomery developed and released the first version of the plugin as WP-Sociable in January 2006. In February 2006, Peter Harkins took over development. So, as you notice the new cool icons at the bottoms of my posts, credit goes to Peter and his predecessors and the Sociable plugin.  Nice work, guys.

Posted by AI3's author, Mike Bergman Posted on August 23, 2006 at 9:06 am in Blogs and Blogging, Site-related | Comments (3)
The URI link reference to this post is: https://www.mkbergman.com/262/sociable-more-plugin-method-to-the-madness/
The URI to trackback this post is: https://www.mkbergman.com/262/sociable-more-plugin-method-to-the-madness/trackback/
Posted:August 22, 2006

Once one starts talking the talk, it becomes time to walk the walk.  And, so I have now done so.  I’ve added semantic Web capabilities to this blog site, AI3:::Adaptive Information.

I’ve done so via a suite of nifty tools:

  • SIOC Exporter for WordPress.  The plugin files can be found at http://sw.deri.org/svn/sw/2005/08/sioc/wordpress/SIOC, which stands for Semantically-Interlinked Online Communities, is one of the standard RDF ontologies. The SIOC Exporter is extremely easy to install as a standard WordPress plugin; it took me less than five minutes to copy the two files and activate the system from within my blog administrator. SIOC Exporter was written by Uldis Bojars and works with all WordPress versions above 1.5
  • Okay, so that’s well and good, but what the heck does adding this plugin do? To see the RDF annotations provided by the SIOC Exporter plugin you can use the Semantic Radar extension to Mozilla/Firefox. Semantic Radar detects links to RDF metadata (via auto-discovery information) (naturally!) and displays a status bar icon at the lower right of the browser status bar. Semantic Radar is also very easy to install, only requiring a re-start of the browser. The icon appears when RDF data is detected; clicking on the icon will allow you to browse SIOC RDF data (see below), as well as FOAF and DOAP metadata.  Semantic Radar was also written by Uldis Bojars
  • The presence of Semantic Radar provides another new feature, which is to ping the Semantic Web Ping Service when metadata are detected. This allows for a community based discovery of the Semantic Web data.  The Semantic Web Ping Service was written by Frederick Giasson.  The purposes of the service are to notify that a new semantic web document
    has been published on the Web, to archive its location, and to give its
    location to other web services.

After these simple installations, I now had the SIOC icon for the Semantic Radar detector on my browser and my site was generating SIOC data due to the SIOC Exporter.  Clicking on that icon brought up the SIOC browser, which provides an entry for the blog as a whole and all posts listed on the main page:

One of the SIOC browser links, among other options, enables you to review validated RDF using the W3C service.  Here is what my triples looked like:

And, as noted above, the viewing of this data also pinged the Semantic Web Ping Service, which now showed my entries under both the SIOC and FOAF categories:

All of this was extremely easy to implement. My next phase in using such tools is to expand the breadth of the subject ontologies to parse against my blog site and to discover information extractors or metadata generators that create those attributes on the fly. Stay tuned!

Posted by AI3's author, Mike Bergman Posted on August 22, 2006 at 3:56 pm in Semantic Web, Site-related | Comments (2)
The URI link reference to this post is: https://www.mkbergman.com/260/enabling-sioc-for-ai3-adaptive-information/
The URI to trackback this post is: https://www.mkbergman.com/260/enabling-sioc-for-ai3-adaptive-information/trackback/
Posted:August 20, 2006

It is often easy to glaze over in most discussions of the semantic Web. Though I argue many places that it is a historical inevitablity that the federation of meaning (e.g., resolving semantic heterogeneities) that is the ultimate objective of the semantic Web is the next logical step to recent accomplishments in the resolving of physical and data syntax heterogeneities, such a statement hardly sounds or is compelling in itself.

That is why examples are so powerful.

TopQuadrant has posted a really cool mashup demo of the use of its TopBraid Composer semantic modeling toolset with Google Maps, all in the context of a standard geospatial OWL ontology. The demo also simply explains how all of the pieces work together and shows why OWL ontologies make sense in the first place. The demo also shows well the use of SPARQL, the RDF query language.

To see this video in action, choose the Geography and Mapping Support link from the http://www.topbraidcomposer.com/videos.html page.

Posted by AI3's author, Mike Bergman Posted on August 20, 2006 at 9:05 pm in Adaptive Information, Semantic Web, Semantic Web Tools | Comments (0)
The URI link reference to this post is: https://www.mkbergman.com/259/owl-geospatial-mashups-with-topbraid-composer/
The URI to trackback this post is: https://www.mkbergman.com/259/owl-geospatial-mashups-with-topbraid-composer/trackback/