New Crowdsourced Translation Option

This post originally appeared on my blog on Monday, May 7, 2012. logo.Many organizations don’t have the budget to guide them through a full translation / localization project, and some don’t even know where to start. In late 2009 I wrote about low/no-cost options from Google (machine translation) and Facebook (human-powered): Facebook and Google Want to Translate Your Site

A new option has emerged recently, covered in the Mashable piece Free Online Human Translation Service Takes On Babelfish, Google Translate. Unfortunately the writer of that piece doesn’t seem to understand the rigor that has to go into the translation process, so opportunities to provide a deeper analysis are missed in the article.

The service is called Ackuna, a free offering from a translation agency. Mashable’s suggestion that this service takes on the two translation giants on which most web users rely is silly — Google and Babelfish provide real-time machine translation. Ackuna does neither. Ackuna uses people to provide translation and does so at the pace of the volunteer translators.

I have already made a case against machine translation for anything other than casual or immediate needs. I almost always counsel my clients against its use, including the free Google translate widget you can drop into a web site. There are exceptions, of course, but that’s out of the scope of what I am addressing here.

Because Ackuna uses humans for translation, there are a number of questions that anyone looking to use Ackuna should ask. I detailed a set of questions in my 2009 post, but I’ll recap here (excluding the questions regarding Facebook Connect):

  1. Does Ackuna attract users who are fluent in the desired target language?
  2. Are these users willing to help translate your content for free?
  3. Is the translator a subject matter expert?
  4. Is the translator part of your target audience (including geographic and demographic breakdown)?
  5. Are you (or your client) comfortable letting unknown third parties translate your message?
  6. Is time budgeted to identify content for translation?
  7. Is time budgeted to have someone review the translation?

Ackuna’s FAQ page answers some of these questions, but doesn’t really explain how you qualify a translator. Ackuna’s translators are ranked in the site by a combination of user feedback and badges. Think upvotes and downvotes, with points determined by whether or not a translation (or a step) was accepted or not. Badges are awarded based on other translators marking submitted translations as accurate.

When it comes to deciding whether a translation is correct, assuming you don’t speak the target language, Ackuna doesn’t make any guarantees:

Use a translator’s reputation and badges as an indicator of their credibility, and take into account the comments and feedback left on each translation by other users. Use these factors and your best judgment before accepting the translation of your text.

If timing is a concern, remember that translators are providing translations because they want to. The only pay-off for these translators are badges and points. When you have no contract and no way to pressure someone for work, there is no guarantee it will ever be completed. In case you can’t wait and decide to walk away with what’s been translated so far, from the FAQ:

How do I download my completed translation?

[…] You will not be able to view a completed translation until every segment in your project has at least one translation submitted.

Not being able to secure translations can be a bit tricky, too, especially if some of your content is sensitive or personal. Given this clause in the terms & conditions, you may want to think hard about what you post for translation:

[Y]ou give the right to Ackuna and its affiliates to store your input indefinitely and reuse it at any time and for any purpose at our discretion.

Ackuna needs critical mass to produce good translations (or translators whose profiles don’t read like Hipster spam-bots). It needs many translators reviewing each others’ work to produce robust translations in timeframes that matter for businesses. Ackuna needs more users ranking one another’s work, otherwise it may be too hard to know if that Simplified Chinese translation really conveys your message properly — especially when the translators all have a similar rating. Ackuna’s bare-bones interface may not help it attract good Samaritans who just want to translate, since it’s not too easy to see all the projects in one pass (you have to page through them) and the search feature doesn’t work (yet, it claims).

Ackuna itself is not a bad idea. A translation workflow and process is a necessity in any translation project and Ackuna provides some of that. If you already have translators available to you, it might even make an effective no-cost solution to manage the workflow and get others to weigh in on the work.

What Ackuna could do is counsel its users on what makes good translation, maybe even cross-selling its parent company’s services. From there it should group translations into industries or subject matter so that those with experience in them can find content more relevant to their skills. In addition, finding a method to indicate a translator has a specific industry or region expertise and provide a ranking system for same can go a long way to helping a user understand if his or her translation is as good as it could be.

I want to be clear that I am not criticizing Ackuna (though I could be criticizing Mashable’s presentation of Ackuna). Providing a free service for something so rooted in the complexities of human language goes beyond what its technology can do. As I have commented before about free services, you get what you pay for.


HTML5 Will Play Nice with Translation

This is an update to a note originally posted on my blog on January 17. As of yesterday (February 7, 2012), this new attribute is officially part of the HTML5 specification. If you are interested you can read the part of the bug report where this change was accepted.

HTML5 Logo with character for Chinese number 5.Back in late 2009 I wrote a little something talking about Google Translate and the risks associated with relying on machine translation for anything critical (“Facebook and Google Want to Translate Your Site“). I even offered some examples of things that are tough to translate.

One real-world example I did not list was when I used machine translation to process a page with someone’s name. The first name we’ll say was “Bill,” but the last name was definitely “Belt.” Somehow instead of “Bill Belt” being retained as his name throughout the article, he was renamed to “Bill of Leather Strap.”

This particular example is one step closer to being a thing of the past. In the latest W3C Open Web Platform Weekly Summary, a new attribute has been announced for HTML5 that will allow authors to exclude specific content from being translated — for any service that will honor it. The announcement:

A global translate attribute will be added to HTML5. The values are yes or no with the same inheritance policy than the lang attribute. The goal is to specify if a piece of text should or not should not be translated automatically.

Of course, if I want to exclude Bill Belt from being translated, I’ll probably have to wrap his name in a span in order to throw a translate="no" in there since I doubt I’ll have an otherwise semantic or structural element in place already. This does, however, offer a far better solution than the previous suggestion of using a class to achieve the same effect.

To be fair, Google Translate already has its own support for excluding content from automatic translation, specifically using class="notranslate". Head over to the Google Translate Help page and expand the bottom-most option, “General information for webmasters” (nice to see they make it easy for a direct link).

If you are curious about the process this went through to become a change for HTML5, you can see the bug report that started it all back on April 4, 2011: Bug 12417 – HTML5 is missing attribute for specifying translatability of content.

I don’t believe that machine translation is ever a good way to translate or localize content for anything more than casual use. For example, legal matters, healthcare, and things like that are poor candidates for machine translation (I have far more to say on this point in the post linked above). For organizations that do provide manual human translation, this attribute can be a boon to them as well, allowing them to understand pieces of content that do not need to be processed, saving time, effort and cost to everyone in the translation workflow.

As developers it’s our responsibility to make sure it is used correctly, most likely by helping to train content authors.