The results of the 2005 NIST evaluation are now online:
The winner, hands down, appears to be Google who were able to translate two texts in Arabic and Chinese to English with the best precision on the BLEU-4. If you don’t know what the Bleu metric measures, NIST has an explanation:
Machine translation quality was measured automatically using an N-gram co-occurrence statistic metric developed by IBM and referred to as BLEU. BLEU measures translation accuracy according to the N-grams or sequence of N-words that it shares with one or more high quality reference translations. Thus, the more co-occurrences the better the score. BLEU is an accuracy metric, ranging from “0″ to “1″ with “1″ being the best possible score.
How does a parallel statistical translation model work? You feed a classifying engine two text streams in the languages of choice, and it will associate words and phrases in one language with words and phrases in the next, for example, feeding it:
Me falta el tiempo.
and the english version
I don’t have time!
The engine may associate el tiempo with time more strongly, since it’s seen that before, but create a special rule for the phrase me falta which in this context means I don’t have and not anything deriving regularly from to lack.
|This entry was posted on Monday, August 22nd, 2005 at 4:26 pm and is tagged with translation accuracy, translation model, text streams, languages of choice, gram co, metric measures, translation quality, machine translation, google, quality reference, pride and joy, model work, el tiempo, nist, statistical analysis, occurrences, statistic, occurrence, phrases, arabic. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback.|