Machine Translation

Machine Translation (TMS)

Content is machine translated from English by Phrase Language AI.

Integrations with many machine translation providers are available. It is possible to use machine translation with Phrase Language AI or by setting up direct connections with third-party MT engines such as DeepL, Amazon Translate, and Google Translate.

Use Cases

Raw machine translation

In cases where perfect translation is not required, such as internal communication, a raw machine translation can be used. Depending on the quality of the translation memory, even raw machine translation can have some quality. A raw machine translation is generated in pre-translation.

Machine translation post-editing (MTPE)

Higher quality translation requires the review and editing of raw machine translation by professional linguists. A translation workflow should include the machine translation, the editing and review processes along with management and client notifications. Post-editing translation decisions are captured in translation memories and term bases for reuse.

Machine Translation Engine Settings

From the Machine translation engines page, the Engine use table is presented, Phrase Language AI can be configured and third-party machine translation engines can be added.

To access machine translation engines settings, follow these steps:

  1. From the Settings Setup_gear.png page, scroll down to the Integrations section.

  2. Click on Machine Translation Engines.

Character Usage

The Machine translation engines page presents a character usage chart for all MT engines associated with a profile.

MT characters usage is calculated on the source text including spaces and when confirming segments that have no alt-trans (based on settings).

Data for a specific engine can be viewed by deselecting the names of the other engines at the bottom of the chart. The chart only displays data from the last 30 days.

By default, project managers can only see data related to projects they have created.

Phrase Language AI

The Phrase Language AI Add-on includes unlimited machine translation for post-editing workflows.

Configure Third-party MT Engines

If Phrase Language AI is not being used, third-party machine translation engines can be accessed via an API. If using third-party MT in Phrase, ensure the terms and conditions of the given MT provider have been read and understood. Some MT engines have specific restrictions.

To set up a new machine translation engine, have an Admin user follow these steps:

  1. From the Settings Setup_gear.png page, scroll down to the Integrations section.

  2. Click on Machine translation engines.

    The Machine translation engines page opens.

  3. Click on Create

    The Create machine translation engine window opens.

  4. Select MT engine type from the drop-down list and click Create.

    The Create page opens.

  5. Fill in required fields as per MT engine type.

  6. Select options (not applicable to all MT engine types)

    • Add labels (Key and Value)

    • Include tags

    • Make default

  7. Click Save.

    The engine is added to list of machine translation engines.

Some machine translation engines are not for public use, and some are only accessible in specific editions.

MT Data Cleaning

Available for

  • Enterprise plan (Legacy)

Get in touch with Sales for licensing questions.

Raw data can be filtered and extracted from translation memories for use in training custom machine translation engines via an API.

API documentation

Data cleaning process:

  1. Raw data from specified locale pairs is downloaded from translation memories.

  2. Data is filtered.

    Segments with the same source and target are discarded along with tags and data is de-duplicated.

    Filtering is more relaxed for CJK languages.

  3. 10% of the data with the lowest semantic similarity between the source and the target is removed.

    The ratio of discarded data may be defined using preserveRatio.

  4. Data is converted to TXT/TSV based on the outputFormat setting.

Cleaning criteria:

  • Minimum/maximum length

    Each segment requires at least 5 characters and at least 3 letters in any alphabet/script. The segment must be shorter than 1000 characters. Non-conforming segments are discarded

  • Length ratio

    Language specific.

    Ratio of source segment length vs. target segment length (and vice versa) cannot be larger than 2.

    Example:

    If the source segment has 30 characters and the target has 70 characters, the segment pair will be discarded.

  • LASER score

    Segment pairs that pass initial checks are scored using the LASER metric and sorted according to the score; segment pairs scoring more than 90% are are kept. LASER automatically detects whether sentences in two different languages are similar. This check is mainly targeted at misaligned/noisy TM entries.

Supported Third-Party Machine Translation Engines

  • Alexa Translations A.I. (formerly Yappn)

  • Amazon Translate

  • Apertium

  • Closed NMT

  • CrossLang

  • DeepL Pro Advanced and DeepL Pro Ultimate (contact DeepL's support for an API v1 key for CAT tools)

  • Fair Trade Translation

  • Globalese

  • Globalese NMT (Neural Machine Translation)

  • Google AutoML

    Note

    When adding Google AutoML to Phrase Language AI, Bucket name is required information due to support and potential usage of MT glossaries.

  • Google Translate

    Important

    As of November 3rd, 2020, all new projects created with Google Translate as the selected engine will automatically use Google’s Neural Machine Translation (NMT) engine by default instead of Google’s Statistical Machine Translation (SMT) engine.

  • Human Science

  • KantanMT

  • Kodensha MT

  • Language Weaver (formerly SDL BeGlobal)

  • Microsoft Custom Translator (in Microsoft Azure, set location to "global")

  • Microsoft Translator / Microsoft Translator Hub

  • Mirai Translator

  • MoraviaMT

  • NICT

  • NpatMT

  • Omniscien Technologies

  • PangeaMT

  • PROMT

  • Rozetta T-3MT

  • Rozetta T-4OO

  • SDL Language Cloud

  • Skrivanek

  • Sunda MT

  • Systran

  • Systran PNMT

  • T-tact AN-ZIN

  • Tauyou

  • Tauyou Real-time

  • Tencent TranSmart

    Important

    Only available through Phrase Language AI. Supports zh -> en and en-> zh translations (simplified Chinese only).

  • Tilde MT

  • Toshiba

  • Ubiqus NMT

  • Yandex

Was this article helpful?

Sorry about that! In what way was it not helpful?

The article didn’t address my problem.
I couldn’t understand the article.
The feature doesn’t do what I need.
Other reason.

Note that feedback is provided anonymously so we aren't able to reply to questions.
If you'd like to ask a question, submit a request to our Support team.
Thank you for your feedback.