Integrations with many machine translation providers are available. It is possible to use machine translation with Phrase Language AI or by setting up direct connections with third-party MT engines such as DeepL, Amazon Translate, and Google Translate.
Raw machine translation
In cases where perfect translation is not required, such as internal communication, a raw machine translation can be used. Depending on the quality of the translation memory, even raw machine translation can have some quality. A raw machine translation is generated in pre-translation.
Machine translation post-editing (MTPE)
Higher quality translation requires the review and editing of raw machine translation by professional linguists. A translation workflow should include the machine translation, the editing and review processes along with management and client notifications. Post-editing translation decisions are captured in translation memories and term bases for reuse.
From the
page, the table is presented, Phrase Language AI can be configured and third-party machine translation engines can be added.To access machine translation engines settings, follow these steps:
The
page presents a character usage chart for all MT engines associated with a profile.MT characters usage is calculated on the source text including spaces and when confirming segments that have no alt-trans (based on settings).
Data for a specific engine can be viewed by deselecting the names of the other engines at the bottom of the chart. The chart only displays data from the last 30 days.
By default, project managers can only see data related to projects they have created.
Phrase Language AI
The Phrase Language AI Add-on includes unlimited machine translation for post-editing workflows.
If Phrase Language AI is not being used, third-party machine translation engines can be accessed via an API. If using third-party MT in Phrase, ensure the terms and conditions of the given MT provider have been read and understood. Some MT engines have specific restrictions.
To set up a new machine translation engine, have an Admin user follow these steps:
From the Settings page, scroll down to the
section.-
Click on Machine translation engines.
The
page opens. -
Click on Create.
The
window opens. -
Select MT engine type from the drop-down list and click Create.
The
page opens. Fill in required fields as per MT engine type.
-
Select options (not applicable to all MT engine types)
Add labels (Key and Value)
Include tags
Make default
-
Click Save.
The engine is added to list of machine translation engines.
Some machine translation engines are not for public use, and some are only accessible in specific editions.
Raw data can be filtered and extracted from translation memories for use in training custom machine translation engines via an API.
API documentation
Data cleaning process:
Raw data from specified locale pairs is downloaded from translation memories.
-
Data is filtered.
Segments with the same source and target are discarded along with tags and data is de-duplicated.
Filtering is more relaxed for CJK languages.
-
10% of the data with the lowest semantic similarity between the source and the target is removed.
The ratio of discarded data may be defined using
preserveRatio
. Data is converted to TXT/TSV based on the
outputFormat
setting.
Cleaning criteria:
-
Minimum/maximum length
Each segment requires at least 5 characters and at least 3 letters in any alphabet/script. The segment must be shorter than 1000 characters. Non-conforming segments are discarded
-
Length ratio
Language specific.
Ratio of source segment length vs. target segment length (and vice versa) cannot be larger than 2.
Example:
If the source segment has 30 characters and the target has 70 characters, the segment pair will be discarded.
-
LASER score
Segment pairs that pass initial checks are scored using the LASER metric and sorted according to the score; segment pairs scoring more than 90% are are kept. LASER automatically detects whether sentences in two different languages are similar. This check is mainly targeted at misaligned/noisy TM entries.
Alexa Translations A.I. (formerly Yappn)
Amazon Translate
Apertium
Closed NMT
CrossLang
DeepL Pro Advanced and DeepL Pro Ultimate (contact DeepL's support for an API v1 key for CAT tools)
Fair Trade Translation
Globalese
Globalese NMT (Neural Machine Translation)
-
Google AutoML
Note
When adding Google AutoML to Phrase Language AI, Bucket name is required information due to support and potential usage of MT glossaries.
-
Google Translate
Important
As of November 3rd, 2020, all new projects created with Google Translate as the selected engine will automatically use Google’s Neural Machine Translation (NMT) engine by default instead of Google’s Statistical Machine Translation (SMT) engine.
Human Science
KantanMT
Kodensha MT
Language Weaver (formerly SDL BeGlobal)
Microsoft Custom Translator (in Microsoft Azure, set location to "global")
Microsoft Translator / Microsoft Translator Hub
Mirai Translator
MoraviaMT
NICT
NpatMT
Omniscien Technologies
PangeaMT
PROMT
Rozetta T-3MT
Rozetta T-4OO
SDL Language Cloud
Skrivanek
Sunda MT
Systran
Systran PNMT
T-tact AN-ZIN
Tauyou
Tauyou Real-time
-
Tencent TranSmart
Important
Only available through Phrase Language AI. Supports zh -> en and en-> zh translations (simplified Chinese only).
Tilde MT
Toshiba
Ubiqus NMT
Yandex