Custom AI

Automated Evaluations

Content is machine translated from English by Phrase Language AI.

Automated evaluations are provided for every MT model. Click on a model name or the ellipsis ellipses.png in the More column to view them.

Phrase Custom AI offers rich data and advanced visual support designed to provide a deeper understanding of custom NextMT model quality:

  • The Overview tab provides a summary of the evaluation results, featuring intuitive visualizations and metadata about the MT model.

    • The Performance Comparison table compares the performance of generic versus custom NextMT models across four MT quality metrics. The table has two main sections:

      • Baseline Performance

        Shows automated MT quality scores for Phrase NextMT and a custom NextMT model without TM leverage.

      • RAG Performance

        Shows automated MT quality scores where TM fuzzy matches are leveraged to adapt MT output.

      The Best Engine column highlights the highest-performing model for each metric.

    • The Model metadata panel provides essential information about the evaluated custom NextMT model.

  • The Visualizations tab provides a graphical representation of MT evaluation results through donut charts, offering a breakdown of evaluated translation segments by quality category.

    • Select the desired MT quality metric from the dropdown menu at the top to benchmark the custom NextMT model against the generic Phrase NextMT model.

    • Hover over each category of the donut chart(s) to view the percentage and number of affected segments for that category.

  • The Evaluation sample tab presents a segment sample preview from the evaluation set, displaying a list of source segments with relevant baseline and RAG performance scores.

    When a segment is selected, the right panel displays:

    • Segment-specific scores and quality level indicators for baseline and RAG performance.

    • A comparison of the translation output generated by custom and generic NextMT models against the reference translation from the dataset. Select Show differences in the engine output to highlight differences against the reference translation.

Was this article helpful?

Sorry about that! In what way was it not helpful?

The article didn’t address my problem.
I couldn’t understand the article.
The feature doesn’t do what I need.
Other reason.

Note that feedback is provided anonymously so we aren't able to reply to questions.
If you'd like to ask a question, submit a request to our Support team.
Thank you for your feedback.