Administration

Phrase Data (TMS)

Content is machine translated from English by Phrase Language AI.

Phrase Data is available in two tiers:

  • Basic

    Available for

    • Team, Professional, Business and Enterprise plans

    Get in touch with Sales for licensing questions.

  • Premium

    The Premium tier offers the same access as the Basic tier, plus access to segment-level data.

    Available for

    • Business and Enterprise plans

    Get in touch with Sales for licensing questions.

    Available for

    • Enterprise plan (Legacy)

    Get in touch with Sales for licensing questions.

Cloud data warehouses (such as Snowflake) enable customers to securely access their data through an SQL interface.

Phrase Data displays data relevant to the customer's usage of TMS from the date the customer first subscribed to Phrase to the present date. Phrase reserves the right to change the period for which the data is displayed, upon notice to the customer.

Full technical documentation of the Phrase Data integration.

Phrase Data aids strategic decision-making, demonstrates business impact and can justify investments. Data can be combined with broader company metrics to reveal impact on revenue, market penetration, and customer satisfaction:

  • Web traffic, such as page views, bounce rates, user demographics.

  • Marketing data, such as click-through rates, conversion rates, social media engagement

  • Customer support data, analyze whether localized support materials lead to faster issue resolution and higher customer satisfaction.

  • Sales figures, comparing the costs with revenue generated from localized content.

Insights help to understand the reach and reputation of global content globally and ensures a company’s message succeeds across languages.

  • Return on Investment (ROI) of Localization Efforts

    Used to support maintaining or increasing budgets by proving that investments in translation drives returns. It also points out cost savings such as how using machine translation saved money over time contributes to overall ROI.

  • Content Adaptation, Efficacy, and User Engagement

    Understand how localized content performs with audiences. Tracking customer retention in each locale suggests that the translations are meeting user needs. This feedback can influence a team’s quality strategy or how to adjust style to better resonate with local users. A customer use case correlates content adaptations with engagement metrics. By using different prompts with AutoAdapt to personalize content based on demographics (e.g. age, gender, region, language), this would then track performance uplift at a local or campaign level.

  • Market and Language Prioritization

    Helps decide which languages or regions to focus on for maximum impact. It ensures the team allocates resources to translations that yield the greatest business return. May also guide the sequence of new language rollouts or justify localization for markets where they want to grow.

  • Process Optimization & Technology Strategy

    Evaluating workflow duration and automation impact leads to continuous process improvements and helping choose the right technology. Measuring how increased use of machine translation in terms of quality and speed (e.g. post-editing time and quality scores) quantifies productivity gains.

  • Internal Performance Benchmarking

    Over time, internal benchmarks such as average cost per word, average turnaround for a given content type, and average quality score become strategic targets for improvement. It reveals the results of smart practices and efficiency gains which further justify localization programs.

Basic Tier Use Cases

In any industry that relies on translation and linguistic localization, teams need robust data insights to guide decisions. However, analytics reporting and insights differ by team needs (Localization, Content, Executive Stakeholder) who require operational metrics (day-to-day efficiency, throughput, costs) and strategic insights (long-term ROI, user impact, market growth).

Operational analytics help localization managers streamline workflows and manage costs on a daily basis.

  • Volume (also known as throughput)

    Tracking the amount of content translated over time (e.g. words per week/month) indicates the team’s capacity. This helps with resource planning.

  • Timeliness (also known as turnaround time)

    Turnaround time measures how long it takes to translate content from start to finish. Localization teams track whether translations are delivered on schedule or face delays. This is crucial for meeting product launch dates, SLAs, and investigating delays.

  • Vendor and linguist performance

    If external translation vendors or freelance linguists are used, the localization team will want to evaluate their performance. Metrics like turnaround time per vendor, on-time delivery per vendor and quality scores per linguist are tracked.

  • Quality metrics (Linguistic Quality Assurance)

    Measure the percentage of translations that pass QA checks on the first attempt (no rework needed), indicating effective initial translation. Similarly, detailing the category and severity of the error issues found e.g. terminology, accuracy, etc.

  • Cost and efficiency metrics

    Typically teams track cost per word and total spend by project, language, or department to ensure budgets are met. As well as savings to compare raw translation cost vs. discounted cost after applying a net rate scheme highlighting the value of maintaining TM matches and repetitions.  For example, if 30% of words in a new project were translated via 100% TM matches, the team can quantify the cost saved by not retranslating those segments.

  • Leverage of Automation (MT and TM)

    • Post-editing effort, such as the average edits or time needed on MT outputs, to help evaluate MT.

    • TM leverage rate by looking at the % of content by TM matches, which indicates TM reuse efficiency and cost savings.

    • MT usage rate such as  the % of segments initially translated by MT. This presents automation coverage and opportunities for cost/time reduction – Premium only.

Premium Tier Use Cases

Translation Memory (TM) utilization

Track when a TM was last used and how many segments were reused over time. This helps assess TM freshness, value contribution, and whether outdated TMs are still worth maintaining.

Suggested analysis:

  • Use segment_statistic_v2 and filter on translation_origin = TM.

    • A TM can be identified using the translation_memory_id.

    • Aggregate segments by TM, and use the date_created field to see when the TM was used. It is also good to calculate and see the average editing_time_ms and score.

  • Join with job_v2 using the job_id to extract language information e.g. target_locale.

  • Join with project_v2 using the project_id to contextualize usage by domain, client, business unit, etc.

Benefits:

  • Cost savings: Helps sunset unused or low-performing TMs to reduce maintenance overhead.

  • Quality: Identifies aging TMs with high post-editing effort, and flags suitable action.

  • Efficiency: Use high-performing TMs that reduce edit time and increase auto-confirmation. Look into adjusting the TM threshold accordingly.

Machine Translation (MT) engine optimization

Compare MT engines by QPS (quality score per segment) and post-editing time, allowing for a more intelligent routing of content to the right MT engine.

Suggested analysis:

  • Use  segment_statistic_v2 with filter on translation_origin = MT.

  • Join with machine_translate_setting_v2 using the machine_translate_setting_id.

    • An MT engine can be identified using the machine_translate_setting_name.

    • Aggregate segments by MT engine and calculate the sum of segment_id, the average QPS, and the average editing_time_ms.

  • Join with job_v2 using the job_id to extract language information e.g. target_locale.

  • Join with project_v2 for different content types like domain, client, business unit, etc.

Benefits:

  • Lower Cost: Identify and use the right engines for the job that are most suitable for content type and language with historically low edit rates.

  • Quantify turnaround time: By looking into the editing times per language pair, content type, etc.

Maximize auto-confirm segments (i.e. no touch content)

Identify the most efficient point where high QPS and no editing time is eligible for auto-confirmation so to separate as much content as possible from requiring human review.

Suggested analysis:

  • Use  segment_statistic_v2:

    • Filter where segment translation_origin = MT, is_confirmed = true, editing_time_ms = 0, and confirmation_source = MT.

    • Aggregate segment count (segment_id),word count (words_processed) by QPS.

    • Join with job_v2 using the job_id to extract language information e.g. target_locale.

    • Join with project_v2 using the project_id to extract content metadata.

Benefits:

  • Cost Savings: Find the optimal QPS threshold to increase auto-confirmation and reduce editing time where the quality is already good enough.

  • Reduce turnaround time: Avoid content being unnecessarily reviewed.

Query templates

Sample queries to use as a guide to help get started with the relevant tables:

Pre-translated MT output from month to month

SELECT
    ssv.qps
    ,DATE(DATE_TRUNC('month', ssv.date_created)) AS date_month
    ,COUNT(DISTINCT segment_id) AS mt_segments
FROM segment_statistic_v2 ssv
WHERE
    ssv.date_created >= DATEADD('day', -365, CURRENT_DATE)
    AND ssv.date_created < CURRENT_DATE
    AND ssv.translation_origin = 'mt' -- segments pre-translated by MT
GROUP BY
    ssv.qps
    ,DATE(DATE_TRUNC('month', ssv.date_created));

Pre-translated MT quality variation across custom project metadata

SELECT
    ssv.qps
    ,pcmv.custom_field_name
    ,pcmv.custom_field_values
    ,COUNT(DISTINCT segment_id) AS mt_segments
FROM segment_statistic_v2 ssv
JOIN job_v2 jv
    ON ssv.job_id = jv.job_id
JOIN project_custom_metadata_v2 pcmv
    ON jv.project_id = pcmv.project_id
WHERE
    ssv.date_created >= DATEADD('day', -28, CURRENT_DATE)
    AND ssv.date_created < CURRENT_DATE
    AND ssv.translation_origin = 'mt' -- segments pre-translated by MT
GROUP BY
    ssv.qps
    ,pcmv.custom_field_name
    ,pcmv.custom_field_values;

Pre-translated MT quality variation across locales

SELECT
    ssv.qps
    ,jv.locale_pair
    ,COUNT(DISTINCT segment_id) AS mt_segments
FROM segment_statistic_v2 ssv
JOIN job_v2 jv
    ON ssv.job_id = jv.job_id
WHERE
    ssv.date_created >= DATEADD('day', -28, CURRENT_DATE)
    AND ssv.date_created < CURRENT_DATE
    AND ssv.translation_origin = 'mt' -- segments pre-translated by MT
GROUP BY
    ssv.qps
    ,jv.locale_pair;

Pre-translated MT volume and quality

SELECT
    ssv.qps
    ,mtsv.machine_translate_setting_type
    ,COUNT(DISTINCT segment_id) AS mt_segments
FROM segment_statistic_v2 ssv
JOIN machine_translate_setting_v2 mtsv
    ON ssv.machine_translate_setting_id = mtsv.machine_translate_setting_id
WHERE
    ssv.date_created >= DATEADD('day', -28, CURRENT_DATE)
    AND ssv.date_created < CURRENT_DATE
    AND ssv.translation_origin = 'mt' -- segments pre-translated by MT
GROUP BY
    ssv.qps
    ,mtsv.machine_translate_setting_type;

Quantity of pre-translated MT segments auto-confirmed by QPS

SELECT
    qps
    ,COUNT(DISTINCT segment_id) AS mt_segments
    ,COUNT(DISTINCT IFF(LOWER(confirmation_source) IN ('mt', 'tm', 'nt', 'ir', 'ut'), segment_id, NULL)) AS mt_segments_autoconfirmed
FROM segment_statistic_v2
WHERE
    date_created >= DATEADD('day', -28, CURRENT_DATE)
    AND date_created < CURRENT_DATE
    AND translation_origin = 'mt' -- segments pre-translated by MT
GROUP BY
    qps;

Quantity of pre-translated MT segments edited by QPS

SELECT
    qps
    ,COUNT(DISTINCT segment_id) AS mt_segments
    ,COUNT(DISTINCT IFF(COALESCE(editing_time_ms, 0) > 0, segment_id, NULL)) AS mt_segments_edited
FROM segment_statistic_v2
WHERE
    date_created >= DATEADD('day', -28, CURRENT_DATE)
    AND date_created < CURRENT_DATE
    AND translation_origin = 'mt' -- segments pre-translated by MT
GROUP BY
    Qps;

Quantity of words saved from human review by reducing the QPS threshold

WITH qps_segments AS (
    SELECT
        ssv.qps
        ,SUM(words_processed) AS words_processed
    FROM segment_statistic_v2 ssv
    WHERE
        ssv.date_created >= DATEADD('day', -28, CURRENT_DATE)
        AND ssv.date_created < CURRENT_DATE
        AND ssv.translation_origin = 'mt' -- segments pre-translated by MT
    GROUP BY
        ssv.qps
)

SELECT
    qps AS new_qps_threshold
    ,SUM(words_processed) OVER (ORDER BY qps DESC) AS words_saved
FROM qps_segments;
Was this article helpful?

Sorry about that! In what way was it not helpful?

The article didn’t address my problem.
I couldn’t understand the article.
The feature doesn’t do what I need.
Other reason.

Note that feedback is provided anonymously so we aren't able to reply to questions.
If you'd like to ask a question, submit a request to our Support team.
Thank you for your feedback.