Project Management

Analysis (TMS)

Content is machine translated from English by Phrase Language AI.

Analysis calculates the character and word counts in selected files and identifies average characters per word, repetitions, non-translatables, translation memory matches, internal fuzzy matches and machine translation matches. Analysis can also show the number of revisions made by a reviewer.

Analyses can be created by Project managers or Administrators. Linguists cannot be allowed to run their own analyses. Vendors may create analyses for shared jobs/projects.

Some CAT tools refer to analysis as statistics.

Organizational analytics are provided by the analytics dashboard.

Since different billing units are used in different countries, three calculation methods are available:

  • Characters 

    Without spaces.

  • Words 

    For languages that use spaces between words—excluding Chinese, Japanese, and Thai.

  • Pages 

    1800 characters with spaces—unrelated to the actual number of pages in a file.

Word Count

Due to different counting methods across different languages, word count as presented may not be the same as word counts produced by other applications.

  • Each join tag is replaced with one space.

  • Other tags are removed.

In languages using a whitespace for separating words (e.g., English):

  • Each number sequence including +-,. is replaced with one character (using regexp expression [+-]?[0-9]+([., -]?[0-9]++)*+).

  • Each sequence of whitespaces is replaced with one space.

  • Whitespaces at the beginning and the end of segment are removed.

  • Each sequence of characters different from space is counted as one word.

In Languages not using whitespace for separating words (e.g., Japanese):

  • Some punctuation marks are removed from the text (using regexp expression [\u2000-\u206F\u2E00-\u2E7F\u3000-\u3004\u3006-\u301F\\p{P}]).

  • Segment is split into sequences of characters belonging to non whitespace (NWS) Han, Hiragana, Katakana, and Thai scripts and sequences of characters not belonging to those scripts (WS).

  • Total number of words = (number of words from NWS) + (number of words from WS).

  • Number of words from WS is computed as for English.

  • Number of words for NWS is number of characters without whitespaces.

Note

Characters from CJK languages are counted as both characters and words.

Create an Analysis

To create an analysis, follow these steps:

  1. From a Project page, select one or more Jobs.

  2. Click Analyze.

    The Analyze window opens.

  3. Select a Type from the dropdown list.

  4. Provide a name if required.

    • Available macros for Analysis naming:

      • {projectName}

      • {sourceLang}

        Adds source language

      • {targetLang}

        Adds target language. If multiple languages are analyzed, the language will be empty.

      • {userName}

        Adds username of the assigned Linguist or Vendor. If multiple Linguists are assigned, the name will be empty.

      • {workflow}

      • {innerId}

      • {fileName}

        If more files/jobs are used for analysis, the {fileName} will be empty.

  5. Select analysis options. In particular:

    • Applying the Exclude numbers option will affect the word count as numbers will not be calculated as words.

    • The Include internal fuzzies option compares segments in the analyzed job for similarities within the file as opposed to only comparing them against a TM.

      If Separate internal fuzzies is checked, internal fuzzies matches are displayed as a separate category in newly created analyses. For example:

      A translation job with 10 source words includes the following segments, where only the last character differs:

      • I bought a new car.

      • I bought a new car!

      In case no matches are found in the TM, a default analysis will display:

      IF options

      TM category: 0%-49%

      TM category: 95%-99%

      IF category: 95%-99%

      Include IF disabled

      10 words

      Include IF enabled + Separate IF disabled

      5 words

      5 words

      Include IF + Separate IF enabled

      5 words

      5 words

  6. Click Analyze.

    The analysis, or analyses are added to the list.

  7. Click on an analysis in the list to view it in a simple table or download it for rendering in a project management application.

Note

Analysis options can be set when creating an analysis, at the project level, or globally under Settings Setup_gear.png.

Three analysis types are provided:

Default Analysis

Default analysis is the standard analysis run on source segments before translation. It provides the baseline analysis of a job that can be used with the Post-editing Analysis to determine how much effort was put into translating that job. This baseline is also used as the basis for generating quotes for clients.

A breakdown of segment/word/character counts is produced and if used in a project, TM matches are identified along with non-translatable matches, internal fuzzy matches and QPS (if enabled).

Running a Default Analysis after translation produces incorrect analyses.

Post-editing Analysis

Post-editing analysis is run on target segments and indicates editing effort; how much editing the text required from a linguist or proofreader. It is run after post-editing is complete.

When a linguist clicks on an untranslated segment, the current highest translation memory match, machine translation, and/or non-translatable match is saved for that segment and is used in post-editing analysis.

Post-editing analysis can be launched from any workflow step and is calculated as the difference between the text inserted from available source (e.g. TM/MT) and the post-edited result in the segment target.

Post-editing analysis extends the traditional translation memory analysis to include machine translation (MT) and non-translatables (NT).

Post-analysis options

Post-editing options are used for calculating the post-editing effort required for matches from the translation memory (TM), non-translatables (NT) and machine translation (MT).

Analyze TM post-editing enabled

  • Intended for low-quality TMs that contain high percent matches that require Linguist editing.

  • Indicates post-editing effort for the TM.

  • Contains only 100% matches in the analysis. In-context 101% matches from the TM have no effect on the calculation.

Analyze TM post-editing disabled

  • Intended for high-quality TM where matches should be edited as little as possible to reduce cost.

  • Indicates both 101% and 100%.

  • Indicates TM matches offered to the Linguist when the segment is opened (not the actual Linguist's post-editing effort).

  • Indicates post-editing effort for machine translation and non-translatables.

Analyze NT/MT post-editing enabled

  • If the MT or NT suggestion was accepted without further editing it is presented as a 100% match in the analysis.

  • If Linguist changes the MT, the match rate will be lower. The score-counting algorithm is the same as that used to calculate the score of translation memory fuzzy matches.

  • Editing of an NT will cause the segment to be presented as 0-49% NT.

Analyze NT/MT post-editing disabled

  • Entries from MT/NT without any estimated score will be considered TM 0%-49% matches. They will be indicated as translated by the Linguist with the MT not considered.

  • QPS and Phrase Language AI matches higher than 75% will be in the MT column in their respective matches.

Automatically generate post-editing analysis before a source update

  • Analysis is created:

    • For each updated job.

    • For each individual provider individually and assigned to that respective provider.

  • Analysis is not created if:

    • No linguist or vendor is assigned.

  • Analysis counts confirmed and translated segments.

  • Analysis follows the naming convention:

    • UpdateSource #{innerID}{workflow}

  • Analysis will be created with Units counted (source), Analyze NT post-editing, Analyze TM post-editing and Analyze MT post-editing selected.

Count units of the

  • source/target

    Select which word count will be presented in the analysis. A target word count may be higher than a source word count.

    Does not affect match scoring.

Compare Analysis

Available for

  • Team, Ultimate and Enterprise plans (Legacy)

Get in touch with Sales for licensing questions.

The Compare analysis feature is only available in projects with workflow steps. It compares two versions of a file in different Workflow steps on a segment level and analyzes how the two versions differ. If there are no project specific settings for the analysis, default settings are used and may result in incorrect reports.

Example

A comparison between the translation and review steps indicates the actual effort of a reviewer by identifying how much the translation changed during the review step.

Analysis can be run on multiple jobs and can be grouped in two ways:

  • Analyze by provider 

    • For a project with many jobs assigned to various Linguists or Vendors. Used to:

      • Create separate analyses containing files assigned to individual Linguists or Vendors.

      • Assign analyses to a provider making the analyses visible to their Linguists/Vendors.

      Net rate scheme will be pre-selected as an option if one is applied to the provider.

  • Analyze by language 

    • If a project contains multiple target languages, the analyses of all files can be run in a batch creating a separate analysis for each individual language.

      To analyze by language, follow these steps:

      1. From a Project page, select all Job files.

      2. Click Analyze.

        The Analyze window opens.

      3. Maintain default settings and select Analyze by language.

      4. Click Analyze.

        Job analysis is prepared by language.

Analysis Recalculation

If the source file used for an analysis is updated, it is indicated as being outdated yellow_warning.jpg in the analysis table.

Recalculating applies settings used for the original analysis.

Vendors are not allowed to recalculate analyses created by Buyers.

To recalculate using new source file, follow these steps:

  1. Select the outdated analysis(es)

  2. Click Recalculate.

    The Recalculate analysis window opens until the recalculation is processed. When closed, the recalculation is complete and the outdated indicator is cleared.

Customize the Analysis view

SegmentsPagesWordsCharacters, and Percents columns can be displayed/hidden in the Analysis table. The Editing time column is also available for post-editing analysis and indicates how many seconds were spent editing a segment.

Download an analysis

To download an analysis, follow these steps:

  1. Click Download to present the dropdown menu and select:

    • CSV (Comma Separated Values) with or without characters and readable with spreadsheet applications.

    • LOG (Similar to SDL Trados format) and readable with most project management applications.

    • JSON (JavaScript Object Notation), a lightweight data-interchange format.

    Only analysis downloaded in JSON format will include a breakdown of NT, MT, TM and internal fuzzies (IF) data per match type. 

  2. Selecting a file type triggers the download.

These files can be imported into most project management software systems.

Apply a net rate scheme

A discount to words/characters/pages can be applied in an analysis. A discounted translation volume is immediately calculated and displayed directly in the analysis in the Net rate row.

To remove the net rate scheme from the analysis, leave the field next to the Apply net rate button empty.

When a net rate scheme is applied to the analysis, the downloaded file with the analysis shows weighted word counts in each match category.

Assign Analysis to a Provider

To assign an analysis to to a provider, follow these steps:

  1. Select an analysis from the list and click Edit.

    The editing page opens.

  2. Select a Provider from the dropdown list.

  3. Click Save.

    The analysis will be available to the assigned provider on the linguist portal.

Was this article helpful?

Sorry about that! In what way was it not helpful?

The article didn’t address my problem.
I couldn’t understand the article.
The feature doesn’t do what I need.
Other reason.

Note that feedback is provided anonymously so we aren't able to reply to questions.
If you'd like to ask a question, submit a request to our Support team.
Thank you for your feedback.