Analysis (TMS)

Analysis calculates the character and word counts in selected files and identifies:

Repetitions (including cross-file repetitions)

Only the first occurrence of a duplicated segment is counted separately, in the regular match-percentage or internal-fuzzy bucket. Repetitions counts subsequent occurrences of that same segment. Example: A job with exactly two identical segments shows Repetitions: 1 segment, not 2. This reflects the counting rule.
Translation memory matches,
Non-translatables
Internal fuzzy matches
Machine translation suggestions

Analyses can also show the number of revisions made by a reviewer.

Analyses can be created by Project managers or Administrators. Linguists cannot be allowed to run their own analyses. Vendors may create analyses for shared jobs/projects.

Some CAT tools refer to analysis as statistics.

Organizational analytics are provided by the analytics dashboard.

Since different billing units are used in different countries, three calculation methods are available:

Characters

Without spaces.
Words

For languages that use spaces between words—excluding Chinese, Japanese, and Thai.
Pages

1800 characters with spaces—unrelated to the actual number of pages in a file.

Word Count

Due to different counting methods across different languages, word count as presented may not be the same as word counts produced by other applications.

Word count for analysis is different from that of MTU calculation.

Each join tag is replaced with one space.
Other tags are removed.

In languages using a whitespace for separating words (e.g., English):

Each number sequence including +-,. is replaced with one character (using regex expression [+-]?[0-9]+([., -]?[0-9]++)*+).
Each sequence of whitespaces is replaced with one space.
Whitespaces at the beginning and the end of segment are removed.
Each sequence of characters different from space is counted as one word.

In Languages not using whitespace for separating words (e.g., Japanese):

Some punctuation marks are removed from the text (using regex expression [\u2000-\u206F\u2E00-\u2E7F\u3000-\u3004\u3006-\u301F\\p{P}]).
Segment is split into sequences of characters belonging to non whitespace (NWS) Han, Hiragana, Katakana, and Thai scripts and sequences of characters not belonging to those scripts (WS).
Total number of words = (number of words from NWS) + (number of words from WS).
Number of words from WS is computed as for English.
Number of words for NWS is number of characters without whitespaces.
For Thai, the number of words is number of characters without whitespaces divided by 2.

Note

Characters from CJK languages are counted as both characters and words.

Create an Analysis

To create an analysis, follow these steps:

From a Project page, select one or more Jobs.
Click Analyze.

The Analyze window opens.
Select a Type from the dropdown list.
Provide a name if required.
- Available macros for Analysis naming:
  - {projectName}
  - {sourceLang}
    
    Adds source language
  - {targetLang}
    
    Adds target language. If multiple languages are analyzed, the language will be empty.
  - {userName}
    
    Adds username of the assigned Linguist or Vendor. If multiple Linguists are assigned, the name will be empty.
  - {workflow}
  - {innerId}
  - {fileName}
    
    If more files/jobs are used for analysis, the {fileName} will be empty.

Select analysis options. In particular:

Applying the Exclude numbers option will affect the word count as numbers will not be calculated as words.

The Include internal fuzzies option compares segments in the analyzed job for similarities within the file as opposed to only comparing them against a TM.

If Separate internal fuzzies is checked, internal fuzzies matches are displayed as a separate category in newly created analyses. For example:

A translation job with 10 source words includes the following segments, where only the last character differs:

I bought a new car.
I bought a new car!

In case no matches are found in the TM, a default analysis will display:

IF options	TM category: 0%-49%	TM category: 95%-99%	IF category: 95%-99%
Include IF disabled	10 words
Include IF enabled + Separate IF disabled	5 words	5 words
Include IF + Separate IF enabled	5 words		5 words

Click Analyze.

The analysis, or analyses are added to the list.
Click on an analysis in the list to view it in the Analysis detail page or download it for rendering in a project management application.

Note

Analysis options can be set when creating an analysis, at the project level, or globally under Settings .

Three analysis types are provided:

Default Analysis

Default analysis is the standard analysis run on source segments before translation. It provides the baseline analysis of a job that can be used with the post-editing analysis to determine how much effort was put into translating that job. This baseline is also used as the basis for generating quotes for clients.

The TM threshold set in pre-translation is used by default but can be changed if required.

A breakdown of segment/word/character counts is produced and if used in a project, TM matches are identified along with non-translatables, internal fuzzy matches and QPS (if enabled).

Important

Running a default analysis after translation produces incorrect analyses.

Post-editing Analysis

Post-editing analysis is run on target segments and indicates editing effort; how much editing the text required from a linguist or proofreader. It is run after post-editing is complete.

When a linguist clicks on an untranslated segment, the current highest translation memory match, machine translation suggestion, and/or non-translatable is saved for that segment and is used in post-editing analysis.

Post-editing analysis can be launched from any workflow step and is calculated as the difference between the text inserted from available source (e.g. TM/MT) and the post-edited result in the segment target.

Post-editing analysis extends the traditional translation memory analysis to include machine translation (MT) and non-translatables (NT). Third-party MT engines are also supported.

Important

Disabling Analyze TM post-editing and Analyze NT/MT post-editing does not exclude TM/MT matches from the analysis. In this case, the analysis considers the score of the higher available match instead of the post-editing effort.

A 101% TM result in a post-editing analysis does not necessarily mean the segment was translated from TM.

Example

A job translated using MT can still appear as 101% TM if Analyze TM post-editing is disabled and a 101% TM match was available when the segment was opened.

Post-analysis options

Post-editing options are used for calculating the post-editing effort required for matches from the translation memory (TM), non-translatables (NT) and machine translation (MT).

Analyze TM post-editing enabled

Intended for low-quality TMs that contain high percent matches that require Linguist editing.
Indicates post-editing effort for the TM.
Contains only 100% matches in the analysis. In-context 101% matches from the TM have no effect on the calculation.

Analyze TM post-editing disabled

Intended for high-quality TM where matches should be edited as little as possible to reduce cost.
Indicates both 101% and 100%.
Indicates TM matches offered to the Linguist when the segment is opened (not the actual Linguist's post-editing effort).
Indicates post-editing effort for machine translation and non-translatables.

Analyze NT/MT post-editing enabled

If the MT or NT suggestion was accepted without further editing it is presented as a 100% match in the analysis.
If Linguist changes the MT, the match rate will be lower. The score-counting algorithm is the same as that used to calculate the score of translation memory fuzzy matches.
Editing of an NT will cause the segment to be presented as 0-49% NT.

Analyze NT/MT post-editing disabled

Entries from MT/NT without any estimated score will be considered TM 0%-49% matches. They will be indicated as translated by the Linguist with the MT not considered.
QPS and Phrase Language AI matches higher than 75% will be in the MT column in their respective matches.
Indicates NT/MT matches offered to the Linguist when the segment is opened (not the actual Linguist's post-editing effort).

Automatically generate post-editing analysis before a source update

Analysis is created:
- For each updated job.
- For each individual provider individually and assigned to that respective provider.
Analysis is not created if:
- No linguist or vendor is assigned.
Analysis counts confirmed and translated segments.
Analysis follows the naming convention:
- UpdateSource #{innerID}{workflow}
Analysis will be created with Units counted (source), Analyze NT post-editing, Analyze TM post-editing and Analyze MT post-editing selected.

Count units of the

source/target

Select which word count will be presented in the analysis. A target word count may be higher than a source word count.

Does not affect match scoring.

Compare Analysis

Available for

Team, Ultimate and Enterprise plans (Legacy)

Get in touch with Sales for licensing questions.

The Compare analysis feature is only available in projects with workflow steps. It compares two versions of a file in different Workflow steps on a segment level and analyzes how the two versions differ. If there are no project specific settings for the analysis, default settings are used and may result in incorrect reports.

Example

A comparison between the translation and review steps indicates the actual effort of a reviewer by identifying how much the translation changed during the review step.

Analysis can be run on multiple jobs and can be grouped in two ways:

Analyze by provider
- For a project with many jobs assigned to various Linguists or Vendors. Used to:
  - Create separate analyses containing files assigned to individual Linguists or Vendors.
  - Assign analyses to a provider making the analyses visible to their Linguists/Vendors.
  Net rate scheme will be pre-selected as an option if one is applied to the provider.
Analyze by language
- If a project contains multiple target languages, the analyses of all files can be run in a batch creating a separate analysis for each individual language.
  
  To analyze by language, follow these steps:
  1. From a Project page, select all Job files.
  2. Click Analyze.
    
    The Analyze window opens.
  3. Maintain default settings and select Analyze by language.
  4. Click Analyze.
    
    Job analysis is prepared by language.
  Note
  
  Unselect this option to create one single analysis for a multilingual project.

Analysis Recalculation

When the source file for an analysis is updated, it is marked as outdated in the analysis table.

Recalculating applies settings used for the original analysis.

Vendors are not allowed to recalculate analyses created by Buyers.

To recalculate using new source file, follow these steps:

Select the outdated analysis(es)
Click Recalculate.

The Recalculate analysis window opens until the recalculation is processed. When closed, the recalculation is complete and the outdated indicator is cleared.

Customize the Analysis view

Segments, Pages, Words, Characters, and Percents columns can be displayed/hidden in the Analysis table. The Editing time column is also available for post-editing analysis and indicates how many seconds were spent editing a segment.

Download an analysis

To download an analysis, follow these steps:

Click Download to present the dropdown menu and select:
- CSV (Comma Separated Values) with or without characters and readable with spreadsheet applications.
- LOG (Similar to SDL Trados format) and readable with most project management applications.
- JSON (JavaScript Object Notation), a lightweight data-interchange format.
Only analysis downloaded in JSON format will include a breakdown of NT, MT, TM and internal fuzzies (IF) data per match type.
Selecting a file type triggers the download.

These files can be imported into most project management software systems.

Apply a net rate scheme

A discount to words/characters/pages can be applied in an analysis. A discounted translation volume is immediately calculated and displayed directly in the analysis in the Net rate row.

To remove the net rate scheme from the analysis, leave the field next to the Apply net rate button empty.

When a net rate scheme is applied to the analysis, the downloaded file with the analysis shows weighted word counts in each match category.

Assign Analysis to a Provider

To assign an analysis to to a provider, follow these steps:

Select an analysis from the list and click Edit.

The editing page opens.
Select a Provider from the dropdown list.
Click Save.

The analysis will be available to the assigned provider on the linguist portal.

Analysis (TMS)

Word Count

Note

Create an Analysis

Note

Default Analysis

Important

Post-editing Analysis

Important

Post-analysis options

Compare Analysis

Available for

Note

Analysis Recalculation

Customize the Analysis view

Download an analysis

Apply a net rate scheme

Assign Analysis to a Provider

Articles related to Project Manager