Translation Resources

Modify Term Bases (TMS)

External terminology (or glossary) files can be imported in Excel (XLSX) or TBX file formats. The size limit for a file that can be uploaded is 1GB. An exported term base contains all languages of the given term base.

To import content, follow these steps:

  1. From a term base page, click Import.

    The Import TBX/XLSX window opens.

  2. Choose a file to import:

    • XLSX

      An XML based file format for spreadsheet applications.

      XLSX is the easiest way to import terms into a term base. A plain list of terms can be imported, but more complex terminology imports are also supported (import of synonyms, morphology, terms with various attributes, etc.).

    • TBX

      An exchange format for use in other CAT tools. Can also be used for editing content in external tools such as Okapi Olifant.

  3. Select options:

    • Create new terms

    • Update existing terms

    • Strict locale matching

      Prevents the import of a language if it has a different locale than the project.

      Example:

      A file with an EN_US designation will not be imported into a TM designated with just EN and not EN_US.

Term Metadata in a Term Base

Every term in a term base has a list of attributes that can be exported to, or imported from TBX or XLSX files. Some of these attributes can be edited directly in the term setting or edited externally in the XLSX or TBX file.

The attributes of a term base (Client, Domain, etc.) have no effect on the individual term attributes.

XLSX Files

Term metadata such as forbidden, preferred, case or exact are stored as boolean (TRUE/FALSE) values in Excel. Based on your Windows locale settings, you may find these values in your language in the XLSX export (e. g. WAHR/FALSCH for German). When editing a Term Base in Excel, please follow the pattern and type these values in your language – so that Excel can recognize them – to maintain file integrity. For other columns, use English values as stated below.

  • CID

    Phrase Concept ID. The term concept includes the source and all its targets and synonyms.

  • concept_domain

  • concept_subdomain

  • concept_url

  • concept_definition

  • concept_note

  • TID

    Phrase Term ID. The ID of the specific term in the specific language.

  • {Language code}

    A term's language code based on our supported languages.

  • status

    Either New or Approved.

  • forbidden

    True or False.

  • preferred

    True or False.

  • case

    Meaning case sensitive. The case can be either True or False.

  • exact

    Meaning exact match. This can be either True (for fuzzy match) or False.

  • note

    Only the target note will be displayed in the Editor.

  • usage

    Only the target usage will be displayed in the Editor.

  • POS

    Part of Speech; values can be Adjective, Noun, Verb, or Adverb.

  • gender

    Values can be Masculine, Feminine or Neutral.

  • number

    Values can be Singular, Plural or Uncountable.

  • short_translation

  • term_type

    Values can be Full_form, Short_form, Acronym, Abbreviation, Phrase, or Variant.

  • created_by

    Only Phrase usernames are supported

  • created_at

    Date and time of the term creation

  • modified_by

    Only Phrase usernames are supported

  • modified_at

    Date and time of the last modification of the term

TBX Files

  • <descrip type="conceptId">

    Phrase Concept ID (needed for reimporting updated terms). The Term concept includes the source and all its targets and synonyms.

  • <descrip type="conceptDefinition">

  • <descrip type="conceptDomain">

  • <descrip type="conceptNote">

  • <descrip type="conceptSubdomain">

  • <descrip type="conceptUrl">

  • <langSet xml:lang="cs">

    A term's language code based on supported languages.

  • <termNote type="termId">

    Phrase Term ID (needed for reimporting updated terms). This is the ID of the specific term in the specific language.

  • <note>

    Term's note

  • <termNote type="partOfSpeech">

  • <termNote type="grammaticalGender">

  • <termNote type="grammaticalNumber">

  • <termNote type="usageNote">

  • <termNote type="forbidden">

    True or False

  • <termNote type="preferred">

    True or False

  • <termNote type="exactMatch">

    True or False

  • <termNote type="status">

    New or Approved

  • <termNote type="caseSensitive">

    True or False

  • <termNote type="createdBy">

    Phrase username

  • <termNote type="createdAt">

    Unix time

  • <termNote type="lastModifiedBy">

    Phrase username

  • <termNote type="lastModifiedAt">

    Unix time

  • <termNote type="shortTranslation">

  • <termNote type="termType">

Prepare XLSX for Import to a Term Base

XLSX files must be formatted in specific manner before being imported.

To prepare the file, follow these steps:

  1. In the XLSX file, organize all terms into columns with each column representing one language.

  2. In the first row, apply the language code for each language.

    Example:

    A

    B

    C

    1

    en

    de_de

    it

    2

    Agreement

    Abkommen

    accordo

    3

    Joint Committee

    Gemischte Kommission

    Commissione mista

    4

    Federal Council

    Bundesrat

    Consiglio federale

  3. Save the file.

Synonyms

Synonyms can be accommodated by adding a second column with the same language code.

Example:

en

en

de _de

Agreement

Contract

Abkommen

Joint Committee

Gemischte Kommission

Terms with Attributes

Terms can be imported with specified attributes. Some are generated by Phrase and are available only in files exported from a Phrase TB.

To apply an attribute to a term, follow these steps:

  1. Place a column with the attribute name after each term or synonym column.

  2. Place the value of the attribute in the row with the associated term.

Terms with Challenging Morphology

Terms that are being imported follow the same morphology rules as terms created directly in a term base.

Apart from working with synonyms and Fuzzy/Exact matches, a pipe character can be added as a boundary between the word stem (the part that does not change) and the suffix (the part that does change).

Example:

The term smíšen|ý in Czech can also come up as smíšeného, smíšenou, etc. Putting the | character before the ý ensures that all three endings will be considered matches.

TBX Import Format

The TBX format is supported for terminology imports (and exports). The TBX standard is considered a loose standard. If a TBX file is imported from another CAT tool, some metadata may not get imported.

See Term Metadata in Term Base for more details.

If importing terminology between two term bases, use the TBX format. Inside the Phrase environment, data will be correctly imported.

SDL Trados uses a special TBX.XML format and it has different specifications for import.

Multiterm TBX

The import process from Multiterm TBX files has been optimized and the following metadata will be imported:

  • Timestamps (created at, last modified at)

  • Value in element <descrip type="usageNote"> to the attribute usage of all the terms of the concept

  • Value in element <descrip type="note"> to the attribute note of all the terms of the concept

Import TBX.xml from SDL Trados

SDL Trados does not support the TBX format for term bases and uses the XML format with a TBX schema. Importing this XML format is supported but not with all attributes.

Attributes specified for the whole term concept will be added to every individual term's Note (each language, each synonym, etc.)

Imported attributes:

  • Source

  • Target

  • Synonyms

  • Date of Creation

  • Date of Modification

  • Names of Author and Reviewer

    These will be imported only if the name is the same as the username of an existing Phrase user. Either edit the names in the TBX.xml or add the users to Phrase.

  • Customized Attributes

    These will be imported into the term’s Note. Every attribute will have a separate line starting with the attribute’s name. For example:

    • Origin: Wikipedia

    • Theme: Law

    • Status: New

Edit the TBX.xml Before Import

To make the best use of your data, edit the TBX.xml file before importing it. To edit the file, open it in a text editor that supports Multiline Regex (such as Notepad++) and that can use regular expressions in Search and Replace features.

Editing Note, Usage and Status

Customized attributes in TBX.xml files have the following format. Actual names of the attributes will be different since they are not standardized:

<descripGrp>
<descrip type="Comment">term =API= should not be translated</descrip>
</descripGrp>
<descripGrp>
<descrip type="Definition">API = application programming interface</descrip>
</descripGrp>
<descripGrp>
<descrip type="Example">Phrase offers a set of API calls.</descrip>
</descripGrp>
<descripGrp>
<descrip type="Status">confirmed</descrip>
</descripGrp>

These attributes will be automatically imported into the Note:

  • Comment: term =API= should not be translated

  • Definition: API = application programming interface

  • Example: Phrase offers a set of API calls

  • Status: confirmed

To change this behavior and import, for example:

  • Only the Comment as a Note

  • Example as Usage

  • Status as Approved or New

  • Don't require import of Definition

Edit the TBX.xml file to fit the standard of the Phrase format for TBX files:

<note>term =API= should not be translated</note>
<termNote type="usageNote">Phrase offers a set of API calls.</termNote>
<termNote type="status">Approved</termNote>

Changing Comment to Note

Search:

<descripGrp>.[^\<]+<descrip type="Comment">([^\<]+)</descrip>.[^\<]+</descripGrp>

Replace:

<note>\1</note>

Changing Example to Usage

Search:

<descripGrp>.[^\<]+<descrip type="Example">([^\<]+)</descrip>.[^\<]+</descripGrp>

Replace:

<termNote type="usageNote">\1</termNote>

Setting Status to Approved

Search:

<descripGrp>.[^\<]+<descrip type="Status">[^\<]+</descrip>.[^\<]+</descripGrp>

Replace:

<termNote type="status">Approved</termNote>

Deleting Definition

<descripGrp>.[^\<]+<descrip type="Definition">([^\<]+)</descrip>.[^\<]+</descripGrp>

Replace with an empty field.

Adding an Author to Note

Remove the author from the <transacGrp / origination> element and add it to the <descript> element.

<transacGrp>
<transac type="terminologyManagementTransactions">origination</transac>
<date>2006-09-27T11:25:19</date>
<transacNote type="responsibility">MikeS</transacNote>
</transacGrp>

should be replaced by:

<transacGrp>
<transac type="terminologyManagementTransactions">origination</transac>
<date>2006-09-27T11:25:19</date>
</transacGrp>
<descripGrp>
<descrip type="author">MikeS</descrip>
</descripGrp>

The regular expression will be:

Search:

(origination</transac>.[^\<]+<date>[^\<]+</date>.[^\<]+)<transacNote type="responsibility">([^\<]+)</transacNote>.[^\<]+</transacGrp>

Replace:

\1</transacGrp>\r\n<descripGrp>\r\n<descrip type="author">\2</descrip>\r\n</descripGrp>

Adding Edited by to a Note

To add Edited by to a Note, remove the Editor from the <transacGrp / modification> element and add it to the <descript> element.

<transacGrp>
<transac type="terminologyManagementTransactions">modification</transac>
<date>2006-09-27T11:25:19</date>
<transacNote type="responsibility">lauraB</transacNote>
</transacGrp>

should be replaced by:

<transacGrp>
<transac type="terminologyManagementTransactions">modification</transac>
<date>2006-09-27T11:25:19</date>
</transacGrp>
<descripGrp>
<descrip type="Edited by">lauraB</descrip>
</descripGrp>

The regular expression will be:

Search:

(modification</transac>.[^\<]+<date>[^\<]+</date>.[^\<]+)<transacNote type="responsibility">([^\<]+)</transacNote>.[^\<]+</transacGrp>

Replace:

\1</transacGrp>\r\n<descripGrp>\r\n<descrip type="edited by">\2</descrip>\r\n</descripGrp>

Export Term Base for Modification

If the administrator has provided rights, terms can be exported to an XLSX file for modification before being imported back. This can be used for bulk changes or deletions.

To modify terms externally, follow these steps:

  1. From a term base page, click Export.

    The Export TBX/XLSX window opens.

  2. Select XLSX as the Format.

  3. Select term attributes for export.

  4. Click Export.

    The XLSX file is created and downloaded to the system.

  5. Modify the XLSX file with required changes without deleting CID or TID information.

    Rewrite existing terms in the column for the given language to update them.

    The |update suffix added to a CID or TID is not required as for updating Translation Memories. The update works correctly for both with and without |update options added to a CID or TID.

    New terms can also be added to the column for the given language as additional rows. They will be imported as new to the existing term base.

    Terms can be deleted by:

    • Add |delete as a suffix to a CID of a term to remove the term from all languages.

    • Add |delete as a suffix to a TID of a term to delete the term from a specific language.

  6. Save the XLSX file and import it back with the option Update existing terms.

Was this article helpful?

Sorry about that! In what way was it not helpful?

The article didn’t address my problem.
I couldn’t understand the article.
The feature doesn’t do what I need.
Other reason.

Note that feedback is provided anonymously so we aren't able to reply to questions.
If you'd like to ask a question, submit a request to our Support team.
Thank you for your feedback.