Files to translate

Right-to left languages

File formats


Right to left languages

Justification of source and target segments depends on the project languages. It is by default left justified for LTR languages and right justified for RTL languages. You can toggle between different display modes by pressing Ctrl+Shift+O (this is the letter O and not the numeral 0). The Shift+Ctrl+O toggle has three states: Using the RTL mode in OmegaT has no influence whatsoever on the display mode of the translated documents created in OmegaT. The display mode of the translated documents will have to be modified within the application commonly used to display or modify them (check the relevant manuals for details). Using Ctrl+Shift+O causes both text input and display in OmegaT to be changed to RTL. It can be used separately for all three panes (Editor, Fuzzy Matches and Glossary) by clicking on the pane and toggling the display mode. It can also be used in all the input fields found in OmegaT (search window, segmentation rules etc.) Note: Mac OSX uses the same shortcut (and not Cmd+Shift+O) .

Mixing RTL and LTR strings in segments

When writing purely RTL text, the default (LTR) view may be used. In many cases, however, it is necessary to embed LTR text in RTL text. For example, in OmegaT tags, product names that have to remain in the LTR source language, place holders in localization files, and numbers in text. In cases like these it becomes necessary to switch to RTL mode, so that the RTL (in fact bidirectional) text is displayed correctly. It should be noted that when OmegaT is in RTL mode, both source and target are displayed in RTL mode. This means that if the source language is LTR and the target language is RTL, or vice versa, it might be necessary to toggle back and forth between RTL and LTR modes to easily view the source and enter the target in their respective modes.

OmegaT tags in RTL segments

As written above, OmegaT tags are LTR strings. When translating between RTL and LTR langauges, to correctly read the tags from the source and to enter them properly in the target may require the translator to toggle between LTR and RTL modes numerous times.

If the document allows, the translator is strongly encouraged to remove style information from the original document so that as few tags as possible appear in the OmegaT interface. Follow the indications given in Hints for tags management. Frequently validate tags (see Tag validation) and produce translated documents (see below and Menu) at regular intervals to make it easier to catch any problems that arise. It should be possible to translate a plain text version of the text and to later add the necessary style in the relevant application.

Creating translated RTL documents

When the translated document is created, its display direction will be the same as the original document. If the original document was LTR, the display direction of the target document must be changed manually to RTL in its viewing application. Each output format has specific ways to deal with RTL display, check the relevant application manuals for details.

To avoid changing the target files display parameters each time the files are opened, it is sometimes possible to change the source file display parameters so that such parameters are inherited by the target files. Such modifications are possible in OpenOffice.org files for example.


File formats

With OmegaT you can translate files in a number of file formats. There are basically two types of file formats, the plain text and the formatted text formats.

Plain text files

Plain text files contain text only, so their translation is as simple as typing the translation. There are several methods to specify the file's encoding so that its contents is not garbled when opened in OmegaT. Such files do not contain any formatting information beyond the "white space" used to align text, indicate paragraphs or insert page breaks. They are not able to contain / retain information regarding the color, font etc of the text. Currently, OmegaT supports the following plain text formats:

Other plain text file types can be handled by OmegaT by associating their file extension to a supported file type (for example, .pod files could be associated to the ASCII text filter) and by pre-processing them with specific segmentation rules.

Formatted text files

Formatted text files contain text as well as information such as font type, size, color etc. They are commonly created in word processors or home page editors. Such file formats are conceived so that they retain formatting information. Such formatting information can be as simple as this is bold or as complex as table data with different font size, color, position etc. In most translation jobs it is considered important to have the translated document look similar to the original. OmegaT allows you to do this by marking the characters/words that have a special formatting with easy to manipulate tags. Simplifying the original text formatting greatly contributes to reducing the number of tags. Unifying used fonts, font sizes, colors etc should be considered if possible, to simplify the translation and reduce the number of possible tag errors. Each file type is handled differently in OmegaT. Specific behavior can be setup in the file filters. Currently, OmegaT support the following formatted text formats:

Other formatted text file types can be handled by OmegaT by associating their file extension to a supported file type and by pre-processing them with specific segmentation rules.

Other file formats

Other plain text or formatted text file formats may also be processed in OmegaT.

External tools can be used to convert files to supported formats. Please remember, that the translated files will need to be converted back to the original format. This way, a number of plain text formats (including LaTex etc) can be translated in OmegaT through conversion to the PO format. Similarly, a number of formatted text formats (including Microsoft Office files) can be translated in OmegaT through conversion to the Open Document format.

The quality of the translated file will depend on the quality of the round-trip conversion. Make sure you have tested all your options before proceeding with such conversions. Available free conversion tools include:


OpenOffice.org
OpenOffice.org official page

OmegaT does not offer direct support for Microsoft Office formats Word, Excel and Power Point. However, OpenOffice.org (and variants) can be used to convert such formats to OpenDocument, that OmegaT natively supports.

Okapi Framework
Okapi for Mono
, tutorial

The Text Extraction Utility from the Okapi Framework has an option for creating an OmegaT project folder tree. It is also possible to create an OmegaT specific XLIFF file. The Java implementation of Okapi Framework is available at Okapi.opentag.com.

Translate Toolkit
Translate Toolkit official page

The Translate Toolkit, a python tool set, provides users with a number of converters to and from Portable Object, including Mozilla .properties and dtd files, CSV files, Qt .ts files, XLIFF files. It includes a number of tools to manipulate such files before or after their translation in OmegaT.

Po4a
po4a official page

po4a is a Debian perl tool. It can convert files formats such as LaTeX, TeX, POD etc to and from Portable Object.


Legal notices Home Index of contents