Unicode Text String Description Files
- Last UpdatedJan 23, 2023
- 3 minute read
Users who take advantage of the Unicode features of the base product may have issues when exporting and importing a model. The primary danger is where there are catalogue profile or material names in the base product that contain Unicode characters. Therefore, a dictionary file may be written to convert any base product strings that contain Unicode characters into equivalent strings that do not. It is these latter strings that will appear in the file. This might actually be a description of the profile, rather than the known profile name. This could be to inform or alert you of the external application of what the profile is meant to be.
Because there are strings other than the profile names that appear in the file, this Unicode description file should be regarded as a kind of dictionary, a look-up for any string, rather than just a profile name description table. Therefore, its format and use is different from that of the other mapping files. It is used immediately prior to writing out the text strings and it is only to contain the strings that require stripping of Unicode characters. It is also used on file import.

This Unicode description file may be customized for each target system, in the same way as the other mapping tables. This will contain:
-
An indication of the encoding to be used for the file
-
Unicode UTF-8 with a Byte Order Mark (BOM):
ENCODING:UTF8BOM
-
Unicode UTF-8 without BOM:
ENCODING:UTF8noBOM
-
Force to Default encoding:
ENCODING:DEFAULT
-
Force to ASCII:
ENCODING:ASCII
-
-
A set of string substitutions to be used whenever a quoted-string is written to the file.
-
As potentially any character can be used in the string, first non-space character of the line is used as a delimiter for that line
-
Only complete strings would be substituted
-
Comment lines are indicated by '#'
-
not every string needs to be in the dictionary - others would be passed through without error (but would trigger a warning if characters were lost in the encoding)
-
So a dictionary for export might look like:
ENCODING:ASCII
# header texts
# non-ASCII profiles
:Γρxνxθ:Corinthian Column, 1 cubit:
# non-ASCII materials
|χάλυβα, ανθρα|Green-veined marble|
You are not forced to prepare one of these files: the system will cope if one does not exist. It would just mean that the strings would not be translated and you would have to take his chance with the target package. Further, the file can be empty, but if there are any strings requiring description, there must be the ENCODING: statement as the first non comment line in the file.
It should be noted that text fields are of a fixed maximum width. If the resultant text is too long, it will be truncated, and you are warned. Ideally, the translation should succeed with no truncation warnings, and all profiles mapped either by the main mapping file or through using the Unicode description file. If, however, you decide to output in UTF-8 characters, and thereby diverge from the literal interpretation of the format, truncation will not take place.
On import the system will attempt to recognize names in the dictionary, and to translate them back into the original names. This is in addition to the usual profile and material mapping files.