Index: openacs-4/packages/acs-core-docs/www/i18n-convert.html =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/i18n-convert.html,v diff -u -r1.20.2.2 -r1.20.2.3 --- openacs-4/packages/acs-core-docs/www/i18n-convert.html 6 Jul 2009 11:14:26 -0000 1.20.2.2 +++ openacs-4/packages/acs-core-docs/www/i18n-convert.html 11 Sep 2009 23:41:26 -0000 1.20.2.3 @@ -1,5 +1,5 @@ - -
+ +
For multilingual websites we recommend using the UTF8 charset. In order for AOLserver to use utf8 you need to set the config parameters OutputCharset and @@ -9,53 +9,53 @@ variable set to .UTF8. You should set this variable in the nsd-oracle run script (use the acs-core-docs/www/files/nds-oracle.txt template file). -
Replace all text with temporary message tags. From/acs-admin/apm/
, select a
+
Replace all text with temporary message tags. From/acs-admin/apm/, select a
package and then click on
- Internationalization
, then
- Convert ADP, Tcl, and SQL files to using the
- message catalog.
. This pass only changes the adp files; it does not affect catalog files or the catalog in the database.
You will now be walked through all of the selected adp pages. The UI shows you the intended changes and lets you edit or cancel them key by key.
Replace the temporary message tags in ADP files. From the same Convert ADP ...
page in /acs-admin/apm
as in the last step, repeat the process but deselect Find human language text ...
and select Replace <# ... #> tags ...
and click OK. This step replaces all of the temporary tags with "short" message lookups, inserts the message keys into the database message catalog, and then writes that catalog out to an xml file.
Replace human-readable text in TCL files with temporary tags. Examine all of the tcl files in the packages for human-readable text and replace it with temporary tags. The temporary tags in TCL are slightly different from those in ADP. If the first character in the temporary tag is an underscore (_
), then the message keys will be auto-generated from the original message text. Here is an unmodified tcl file:
-set title "Messages for $a(name) in $b(label)" -set context [list [list . "SimPlay"] \ + Internationalization, then + Convert ADP, Tcl, and SQL files to using the + message catalog.. This pass only changes the adp files; it does not affect catalog files or the catalog in the database.You will now be walked through all of the selected adp pages. The UI shows you the intended changes and lets you edit or cancel them key by key.
Replace the temporary message tags in ADP files. From the same Convert ADP ... page in /acs-admin/apm as in the last step, repeat the process but deselect Find human language text ... and select Replace <# ... #> tags ... and click OK. This step replaces all of the temporary tags with "short" message lookups, inserts the message keys into the database message catalog, and then writes that catalog out to an xml file.
Replace human-readable text in TCL files with temporary tags. Examine all of the tcl files in the packages for human-readable text and replace it with temporary tags. The temporary tags in TCL are slightly different from those in ADP. If the first character in the temporary tag is an underscore (_), then the message keys will be auto-generated from the original message text. Here is an unmodified tcl file:
+set title "Messages for $a(name) in $b(label)" +set context [list [list . "SimPlay"] \ [list [export_vars -base case-admin { case_id }] \ - "Administer $a(name)"] \ - "Messages for $a(name)"] + "Administer $a(name)"] \ + "Messages for $a(name)"]
... and here is the same file after temporary message tags have been manually added:
set title <#admin_title Messages for %a.name% in %b.label%#> set context [list [list . <#_ SimPlay#>] \ [list [export_vars -base case-admin { case_id }] \ <#_ Administer %a.name%#>] \ <#_ Messages for %a.name%#>] -
Note that the message key case_admin_page_title
was manually selected, because an autogenerated key for this text, with its substitute variables, would have been very confusing
-
Replace the temporary message tags in TCL files. Repeat step 2 for tcl files. Here is the example TCL file after conversion:
+
Note that the message key case_admin_page_title was manually selected, because an autogenerated key for this text, with its substitute variables, would have been very confusing +
Replace the temporary message tags in TCL files. Repeat step 2 for tcl files. Here is the example TCL file after conversion:
set title [_ simulation.admin_title] set context [list [list . [_ simulation.SimPlay]] \ [list [export_vars -base case-admin { case_id }] \ [_ simulation.lt_Administer_name_gt]] \ [_ simulation.lt_Messages_for_role_pre]] -
Internationalize SQL Code. If there is any user-visible TCL code in the .sql or .xql files, internationalize that the same way as for the TCL files.
Internationalize Package Parameters. - See Multilingual APM Parameters -
Internationalize Date and Time queries.
Find datetime in .xql files. Use command line tools to find suspect SQL code:
grep -r "to_char.*H" * -grep -r "to_date.*H" * -
In SQL statements, replace the format string with the ANSI standard format, YYYY-MM-DD HH24:MI:SS
and change the field name to *_ansi so that it cannot be confused with previous, improperly formatting fields. For example,
to_char(timestamp,'MM/DD/YYYY HH:MI:SS') as foo_date_pretty
becomes
to_char(timestamp,'YYYY-MM-DD HH24:MI:SS') as foo_date_ansi
In TCL files where the date fields are used, convert the datetime from local server timezone, which is how it's stored in the database, to the user's timezone for display. Do this with the localizing function lc_time_system_to_conn
:
-set foo_date_ansi [lc_time_system_to_conn $foo_date_ansi]
When a datetime will be written to the database, first convert it from the user's local time to the server's timezone with lc_time_conn_to_system
.
-
When a datetime field will be displayed, format it using the localizing function lc_time_fmt
. lc_time_fmt takes two parameters, datetime and format code. Several format codes are usable for localization; they are placeholders that format dates with the appropriate codes for the user's locale. These codes are: %x, %X, %q, %Q, and %c.
set foo_date_pretty [lc_time_fmt $foo_date_ansi "%x %X"]
- Use the _pretty
version in your ADP page.
-
+
Internationalize SQL Code. If there is any user-visible TCL code in the .sql or .xql files, internationalize that the same way as for the TCL files.
Internationalize Package Parameters. + See Multilingual APM Parameters +
Internationalize Date and Time queries.
Find datetime in .xql files. Use command line tools to find suspect SQL code:
grep -r "to_char.*H" * +grep -r "to_date.*H" * +
In SQL statements, replace the format string with the ANSI standard format, YYYY-MM-DD HH24:MI:SS and change the field name to *_ansi so that it cannot be confused with previous, improperly formatting fields. For example,
to_char(timestamp,'MM/DD/YYYY HH:MI:SS') as foo_date_pretty
becomes
to_char(timestamp,'YYYY-MM-DD HH24:MI:SS') as foo_date_ansi
In TCL files where the date fields are used, convert the datetime from local server timezone, which is how it's stored in the database, to the user's timezone for display. Do this with the localizing function lc_time_system_to_conn:
+set foo_date_ansi [lc_time_system_to_conn $foo_date_ansi]
When a datetime will be written to the database, first convert it from the user's local time to the server's timezone with lc_time_conn_to_system. +
When a datetime field will be displayed, format it using the localizing function lc_time_fmt. lc_time_fmt takes two parameters, datetime and format code. Several format codes are usable for localization; they are placeholders that format dates with the appropriate codes for the user's locale. These codes are: %x, %X, %q, %Q, and %c.
set foo_date_pretty [lc_time_fmt $foo_date_ansi "%x %X"]
+ Use the _pretty version in your ADP page. +
%c: Long date and time (Mon November 18, 2002 12:00 AM) -
+
%x: Short date (11/18/02) -
+
%X: Time (12:00 AM) -
+
%q: Long date without weekday (November 18, 2002) -
+
%Q: Long date with weekday (Monday November 18, 2002)
- The "q" format strings are OpenACS additions; the rest follow unix standards (see man
- strftime
).
-
Internationalize Numbers.
- To internationalize numbers, use lc_numeric $value
, which formats the number using the appropriate decimal point and thousand separator for the locale.
-
Internationalizing Forms. When coding forms, remember to use message keys for each piece of text that is user-visible, including form option labels and button labels.
Checking the Consistency of Catalog Files. + The "q" format strings are OpenACS additions; the rest follow unix standards (see man + strftime). +
Internationalize Numbers. + To internationalize numbers, use lc_numeric $value, which formats the number using the appropriate decimal point and thousand separator for the locale. +
Internationalizing Forms. When coding forms, remember to use message keys for each piece of text that is user-visible, including form option labels and button labels.
Checking the Consistency of Catalog Files.
This section describes how to check that the set of keys used in
message lookups in tcl, adp, and info files and the set of keys in
the catalog file are identical. The scripts below assume that
@@ -64,23 +64,23 @@
are always is done with one of the valid lookups described above. The script further assumes
that you have perl installed and in your path. Run the script like
this:
-
+
acs-lang/bin/check-catalog.sh package_key
-
+
where package_key is the key of the package that you want to test. If you don't provide the package_key argument then all packages with catalog files will be checked. The script will run its checks primarily on en_US xml catalog files. -
Replace complicated keys with longer, simpler keys. When writing in one language, it is possible to create clever code to make correct text. In English, for example, you can put an if
command at the end of a word which adds "s" if a count is anything but 1. This pluralizes nouns correctly based on the data. However, it is confusing to read and, when internationalized, may result in message keys that are both confusing and impossible to set correctly in some languages. While internationalizing, watch out that the automate converter does not create such keys. Also, refactor compound text as you encounter it.
The automated system can easily get confused by tags within message texts, so that it tries to create two or three message keys for one long string with a tag in the middle. In these cases, uncheck those keys during the conversion and then edit the files directly. For example, this code:
<p class="form-help-text"><b>Invitations</b> are sent, - when this wizard is completed and casting begins.</p>
has a bold tag which confuses the converter into thinking there are two message keys for the text beginning "Invitations ..." where there should be one:
Instead, we cancel those keys, edit the file manually, and put in a single temporary message tag:
<p class="form-help-text"> <#Invitations_are_sent <b>Invitations</b> are sent, +
Replace complicated keys with longer, simpler keys. When writing in one language, it is possible to create clever code to make correct text. In English, for example, you can put an if command at the end of a word which adds "s" if a count is anything but 1. This pluralizes nouns correctly based on the data. However, it is confusing to read and, when internationalized, may result in message keys that are both confusing and impossible to set correctly in some languages. While internationalizing, watch out that the automate converter does not create such keys. Also, refactor compound text as you encounter it.
The automated system can easily get confused by tags within message texts, so that it tries to create two or three message keys for one long string with a tag in the middle. In these cases, uncheck those keys during the conversion and then edit the files directly. For example, this code:
<p class="form-help-text"><b>Invitations</b> are sent, + when this wizard is completed and casting begins.</p>
has a bold tag which confuses the converter into thinking there are two message keys for the text beginning "Invitations ..." where there should be one:
Instead, we cancel those keys, edit the file manually, and put in a single temporary message tag:
<p class="form-help-text"> <#Invitations_are_sent <b>Invitations</b> are sent, when this wizard is completed and casting begins.#> - </p>
Complex if statements may produce convoluted message keys that are very hard to localize. Rewrite these if statements. For example:
Select which case <if @simulation.casting_type@ eq "open">and + </p>
Complex if statements may produce convoluted message keys that are very hard to localize. Rewrite these if statements. For example:
Select which case <if @simulation.casting_type@ eq "open">and role</if> to join, or create a new case for yourself. If you do not -select a case <if @simulation.casting_type@ eq "open">and role</if> +select a case <if @simulation.casting_type@ eq "open">and role</if> to join, you will be automatically assigned to a case <if -@simulation.casting_type@ eq "open">and role</if> when the -simulation begins.
... can be rewritten:
<if @simulation.casting_type@ eq "open"> +@simulation.casting_type@ eq "open">and role</if> when the +simulation begins.
... can be rewritten:
<if @simulation.casting_type@ eq "open"> Select which case and role to join, or create a new case for yourself. If you do not select a case and role to join, you will @@ -96,7 +96,7 @@ begins. </else>
Another example, where bugs are concatenated with a number:
<if @components.view_bugs_url@ not nil> - <a href="@components.view_bugs_url@" title="View the @pretty_names.bugs@ for this component"> + <a href="@components.view_bugs_url@" title="View the @pretty_names.bugs@ for this component"> </if> @components.num_bugs@ <if @components.num_bugs@ eq 1> @@ -110,7 +110,7 @@ </if> <if @components.view_bugs_url@ not nil> -<a href="@components.view_bugs_url@" title="#bug-tracker.View_the_bug_fo_component#"> +<a href="@components.view_bugs_url@" title="#bug-tracker.View_the_bug_fo_component#"> </if> @components.num_bugs@ <if @components.num_bugs@ eq 1> @@ -124,39 +124,39 @@ </if>
It would probably be better to do this as something like:
<if @components.view_bugs_url@ not nil> <if @components.num_bugs@ eq 1> - <a href="@components.view_bugs_url@" title="#bug-tracker.View_the_bug_fo_component#">#bug-tracker.one_bug#</a> + <a href="@components.view_bugs_url@" title="#bug-tracker.View_the_bug_fo_component#">#bug-tracker.one_bug#</a> </if><else> - <a href="@components.view_bugs_url@" title="#bug-tracker.View_the_bug_fo_component#">#bug-tracker.N_bugs#</a> + <a href="@components.view_bugs_url@" title="#bug-tracker.View_the_bug_fo_component#">#bug-tracker.N_bugs#</a> </else> -</if>
Don't combine keys in display text. Converting a phrase from one language to another is usually more complicated than simply replacing each word with an equivalent. When several keys are concatenated, the resulting word order will not be correct for every language. Different languages may use expressions or idioms that don't match the phrase key-for-key. Create complete, distinct keys instead of building text from several keys. For example:
Original code:
multirow append links "New [bug_tracker::conn Bug]"
Problematic conversion:
multirow append links "[_ bug-tracker.New] [bug_tracker::conn Bug]"
Better conversion:
set bug_label [bug_tracker::conn Bug] -multirow append links "[_ bug-tracker.New_Bug]" "${url_prefix}bug-add"
... and include the variable in the key: "New %bug_label%"
. This gives translators more control over the phrase.
In this example of bad i18n, full name is created by concatenating first and last name (admittedly this is pervasive in the toolkit):
<a href="@past_version.maintainer_url@" title="#bug-tracker.Email# @past_version.maintainer_email@"> -@past_version.maintainer_first_names@ @past_version.maintainer_last_name@</a>
Avoid unnecessary duplicate keys. When phrases are exactly the same in several places, use a single key.
For common words such as - Yes and No, you can use a library of keys at acs-kernel. +</if>
Don't combine keys in display text. Converting a phrase from one language to another is usually more complicated than simply replacing each word with an equivalent. When several keys are concatenated, the resulting word order will not be correct for every language. Different languages may use expressions or idioms that don't match the phrase key-for-key. Create complete, distinct keys instead of building text from several keys. For example:
Original code:
multirow append links "New [bug_tracker::conn Bug]"
Problematic conversion:
multirow append links "[_ bug-tracker.New] [bug_tracker::conn Bug]"
Better conversion:
set bug_label [bug_tracker::conn Bug] +multirow append links "[_ bug-tracker.New_Bug]" "${url_prefix}bug-add"
... and include the variable in the key: "New %bug_label%". This gives translators more control over the phrase.
In this example of bad i18n, full name is created by concatenating first and last name (admittedly this is pervasive in the toolkit):
<a href="@past_version.maintainer_url@" title="#bug-tracker.Email# @past_version.maintainer_email@"> +@past_version.maintainer_first_names@ @past_version.maintainer_last_name@</a>
Avoid unnecessary duplicate keys. When phrases are exactly the same in several places, use a single key.
For common words such as
+ Yes and No, you can use a library of keys at acs-kernel.
For example, instead of using
- myfirstpackage.Yes
, you
- can use acs-kernel.Yes
.
- You can also use the Message Key Search facility to find duplicates.
+ myfirstpackage.Yes, you
+ can use acs-kernel.Yes.
+ You can also use the Message Key Search facility to find duplicates.
Be careful, however, building up sentences from keys
because grammar and other elements may not be consistent
- across different locales.
Additional discussion: Re: - Bug 961 ("Control Panel" displayed instead of - "Administer"), Translation - server upgraded, and Localization questions.
Don't internationalize internal code words. Many packages use code words or key words, such as "open" and "closed", which will never be shown to the user. They may match key values in the database, or be used in a switch or if statement. Don't change these.
For example, the original code is
workflow::case::add_log_data \ + across different locales.Additional discussion: Re: + Bug 961 ("Control Panel" displayed instead of + "Administer"), Translation + server upgraded, and Localization questions.
Don't internationalize internal code words. Many packages use code words or key words, such as "open" and "closed", which will never be shown to the user. They may match key values in the database, or be used in a switch or if statement. Don't change these.
For example, the original code is
workflow::case::add_log_data \ -entry_id $entry_id \ - -key "resolution" \ + -key "resolution" \ -value [db_string select_resolution_code {}]
This is incorrectly internationalized to
workflow::case::add_log_data \ -entry_id $entry_id \ - -key "[_ bug-tracker.resolution]" \ - -value [db_string select_resolution_code {}]
But resolution
is a keyword in a table and in the code, so this breaks the code. It should not have been internationalized at all. Here's another example of text that should not have been internationalized:
{show_patch_status "open"}
It is broken if changed to
{show_patch_status "[_ bug-tracker.open]"}
Fix automatic truncated message keys. The automatic converter may create unique but crytic message keys. Watch out for these and replace them with more descriptive keys. For example:
-<msg key="You">You can filter by this %component_name% by viisting %filter_url_string%</msg> -<msg key="You_1">You do not have permission to map this patch to a bug. Only the submitter of the patch + -key "[_ bug-tracker.resolution]" \ + -value [db_string select_resolution_code {}]
But resolution is a keyword in a table and in the code, so this breaks the code. It should not have been internationalized at all. Here's another example of text that should not have been internationalized:
{show_patch_status "open"}
It is broken if changed to
{show_patch_status "[_ bug-tracker.open]"}
Fix automatic truncated message keys. The automatic converter may create unique but crytic message keys. Watch out for these and replace them with more descriptive keys. For example:
+<msg key="You">You can filter by this %component_name% by viisting %filter_url_string%</msg> +<msg key="You_1">You do not have permission to map this patch to a bug. Only the submitter of the patch and users with write permission on this Bug Tracker project (package instance) may do so.</msg> -<msg key="You_2">You do not have permission to edit this patch. Only the submitter of the patch -and users with write permission on the Bug Tracker project (package instance) may do so.</msg>
These would be more useful if they were, "you_can_filter", "you_do_not_have_permission_to_map_this_patch", and "you_do_not_have_permission_to_edit_this_patch". Don't worry about exactly matching the english text, because that might change; instead try to capture the meaning of the phrase. Ask yourself, if I was a translator and didn't know how this application worked, would this key and text make translation easy for me? -
Sometimes the automatic converter creates keys that don't semantically match their text. Fix these:
<msg key="Fix">for version</msg> -<msg key="Fix_1">for</msg> -<msg key="Fix_2">for Bugs</msg>
Another example: Bug-tracker component maintainer"
was converted to "[_ bug-tracker.Bug-tracker]"
. Instead, it should be bug_tracker_component_maintainer
.
Translations in Avoid "clever" message reuse. Translations may need to differ depending on the context in which +<msg key="You_2">You do not have permission to edit this patch. Only the submitter of the patch +and users with write permission on the Bug Tracker project (package instance) may do so.</msg>
These would be more useful if they were, "you_can_filter", "you_do_not_have_permission_to_map_this_patch", and "you_do_not_have_permission_to_edit_this_patch". Don't worry about exactly matching the english text, because that might change; instead try to capture the meaning of the phrase. Ask yourself, if I was a translator and didn't know how this application worked, would this key and text make translation easy for me? +
Sometimes the automatic converter creates keys that don't semantically match their text. Fix these:
<msg key="Fix">for version</msg> +<msg key="Fix_1">for</msg> +<msg key="Fix_2">for Bugs</msg>
Another example: Bug-tracker component maintainer" was converted to "[_ bug-tracker.Bug-tracker]". Instead, it should be bug_tracker_component_maintainer.
Translations in Avoid "clever" message reuse. Translations may need to differ depending on the context in which the message appears. -
Avoid plurals. Different languages create plurals differently. Try to avoid keys which will change based on the value of a number. OpenACS does not currently support internationalization of plurals. If you use two different keys, a plural and a singular form, your application will not localize properly for locales which use different rules or have more than two forms of plurals.
Quoting in the message catalog for tcl. Watch out for quoting and escaping when editing text that is also code. For example, the original string
set title "Patch \"$patch_summary\" is nice."
breaks if the message text retains all of the escaping that was in the tcl command:
<msg>Patch \"$patch_summary\" is nice.</msg>
When it becomes a key, it should be:
<msg>Patch "$patch_summary" is nice.</msg>
Also, some keys had %var;noquote%, which is not needed since those +
Avoid plurals. Different languages create plurals differently. Try to avoid keys which will change based on the value of a number. OpenACS does not currently support internationalization of plurals. If you use two different keys, a plural and a singular form, your application will not localize properly for locales which use different rules or have more than two forms of plurals.
Quoting in the message catalog for tcl. Watch out for quoting and escaping when editing text that is also code. For example, the original string
set title "Patch \"$patch_summary\" is nice."
breaks if the message text retains all of the escaping that was in the tcl command:
<msg>Patch \"$patch_summary\" is nice.</msg>
When it becomes a key, it should be:
<msg>Patch "$patch_summary" is nice.</msg>
Also, some keys had %var;noquote%, which is not needed since those variables are not quoted (and in fact the variable won't even be - recognized so you get the literal %var;noquote% in the output).
Be careful with curly brackets. Code within curly brackets isn't evaluated. TCL uses curly brackets as an alternative way to build lists. But TCL also uses curly brackets as an alternative to quotation marks for quoting text. So this original code
array set names { key "Pretty" ...}
... if converted to
array set names { key "[_bug-tracker.Pretty]" ...}
... won't work since the _ func will not be called. Instead, it should be
array set names [list key [_bug-tracker.Pretty] ...]
Be careful with curly brackets. Code within curly brackets isn't evaluated. TCL uses curly brackets as an alternative way to build lists. But TCL also uses curly brackets as an alternative to quotation marks for quoting text. So this original code
array set names { key "Pretty" ...}
... if converted to
array set names { key "[_bug-tracker.Pretty]" ...}
... won't work since the _ func will not be called. Instead, it should be
array set names [list key [_bug-tracker.Pretty] ...]