Index: openacs-4/packages/acs-core-docs/www/i18n.html =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/i18n.html,v diff -u -r1.15 -r1.16 --- openacs-4/packages/acs-core-docs/www/i18n.html 11 Dec 2003 23:08:46 -0000 1.15 +++ openacs-4/packages/acs-core-docs/www/i18n.html 4 Feb 2004 16:47:32 -0000 1.16 @@ -1,4 +1,4 @@ -
+
By Peter Marklund and Lars Pind
@@ -28,7 +28,17 @@ database must use internationalized functions. All displayed dates must use internationalized functions. All displayed numbers must use internationalized functions. -+ For multilingual websites we recommend using the UTF8 + charset. In order for AOLserver to use utf8 you need to set + the config parameters OutputCharset and + URLCharset to utf-8 in your AOLserver config file (use the etc/config.tcl + template file). For sites running on Oracle you need to make + sure that AOLserver is running with the NLS_LANG environment + variable set to .UTF8. You should set this variable in the + nsd-oracle run script (use the + acs-core-docs/www/files/nds-oracle.txt template file). +
Localizable text must be handled in ADP files, in TCL files, and in APM Parameters. OpenACS provides two approaches, message keys and localized ADP files. For ADP pages which are mostly @@ -38,7 +48,7 @@ which are static and mostly text, it may be easier to create a new ADP page for each language. In this case, the pages are distinguished by a file naming convention. -
If the request processor finds a file named filename.locale.adp, where locale matches the user's locale, it will process that file instead of filename.adp. For example, for a user with locale tl_PH, the file index.tl_PH.adp, if found, will be used instead of index.adp. The locale-specific file should thus contain text in the language appropriate for that locale. The code in the page, however, should still be in English. Message keys are still processed.
+
If the request processor finds a file named filename.locale.adp, where locale matches the user's locale, it will process that file instead of filename.adp. For example, for a user with locale tl_PH, the file index.tl_PH.adp, if found, will be used instead of index.adp. The locale-specific file should thus contain text in the language appropriate for that locale. The code in the page, however, should still be in English. Message keys are still processed.
Internationalizing templates is about replacing human readable text in a certain language with internal message keys, which can then be dynamically replaced with real human language in @@ -88,7 +98,109 @@ auto-generated by the APM. Example: <_ Title>
We recommend the short notation for new package development. -
+ In adp files message lookups are typically done with the syntax + \#package_key.message_key\#. In Tcl + files all message lookups *must* be on either of the following formats: +
+
+
+ Translatable texts in page TCL scripts are often found in page titles, + context bars, and form labels and options. Many times the texts are + enclosed in double quotes. The following is an example of grep commands + that can be used on Linux to highlight translatable text in TCL files: +
+ # Find text in double quotes + find -iname '*.tcl'|xargs egrep -i '"[a-z]' + # Find untranslated text in form labels, options and values + find -iname '*.tcl'|xargs egrep -i '\-(options|label|value)'|egrep -v '<#'|egrep -v '\-(value|label|options)[[:space:]]+\$[a-zA-Z_]+[[:space:]]*\\?[[:space:]]*$' + # Find text in page titles and context bars + find -iname '*.tcl'|xargs egrep -i 'set (title|page_title|context_bar) '|egrep -v '<#' + # Find text in error messages + find -iname '*.tcl'|xargs egrep -i '(ad_complain|ad_return_error)'|egrep -v '<#' ++ You may mark up translatable text in TCL library files and TCL pages + with temporary tags on the <#key text#> syntax. + If you have a sentence or paragraph of text with + variables and or procedure calls in it you should in most cases + try to turn the whole text into one + message in the catalog (remember that translators is made easier the longer the phrases to translate are). In those cases, follow these steps: +
+ The variable values in the message are usually fetched with upvar, here is an example from dotlrn: +
+ ad_return_complaint 1 "Error: A [parameter::get -parameter classes_pretty_name] + must have <em>no</em>[parameter::get -parameter class_instances_pretty_plural] to be deleted" ++ was replaced by: +
+ set subject [parameter::get -localize -parameter classes_pretty_name] + set class_instances [parameter::get -localize -parameter class_instances_pretty_plural] + + ad_return_complaint 1 [_ dotlrn.class_may_not_be_deleted] ++ This kind of interpolation also works in adp files where adp variable values will be inserted into the message. +
+ Alternatively, you may pass in an array list of the variable values to be interpolated into the message so that + our example becomes: +
+ set msg_subst_list [list subject [parameter::get -localize -parameter classes_pretty_name] + class_instances [parameter::get -localize -parameter class_instances_pretty_plural]] + + ad_return_complaint 1 [_ dotlrn.class_may_not_be_deleted $msg_subst_list] ++ When we were done going through the tcl files we ran the following + commands to check for mistakes: +
+ # Message tags should usually not be in curly braces since then the message lookup may not be + # executed then (you can usually replace curly braces with the list command). Find message tags + # in curly braces (should return nothing, or possibly a few lines for inspection) + find -iname '*.tcl'|xargs egrep -i '\{.*<#' + # Check if you've forgotten space between default key and text in message tags (should return nothing) + find -iname '*.tcl'|xargs egrep -i '<#_[^ ]' + # Review the list of tcl files with no message lookups + for tcl_file in $(find -iname '*.tcl'); do egrep -L '(<#|\[_)' $tcl_file; done ++ When you feel ready you may vist your package in the + package manager + and run the action "Replace tags with keys + and insert into catalog" on the TCL files that you've edited to + replace the temporary tags with calls to the message lookup + procedure. +
+ This section describes how to check that the set of keys used in + message lookups in tcl, adp, and info files and the set of keys in + the catalog file are identical. The scripts below assume that + message lookups in adp and info files are on the format + \#package_key.message_key\#, and that message lookups in tcl files + are always is done with one of the valid lookups described above. The script further assumes + that you have perl installed and in your path. Run the script like + this: +
+ acs-lang/bin/check-catalog.sh package_key ++ where package_key is the key of the package that you want to + test. If you don't provide the package_key argument then all + packages with catalog files will be checked. + The script will run its checks primarily on en_US xml catalog files. +
Some parameters contain text that need to be localized. In this case, instead of storing the real text in the parameter, you should use message keys using the short notation above, @@ -151,98 +263,15 @@ Use the *_pretty version in your ADP page.
To internationalize numbers, use lc_numeric $value, which formats the number using the appropriate decimal point and thousand separator for the locale. -
When coding forms, remember to use message keys for each piece of text that is user-visible, including form option labels and button labels.
When coding forms, remember to use message keys for each piece of text that is user-visible, including form option labels and button labels.
Acs-lang includes tools to automate some internationalization. From /acs-admin/apm/, select a package and then click on Internationalization, then Convert ADP, Tcl, and SQL files to using the - message catalog..
- Replace text with tags: - Choose Find human language text and replace with <# ... #> tags. This automated process - automatically locates chunks of translatable text, - generates a reasonable message key, and replaces the text - with a "temporary" tag as described above. -
Any pieces of text found but not extractable -- for - example, pieces of text with embedded adp variables - (i.e. @var_name@) -- will be listed on the result - page. Make sure to take note of these texts and translate - them manually. Suppose for example that our script tells you - that it left the text "Manage forum @forum_name@" - untouched. What you should do then is to edit the - corresponding adp file and manually replace that text with - something like "<#manage_forum Manage forum @forum_name@#>" - (to save you from too much typing you may use the shorthand - <#_ Manage forum @forum_name@#>; an underscore key will - result in the script auto-generating a key for you based on - the text). After you have made all such manual edits you can - simply run the second action labeled "Replace tags with keys - and insert into catalog". -
Note: running this action will not find translatable text within HTML or adp tags on adp pages (i.e. text in alt tags of images), nor will it find translatable text in tcl files. Such texts will have to be found manually. If those texts are in adp files they are best replaced with the <#message_key text#> tags that can be extracted by the action described below. Here are some commands that we used on Linux to look for texts in adp pages not found by the script:
-# List image tags with alt attributes, look for alt attributes with literal text -find -iname '*.adp'|xargs egrep -i '<img.*alt=' -# List submit buttons, look for text in the value attribute -find -iname '*.adp'|xargs egrep -i '<input[^>]*type="?submit' -
- When you run this step, any modified files are backed up in - a file with a ".orig" suffix. Those files are - never overwritten, though, so the .orig file will always be - the original page file, not the second-to-last file. Running - this action multiple times is harmless. -
- Manually verify each ADP - file. If necessary, you can add additional - <#...#> tags, or you can move or remove the ones set - by the automated step. -
- Manually mark up Tcl - files, marking up translatable text with the - <#...#> notation. -
Ttranslatable texts are often found in page titles, context bars, and form labels and options. Many times the texts are enclosed in double quotes. Use the following grep commands on Linux to highlight translatable text in tcl files for us:
# Find text in double quotes -find -iname '*.tcl'|xargs egrep -i '"[a-z]' -# Find untranslated text in form labels, options and values -find -iname '*.tcl'|xargs egrep -i '\-(options|label|value)'|egrep -v '<#'|egrep -v '\-(value|label|options)[[:space:]]+\$[a-zA-Z_]+[[:space:]]*\\?[[:space:]]*$' -# Find text in page titles and context bars -find -iname '*.tcl'|xargs egrep -i 'set (title|page_title|context_bar) '|egrep -v '<#' -# Find text in error messages -find -iname '*.tcl'|xargs egrep -i '(ad_complain|ad_return_error)'|egrep -v '<#'
You may mark up translatable text in tcl library files and tcl pages with temporary tags (on the <#key text#> syntax mentioned previously). If you have a sentence or paragraph of text with variables and or procedure calls in it you should in most cases try to turn the whole text into one message in the catalog. In those cases, follow these steps:
For each message call in the text, decide on a variable - name and replace the procedure call with a variable - lookup on the syntax %var_name%. Remember to initialize - a tcl variable with the same name on some line above the - text.
If the text is in a tcl file you must replace - variable lookups (occurences of $var_name or - ${var_name}) with %var_name%
You are now ready to follow the normal procedure - and mark up the text using a tempoarary message tag (<#_ - text_with_percentage_vars#>) and run the action replace - tags with keys in the APM.
The variable values in the message are usually fetched with upvar, here is an example from dotlrn:
ad_return_complaint 1 "Error: A [parameter::get -parameter classes_pretty_name] - must have no[parameter::get -parameter class_instances_pretty_plural] to be deleted" -
was replaced by:
set subject [parameter::get -localize -parameter classes_pretty_name] -set class_instances [parameter::get -localize -parameter class_instances_pretty_plural] -ad_return_complaint 1 [_ dotlrn.class_may_not_be_deleted] - -
This kind of interpolation also works in adp files where adp variable values will be inserted into the message.
Alternatively, you may pass in an array list of the variable values to be interpolated into the message so that our example becomes:
set msg_subst_list [list subject [parameter::get -localize -parameter classes_pretty_name] - class_instances [parameter::get -localize -parameter class_instances_pretty_plural]] - -ad_return_complaint 1 [_ dotlrn.class_may_not_be_deleted $msg_subst_list] -
When we were done going through the tcl files we ran the following commands to check for mistakes: -
# Message tags should usually not be in curly braces since then the message lookup may not be -# executed then (you can usually replace curly braces with the list command). Find message tags -# in curly braces (should return nothing, or possibly a few lines for inspection) -find -iname '*.tcl'|xargs egrep -i '\{.*<#' -# Check if you've forgotten space between default key and text in message tags (should return nothing) -find -iname '*.tcl'|xargs egrep -i '<#_[^ ]' -# Review the list of tcl files with no message lookups -for tcl_file in $(find -iname '*.tcl'); do egrep -L '(<#|\[_)' $tcl_file; done -
When you feel ready you may run the action "Replace tags with keys and insert into catalog" on the tcl files that you've edited to replace the temporary tags with calls to the message lookup procedure.
The acs-lang/bin/check-catalog.sh script checks that the set of keys used in message lookups in tcl, adp, and info files and the set of keys in the catalog file are identical. The scripts below assume that message lookups in adp and info files are on the format #package_key.message_key#, and that message lookups in tcl files are always done with the underscore procedure. The script assumes that you have perl installed and in your path. Run the script like this:
acs-lang/bin/check-catalog.sh package_key
where package_key is the key of the package that you want to test. If you don't provide the package_key argument then all packages with catalog files will be checked. The script will run its checks on en_US xml catalog files.
- Replace tags with keys: - This is an automated process, which will replace the - temporary <#...#> notation in both ADP and Tcl files - with the appropriate notation for the type of file, and - store the text in the message catalog. You need to run the - process twice, once for ADP files, and once for Tcl files. -
- See Multilingual APM Parameters -
Find datetime in .xql files. Use command line tools to find suspect SQL code:
grep -r "to_char.*H" * + message catalog..
Find datetime in .xql files. Use command line tools to find suspect SQL code:
grep -r "to_char.*H" * grep -r "to_date.*H" *
In SQL statements, replace the format string with the ANSI standard format, YYYY-MM-DD HH24:MI:SS and change the field name to *_ansi so that it cannot be confused with previous, improperly formatting fields. For example,
to_char(timestamp,'MM/DD/YYYY HH:MI:SS') as foo_date_pretty
becomes
to_char(timestamp,'YYYY-MM-DD HH24:MI:SS') as foo_date_ansi
In TCL files where the date fields are used, convert the datetime from local server timezone, which is how it's stored in the database, to the user's timezone for display. Do this with the localizing function lc_time_system_to_conn:
set foo_date_ansi [lc_time_system_to_conn $foo_date_ansi]
When a datetime will be written to the database, first convert it from the user's local time to the server's timezone with lc_time_conn_to_system.