Index: openacs-4/packages/acs-core-docs/www/i18n.html =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/i18n.html,v diff -u -r1.3 -r1.4 --- openacs-4/packages/acs-core-docs/www/i18n.html 20 Aug 2003 16:20:16 -0000 1.3 +++ openacs-4/packages/acs-core-docs/www/i18n.html 14 Oct 2003 11:02:58 -0000 1.4 @@ -1,26 +1,25 @@ - -Internationalization

Internationalization

+Internationalization

Internationalization

By Peter Marklund and Lars Pind


OpenACS docs are written by the named authors, and may be edited by OpenACS documentation staff. -

Introduction

+

Introduction

This document describes how to develop internationalized OpenACS packages, including writing new packages with internationalization and converting old packages. Text that - users might see is "localizable text"; replacing monolingual text + users might see is "localizable text"; replacing monolingual text and single-locale date/time/money functions with generic - functions is "internationalization"; translating first - generation text into a specific language is "localization." At + functions is "internationalization"; translating first + generation text into a specific language is "localization." At a minimum, all packages should be internationalized. If you do not also localize your package for different locales, volunteers - may use a public "localization server" to submit suggested text. + may use a public "localization server" to submit suggested text. Otherwise, your package will not be usable for all locales.

The main difference between monolingual and internationalized packages is that all user-visible text in an internationalized - package are coded as "message keys." The message keys + package are coded as "message keys." The message keys correspond to a message catalog, which contains versions of the text for each available language. Both script files (ADP/TCL) and APM parameters are affected. @@ -29,7 +28,7 @@ database must use internationalized functions. All displayed dates must use internationalized functions. All displayed numbers must use internationalized functions. -

Using the Message Catalog

+

Using the Message Catalog

Localizable text must be handled in ADP files, in TCL files, and in APM Parameters. OpenACS provides two approaches, message keys and localized ADP files. For ADP pages which are mostly @@ -39,32 +38,32 @@ which are static and mostly text, it may be easier to create a new ADP page for each language. In this case, the pages are distinguished by a file naming convention. -

Separate Templates for each Locale

If the request processor finds a file named filename.locale.adp, where locale matches the user's locale, it will process that file instead of filename.adp. For example, for a user with locale tl_PH, the file index.tl_PH.adp, if found, will be used instead of index.adp. The locale-specific file should thus contain text in the language appropriate for that locale. The code in the page, however, should still be in English. Message keys are still processed.

Message Keys in Template Files (ADP Files)

+

Separate Templates for each Locale

If the request processor finds a file named filename.locale.adp, where locale matches the user's locale, it will process that file instead of filename.adp. For example, for a user with locale tl_PH, the file index.tl_PH.adp, if found, will be used instead of index.adp. The locale-specific file should thus contain text in the language appropriate for that locale. The code in the page, however, should still be in English. Message keys are still processed.

Message Keys in Template Files (ADP Files)

Internationalizing templates is about replacing human readable text in a certain language with internal message keys, which can then be dynamically replaced with real human language in the desired locale. Message keys themselves should be in ASCII English, as should all code. Three different syntaxes are possible for message keys.

- "Short" syntax is the recommended syntax and should be used + "Short" syntax is the recommended syntax and should be used for new development. When internationalizing an existing - package, you can use the "temporary" syntax, which the APM can + package, you can use the "temporary" syntax, which the APM can use to auto-generate missing keys and automatically translate - to the short syntax. The "verbose" syntax is useful while + to the short syntax. The "verbose" syntax is useful while developing, because it allows default text so that the page is usable before you have done localization.

  • The short: - #package_key.message_key# + #package_key.message_key#

    The advantage of the short syntax is that it's short. It's as simple as inserting the value of a variable. Example: #forum.title#

  • - The verbose: <trn - key="package_key.message_key" - locale="locale">default + The verbose: <trn + key="package_key.message_key" + locale="locale">default text</trn>

    The verbose syntax allows you to specify a default text in @@ -74,10 +73,11 @@ in the message catalog yet, because what it'll do is create the message key with the default text from the tag as the localized message. Example: <trn - key="forum.title" locale="en_US">Title</trn> + key="forum.title" locale="en_US">Title</trn>

  • The temporary: - <#message_key original�text#> + <#message_key + original text#>

    This syntax has been designed to make it easy to internationalize existing pages. This is not a syntax that @@ -88,7 +88,7 @@ auto-generated by the APM. Example: <_ Title>

We recommend the short notation for new package development. -

APM Parameters

+

APM Parameters

Some parameters contain text that need to be localized. In this case, instead of storing the real text in the parameter, you should use message keys using the short notation above, @@ -101,9 +101,9 @@

Here are a couple of examples. Say we have the following two parameters, taken directly from the dotlrn package. -

Table�13.1.�

Parameter NameParameter Value
class_instance_pages_csv#dotlrn.class_page_home_title#,Simple 2-Column;#dotlrn.class_page_calendar_title#,Simple 1-Column;#dotlrn.class_page_file_storage_title#,Simple 1-Column
departments_pretty_name#departments_pretty_name#

+

Table�13.1.�

Parameter NameParameter Value
class_instance_pages_csv#dotlrn.class_page_home_title#,Simple 2-Column;#dotlrn.class_page_calendar_title#,Simple 1-Column;#dotlrn.class_page_file_storage_title#,Simple 1-Column
departments_pretty_name#departments_pretty_name#

Then, depending on how we retrieve the value, here's what we get: -

Table�13.2.�

Command used to retrieve ValueRetrieved Value
parameter::get -localize -parameter class_instances_pages_csvKurs Startseite,Simple 2-Column;Kalender,Simple 1-Column;Dateien,Simple 1-Column
parameter::get -localize -parameter departments_pretty_nameAbteilung
parameter::get -parameter departments_pretty_name#departments_pretty_name#

+

Table�13.2.�

Command used to retrieve ValueRetrieved Value
parameter::get -localize -parameter class_instances_pages_csvKurs Startseite,Simple 2-Column;Kalender,Simple 1-Column;Dateien,Simple 1-Column
parameter::get -localize -parameter departments_pretty_nameAbteilung
parameter::get -parameter departments_pretty_name#departments_pretty_name#

The value in the rightmost column in the table above is the value returned by an invocation of parameter::get. Note that for localization to happen you must use the -localize flag. @@ -113,27 +113,27 @@ locale.

Developers are responsible for creating the keys in the message - catalog, which is available at /acs-lang/admin/ -

Dates, Times, and Numbers

+ catalog, which is available at /acs-lang/admin/ +

Dates, Times, and Numbers

Dates and times must be converted when stored in the database, when retrieved from the database, and when displayed. All dates are stored in the database in the server's timezone, which is an APM Parameter set at - /acs-lang/admin/set-system-timezone + /acs-lang/admin/set-system-timezone and readable at - lang::system::timezone.. When + lang::system::timezone.. When retrieved from the database and displayed, dates and times must be localized to the user's locale.

  1. Get the date in ANSI format from the database (YYYY-MM-DD HH24:MI:SS; the time portion is optional). By convention, we identify dates in ansi format by ending the column name - with _ansi. + with _ansi. Example:

    select to_char(posting_date, 'YYYY-MM-DD HH24:MI:SS') as posting_date_ansi
       from table
     
  2. - Use the Tcl command lc_time_fmt to format the - date in "pretty" format. Several standard formats localize automatically: + Use the Tcl command lc_time_fmt to format the + date in "pretty" format. Several standard formats localize automatically:

    • %c: Long date and time (Mon November 18, 2002 12:00 AM)

    • @@ -145,47 +145,47 @@

    • %Q: Long date with weekday (Monday November 18, 2002)

    - The "q" format strings are OpenACS additions; the rest follow unix standards (see man + The "q" format strings are OpenACS additions; the rest follow unix standards (see man strftime). -

    set posting_date_pretty [lc_time_fmt $posting_date_ansi "%q"]
  3. - Use the *_pretty version in your ADP page. +

    set posting_date_pretty [lc_time_fmt $posting_date_ansi "%q"]
  4. + Use the *_pretty version in your ADP page.

- To internationalize numbers, use lc_numeric $value, which formats the number using the appropriate decimal point and thousand separator for the locale. -

Internationalizing Forms

When coding forms, remember to use message keys for each piece of text that is user-visible, including form option labels and button labels.

Internationalizing Existing Packages

Internationalize Message text in ADP and TCL

Acs-lang includes tools to automate some + To internationalize numbers, use lc_numeric $value, which formats the number using the appropriate decimal point and thousand separator for the locale. +

Internationalizing Forms

When coding forms, remember to use message keys for each piece of text that is user-visible, including form option labels and button labels.

Internationalizing Existing Packages

Internationalize Message text in ADP and TCL

Acs-lang includes tools to automate some internationalization. From - /acs-admin/apm/, select a + /acs-admin/apm/, select a package and then click on - Internationalization, then - Convert ADP, Tcl, and SQL files to using the + Internationalization, then + Convert ADP, Tcl, and SQL files to using the message catalog..

  1. Replace text with tags: - Choose Find human language text and replace with <# ... #> tags. This automated process + Choose Find human language text and replace with <# ... #> tags. This automated process automatically locates chunks of translatable text, generates a reasonable message key, and replaces the text - with a "temporary" tag as described above. + with a "temporary" tag as described above.

    Any pieces of text found but not extractable -- for example, pieces of text with embedded adp variables (i.e. @var_name@) -- will be listed on the result page. Make sure to take note of these texts and translate them manually. Suppose for example that our script tells you - that it left the text "Manage forum @forum_name@" + that it left the text "Manage forum @forum_name@" untouched. What you should do then is to edit the corresponding adp file and manually replace that text with - something like "<#manage_forum Manage forum @forum_name@#>" + something like "<#manage_forum Manage forum @forum_name@#>" (to save you from too much typing you may use the shorthand <#_ Manage forum @forum_name@#>; an underscore key will result in the script auto-generating a key for you based on the text). After you have made all such manual edits you can - simply run the second action labeled "Replace tags with keys - and insert into catalog". + simply run the second action labeled "Replace tags with keys + and insert into catalog".

    Note: running this action will not find translatable text within HTML or adp tags on adp pages (i.e. text in alt tags of images), nor will it find translatable text in tcl files. Such texts will have to be found manually. If those texts are in adp files they are best replaced with the <#message_key text#> tags that can be extracted by the action described below. Here are some commands that we used on Linux to look for texts in adp pages not found by the script:

     # List image tags with alt attributes, look for alt attributes with literal text
     find -iname '*.adp'|xargs egrep -i '<img.*alt='
     # List submit buttons, look for text in the value attribute 
    -find -iname '*.adp'|xargs egrep -i '<input[^>]*type="?submit'
    +find -iname '*.adp'|xargs egrep -i '<input[^>]*type="?submit'
     

    When you run this step, any modified files are backed up in - a file with a ".orig" suffix. Those files are + a file with a ".orig" suffix. Those files are never overwritten, though, so the .orig file will always be the original page file, not the second-to-last file. Running this action multiple times is harmless. @@ -199,7 +199,7 @@ files, marking up translatable text with the <#...#> notation.

    Ttranslatable texts are often found in page titles, context bars, and form labels and options. Many times the texts are enclosed in double quotes. Use the following grep commands on Linux to highlight translatable text in tcl files for us:

    # Find text in double quotes
    -find -iname '*.tcl'|xargs egrep -i '"[a-z]'
    +find -iname '*.tcl'|xargs egrep -i '"[a-z]'
     # Find untranslated text in form labels, options and values
     find -iname '*.tcl'|xargs egrep -i '\-(options|label|value)'|egrep -v '<#'|egrep -v '\-(value|label|options)[[:space:]]+\$[a-zA-Z_]+[[:space:]]*\\?[[:space:]]*$'
     # Find text in page titles and context bars
    @@ -214,8 +214,8 @@
                   ${var_name}) with %var_name%

  2. You are now ready to follow the normal procedure and mark up the text using a tempoarary message tag (<#_ text_with_percentage_vars#>) and run the action replace - tags with keys in the APM.

The variable values in the message are usually fetched with upvar, here is an example from dotlrn:

ad_return_complaint 1 "Error: A [parameter::get -parameter classes_pretty_name] 
-             must have no[parameter::get -parameter class_instances_pretty_plural] to be deleted"
+              tags with keys in the APM.

The variable values in the message are usually fetched with upvar, here is an example from dotlrn:

ad_return_complaint 1 "Error: A [parameter::get -parameter classes_pretty_name] 
+             must have no[parameter::get -parameter class_instances_pretty_plural] to be deleted"
 

was replaced by:

set subject [parameter::get -localize -parameter classes_pretty_name] 
 set class_instances [parameter::get -localize -parameter class_instances_pretty_plural]
 ad_return_complaint 1 [_ dotlrn.class_may_not_be_deleted]
@@ -233,18 +233,18 @@
 find -iname '*.tcl'|xargs egrep -i '<#_[^ ]'
 # Review the list of tcl files with no message lookups
 for tcl_file in $(find -iname '*.tcl'); do egrep -L '(<#|\[_)' $tcl_file; done
-

When you feel ready you may run the action "Replace tags with keys and insert into catalog" on the tcl files that you've edited to replace the temporary tags with calls to the message lookup procedure.

The acs-lang/bin/check-catalog.sh script checks that the set of keys used in message lookups in tcl, adp, and info files and the set of keys in the catalog file are identical. The scripts below assume that message lookups in adp and info files are on the format #package_key.message_key#, and that message lookups in tcl files are always done with the underscore procedure. The script assumes that you have perl installed and in your path. Run the script like this:

acs-lang/bin/check-catalog.sh package_key

where package_key is the key of the package that you want to test. If you don't provide the package_key argument then all packages with catalog files will be checked. The script will run its checks on en_US xml catalog files.

  • +

    When you feel ready you may run the action "Replace tags with keys and insert into catalog" on the tcl files that you've edited to replace the temporary tags with calls to the message lookup procedure.

    The acs-lang/bin/check-catalog.sh script checks that the set of keys used in message lookups in tcl, adp, and info files and the set of keys in the catalog file are identical. The scripts below assume that message lookups in adp and info files are on the format #package_key.message_key#, and that message lookups in tcl files are always done with the underscore procedure. The script assumes that you have perl installed and in your path. Run the script like this:

    acs-lang/bin/check-catalog.sh package_key

    where package_key is the key of the package that you want to test. If you don't provide the package_key argument then all packages with catalog files will be checked. The script will run its checks on en_US xml catalog files.

  • Replace tags with keys: This is an automated process, which will replace the temporary <#...#> notation in both ADP and Tcl files with the appropriate notation for the type of file, and store the text in the message catalog. You need to run the process twice, once for ADP files, and once for Tcl files. -

  • Internationalize Package Parameters with visible messages

    +

    Internationalize Package Parameters with visible messages

    See Multilingual APM Parameters -

    Internationalize Date and Time queries

    1. Find datetime in .xql files. Use command line tools to find suspect SQL code:

      grep -r "to_char.*H" *
      -grep -r "to_date.*H" *
      -
    2. In SQL statements, replace the format string with the ANSI standard format, YYYY-MM-DD HH24:MI:SS and change the field name to *_ansi so that it cannot be confused with previous, improperly formatting fields. For example,

      to_char(timestamp,'MM/DD/YYYY HH:MI:SS') as foo_date_pretty

      becomes

      to_char(timestamp,'YYYY-MM-DD HH24:MI:SS') as foo_date_ansi
    3. In TCL files where the date fields are used, convert the datetime from local server timezone, which is how it's stored in the database, to the user's timezone for display. Do this with the localizing function lc_time_system_to_conn:

      -set foo_date_ansi [lc_time_system_to_conn $foo_date_ansi]

      When a datetime will be written to the database, first convert it from the user's local time to the server's timezone with lc_time_conn_to_system. -

    4. When a datetime field will be displayed, format it using the localizing function lc_time_fmt. lc_time_fmt takes two parameters, datetime and format code. Several format codes are usable for localization; they are placeholders that format dates with the appropriate codes for the user's locale. These codes are: %x, %X, %q, %Q, and %c.

      set foo_date_pretty [lc_time_fmt $foo_date_ansi "%x %X"]

    Design Notes

    User locale is a property of ad_conn, ad_conn locale. The request processor sets this by calling lang::conn::locale, which looks for the following in order of precedence:

    1. Use user preference for this package (stored in ad_locale_user_prefs)

    2. Use system preference for the package (stored in apm_packages)

    3. Use user's general preference (stored in user_preferences)

    4. Use Browser header (Accept-Language HTTP header)

    5. Use system locale (an APM parameter for acs_lang)

    6. default to en_US

    For ADP pages, message key lookup occurs in the templating engine. For TCL pages, message key lookup happens with the _ function. In both cases, if the requested locale is not found but a locale which is the default for the language which matches your locale's language is +

    Internationalize Date and Time queries

    1. Find datetime in .xql files. Use command line tools to find suspect SQL code:

      grep -r "to_char.*H" *
      +grep -r "to_date.*H" *
      +
    2. In SQL statements, replace the format string with the ANSI standard format, YYYY-MM-DD HH24:MI:SS and change the field name to *_ansi so that it cannot be confused with previous, improperly formatting fields. For example,

      to_char(timestamp,'MM/DD/YYYY HH:MI:SS') as foo_date_pretty

      becomes

      to_char(timestamp,'YYYY-MM-DD HH24:MI:SS') as foo_date_ansi
    3. In TCL files where the date fields are used, convert the datetime from local server timezone, which is how it's stored in the database, to the user's timezone for display. Do this with the localizing function lc_time_system_to_conn:

      +set foo_date_ansi [lc_time_system_to_conn $foo_date_ansi]

      When a datetime will be written to the database, first convert it from the user's local time to the server's timezone with lc_time_conn_to_system. +

    4. When a datetime field will be displayed, format it using the localizing function lc_time_fmt. lc_time_fmt takes two parameters, datetime and format code. Several format codes are usable for localization; they are placeholders that format dates with the appropriate codes for the user's locale. These codes are: %x, %X, %q, %Q, and %c.

      set foo_date_pretty [lc_time_fmt $foo_date_ansi "%x %X"]

    Design Notes

    User locale is a property of ad_conn, ad_conn locale. The request processor sets this by calling lang::conn::locale, which looks for the following in order of precedence:

    1. Use user preference for this package (stored in ad_locale_user_prefs)

    2. Use system preference for the package (stored in apm_packages)

    3. Use user's general preference (stored in user_preferences)

    4. Use Browser header (Accept-Language HTTP header)

    5. Use system locale (an APM parameter for acs_lang)

    6. default to en_US

    For ADP pages, message key lookup occurs in the templating engine. For TCL pages, message key lookup happens with the _ function. In both cases, if the requested locale is not found but a locale which is the default for the language which matches your locale's language is found, then that locale is offered instead.

    View comments on this page at openacs.org