Establishing Style and Supporting Multi-Lingualism

using the ArsDigita Community System by Philip Greenspun
This document explains how to establish site-wide style and presentation conventions. A core element of the system is AOLserver's ADP template parsing system.

The Big Problem

Here are some of the challenges that we need to attack:

Some Trivial Solutions

Suppose that you simply want consistent look and feel, changeable by editing only one file, across thousands of dynamic pages: What about static HTML pages? You can put regsub calls in ad_serve_html_page (in /tcl/ad-html.tcl) to consistently change the appearance of outgoing pages.

If you're building from scratch, you could build in ADP instead of HTML and use ns_register_adptag to augment the HTML a bit.

Some Trivial Solutions in a Perfect World

In a perfect world, you'd modify ad_header or your static HTML to reference a cascading style sheet (CSS). See the HTML chapter of Philip and Alex's Guide to Web Publishing for an explanation and also do a View Source on the document to see a style sheet reference from the HEAD of a document.

This doesn't work out too great because (1) only the 4.0 browsers interpret style sheets, and (2) Brand M and Brand N browsers do very different things given the same instructions (each implements a subset of the CSS standard).

Why These Trivial Solutions Won't Work for You

Publishers and the designers they hire want to control much more than background, text, alink, and vlink colors. They want to move around the elements on each page.

So what's the big deal? Let them write whatever HTML they want.

The problem is that they want control over pages that are generated by querying the database and executing procedures but they don't want to learn how to program. Your naive solution is to let the designers build static HTML files and show them to you. You'll work these elements into Tcl string literals and write programs that print them to the browser. In the end you'll have programs that query the database and produce output exactly like what the designer wanted... on Monday. By Friday, the designer has changed his or her mind. Would you rather spend your life attacking the hard problem of Web-based collaboration or moving strings around inside .tcl pages?

Templates

Suppose that you send your staff the following message:
To: Web Developers

I want you to put all the SQL queries into Tcl functions that get loaded
at server start-up time.  The graphic designers are to build ADP pages
that call a Tcl procedure which will set a bunch of local variables with
values from the database.  They are then to stick <%=$variable_name=> in
the ADP page wherever they want one of the variables to appear.

Alternatively, write .tcl scripts that implement the "business logic"
and, after stuffing a bunch of local vars, call ns_adp_parse to drag in
the ADP created by the graphic designer.
In future, a change to the look of a site won't require a programmer, only someone who knows HTML and who is careful enough not to disturb references to variables.

Putting It All Together

Putting it all together in an ArsDigita Community System-based site: Why are the templates stored under a separate directory structure than the .tcl scripts? Isn't this inconvenient? Yes, if you're one person maintaining a site. However, the whole point of this system is that a bunch of programmers and designers are collaborating. The programmers will probably be happier if the designers never get FTP access to the directories containing .tcl scripts. Also, from a security point of view, if someone is going to upload files to your server via FTP, you don't want them ending up directly underneath the Web server root.

Caveat nerdor: remember that AOLserver sources private Tcl libraries alphabetically. So your calls to ad_register_styletag must be in a Tcl file that sorts alphabetically after "ad-style.tcl" (we suggest that you stick to a convention of "sitename-styles.tcl", e.g., "photonet-styles.tcl" would be the photo.net styles).

How we represent languages

Languages are represented by lowercase ISO 639 two-character abbreviations, e.g., "en" for English, "km" for Cambodian, "ja" for Japanese (not "jp" as you might expect; jp is the country code for Japan, not the language code for the Japanese language). For a complete list, check your Netscape preferences (click on "languages" and then try to add one), visit http://www.w3.org/International/O-charset-lang.html, or refer to this list below (we're not going to make sure that it is kept up to date, so you might want to visit the source).

Language NameCodeLanguage Family
AbkhazianabIbero-Caucasian
Afan (Oromo)omHamitic
AfaraaHamitic
AfrikaansafGermanic
AlbaniansqIndo-european (other)
AmharicamSemitic
ArabicarSemitic
ArmenianhyIndo-european (other)
AssameseasIndian
AymaraayAmerindian
AzerbaijaniazTurkic/altaic
BashkirbaTurkic/altaic
BasqueeuBasque
Bengali;banglabnIndian
BhutanidzAsian
BiharibhIndian
Bislamabi[notgiven]
BretonbrCeltic
BulgarianbgSlavic
BurmesemyAsian
ByelorussianbeSlavic
CambodiankmAsian
CatalancaRomance
ChinesezhAsian
CorsicancoRomance
CroatianhrSlavic
CzechcsSlavic
DanishdaGermanic
DutchnlGermanic
EnglishenGermanic
EsperantoeoInternationalaux.
EstonianetFinno-ugric
FaroesefoGermanic
FijifjOceanic/indonesian
FinnishfiFinno-ugric
FrenchfrRomance
FrisianfyGermanic
GalicianglRomance
GeorgiankaIbero-caucasian
GermandeGermanic
GreekelLatin/greek
GreenlandicklEskimo
GuaranignAmerindian
GujaratiguIndian
HausahaNegro-african
HebrewiwSemitic
HindihiIndian
HungarianhuFinno-ugric
IcelandicisGermanic
IndonesianinOceanic/indonesian
InterlinguaiaInternationalaux.
InterlingueieInternationalaux.
InupiakikEskimo
IrishgaCeltic
ItalianitRomance
JapanesejaAsian
JavanesejvOceanic/indonesian
KannadaknDravidian
KashmiriksIndian
KazakhkkTurkic/altaic
KinyarwandarwNegro-african
KirghizkyTurkic/altaic
KurundirnNegro-african
KoreankoAsian
KurdishkuIranian
LaothianloAsian
LatinlaLatin/greek
Latvian;lettishlvBaltic
LingalalnNegro-african
LithuanianltBaltic
MacedonianmkSlavic
MalagasymgOceanic/indonesian
MalaymsOceanic/indonesian
MalayalammlDravidian
MaltesemtSemitic
MaorimiOceanic/indonesian
MarathimrIndian
MoldavianmoRomance
Mongolianmn[notgiven]
Nauruna[notgiven]
NepalineIndian
NorwegiannoGermanic
OccitanocRomance
OriyaorIndian
Pashto;pushtopsIranian
Persian(farsi)Fairanian
PolishplSlavic
PortugueseptRomance
PunjabipaIndian
QuechuaquAmerindian
Rhaeto-romancermRomance
RomanianroRomance
RussianruSlavic
SamoansmOceanic/indonesian
SanghosgNegro-african
SanskritsaIndian
ScotsgaelicGdceltic
SerbiansrSlavic
Serbo-croatianshSlavic
SesothostNegro-african
SetswanatnNegro-african
ShonasnNegro-african
SindhisdIndian
SinghalesesiIndian
SiswatissNegro-african
SlovakskSlavic
SlovenianslSlavic
SomalisoHamitic
SpanishesRomance
SundanesesuOceanic/indonesian
SwahiliswNegro-african
SwedishsvGermanic
TagalogtlOceanic/indonesian
TajiktgIranian
TamiltaDravidian
TatarttTurkic/altaic
TeluguteDravidian
ThaithAsian
TibetanboAsian
TigrinyatiSemitic
TongatoOceanic/indonesian
TsongatsNegro-african
TurkishtrTurkic/altaic
TurkmentkTurkic/altaic
TwitwNegro-african
UkrainianukSlavic
UrduurIndian
UzbekuzTurkic/altaic
VietnameseviAsian
VolapukvoInternationalaux.
WelshcyCeltic
WolofwoNegro-african
XhosaxhNegro-african
YiddishjiGermanic
YorubayoNegro-african
ZuluzuNegro-african

What about language variants, e.g., British English versus correct English? The standard way to handle variants is with suffixes, e.g., "zh-CN" and "zh-TW" for Chinese from China and Taiwan respectively, "en-GB" and "en-US" for UK and US English, "fr-CA" and "fr-FR" for Quebecois and French French. We think this is cumbersome and can't imagine anyone wanting to have templates named "foobar.fancy.en-US.adp". Our system doesn't require that the two-character coding be ISO-standard. A publisher who wished to serve British and American readers could use "gb" and "us", for example. Non-standard? Yes. But in my defence, let me note that if you've flown over to England in an aeroplane, gone out in a mackintosh with a brolly, rotted your teeth on fairy cakes with coloured frosting, you probably have worse problems that non-standard file names.

How we pick the right template

At the end of /foo/bar.tcl, release your database handle (good practice; this way other threads can reuse it while AOLserver is streaming bytes out to your client) and then call ad_return_template.

If you need to set a cookie, bash ns_conn outputheaders.

How does ad_return_template work? It goes up one Tcl level so that it can have access to all the local vars that bar.tcl might have set. Then it looks at the user's language and graphics preferences (from the users_preferences defined in community-core.sql). Then it looks in the templates subtree of the file system to see what the closest matching template is (language preference overrides graphics preference).

Note that ad_return_template returns headers and content bytes to the connection but does not terminate the thread. So you can do logging or other database activity following the service of the parsed ADP template to the user.

Standard Cookie Names

If you're supporting registered users, you'll be pulling graphics and language preferences from users_preferences. You might want to offer casual users a choice of languages or graphics complexity (see scorecard.org for an example). In this case, you need to use cookies to record what the user said he or she wanted.

It is tough to know how and where the publisher will want to present users with language and graphics options. But we can build standard Tcl API calls into /tcl/ad-style.tcl if we agree to standardize on cookie names. So let's agree on the same names as the columns in users_preferences: "prefer_text_only_p" (value "t" or "f") and "language_preference" (two-char lowercase code).

Note that the code in ad-style.tcl will only look for cookies if PlainFancyCookieP and LanguageCookieP parameters are turned on.


philg@mit.edu