Index: openacs-4/packages/acs-core-docs/www/programming-with-aolserver.html =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/programming-with-aolserver.html,v diff -u -N -r1.42 -r1.42.4.1 --- openacs-4/packages/acs-core-docs/www/programming-with-aolserver.html 17 Jul 2006 05:38:32 -0000 1.42 +++ openacs-4/packages/acs-core-docs/www/programming-with-aolserver.html 3 Feb 2008 12:07:40 -0000 1.42.4.1 @@ -1,65 +1,66 @@ -Programming with AOLserver

Programming with AOLserver

By Michael Yoon, Jon Salz and Lars Pind.

+ +Programming with AOLserver

Programming with AOLserver

By Michael Yoon, Jon Salz and Lars Pind.

OpenACS docs are written by the named authors, and may be edited by OpenACS documentation staff. -

The global command

+

The global command

When using AOLserver, remember that there are effectively two types of global namespace, not one:

  1. Server-global: As you'd expect, there is only one server-global namespace per server, and variables set within it can be accessed by any Tcl code running subsequently, in any of the server's -threads. To set/get server-global variables, use AOLserver 3's nsv API -(which supersedes ns_share from the pre-3.0 API). +threads. To set/get server-global variables, use AOLserver 3's nsv API +(which supersedes ns_share from the pre-3.0 API).

  2. Script-global: Each Tcl script (ADP, Tcl page, registered proc, filter, etc.) executing within an AOLserver thread has its own global namespace. Any variable set in the top level of a script is, by definition, script-global, meaning that it is accessible only by subsequent code in the same script and only for the duration of the current script execution.

-The Tcl built-in command global +The Tcl built-in command global accesses script-global, not server-global, variables from within a procedure. This distinction is important to understand in order to use -global correctly when programming AOLserver. +global correctly when programming AOLserver.

Also, AOLserver purges all script-global variables in a thread (i.e., Tcl interpreter) between HTTP requests. If it didn't, that would affect (and complicate) our use of script-global variables dramatically, which would then be better described as thread-global variables. Given -AOLserver's behaviour, however, "script-global" is a more -appropriate term.

Threads and Scheduled Procedures

-ns_schedule_proc and ad_schedule_proc each take a --thread flag to cause a scheduled procedure to run +AOLserver's behaviour, however, "script-global" is a more +appropriate term.

Threads and Scheduled Procedures

+ns_schedule_proc and ad_schedule_proc each take a +-thread flag to cause a scheduled procedure to run asychronously, in its own thread. It almost always seems like a good idea to specify this switch, but there's a problem. -

It turns out that whenever a task scheduled with ns_schedule_proc --thread or ad_schedule_proc -thread t is run, AOLserver +

It turns out that whenever a task scheduled with ns_schedule_proc +-thread or ad_schedule_proc -thread t is run, AOLserver creates a brand new thread and a brand new interpreter, and reinitializes the procedure table (essentially, loads all procedures that were created during server initialization into the new interpreter). This happens every time the task is executed - and it is a very expensive process that should not be taken lightly!

The moral: if you have a lightweight scheduled procedure -which runs frequently, don't use the -thread +which runs frequently, don't use the -thread switch.

Note also that thread is initialized with a copy of what was installed during server startup, so if the procedure table have changed since startup (e.g. using the APM watch facility), that will not be reflected in the scheduled -thread.

Using return

-The return command in Tcl returns control to the caller procedure. +thread.

Using return

+The return command in Tcl returns control to the caller procedure. This definition allows nested procedures to work properly. However, this -definition also means that nested procedures cannot use return to +definition also means that nested procedures cannot use return to end an entire thread. This situation is most common in exception conditions that can be triggered from inside a procedure e.g., a permission denied exception. At this point, the procedure that detects invalid permission wants to write an error message to the user, and completely abort execution of the -caller thread. return doesn't work, because the procedure may be -nested several levels deep. We therefore use ad_script_abort -to abort the remainder of the thread. Note that using return instead -of ad_script_abort may raise some security issues: an attacker could +caller thread. return doesn't work, because the procedure may be +nested several levels deep. We therefore use ad_script_abort +to abort the remainder of the thread. Note that using return instead +of ad_script_abort may raise some security issues: an attacker could call a page that performed some DML statement, pass in some arguments, and get a permission denied error -- but the DML statement would still be -executed because the thread was not stopped. Note that return -code -return can be used in circumstances where the procedure will only be +executed because the thread was not stopped. Note that return -code +return can be used in circumstances where the procedure will only be called from two levels deep. -

Returning More Than One Value From a Function

-Many functions have a single return value. For instance, empty_string_p +

Returning More Than One Value From a Function

+Many functions have a single return value. For instance, empty_string_p returns a number: 1 or 0. Other functions need to return a composite value. For instance, consider a function that looks up a user's name and email address, given an ID. One way to implement this is to return a three-element @@ -74,33 +75,33 @@

AOLserver/Tcl generally has three mechanisms that we like, for returning more than one value from a function. When to use which depends on the circumstances.

Using Arrays and Pass-By-Value

-The one we generally prefer is returning an array -get-formatted list. It has all the nice properties of +The one we generally prefer is returning an array +get-formatted list. It has all the nice properties of pass-by-value, and it uses Tcl arrays, which have good native support.

 ad_proc ad_get_user_info { user_id } {
     db_1row user_info { select first_names, last_name, email from users where user_id = :user_id }
     return [list \
-        name "$first_names $last_name" \
+        name "$first_names $last_name" \
     email $email \
-    namelink "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>" \
-    emaillink "<a href=\"mailto:$email\">$email</a>"]
+    namelink "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>" \
+    emaillink "<a href=\"mailto:$email\">$email</a>"]
 }
 
 array set user_info [ad_get_user_info $user_id]
 
-doc_body_append "$user_info(namelink) ($user_info(emaillink))"
+doc_body_append "$user_info(namelink) ($user_info(emaillink))"
 

You could also have done this by using an array internally and using -array get: +array get:

 
 ad_proc ad_get_user_info { user_id } {
     db_1row user_info { select first_names, last_name, email from users where user_id = :user_id }
-    set user_info(name) "$first_names $last_name"
+    set user_info(name) "$first_names $last_name"
     set user_info(email) $email
-    set user_info(namelink) "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>"
-    set user_info(emaillink) "<a href=\"mailto:$email\">$email</a>"
+    set user_info(namelink) "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>"
+    set user_info(emaillink) "<a href=\"mailto:$email\">$email</a>"
     return [array get user_info]
 }
 
@@ -116,7 +117,7 @@
 milisecond. The time depends almost completely on the number of entries, and
 almost not at all on the size of the entries.

You implement pass-by-reference in Tcl by taking the name of an array -as an argument and upvar it. +as an argument and upvar it.

 
 ad_proc ad_get_user_info { 
@@ -125,30 +126,30 @@
 } {
     upvar $array user_info
     db_1row user_info { select first_names, last_name, email from users where user_id = :user_id }
-    set user_info(name) "$first_names $last_name"
+    set user_info(name) "$first_names $last_name"
     set user_info(email) $email
-    set user_info(namelink) "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>"
-    set user_info(emaillink) "<a href=\"mailto:$email\">$email</a>"
+    set user_info(namelink) "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>"
+    set user_info(emaillink) "<a href=\"mailto:$email\">$email</a>"
 }
 
 ad_get_user_info -array user_info $user_id
 
-doc_body_append "$user_info(namelink) ($user_info(emaillink))"
+doc_body_append "$user_info(namelink) ($user_info(emaillink))"
 
 

We prefer pass-by-value over pass-by-reference. Pass-by-reference makes the code harder to read and debug, because changing a value in one place has side effects in other places. Especially if have a chain of -upvars through several layers of the call stack, you'll have -a hard time debugging.

Multisets: Using ns_sets and Pass-By-Reference

+upvars through several layers of the call stack, you'll have +a hard time debugging.

Multisets: Using ns_sets and Pass-By-Reference

An array is a type of set, which means you can't have multiple entries with the same key. Data structures that can have multiple entries for the same key are known as a multiset or bag.

If your data can have multiple entries with the same key, -you should use the AOLserver built-in -ns_set. You can also do a case-insensitive lookup on an -ns_set, something you can't easily do on an array. This is +you should use the AOLserver built-in +ns_set. You can also do a case-insensitive lookup on an +ns_set, something you can't easily do on an array. This is especially useful for things like HTTP headers, which happen to have these -exact properties.

You always use pass-by-reference with ns_sets, since they +exact properties.

You always use pass-by-reference with ns_sets, since they don't have any built-in way of generating and reconstructing themselves from a string representation. Instead, you pass the handle to the set.

 
@@ -157,34 +158,34 @@
     user_id
 } {
     db_1row user_info { select first_names, last_name, email from users where user_id = :user_id }
-    ns_set put $set name "$first_names $last_name"
+    ns_set put $set name "$first_names $last_name"
     ns_set put $set email $email
-    ns_set put $set namelink "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>"
-    ns_set put $set emaillink "<a href=\"mailto:$email\">$email</a>"
+    ns_set put $set namelink "<a href=\"/shared/community-member?user_id=[ns_urlencode $user_id]\">$first_names $last_name</a>"
+    ns_set put $set emaillink "<a href=\"mailto:$email\">$email</a>"
 }
 
 set user_info [ns_set create]
 ad_get_user_info -set $user_info $user_id
 
-doc_body_append "[ns_set get $user_info namelink] ([ns_set get $user_info emaillink])"
+doc_body_append "[ns_set get $user_info namelink] ([ns_set get $user_info emaillink])"
 
 

-We don't recommend ns_set as a general mechanism for passing +We don't recommend ns_set as a general mechanism for passing sets (as opposed to multisets) of data. Not only do they inherently use pass-by-reference, which we dis-like, they're also somewhat clumsy to use, since Tcl doesn't have built-in syntactic support for them. -

Consider for example a loop over the entries in a ns_set as +

Consider for example a loop over the entries in a ns_set as compared to an array:

 
 # ns_set variant
 set size [ns_set size $myset]
 for { set i 0 } { $i < $size } { incr i } {
-    puts "[ns_set key $myset $i] = [ns_set value $myset $i]"
+    puts "[ns_set key $myset $i] = [ns_set value $myset $i]"
 }
 
 # array variant
 foreach name [array names myarray] {
-    puts "$myarray($name) = $myarray($name)"
+    puts "$myarray($name) = $myarray($name)"
 }
 
 

@@ -204,9 +205,9 @@ ]

-ns_sets are designed to be lightweight, so memory consumption -should not be a problem. However, when using ns_set get to +ns_sets are designed to be lightweight, so memory consumption +should not be a problem. However, when using ns_set get to perform lookup by name, they perform a linear lookup, whereas arrays use a -hash table, so ns_sets are slower than arrays when the number of +hash table, so ns_sets are slower than arrays when the number of entries is large.

($Id$)
View comments on this page at openacs.org