Index: openacs-4/packages/acs-core-docs/www/db-api.adp =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-core-docs/www/db-api.adp,v diff -u -N --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ openacs-4/packages/acs-core-docs/www/db-api.adp 23 Sep 2015 11:54:26 -0000 1.1.2.1 @@ -0,0 +1,648 @@ + +{/doc/acs-core-docs {Documentation}} {The OpenACS Database Access API} +The OpenACS Database Access API + + +
+

+The OpenACS Database Access API

By Pete Su and Jon Salz. Modified by Roberto Mello.

+

+Overview

One of OpenACS's great strengths is that code written for it is +very close to the database. It is very easy to interact with the +database from anywhere within OpenACS, and we have a coherent API +for database access which makes this even easier.

More detailed information about the DB api is available at +Database Access API.

+
+

+DB API Examples

The OpenACS database API is meant to save developers from making +common mistakes and to provide a more structured syntax for +specifying database operations, including transactions. Here's an +example of the API.

+set count 0
+set tcl_var "foo"
+set sql {
+        SELECT foo, bar, baz
+       FROM some_table, some_other_table
+       WHERE some_table.id = some_other_table.id
+         and some_table.condition_p = :tcl_var
+}
+
+db_transaction {
+    db_foreach my_example_query_name $sql {
+        lappend rows [list $foo $bar $baz]
+        incr count
+    }
+    foreach row $rows { 
+        call_some_proc $foo $bar $baz
+    }
+}
+

There are several things to note here:

    +
  1. No explicit code for grabbing and releasing handles. Usage of +the Database API implicitly deals with all handle management +issues.

  2. The db_transaction command +makes the scope of a transaction clear; db_transaction takes the code block +argument and automatically runs it in the context of a transaction. +If you use something like db_foreach though, you need to make sure +that there are no calls in the code block which would take a second +db handle since the transaction is only valid for one handle (thats +why we build up a list of returned values and call a second proc +outside the db_foreach loop).

  3. The command db_foreach +writes our old while loop for us.

  4. Every SQL query has a name, which is used in conjunction with +.XQL files to support multiple databases.

  5. Finally and most importantly, there API implements bind +variables, which we will cover next.

  6. +
+
+

+Bind Variables

Bind variables are placeholders for literal values in an SQL +query being sent to the server. In the old way, data was generally +passed to directly to the DB backend, via Tcl string interpolation. +In the example above, the query would look like:

+select foo, bar, baz 
+from some_table, some_other_table
+where some_table.id=some_other_table.id  
+and some_table.condition_p = '$foo'
+

There are a few problems with this:

    +
  1. If the value of $foo is a huge string, then we waste a lot of +time in the database server doing useless parsing.

  2. Second, if the literal value contains characters like single +quotes, we have to be careful to properly escape them, because not +quoting them will lead to surprising errors.

  3. +

    Third, no type checking occurs on the literal value. Finally, if +the Tcl variable is passed in or between web forms or otherwise +subject to external modification, there is nothing keeping +malicious users from setting the Tcl variable to some string that +changes the query textually. This type of attack, called +SQL smuggling, can be very +damaging - entire tables can be exposed or have their contents +deleted, for example.

    Another very important reason for using bind variables is +performance. Oracle can cache previously parsed queries. If there +are values in the where clause, that is how the query is cached. It +also performs bind variable susbstitution after parsing the SQL +statement. This means that SQL statements that use bind variables +will always match (assuming all else is the same) while SQL +statements that do not use bind variables will not match unless the +values in the statement are exactly the same. This will improve the +query cache considerably, which can make the server much more +efficient.

    +
  4. +

What the DB API (in conjuntion with the database drivers +implemented for aolserver) do is send the SQL statement to the +server for parsing, then bind values to the variables and sends +those values along seperately as a second step. This seperate +binding step is where the term bind +variable comes from.

This split has several advantages. First, type checking happens +on the literal. If the column we are comparing against holds +numbers, and we send a string, we get a nice error. Second, since +string literals are no longer in the query, no extra quoting is +required. Third, substitution of bind variables cannot change the +actual text of the query, only the literal values in the +placeholders. The database API makes bind variables easy to use by +hooking them smoothly into the Tcl runtime so you simply provide +:tclvar and the value of $tclvar is sent to the backend to actually +execute the query.

The database API parses the query and pulls out all the bind +variable specifications and replaces them with generic +placeholders. It then automatically pulls the values of the named +Tcl vars out of the runtime environment of the script, and passes +them to the database.

Note that while this looks like a simple syntactic change, it +really is very different from how interpolated text queries work. +You use bind variables to replace what would otherwise be a literal +value in a query, and Tcl style string interpolation does not +happen. So you cannot do something like:

+set table "baz"
+set condition "where foo = bar"
+
+db_foreach my_query { select :table from some_table where :condition }
+    
+

SQL will not allow a literal to occur where we've put the bind +variables, so the query is syntactically incorrect. You have to +remember that while the bind variable syntax looks similar to +variable interpolation in Tcl, It is not +the same thing at all.

Finally, the DB API has several different styles for passing +bind variable values to queries. In general, use the style +presented here because it is the most convenient.

+

+Usage

Every db_* command accepting +a SQL command as an argument supports bind variables. You can +either

    +
  • Specify the -bind switch to +provide a set with bind variable values, or

  • Specify the -bind switch to +explicitly provide a list of bind variable names and values, or

  • Not specify a bind variable list at all, in which case Tcl +variables are used as bind variables.

  • +

The default behavior (i.e., if the -bind switch is omitted) is that these +procedures expect to find local variables that correspond in name +to the referenced bind variables, e.g.:

+
+set user_id 123456
+set role "administrator"
+
+db_foreach user_group_memberships_by_role {
+    select g.group_id, g.group_name
+    from user_groups g, user_group_map map
+    where g.group_id = map.user_id
+    and map.user_id = :user_id
+    and map.role = :role
+} {
+    # do something for each group of which user 123456 is in the role
+    # of "administrator"
+}
+
+      
+

The value of the local Tcl variable user_id (123456) is bound to the +user_id bind variable.

The -bind switch can takes +the name of an ns_set +containing keys for each bind variable named in the query, +e.g.:

+
+set bind_vars [ns_set create]
+ns_set put $bind_vars user_id 123456
+ns_set put $bind_vars role "administrator"
+
+db_foreach user_group_memberships_by_role {
+    select g.group_id, g.group_name
+    from user_groups g, user_group_map map
+    where g.group_id = map.user_id
+    and map.user_id = :user_id
+    and map.role = :role
+} -bind $bind_vars {
+    # do something for each group in which user 123456 has the role
+    # of "administrator"
+}
+
+      
+

Alternatively, as an argument to -bind you can specify a list of alternating +name/value pairs for bind variables:

+        
+db_foreach user_group_memberships_by_role {
+    select g.group_id, g.group_name
+    from user_groups g, user_group_map map
+    where g.group_id = map.user_id
+    and map.user_id = :user_id
+    and map.role = :role
+} -bind [list user_id 123456 role "administrator"] {
+    # do something for each group in which user 123456 has the role
+    # of "administrator"
+}
+
+      
+
+
+

+Nulls and Bind Variables

When processing a DML statement, Oracle coerces empty strings +into null. (This coercion does +not occur in the +WHERE clause of a query, i.e. +col = '' and col is null are not equivalent.)

As a result, when using bind variables, the only way to make +Oracle set a column value to null is to set the corresponding bind +variable to the empty string, since a bind variable whose value is +the string "null" will be interpreted as the literal string +"null".

These Oracle quirks complicate the process of writing clear and +abstract DML difficult. Here is an example that illustrates +why:

+
+#
+# Given the table:
+#
+#   create table foo (
+#           bar        integer,
+#           baz        varchar(10)
+#   );
+#
+
+set bar ""
+set baz ""
+
+db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)"
+#
+# the values of the "bar" and "baz" columns in the new row are both
+# null, because Oracle has coerced the empty string (even for the
+# numeric column "bar") into null in both cases
+
+      
+

Since databases other than Oracle do not coerce empty strings +into null, this code has +different semantics depending on the underlying database (i.e., the +row that gets inserted may not have null as its column values), +which defeats the purpose of SQL abstraction.

Therefore, the Database Access API provides a +database-independent way to represent null (instead of the Oracle-specific idiom +of the empty string): db_null.

Use it instead of the empty string whenever you want to set a +column value explicitly to null, e.g.:

+set bar [db_null]
+set baz [db_null]
+
+db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)"
+#
+# sets the values for both the "bar" and "baz" columns to null
+
+
+
+

+Sequence Pooling

The database library can transparently maintain pools of +sequence values, so that each request for a new sequence value +(using db_nextval) does not +incur a roundtrip to the server. For instance, this functionality +is very useful in the security/sessions library, which very +frequently allocates values from the sec_id_seq sequence. To utilize this +functionality for a particular sequence, register the sequence to +be pooled, either using the db_register_pooled_sequence procedure at +server startup time, or by including a configuration parameter of +the form

+
+PoolSequence.sequence_name_seq=count
+

in any configuration +section in the yourservername.ini file, e.g.,

+
+[ns/server/yourservername/acs/security]
+PoolSequence.sec_id_seq=20
+
+    
+

The database library will allocate this number of sequence +values at server startup. It will periodically scan pools and +allocate new values for sequences which are less than half-full. +(This normally occurs every 60 seconds, and is configurable via the +PooledSequenceUpdateInterval +parameter in the [ns/server/yourservername/acs/database] configuration section.)

+
+

+Basic API

The Database API has several functions that wrap familiar parts +of the AOLserver database API.

Note that you never have to use ns_db anymore (including ns_db gethandle)! Just start doing stuff, +and (if you want) call db_release_unused_handles when you're done +as a hint to release the database handle.

+
+db_abort_transaction
+
+db_abort_transaction
+          
+

Aborts all levels of a transaction. That is if this is called +within several nested transactions, all of them are terminated. Use +this insetead of db_dml "abort" "abort +transaction".

+
+db_multirow
+
+db_multirow [ -local ] [ -append ] [ -extend column_list ] \
+    var-namestatement-namesql \
+    [ -bind bind_set_id | -bind bind_value_list ] \
+    code_block [ if_no_rows if_no_rows_block ]
+

Performs the SQL query sql, +saving results in variables of the form +var_name:1, +var_name:2, etc, setting + +var_name:rowcount to the total +number of rows, and setting +var_name:columns to a list of +column names.

Each row also has a column, rownum, automatically added and set +to the row number, starting with 1. Note that this will override +any column in the SQL statement named 'rownum', also if you're +using the Oracle rownum pseudo-column.

If the -local is passed, the +variables defined by db_multirow will be set locally (useful if +you're compiling dynamic templates in a function or similar +situations).

You may supply a code block, which will be executed for each row +in the loop. This is very useful if you need to make computations +that are better done in Tcl than in SQL, for example using +ns_urlencode or ad_quotehtml, etc. When the Tcl code is executed, +all the columns from the SQL query will be set as local variables +in that code. Any changes made to these local variables will be +copied back into the multirow.

You may also add additional, computed columns to the multirow, +using the -extend { col_1col_2 ... } switch. This is +useful for things like constructing a URL for the object retrieved +by the query.

If you're constructing your multirow through multiple queries +with the same set of columns, but with different rows, you can use +the -append switch. This causes +the rows returned by this query to be appended to the rows already +in the multirow, instead of starting a clean multirow, as is the +normal behavior. The columns must match the columns in the original +multirow, or an error will be thrown.

Your code block may call continue in order to skip a row and not +include it in the multirow. Or you can call break to skip this row and quit +looping.

Notice the nonstandard numbering (everything else in Tcl starts +at 0); the reason is that the graphics designer, a non programmer, +may wish to work with row numbers.

Example:

+db_multirow -extend { user_url } users users_query {
+    select user_id first_names, last_name, email from cc_users
+} {
+    set user_url [acs_community_member_url -user_id $user_id]
+}
+    
+

You can also iterate over a multirow after it has been created - +check the documentation for template::multirow

For example,

+db_multirow assets assets {
+  select asset_id,
+    from ...
+}
+
+..
+
+set asset_id_l [list]
+multirow foreach assets {
+  lappend asset_id_l $asset_id
+}
+          
+

Technically it's equivalent to using a code block on the end of +your db_multirow.

+
+db_null
+
db_null

Returns a value which can be used in a bind variable to +represent the SQL value null. +See Nulls and Bind Variables +above.

+
+db_foreach
+
+db_foreach statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \
+    [ -column_array array_name | -column_set set_name ] \
+    code_block [ if_no_rows if_no_rows_block ]
+

Performs the SQL query sql , executing code_block once for each row +with variables set to column values (or a set or array populated if +-column_array or column_set is specified). If the query +returns no rows, executes if_no_rows_block (if +provided).

Example:

+
+db_foreach select_foo "select foo, bar from greeble" {
+    doc_body_append "<li>foo=$foo; bar=$bar\n"
+} if_no_rows {
+    doc_body_append "<li>There are no greebles in the database.\n"
+}
+
+          
+

The code block may contain break statements (which terminate the loop +and flush the database handle) and continue statements (which continue to the +next row of the loop).

+
+db_1row
+
+db_1row statement-namesql [ -bind bind_set_id | -bind bind_value_list ] \
+    [ -column_array array_name | -column_set set_name ]
+          
+

Performs the SQL query sql, setting variables to +column values. Raises an error if the query does not return exactly +1 row.

Example:

+
+db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id"
+# Bombs if there's no such greeble!
+# Now $foo and $bar are set.
+
+          
+
+
+db_0or1row
+
+db_0or1row statement-namesql [ -bind bind_set_id | -bind bind_value_list ] \
+    [ -column_array array_name | -column_set set_name ]
+          
+

Performs the SQL query sql. If a row is returned, sets +variables to column values and returns 1. If no rows are returned, +returns 0. If more than one row is returned, throws an error.

+
+db_nextval
+
+db_nextval sequence-name
+

Returns the next value for the sequence sequence-name (using a SQL statement +like SELECTsequence-name.nextval FROM DUAL). If sequence pooling is +enabled for the sequence, transparently uses a value from the pool +if available to save a round-trip to the database (see Sequence Pooling).

+
+db_register_pooled_sequence
+
+db_register_pooled_sequence sequence-namepool-size
+

Registers the sequence sequence-name to be pooled, with a pool +size of pool-size sequence +values (see Sequence +Pooling).

+
+db_string
+
+db_string statement-namesql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ]
+          
+

Returns the first column of the result of SQL query sql. +If sql doesn't return a row, +returns default (or throws an error if +default is unspecified). +Analogous to database_to_tcl_string and database_to_tcl_string_or_null.

+
+db_list
+
+db_list statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
+          
+

Returns a Tcl list of the values in the first column of the +result of SQL query sql. If sql +doesn't return any rows, returns an empty list. Analogous to +database_to_tcl_list.

+
+db_list_of_lists
+
+db_list_of_lists statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
+          
+

Returns a Tcl list, each element of which is a list of all +column values in a row of the result of SQL query sql. +If sql doesn't return any rows, +returns an empty list. (Analogous to database_to_tcl_list_list.)

+
+db_dml
+
+db_dml statement-namesql \
+    [ -bind bind_set_id | -bind bind_value_list ] \
+    [ -blobs blob_list | -clobs clob_list |
+      -blob_files blob_file_list | -clob_files clob_file_list ]
+          
+

Performs the DML or DDL statement sql.

If a length-n list of +blobs or clobs is provided, then the SQL should return n blobs or clobs into the bind variables +:1, :2, ... :n. +blobs or clobs, if specified, should be +a list of individual BLOBs or CLOBs to insert; blob_files or clob_files, if specified, +should be a list of paths to +files containing the data to insert. Only one of +-blobs, -clobs, -blob_files, and -clob_files may be provided.

Example:

+
+db_dml insert_photos "
+        insert photos(photo_id, image, thumbnail_image)
+        values(photo_id_seq.nextval, empty_blob(), empty_blob())
+        returning image, thumbnail_image into :1, :2
+    "  -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"] 
+
+          
+

This inserts a new row into the photos table, with the contents of the +files /var/tmp/the_photo and +/var/tmp/the_thumbnail in the +image and thumbnail columns, respectively.

+
+ +db_write_clob, + +db_write_blob, + +db_blob_get_file +
+
+db_write_clob statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
+
+db_write_blob statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
+
+db_blob_get_file statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
+          
+

Analagous to ns_ora +write_clob/write_blob/blob_get_file.

+
+db_release_unused_handles
+
+            db_release_unused_handles
+          
+

Releases any allocated, unused database handles.

+
+db_transaction
+
+db_transaction code_block [ on_error { code_block } ]
+          
+

Executes code_block transactionally. +Nested transactions are supported (end +transaction is transparently ns_db dml'ed when the outermost transaction +completes). The db_abort_transaction command can be used to +abort all levels of transactions. It is possible to specify an +optional on_error code block +that will be executed if some code in code_block throws an exception. The +variable errmsg will be bound +in that scope. If there is no on_error code, any errors will be +propagated.

Example:

+
+proc replace_the_foo { col } {
+    db_transaction {
+        db_dml "delete from foo"
+        db_dml "insert into foo(col) values($col)"
+    }
+}
+
+proc print_the_foo {} {
+    doc_body_append "foo is [db_string "select col from foo"]<br>\n"
+}
+
+replace_the_foo 8
+print_the_foo ; # Writes out "foo is 8"
+
+db_transaction {
+    replace_the_foo 14
+    print_the_foo ; # Writes out "foo is 14"
+    db_dml "insert into some_other_table(col) values(999)"
+    ...
+    db_abort_transaction
+} on_error {
+    doc_body_append "Error in transaction: $errmsg"
+}
+    
+
+print_the_foo ; # Writes out "foo is 8"
+
+          
+
+
+db_resultrows
+
+db_resultrows
+          
+

Returns the number of rows affected or returned by the previous +statement.

+
+db_with_handle
+
+db_with_handle varcode_block
+

Places a database handle into the variable var +and executes code_block. This is useful when +you don't want to have to use the new API (db_foreach, db_1row, etc.), but need to use database +handles explicitly.

Example:

+
+proc lookup_the_foo { foo } {
+    db_with_handle db {
+        return [db_string unused "select ..."]
+    }
+}
+
+db_with_handle db {
+    # Now there's a database handle in $db.
+    set selection [ns_db select $db "select foo from bar"]
+    while { [ns_db getrow $db $selection] } {
+        set_variables_after_query
+
+        lookup_the_foo $foo
+    }
+}
+
+          
+
+
+db_nullify_empty_string
+
+db_nullify_empty_string string
+

For true SQL purists, we provide the convenience function +db_nullify_empty_string, which +returns [db_null] if its string argument is the empty +string and can be used to encapsulate another Oracle quirk:

+
+set baz ""
+
+# Clean out the foo table
+#
+db_dml unused "delete from foo"
+
+db_dml unused "insert into foo(baz) values('$baz')"
+
+set n_rows [db_string unused "select count(*) from foo where baz is null"]
+#
+# $n_rows is 1; in effect, the "baz is null" criterion is matching
+# the empty string we just inserted (because of Oracle's coercion
+# quirk)
+
+          
+

To balance out this asymmetry, you can explicitly set +baz to null by writing:

+
+db_dml foo_insert "insert into foo(baz) values(:1)" {[db_nullify_empty_string $baz]}
+
+          
+
+
+
($Id: db-api.xml,v 1.13 2009/07/12 01:08:30 +donb Exp $)
+
+

+Caching Database API Results

The database API allows for direct caching of query results. +Repeated calls will return the cached value until it is either +explicitly flushed using db_flush_cache, times out (configured the +ns_cache is called to create the cache), or another cached query +fills the cache, causing older entries to be flushed.

Values returned by a query are cached if you pass the +"-cache_key" switch to the database procedure. The switch value +will be used as the key in the ns_cache eval call used to execute +the query and processing code. The db_flush proc should be called +to flush the cache when appropriate. The "-cache_pool" parameter +can be used to specify the cache pool to be used, and defaults to +db_cache_pool. The size of the default cache is governed by the +kernel parameter "DBCacheSize" in the "caching" section.

Currently db_string, db_list, db_list_of_lists, db_1row, +db_0or1row, and db_multirow support caching.

For caching to be effective, one must carefully design a +cache_pool and cache_key strategy that uniquely identifies a query +within the system, including the relevant objects being referenced +by the query. Typically a cache_key should include one or more +object_ids and a name that identifies the operation being done.

Here is an example from the layout-manager package:

+
+# Query to return the elements of a page as a list. The prefix "page_" is used to denote
+# that this is a page-related query, page_id is used to uniquely identify the query
+# by object, and the suffix uniquely defines the operation being performed on the
+# page object.
+
+db_list -cache_key page_${page_id}_get_elements get_elements {}
+
+# When the contents of a page are changed, we flush all page-related queries for the given
+# page object using db_flush_cache.
+
+db_flush_cache -cache_key_pattern page_${page_id}_*
+
+    
+
+
+
+ +
View comments +on this page at openacs.org