By Pete Su and Jon Salz. Modified by Roberto Mello.
One of OpenACS's great strengths is that code written for it is +very close to the database. It is very easy to interact with the +database from anywhere within OpenACS, and we have a coherent API +for database access which makes this even easier.
More detailed information about the DB api is available at +Database Access API.
+The OpenACS database API is meant to save developers from making +common mistakes and to provide a more structured syntax for +specifying database operations, including transactions. Here's an +example of the API.
+set count 0 +set tcl_var "foo" +set sql { + SELECT foo, bar, baz + FROM some_table, some_other_table + WHERE some_table.id = some_other_table.id + and some_table.condition_p = :tcl_var +} + +db_transaction { + db_foreach my_example_query_name $sql { + lappend rows [list $foo $bar $baz] + incr count + } + foreach row $rows { + call_some_proc $foo $bar $baz + } +} +
There are several things to note here:
No explicit code for grabbing and releasing handles. Usage of +the Database API implicitly deals with all handle management +issues.
The db_transaction
command
+makes the scope of a transaction clear; db_transaction
takes the code block
+argument and automatically runs it in the context of a transaction.
+If you use something like db_foreach though, you need to make sure
+that there are no calls in the code block which would take a second
+db handle since the transaction is only valid for one handle (thats
+why we build up a list of returned values and call a second proc
+outside the db_foreach loop).
The command db_foreach
+writes our old while loop for us.
Every SQL query has a name, which is used in conjunction with +.XQL files to support multiple databases.
Finally and most importantly, there API implements bind +variables, which we will cover next.
Bind variables are placeholders for literal values in an SQL +query being sent to the server. In the old way, data was generally +passed to directly to the DB backend, via Tcl string interpolation. +In the example above, the query would look like:
+select foo, bar, baz +from some_table, some_other_table +where some_table.id=some_other_table.id +and some_table.condition_p = '$foo' +
There are a few problems with this:
If the value of $foo is a huge string, then we waste a lot of +time in the database server doing useless parsing.
Second, if the literal value contains characters like single +quotes, we have to be careful to properly escape them, because not +quoting them will lead to surprising errors.
Third, no type checking occurs on the literal value. Finally, if +the Tcl variable is passed in or between web forms or otherwise +subject to external modification, there is nothing keeping +malicious users from setting the Tcl variable to some string that +changes the query textually. This type of attack, called +SQL smuggling, can be very +damaging - entire tables can be exposed or have their contents +deleted, for example.
Another very important reason for using bind variables is +performance. Oracle can cache previously parsed queries. If there +are values in the where clause, that is how the query is cached. It +also performs bind variable susbstitution after parsing the SQL +statement. This means that SQL statements that use bind variables +will always match (assuming all else is the same) while SQL +statements that do not use bind variables will not match unless the +values in the statement are exactly the same. This will improve the +query cache considerably, which can make the server much more +efficient.
+What the DB API (in conjuntion with the database drivers +implemented for aolserver) do is send the SQL statement to the +server for parsing, then bind values to the variables and sends +those values along seperately as a second step. This seperate +binding step is where the term bind +variable comes from.
This split has several advantages. First, type checking happens +on the literal. If the column we are comparing against holds +numbers, and we send a string, we get a nice error. Second, since +string literals are no longer in the query, no extra quoting is +required. Third, substitution of bind variables cannot change the +actual text of the query, only the literal values in the +placeholders. The database API makes bind variables easy to use by +hooking them smoothly into the Tcl runtime so you simply provide +:tclvar and the value of $tclvar is sent to the backend to actually +execute the query.
The database API parses the query and pulls out all the bind +variable specifications and replaces them with generic +placeholders. It then automatically pulls the values of the named +Tcl vars out of the runtime environment of the script, and passes +them to the database.
Note that while this looks like a simple syntactic change, it +really is very different from how interpolated text queries work. +You use bind variables to replace what would otherwise be a literal +value in a query, and Tcl style string interpolation does not +happen. So you cannot do something like:
+set table "baz" +set condition "where foo = bar" + +db_foreach my_query { select :table from some_table where :condition } + +
SQL will not allow a literal to occur where we've put the bind +variables, so the query is syntactically incorrect. You have to +remember that while the bind variable syntax looks similar to +variable interpolation in Tcl, It is not +the same thing at all.
Finally, the DB API has several different styles for passing +bind variable values to queries. In general, use the style +presented here because it is the most convenient.
Every db_*
command accepting
+a SQL command as an argument supports bind variables. You can
+either
Specify the -bind
switch to
+provide a set with bind variable values, or
Specify the -bind
switch to
+explicitly provide a list of bind variable names and values, or
Not specify a bind variable list at all, in which case Tcl +variables are used as bind variables.
The default behavior (i.e., if the -bind
switch is omitted) is that these
+procedures expect to find local variables that correspond in name
+to the referenced bind variables, e.g.:
+ +set user_id 123456 +set role "administrator" + +db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} { + # do something for each group of which user 123456 is in the role + # of "administrator" +} + + +
The value of the local Tcl variable user_id
(123456) is bound to the
+user_id
bind variable.
The -bind
switch can takes
+the name of an ns_set
+containing keys for each bind variable named in the query,
+e.g.:
+ +set bind_vars [ns_set create] +ns_set put $bind_vars user_id 123456 +ns_set put $bind_vars role "administrator" + +db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} -bind $bind_vars { + # do something for each group in which user 123456 has the role + # of "administrator" +} + + +
Alternatively, as an argument to -bind
you can specify a list of alternating
+name/value pairs for bind variables:
+ +db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} -bind [list user_id 123456 role "administrator"] { + # do something for each group in which user 123456 has the role + # of "administrator" +} + + ++
When processing a DML statement, Oracle coerces empty strings
+into null
. (This coercion does
+not occur in the
+WHERE
clause of a query, i.e.
+col = ''
and col is null
are not equivalent.)
As a result, when using bind variables, the only way to make
+Oracle set a column value to null
is to set the corresponding bind
+variable to the empty string, since a bind variable whose value is
+the string "null" will be interpreted as the literal string
+"null".
These Oracle quirks complicate the process of writing clear and +abstract DML difficult. Here is an example that illustrates +why:
+ +# +# Given the table: +# +# create table foo ( +# bar integer, +# baz varchar(10) +# ); +# + +set bar "" +set baz "" + +db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" +# +# the values of the "bar" and "baz" columns in the new row are both +# null, because Oracle has coerced the empty string (even for the +# numeric column "bar") into null in both cases + + +
Since databases other than Oracle do not coerce empty strings
+into null
, this code has
+different semantics depending on the underlying database (i.e., the
+row that gets inserted may not have null as its column values),
+which defeats the purpose of SQL abstraction.
Therefore, the Database Access API provides a
+database-independent way to represent null
(instead of the Oracle-specific idiom
+of the empty string): db_null
.
Use it instead of the empty string whenever you want to set a
+column value explicitly to null
, e.g.:
+set bar [db_null] +set baz [db_null] + +db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" +# +# sets the values for both the "bar" and "baz" columns to null ++
The database library can transparently maintain pools of
+sequence values, so that each request for a new sequence value
+(using db_nextval
) does not
+incur a roundtrip to the server. For instance, this functionality
+is very useful in the security/sessions library, which very
+frequently allocates values from the sec_id_seq
sequence. To utilize this
+functionality for a particular sequence, register the sequence to
+be pooled, either using the db_register_pooled_sequence
procedure at
+server startup time, or by including a configuration parameter of
+the form
+ +PoolSequence.sequence_name_seq=count +
in any configuration
+section in the yourservername.ini
file, e.g.,
+
+[ns/server/yourservername/acs/security]
+PoolSequence.sec_id_seq=20
+
+
+
The database library will allocate this number of sequence
+values at server startup. It will periodically scan pools and
+allocate new values for sequences which are less than half-full.
+(This normally occurs every 60 seconds, and is configurable via the
+PooledSequenceUpdateInterval
+parameter in the [ns/server/
yourservername
/acs/database]
configuration section.)
The Database API has several functions that wrap familiar parts +of the AOLserver database API.
Note that you never have to use ns_db
anymore (including ns_db gethandle
)! Just start doing stuff,
+and (if you want) call db_release_unused_handles
when you're done
+as a hint to release the database handle.
+db_abort_transaction
+db_abort_transaction + +
Aborts all levels of a transaction. That is if this is called
+within several nested transactions, all of them are terminated. Use
+this insetead of db_dml "abort" "abort
+transaction"
.
+db_multirow
+db_multirow [ -local ] [ -append ] [ -extend column_list ] \ + var-namestatement-namesql \ + [ -bind bind_set_id | -bind bind_value_list ] \ + code_block [ if_no_rows if_no_rows_block ] +
Performs the SQL query sql
,
+saving results in variables of the form
+var_name:1
,
+var_name:2
, etc, setting
+
+var_name:rowcount
to the total
+number of rows, and setting
+var_name:columns
to a list of
+column names.
Each row also has a column, rownum, automatically added and set +to the row number, starting with 1. Note that this will override +any column in the SQL statement named 'rownum', also if you're +using the Oracle rownum pseudo-column.
If the -local
is passed, the
+variables defined by db_multirow will be set locally (useful if
+you're compiling dynamic templates in a function or similar
+situations).
You may supply a code block, which will be executed for each row +in the loop. This is very useful if you need to make computations +that are better done in Tcl than in SQL, for example using +ns_urlencode or ad_quotehtml, etc. When the Tcl code is executed, +all the columns from the SQL query will be set as local variables +in that code. Any changes made to these local variables will be +copied back into the multirow.
You may also add additional, computed columns to the multirow,
+using the -extend { col_1col_2 ... }
switch. This is
+useful for things like constructing a URL for the object retrieved
+by the query.
If you're constructing your multirow through multiple queries
+with the same set of columns, but with different rows, you can use
+the -append
switch. This causes
+the rows returned by this query to be appended to the rows already
+in the multirow, instead of starting a clean multirow, as is the
+normal behavior. The columns must match the columns in the original
+multirow, or an error will be thrown.
Your code block may call continue
in order to skip a row and not
+include it in the multirow. Or you can call break
to skip this row and quit
+looping.
Notice the nonstandard numbering (everything else in Tcl starts +at 0); the reason is that the graphics designer, a non programmer, +may wish to work with row numbers.
Example:
+db_multirow -extend { user_url } users users_query { + select user_id first_names, last_name, email from cc_users +} { + set user_url [acs_community_member_url -user_id $user_id] +} + +
You can also iterate over a multirow after it has been created - +check the documentation for template::multirow
For example,
+db_multirow assets assets { + select asset_id, + from ... +} + +.. + +set asset_id_l [list] +multirow foreach assets { + lappend asset_id_l $asset_id +} + +
Technically it's equivalent to using a code block on the end of +your db_multirow.
+
+db_null
db_null
Returns a value which can be used in a bind variable to
+represent the SQL value null
.
+See Nulls and Bind Variables
+above.
+db_foreach
+db_foreach statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] \ + code_block [ if_no_rows if_no_rows_block ] +
Performs the SQL query sql
, executing code_block
once for each row
+with variables set to column values (or a set or array populated if
+-column_array
or column_set
is specified). If the query
+returns no rows, executes if_no_rows_block
(if
+provided).
Example:
+ +db_foreach select_foo "select foo, bar from greeble" { + doc_body_append "<li>foo=$foo; bar=$bar\n" +} if_no_rows { + doc_body_append "<li>There are no greebles in the database.\n" +} + + +
The code block may contain break
statements (which terminate the loop
+and flush the database handle) and continue
statements (which continue to the
+next row of the loop).
+db_1row
+db_1row statement-namesql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] + +
Performs the SQL query sql
, setting variables to
+column values. Raises an error if the query does not return exactly
+1 row.
Example:
+ +db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id" +# Bombs if there's no such greeble! +# Now $foo and $bar are set. + + ++
+db_0or1row
+db_0or1row statement-namesql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] + +
Performs the SQL query sql
. If a row is returned, sets
+variables to column values and returns 1. If no rows are returned,
+returns 0. If more than one row is returned, throws an error.
+db_nextval
+db_nextval sequence-name
+
Returns the next value for the sequence sequence-name (using a SQL statement
+like SELECT
sequence-name
.nextval FROM DUAL
). If sequence pooling is
+enabled for the sequence, transparently uses a value from the pool
+if available to save a round-trip to the database (see Sequence Pooling).
+db_register_pooled_sequence
+db_register_pooled_sequence sequence-namepool-size +
Registers the sequence sequence-name to be pooled, with a pool +size of pool-size sequence +values (see Sequence +Pooling).
+
+db_string
+db_string statement-namesql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ] + +
Returns the first column of the result of SQL query sql
.
+If sql
doesn't return a row,
+returns default
(or throws an error if
+default
is unspecified).
+Analogous to database_to_tcl_string
and database_to_tcl_string_or_null
.
+db_list
+db_list statement-namesql [ -bind bind_set_id | -bind bind_value_list ] + +
Returns a Tcl list of the values in the first column of the
+result of SQL query sql
. If sql
+doesn't return any rows, returns an empty list. Analogous to
+database_to_tcl_list
.
+db_list_of_lists
+db_list_of_lists statement-namesql [ -bind bind_set_id | -bind bind_value_list ] + +
Returns a Tcl list, each element of which is a list of all
+column values in a row of the result of SQL query sql
.
+If sql
doesn't return any rows,
+returns an empty list. (Analogous to database_to_tcl_list_list
.)
+db_dml
+db_dml statement-namesql \ + [ -bind bind_set_id | -bind bind_value_list ] \ + [ -blobs blob_list | -clobs clob_list | + -blob_files blob_file_list | -clob_files clob_file_list ] + +
Performs the DML or DDL statement sql
.
If a length-n list of
+blobs or clobs is provided, then the SQL should return n blobs or clobs into the bind variables
+:1
, :2
, ... :n
.
+blobs
or clobs
, if specified, should be
+a list of individual BLOBs or CLOBs to insert; blob_files
or clob_files
, if specified,
+should be a list of paths to
+files containing the data to insert. Only one of
+-blobs
, -clobs
, -blob_files
, and -clob_files
may be provided.
Example:
+ +db_dml insert_photos " + insert photos(photo_id, image, thumbnail_image) + values(photo_id_seq.nextval, empty_blob(), empty_blob()) + returning image, thumbnail_image into :1, :2 + " -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"] + + +
This inserts a new row into the photos
table, with the contents of the
+files /var/tmp/the_photo
and
+/var/tmp/the_thumbnail
in the
+image
and thumbnail
columns, respectively.
+db_write_clob
,
+
+db_write_blob
,
+
+db_blob_get_file
++db_write_clob statement-namesql [ -bind bind_set_id | -bind bind_value_list ] + +db_write_blob statement-namesql [ -bind bind_set_id | -bind bind_value_list ] + +db_blob_get_file statement-namesql [ -bind bind_set_id | -bind bind_value_list ] + +
Analagous to ns_ora
+write_clob/write_blob/blob_get_file
.
+db_release_unused_handles
+ db_release_unused_handles + +
Releases any allocated, unused database handles.
+
+db_transaction
+db_transaction code_block [ on_error { code_block } ] + +
Executes code_block
transactionally.
+Nested transactions are supported (end
+transaction
is transparently ns_db dml
'ed when the outermost transaction
+completes). The db_abort_transaction
command can be used to
+abort all levels of transactions. It is possible to specify an
+optional on_error
code block
+that will be executed if some code in code_block throws an exception. The
+variable errmsg
will be bound
+in that scope. If there is no on_error
code, any errors will be
+propagated.
Example:
+ +proc replace_the_foo { col } { + db_transaction { + db_dml "delete from foo" + db_dml "insert into foo(col) values($col)" + } +} + +proc print_the_foo {} { + doc_body_append "foo is [db_string "select col from foo"]<br>\n" +} + +replace_the_foo 8 +print_the_foo ; # Writes out "foo is 8" + +db_transaction { + replace_the_foo 14 + print_the_foo ; # Writes out "foo is 14" + db_dml "insert into some_other_table(col) values(999)" + ... + db_abort_transaction +} on_error { + doc_body_append "Error in transaction: $errmsg" +} + + +print_the_foo ; # Writes out "foo is 8" + + ++
+db_resultrows
+db_resultrows + +
Returns the number of rows affected or returned by the previous +statement.
+
+db_with_handle
+db_with_handle varcode_block +
Places a database handle into the variable var
+and executes code_block
. This is useful when
+you don't want to have to use the new API (db_foreach
, db_1row
, etc.), but need to use database
+handles explicitly.
Example:
+ +proc lookup_the_foo { foo } { + db_with_handle db { + return [db_string unused "select ..."] + } +} + +db_with_handle db { + # Now there's a database handle in $db. + set selection [ns_db select $db "select foo from bar"] + while { [ns_db getrow $db $selection] } { + set_variables_after_query + + lookup_the_foo $foo + } +} + + ++
+db_nullify_empty_string
+db_nullify_empty_string string
+
For true SQL purists, we provide the convenience function
+db_nullify_empty_string
, which
+returns [db_null] if its string
argument is the empty
+string and can be used to encapsulate another Oracle quirk:
+ +set baz "" + +# Clean out the foo table +# +db_dml unused "delete from foo" + +db_dml unused "insert into foo(baz) values('$baz')" + +set n_rows [db_string unused "select count(*) from foo where baz is null"] +# +# $n_rows is 1; in effect, the "baz is null" criterion is matching +# the empty string we just inserted (because of Oracle's coercion +# quirk) + + +
To balance out this asymmetry, you can explicitly set
+baz
to null
by writing:
+ +db_dml foo_insert "insert into foo(baz) values(:1)" {[db_nullify_empty_string $baz]} + + ++
The database API allows for direct caching of query results. +Repeated calls will return the cached value until it is either +explicitly flushed using db_flush_cache, times out (configured the +ns_cache is called to create the cache), or another cached query +fills the cache, causing older entries to be flushed.
Values returned by a query are cached if you pass the +"-cache_key" switch to the database procedure. The switch value +will be used as the key in the ns_cache eval call used to execute +the query and processing code. The db_flush proc should be called +to flush the cache when appropriate. The "-cache_pool" parameter +can be used to specify the cache pool to be used, and defaults to +db_cache_pool. The size of the default cache is governed by the +kernel parameter "DBCacheSize" in the "caching" section.
Currently db_string, db_list, db_list_of_lists, db_1row, +db_0or1row, and db_multirow support caching.
For caching to be effective, one must carefully design a +cache_pool and cache_key strategy that uniquely identifies a query +within the system, including the relevant objects being referenced +by the query. Typically a cache_key should include one or more +object_ids and a name that identifies the operation being done.
Here is an example from the layout-manager package:
+ +# Query to return the elements of a page as a list. The prefix "page_" is used to denote +# that this is a page-related query, page_id is used to uniquely identify the query +# by object, and the suffix uniquely defines the operation being performed on the +# page object. + +db_list -cache_key page_${page_id}_get_elements get_elements {} + +# When the contents of a page are changed, we flush all page-related queries for the given +# page object using db_flush_cache. + +db_flush_cache -cache_key_pattern page_${page_id}_* + + ++