By Pete Su and Jon Salz. Modified by Roberto Mello.
One of OpenACS's great strengths is that code written for it is very close to the database. It is very easy to interact with the database from anywhere within OpenACS, and we have a coherent API for database access which makes this even easier.
More detailed information about the DB API is available at Database Access API.
The OpenACS database API is meant to save developers from making common mistakes and to provide a more structured syntax for specifying database operations, including transactions. Here's an example of the API.
set count 0 set tcl_var "foo" set sql { SELECT foo, bar, baz FROM some_table, some_other_table WHERE some_table.id = some_other_table.id and some_table.condition_p = :tcl_var } db_transaction { db_foreach my_example_query_name $sql { lappend rows [list $foo $bar $baz] incr count } foreach row $rows { call_some_proc $foo $bar $baz } }
There are several things to note here:
No explicit code for grabbing and releasing handles. Usage of the Database API implicitly deals with all handle management issues.
The db_transaction
command
makes the scope of a transaction clear; db_transaction
takes the code block
argument and automatically runs it in the context of a transaction.
If you use something like db_foreach though, you need to make sure
that there are no calls in the code block which would take a second
db handle since the transaction is only valid for one handle
(that's why we build up a list of returned values and call a
second proc outside the db_foreach loop).
The command db_foreach
writes our old while loop for us.
Every SQL query has a name, which is used in conjunction with .XQL files to support multiple databases.
Finally and most importantly, there API implements bind variables, which we will cover next.
Bind variables are placeholders for literal values in an SQL query being sent to the server. In the old way, data was generally passed to directly to the DB backend, via Tcl string interpolation. In the example above, the query would look like:
select foo, bar, baz from some_table, some_other_table where some_table.id=some_other_table.id and some_table.condition_p = '$foo'
There are a few problems with this:
If the value of $foo is a huge string, then we waste a lot of time in the database server doing useless parsing.
Second, if the literal value contains characters like single quotes, we have to be careful to properly escape them, because not quoting them will lead to surprising errors.
Third, no type checking occurs on the literal value. Finally, if the Tcl variable is passed in or between web forms or otherwise subject to external modification, there is nothing keeping malicious users from setting the Tcl variable to some string that changes the query textually. This type of attack, called SQL smuggling, can be very damaging - entire tables can be exposed or have their contents deleted, for example.
Another very important reason for using bind variables is performance. Oracle can cache previously parsed queries. If there are values in the where clause, that is how the query is cached. It also performs bind variable susbstitution after parsing the SQL statement. This means that SQL statements that use bind variables will always match (assuming all else is the same) while SQL statements that do not use bind variables will not match unless the values in the statement are exactly the same. This will improve the query cache considerably, which can make the server much more efficient.
What the DB API (in conjunction with the database drivers implemented for AOLserver) do is send the SQL statement to the server for parsing, then bind values to the variables and sends those values along separately as a second step. This separate binding step is where the term bind variable comes from.
This split has several advantages. First, type checking happens on the literal. If the column we are comparing against holds numbers, and we send a string, we get a nice error. Second, since string literals are no longer in the query, no extra quoting is required. Third, substitution of bind variables cannot change the actual text of the query, only the literal values in the placeholders. The database API makes bind variables easy to use by hooking them smoothly into the Tcl run time so you simply provide :tclvar and the value of $tclvar is sent to the backend to actually execute the query.
The database API parses the query and pulls out all the bind variable specifications and replaces them with generic placeholders. It then automatically pulls the values of the named Tcl vars out of the run time environment of the script, and passes them to the database.
Note that while this looks like a simple syntactic change, it really is very different from how interpolated text queries work. You use bind variables to replace what would otherwise be a literal value in a query, and Tcl style string interpolation does not happen. So you cannot do something like:
set table "baz" set condition "where foo = bar" db_foreach my_query { select :table from some_table where :condition }
SQL will not allow a literal to occur where we've put the bind variables, so the query is syntactically incorrect. You have to remember that while the bind variable syntax looks similar to variable interpolation in Tcl, It is not the same thing at all.
Finally, the DB API has several different styles for passing bind variable values to queries. In general, use the style presented here because it is the most convenient.
Every db_*
command accepting
a SQL command as an argument supports bind variables. You can
either
Specify the -bind
switch to
provide a set with bind variable values, or
Specify the -bind
switch to
explicitly provide a list of bind variable names and values, or
Not specify a bind variable list at all, in which case Tcl variables are used as bind variables.
The default behavior (i.e., if the -bind
switch is omitted) is that these
procedures expect to find local variables that correspond in name
to the referenced bind variables, e.g.:
set user_id 123456 set role "administrator" db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } { # do something for each group of which user 123456 is in the role # of "administrator" }
The value of the local Tcl variable user_id
(123456) is bound to the
user_id
bind variable.
The -bind
switch can takes
the name of an ns_set
containing keys for each bind variable named in the query,
e.g.:
set bind_vars [ns_set create] ns_set put $bind_vars user_id 123456 ns_set put $bind_vars role "administrator" db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } -bind $bind_vars { # do something for each group in which user 123456 has the role # of "administrator" }
Alternatively, as an argument to -bind
you can specify a list of alternating
name/value pairs for bind variables:
db_foreach user_group_memberships_by_role { select g.group_id, g.group_name from user_groups g, user_group_map map where g.group_id = map.user_id and map.user_id = :user_id and map.role = :role } -bind [list user_id 123456 role "administrator"] { # do something for each group in which user 123456 has the role # of "administrator" }
When processing a DML statement, Oracle coerces empty strings
into null
. (This coercion does
not occur in the
WHERE
clause of a query, i.e.
col = ''
and
col is null
are not
equivalent.)
As a result, when using bind variables, the only way to make
Oracle set a column value to null
is to set the corresponding bind
variable to the empty string, since a bind variable whose value is
the string "null" will be interpreted as the literal
string "null".
These Oracle quirks complicate the process of writing clear and abstract DML difficult. Here is an example that illustrates why:
# # Given the table: # # create table foo ( # bar integer, # baz varchar(10) # ); # set bar "" set baz "" db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" # # the values of the "bar" and "baz" columns in the new row are both # null, because Oracle has coerced the empty string (even for the # numeric column "bar") into null in both cases
The database library can transparently maintain pools of
sequence values, so that each request for a new sequence value
(using db_nextval
) does not
incur a roundtrip to the server. For instance, this functionality
is very useful in the security/sessions library, which very
frequently allocates values from the sec_id_seq
sequence. To utilize this
functionality for a particular sequence, register the sequence to
be pooled, either using the db_register_pooled_sequence
procedure at
server startup time, or by including a configuration parameter of
the form
PoolSequence.sequence_name_seq=count
in any configuration
section in the yourservername.ini
file, e.g.,
[ns/server/yourservername/acs/security]
PoolSequence.sec_id_seq=20
The database library will allocate this number of sequence
values at server startup. It will periodically scan pools and
allocate new values for sequences which are less than half-full.
(This normally occurs every 60 seconds, and is configurable via the
PooledSequenceUpdateInterval
parameter in the [ns/server/
yourservername
/acs/database]
configuration section.)
The Database API has several functions that wrap familiar parts of the AOLserver database API.
Note that you never have to use ns_db
anymore (including ns_db gethandle
)! Just start doing stuff,
and (if you want) call db_release_unused_handles
when you're
done as a hint to release the database handle.
db_abort_transaction
db_abort_transaction
Aborts all levels of a transaction. That is if this is called
within several nested transactions, all of them are terminated. Use
this instead of db_dml
"abort" "abort transaction"
.
db_multirow
db_multirow [ -local ] [ -append ] [ -extend column_list ] \ var-namestatement-namesql \ [ -bind bind_set_id | -bind bind_value_list ] \ code_block [ if_no_rows if_no_rows_block ]
Performs the SQL query sql
,
saving results in variables of the form
var_name:1
,
var_name:2
, etc, setting
var_name:rowcount
to the total
number of rows, and setting
var_name:columns
to a list of
column names.
Each row also has a column, rownum, automatically added and set to the row number, starting with 1. Note that this will override any column in the SQL statement named 'rownum', also if you're using the Oracle rownum pseudo-column.
If the -local
is passed, the
variables defined by db_multirow will be set locally (useful if
you're compiling dynamic templates in a function or similar
situations).
You may supply a code block, which will be executed for each row in the loop. This is very useful if you need to make computations that are better done in Tcl than in SQL, for example using ns_urlencode or ns_quotehtml, etc. When the Tcl code is executed, all the columns from the SQL query will be set as local variables in that code. Any changes made to these local variables will be copied back into the multirow.
You may also add additional, computed columns to the multirow,
using the -extend { col_1col_2 ... }
switch. This is
useful for things like constructing a URL for the object retrieved
by the query.
If you're constructing your multirow through multiple
queries with the same set of columns, but with different rows, you
can use the -append
switch.
This causes the rows returned by this query to be appended to the
rows already in the multirow, instead of starting a clean multirow,
as is the normal behavior. The columns must match the columns in
the original multirow, or an error will be thrown.
Your code block may call continue
in order to skip a row and not
include it in the multirow. Or you can call break
to skip this row and quit
looping.
Notice the nonstandard numbering (everything else in Tcl starts at 0); the reason is that the graphics designer, a non programmer, may wish to work with row numbers.
Example:
db_multirow -extend { user_url } users users_query { select user_id first_names, last_name, email from cc_users } { set user_url [acs_community_member_url -user_id $user_id] }
You can also iterate over a multirow after it has been created - check the documentation for template::multirow
For example,
db_multirow assets assets { select asset_id, from ... } .. set asset_id_l [list] multirow foreach assets { lappend asset_id_l $asset_id }
Technically it's equivalent to using a code block on the end of your db_multirow.
db_foreach
db_foreach statement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ] \ code_block [ if_no_rows if_no_rows_block ]
Performs the SQL query sql
, executing code_block
once for each row
with variables set to column values (or a set or array populated if
-column_array
or column_set
is specified). If the query
returns no rows, executes if_no_rows_block
(if
provided).
Example:
db_foreach select_foo "select foo, bar from greeble" { doc_body_append "<li>foo=$foo; bar=$bar\n" } if_no_rows { doc_body_append "<li>There are no greebles in the database.\n" }
The code block may contain break
statements (which terminate the loop
and flush the database handle) and continue
statements (which continue to the
next row of the loop).
db_1row
db_1row statement-namesql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ]
Performs the SQL query sql
, setting variables to
column values. Raises an error if the query does not return exactly
1 row.
Example:
db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id" # Bombs if there's no such greeble! # Now $foo and $bar are set.
db_0or1row
db_0or1row statement-namesql [ -bind bind_set_id | -bind bind_value_list ] \ [ -column_array array_name | -column_set set_name ]
Performs the SQL query sql
. If a row is returned, sets
variables to column values and returns 1. If no rows are returned,
returns 0. If more than one row is returned, throws an error.
db_nextval
db_nextval sequence-name
Returns the next value for the sequence sequence-name (using a SQL statement
like SELECT
sequence-name
.nextval FROM DUAL
). If sequence pooling is
enabled for the sequence, transparently uses a value from the pool
if available to save a round-trip to the database (see Sequence Pooling).
db_register_pooled_sequence
db_register_pooled_sequence sequence-namepool-size
Registers the sequence sequence-name to be pooled, with a pool size of pool-size sequence values (see Sequence Pooling).
db_string
db_string statement-namesql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ]
Returns the first column of the result of SQL query sql
.
If sql
doesn't return a row,
returns default
(or throws an error if
default
is unspecified).
Analogous to database_to_tcl_string
and database_to_tcl_string_or_null
.
db_list
db_list statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
Returns a Tcl list of the values in the first column of the
result of SQL query sql
. If sql
doesn't return any rows, returns an empty list. Analogous to
database_to_tcl_list
.
db_list_of_lists
db_list_of_lists statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
Returns a Tcl list, each element of which is a list of all
column values in a row of the result of SQL query sql
.
If sql
doesn't return any
rows, returns an empty list. (Analogous to database_to_tcl_list_list
.)
db_dml
db_dml statement-namesql \ [ -bind bind_set_id | -bind bind_value_list ] \ [ -blobs blob_list | -clobs clob_list | -blob_files blob_file_list | -clob_files clob_file_list ]
Performs the DML or DDL statement sql
.
If a length-n list of
blobs or clobs is provided, then the SQL should return n blobs or clobs into the bind variables
:1
, :2
, ... :n
.
blobs
or clobs
, if specified, should be
a list of individual BLOBs or CLOBs to insert; blob_files
or clob_files
, if specified,
should be a list of paths to
files containing the data to insert. Only one of
-blobs
, -clobs
, -blob_files
, and -clob_files
may be provided.
Example:
db_dml insert_photos { insert photos(photo_id, image, thumbnail_image) values(photo_id_seq.nextval, empty_blob(), empty_blob()) returning image, thumbnail_image into :1, :2 } -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"]
This inserts a new row into the photos
table, with the contents of the
files /var/tmp/the_photo
and
/var/tmp/the_thumbnail
in the
image
and thumbnail
columns, respectively.
db_write_clob
,
db_write_blob
,
db_blob_get_file
db_write_clob statement-namesql [ -bind bind_set_id | -bind bind_value_list ] db_write_blob statement-namesql [ -bind bind_set_id | -bind bind_value_list ] db_blob_get_file statement-namesql [ -bind bind_set_id | -bind bind_value_list ]
Analogous to ns_ora
write_clob/write_blob/blob_get_file
.
db_release_unused_handles
db_release_unused_handles
Releases any allocated, unused database handles.
db_transaction
db_transaction code_block [ on_error { code_block } ]
Executes code_block
transactionally.
Nested transactions are supported (end
transaction
is transparently ns_db dml
'ed when the outermost
transaction completes). The db_abort_transaction
command can be used to
abort all levels of transactions. It is possible to specify an
optional on_error
code block
that will be executed if some code in code_block throws an exception. The
variable errmsg
will be bound
in that scope. If there is no on_error
code, any errors will be
propagated.
Example:
proc replace_the_foo { col } { db_transaction { db_dml delete {delete from foo} db_dml insert {insert into foo(col) values(:col)} } } proc print_the_foo {} { doc_body_append "foo is [db_string "select col from foo"]<br>\n" } replace_the_foo 8 print_the_foo ; # Writes out "foo is 8" db_transaction { replace_the_foo 14 print_the_foo ; # Writes out "foo is 14" db_dml insert_foo {insert into some_other_table(col) values(999)} ... db_abort_transaction } on_error { doc_body_append "Error in transaction: $errmsg" } print_the_foo ; # Writes out "foo is 8"
db_resultrows
db_resultrows
Returns the number of rows affected or returned by the previous statement.
db_with_handle
db_with_handle varcode_block
Places a database handle into the variable var
and executes code_block
. This is useful when
you don't want to have to use the new API (db_foreach
, db_1row
, etc.), but need to use database
handles explicitly.
Example:
proc lookup_the_foo { foo } { db_with_handle db { return [db_string unused "select ..."] } } db_with_handle db { # Now there's a database handle in $db. set selection [ns_db select $db "select foo from bar"] while { [ns_db getrow $db $selection] } { set_variables_after_query lookup_the_foo $foo } }
The database API allows for direct caching of query results. Repeated calls will return the cached value until it is either explicitly flushed using db_flush_cache, times out (configured the ns_cache is called to create the cache), or another cached query fills the cache, causing older entries to be flushed.
Values returned by a query are cached if you pass the "-cache_key" switch to the database procedure. The switch value will be used as the key in the ns_cache eval call used to execute the query and processing code. The db_flush proc should be called to flush the cache when appropriate. The "-cache_pool" parameter can be used to specify the cache pool to be used, and defaults to db_cache_pool. The size of the default cache is governed by the kernel parameter "DBCacheSize" in the "caching" section.
Currently db_string, db_list, db_list_of_lists, db_1row, db_0or1row, and db_multirow support caching.
For caching to be effective, one must carefully design a cache_pool and cache_key strategy that uniquely identifies a query within the system, including the relevant objects being referenced by the query. Typically a cache_key should include one or more object_ids and a name that identifies the operation being done.
Here is an example from the layout-manager package:
# Query to return the elements of a page as a list. The prefix "page_" is used to denote # that this is a page-related query, page_id is used to uniquely identify the query # by object, and the suffix uniquely defines the operation being performed on the # page object. db_list -cache_key page_${page_id}_get_elements get_elements {} # When the contents of a page are changed, we flush all page-related queries for the given # page object using db_flush_cache. db_flush_cache -cache_key_pattern page_${page_id}_*