Tcl procedures: /packages/acs-kernel/10-database-procs.tcl
Tcl initialization: /packages/acs-kernel/database-init.tcl
One of OpenACS's great strengths is that code written for it is +very close to the database. It is very easy to interact with the +database from anywhere within OpenACS. Our goal is to develop a +coherent API for database access which makes this even easier.
There were four significant problems with the way OpenACS
+previously used the database (i.e., directly through the
+ns_db
interface):
+Handle management. +We required code to pass database handles around, and for routines +which needed to perform database access but didn't receive a +database handle as input, it was difficult to know from which of +the three "magic pools" (main, subquery, and log) to allocate a new +handle.
+Nested
+transactions. In our Oracle driver, begin transaction
really means "turn
+auto-commit mode off" and end
+transaction
means "commit the current transaction and turn
+auto-commit mode on." Thus if transactional code needed to call a
+routine which needed to operate transactionally, the semantics were
+non-obvious. Consider:
+ +proc foo { db args } { + db_transaction { + ... + } +} + +db_transaction { +db_dml unused "insert into greeble(bork) values(33)" +foo $db +db_dml unused "insert into greeble(bork) values(50)" +} + +
This would insert greeble #33 and do all the stuff in
+foo
transactionally, but the
+end transaction
in foo
would actually cause a commit, and
+greeble #50 would later be inserted in auto-commit mode. This could
+cause subtle bugs: e.g., in the case that the insert for greeble
+#50 failed, part of the "transaction" would have already have been
+committed!. This is not a good thing.
+Unorthodox use of
+variables. The standard mechanism for mapping
+column values into variables involved the use of the set_variables_after_query
routine, which
+relies on an uplevel variable named selection
(likewise for set_variables_after_subquery
and
+subselection
).
+Hard-coded reliance on
+Oracle. It's difficult to write code supporting
+various different databases (dynamically using the appropriate
+dialect based on the type of database being used, e.g., using
+DECODE
on Oracle and
+CASE ... WHEN
on Postgres).
The Database Access API addresses the first three problems +by:
making use of database handles transparent
wrapping common database operations (including transaction +management) in Tcl control structures (this is, after all, what Tcl +is good at!)
It lays the groundwork for addressing the fourth problem by +assigning each SQL statement a logical name. In a future version of +the OpenACS Core, this API will translate logical statement names +into actual SQL, based on the type of database in use. (To smooth +the learning curve, we provide a facility for writing SQL inline +for a "default SQL dialect", which we assume to be Oracle for +now.)
To be clear, SQL abstraction is not fully implemented in OpenACS 3.3.1. +The statement names supplied to each call are not used by the API +at all. The API's design for SQL abstraction is in fact incomplete; +unresolved issues include:
how to add WHERE
clause
+criteria dynamically
how to build a dynamic ORDER
+BY
clause (Ben Adida has a proposed solution for this)
how to define a statement's formal interface (i.e., what bind
+variables it expects, what columns its SELECT
clause must contain if it's a query)
+without actually implementing the statement in a specific SQL
+dialect
So why is the incremental change of adding statement naming to +the API worth the effort? It is worth the effort because we know +that giving each SQL statement a logical name will be required by +the complete SQL abstraction design. Therefore, we know that the +effort will not be wasted, and taking advantage of the new support +for bind variables will already require code that uses 3.3.0 +version of the API to be updated.
+
+set_variables_after_query
is
+gone! (Well, it's still there, but you'll never need to use it.)
+The new API routines set local variables automatically. For
+instance:
+ +db_1row select_names "select first_names, last_name from users where user_id = [ad_conn user_id]" +doc_body_append "Hello, $first_names $last_name!" + +
Like ns_db 1row
, this will
+bomb if the query doesn't return any rows (no such user exists). If
+this isn't what you want, you can write:
+ +if { [db_0or1row select_names "select first_names, last_name from users where user_id = [ad_conn user_id]"] } { + doc_body_append "Hello, $first_names $last_name!" +} else { + # Executed if the query returns no rows. + doc_body_append "There's no such user!" +} + +
Selecting a bunch of rows is a lot prettier now:
+ +db_foreach select_names "select first_names, last_name from users" { + doc_body_append "Say hi to $first_names $last_name for me!<br>" +} + +
That's right, db_foreach
is
+now like ns_db select
plus a
+while
loop plus set_variables_after_query
plus an
+if
statement (containing code
+to be executed if no rows are returned).
+ +db_foreach select_names "select first_names, last_name from users where last_name like 'S%'" { + doc_body_append "Say hi to $first_names $last_name for me!<br>" +} if_no_rows { + doc_body_append "There aren't any users with last names beginnings with S!" +} + ++
The new API keeps track of which handles are in use, and +automatically allocates new handles when they are necessary (e.g., +to perform subqueries while a select is active). For example:
+ +doc_body_append "<ul>" +db_foreach select_names "select first_names, last_name, user_id from users" { + # Automatically allocated a database handle from the main pool. + doc_body_append "<li>User $first_names $last_name\n<ul>" + + db_foreach select_groups "select group_id from user_group_map where user_id = $user_id" { + # There's a selection in progress, so we allocated a database handle + # from the subquery pool for this selection. + doc_body_append "<li>Member of group #$group_id.\n" + } if_no_rows { + # Not a member of any groups. + doc_body_append "<li>Not a member of any group.\n" + } +} +doc_body_append "</ul>" +db_release_unused_handles + +
A new handle isn't actually allocated and released for every
+selection, of course - as a performance optimization, the API keeps
+old handles around until db_release_unused_handles
is invoked (or
+the script terminates).
Note that there is no analogue to ns_db gethandle
- the handle is always
+automatically allocated the first time it's needed.
Introduction
Most SQL statements require that the code invoking the statement +pass along data associated with that statement, usually obtained +from the user. For instance, in order to delete a WimpyPoint +presentation, a Tcl script might use the SQL statement
+
+delete from wp_presentations where presentation_id = some_presentation_id
+
where some_presentation_id
is a
+number which is a valid presentation ID of the presentation I want
+to delete. It's easy to write code handling situations like this
+since SQL statements can include bind
+variables, which represent placeholders for actual
+data. A bind variable is specified as a colon followed by an
+identifier, so the statement above can be coded as:
+ +db_dml presentation_delete { + delete from wp_presentations where presentation_id = :some_presentation_id +} + +
When this SQL statement is invoked, the value for the bind
+variable :some_presentation_id
+is pulled from the Tcl variable $some_presentation_id
(in the caller's
+environment). Note that bind variables are not limited to one per
+statement; you can use an arbitrary number, and each will pull from
+the correspondingly named Tcl variable. (Alternatively, you can
+also specify an list or ns_set
+providing bind variables' values; see Usage.)
The value of a bind variable is taken literally by the database
+driver, so there is never any need to put single-quotes around the
+value for a bind variable, or to use db_quote
to escape single-quotes contained
+in the value. The following works fine, despite the apostrophe:
+ +set exclamation "That's all, folks!" +db_dml exclamation_insert { insert into exclamations(exclamation) values(:exclamation) } + +
Note that you can use a bind variable in a SQL statement only
+where you could use a literal (a number or single-quoted string).
+Bind variables cannot be placeholders for things like SQL keywords,
+table names, or column names, so the following will not work, even
+if $table_name
is set
+properly:
+ +select * from :table_name + +
Why Bind Variables Are +Useful
Why bother with bind variables at all - why not just write the +Tcl statement above like this:
+ +db_dml presentation_delete " + delete from wp_presentations where presentation_id = $some_presentation_id +" + +
(Note the use of double-quotes to allow the variable reference
+to $some_presentation_id
to be
+interpolated in.) This will work, but consider the case where some
+devious user causes some_presentation_id
to be set to something
+like '3 or 1 = 1'
, which would
+result in the following statement being executed:
+ +delete from wp_presentations where presentation_id = 3 or 1 = 1 + +
This deletes every presentation in the database! Using bind
+variables eliminates this gaping security hole: since bind variable
+values are taken literally. Oracle will attempt to delete
+presentations whose presentation ID is literally '3 or 1 = 1'
(i.e., no presentations, since
+'3 or 1 = 1'
can't possibly be
+a valid integer primary key for wp_presentations
. In general, since Oracle
+always considers the values of bind variables to be literals, it
+becomes more difficult for users to perform URL surgery to trick
+scripts into running dangerous queries and DML.
Usage
Every db_*
command accepting
+a SQL command as an argument supports bind variables. You can
+either
specify the -bind
switch to
+provide a set with bind variable values, or
specify the -bind
switch to
+explicitly provide a list of bind variable names and values, or
not specify a bind variable list at all, in which case Tcl +variables are used as bind variables.
The default behavior (i.e., if the -bind
switch is omitted) is that these
+procedures expect to find local variables that correspond in name
+to the referenced bind variables, e.g.:
+ +set user_id 123456 +set role "administrator" + +db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} { + # do something for each group of which user 123456 is in the role + # of "administrator" +} + +
The value of the local Tcl variable user_id
(123456) is bound to the
+user_id
bind variable.
The -bind
switch can takes
+the name of an ns_set
+containing keys for each bind variable named in the query,
+e.g.:
+ +set bind_vars [ns_set create] +ns_set put $bind_vars user_id 123456 +ns_set put $bind_vars role "administrator" + +db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} -bind $bind_vars { + # do something for each group in which user 123456 has the role + # of "administrator" +} + +
Alternatively, as an argument to -bind
you can specify a list of alternating
+name/value pairs for bind variables:
+ +db_foreach user_group_memberships_by_role { + select g.group_id, g.group_name + from user_groups g, user_group_map map + where g.group_id = map.user_id + and map.user_id = :user_id + and map.role = :role +} -bind [list user_id 123456 role "administrator"] { + # do something for each group in which user 123456 has the role + # of "administrator" +} + +
When processing a DML statement, Oracle coerces empty strings
+into null
. (This coercion does
+not occur in the
+WHERE
clause of a query, i.e.
+col = ''
and col is null
are not equivalent.)
As a result, when using bind variables, the only way to make
+Oracle set a column value to null
is to set the corresponding bind
+variable to the empty string, since a bind variable whose value is
+the string "null" will be interpreted as the literal string
+"null".
These Oracle quirks complicate the process of writing clear and +abstract DML difficult. Here is an example that illustrates +why:
+ +# +# Given the table: +# +# create table foo ( +# bar integer, +# baz varchar(10) +# ); +# + +set bar "" +set baz "" + +db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" +# +# the values of the "bar" and "baz" columns in the new row are both +# null, because Oracle has coerced the empty string (even for the +# numeric column "bar") into null in both cases + +
Since databases other than Oracle do not coerce empty strings
+into null
, this code has
+different semantics depending on the underlying database (i.e., the
+row that gets inserted may not have null as its column values),
+which defeats the purpose of SQL abstraction.
Therefore, the Database Access API provides a
+database-independent way to represent null
(instead of the Oracle-specific idiom
+of the empty string): db_null
.
Use it instead of the empty string whenever you want to set a
+column value explicitly to null
, e.g.:
+ +set bar [db_null] +set baz [db_null] + +db_dml foo_create "insert into foo(bar, baz) values(:bar, :baz)" +# +# sets the values for both the "bar" and "baz" columns to null + ++
We now require that each SQL statement be assigned a logical +name for the statement that is unique to the procedure or page in +which it is defined. This is so that (eventually) we can implement +logically named statements with alternative SQL for non-Oracle +databases (e.g., Postgres). More on this later.
+Normally, db_foreach
,
+db_0or1row
, and db_1row
places the results of queries in
+Tcl variables, so you can say:
+ +db_foreach users_select "select first_names, last_name from users" { + doc_body_append "<li>$first_names $last_name\n" +} + +
However, sometimes this is not sufficient: you may need to
+examine the rows returned, to dynamically determine the set of
+columns returned by the query, or to avoid collisions with existing
+variables. You can use the -column_array
and -column_set
switches to db_foreach
, db_0or1row
, and db_1row
to instruct the database routines
+to place the results in a Tcl array or ns_set
, respectively, where the keys are
+the column names and the values are the column values. For
+example:
+ +db_foreach users_select "select first_names, last_name from users" -column_set columns { + # Now $columns is an ns_set. + doc_body_append "<li>" + for { set i 0 } { $i < [ns_set size $columns] } { incr i } { + doc_body_append "[ns_set key $columns $i] is [ns_set value $columns $i]. \n" + } +} + +
will write something like:
first_names is Jon. last_name is Salz.
first_names is Lars. last_name is Pind.
first_names is Michael. last_name is Yoon.
Note that you never have to use ns_db
anymore (including ns_db gethandle
)! Just start doing stuff,
+and (if you want) call db_release_unused_handles
when you're done
+as a hint to release the database handle.
+db_null
db_null
Returns a value which can be used in a bind variable to
+represent the SQL value null
.
+See Nulls and Bind Variables
+above.
+db_foreach
+db_foreachstatement-name sql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] \ + code_block [ if_no_rows if_no_rows_block ] +
Performs the SQL query sql
, executing code_block
once for each row
+with variables set to column values (or a set or array populated if
+-column_array
or column_set
is specified). If the query
+returns no rows, executes if_no_rows_block
(if
+provided).
Example:
+ +db_foreach select_foo "select foo, bar from greeble" { + doc_body_append "<li>foo=$foo; bar=$bar\n" +} if_no_rows { + doc_body_append "<li>There are no greebles in the database.\n" +} + +
The code block may contain break
statements (which terminate the loop
+and flush the database handle) and continue
statements (which continue to the
+next row of the loop).
+db_1row
+db_1rowstatement-namesql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] +
Performs the SQL query sql
, setting variables to
+column values. Raises an error if the query does not return exactly
+1 row.
Example:
+ +db_1row select_foo "select foo, bar from greeble where greeble_id = $greeble_id" +# Bombs if there's no such greeble! +# Now $foo and $bar are set. + ++
+db_0or1row
+db_0or1rowstatement-namesql [ -bind bind_set_id | -bind bind_value_list ] \ + [ -column_array array_name | -column_set set_name ] +
Performs the SQL query sql
. If a row is returned, sets
+variables to column values and returns 1. If no rows are returned,
+returns 0. If more than one row is returned, throws an error.
+db_string
+db_stringstatement-namesql [ -default default ] [ -bind bind_set_id | -bind bind_value_list ] +
Returns the first column of the result of SQL query sql
.
+If sql
doesn't return a row,
+returns default
(or throws an error if
+default
is unspecified).
+Analogous to database_to_tcl_string
and database_to_tcl_string_or_null
.
+db_nextval
+db_nextvalsequence-name +
Returns the next value for the sequence sequence-name (using a SQL statement
+like SELECT
sequence-name
.nextval FROM DUAL
). If sequence pooling is
+enabled for the sequence, transparently uses a value from the pool
+if available to save a round-trip to the database.
+db_list
+db_liststatement-namesql [ -bind bind_set_id | -bind bind_value_list ] +
Returns a Tcl list of the values in the first column of the
+result of SQL query sql
. If sql
+doesn't return any rows, returns an empty list. Analogous to
+database_to_tcl_list
.
+db_list_of_lists
+db_list_of_listsstatement-namesql [ -bind bind_set_id | -bind bind_value_list ] +
Returns a Tcl list, each element of which is a list of all
+column values in a row of the result of SQL query sql
.
+If sql
doesn't return any rows,
+returns an empty list. (Analogous to database_to_tcl_list_list
.)
+db_list_of_ns_sets
+db_list_of_ns_setsstatement-namesql [ -bind bind_set_id | -bind bind_value_list ] +
Returns a list of ns_sets with the values of each column of each
+row returned by the sql
query
+specified.
+db_dml
+db_dmlstatement-namesql \ + [ -bind bind_set_id | -bind bind_value_list ] \ + [ -blobs blob_list | -clobs clob_list | + -blob_files blob_file_list | -clob_files clob_file_list ] +
Performs the DML or DDL statement sql
.
If a length-n list of
+blobs or clobs is provided, then the SQL should return n blobs or clobs into the bind variables
+:1
, :2
, ... :n
.
+blobs
or clobs
, if specified, should be
+a list of individual BLOBs or CLOBs to insert; blob_files
or clob_files
, if specified,
+should be a list of paths to
+files containing the data to insert. Only one of
+-blobs
, -clobs
, -blob_files
, and -clob_files
may be provided.
Example:
+ +db_dml insert_photos " + insert photos(photo_id, image, thumbnail_image) + values(photo_id_seq.nextval, empty_blob(), empty_blob()) + returning image, thumbnail_image into :1, :2 + " -blob_files [list "/var/tmp/the_photo" "/var/tmp/the_thumbnail"] + +
This inserts a new row into the photos
table, with the contents of the
+files /var/tmp/the_photo
and
+/var/tmp/the_thumbnail
in the
+image
and thumbnail
columns, respectively.
+db_write_clob
,
+
+db_write_blob
,
+
+db_blob_get_file
++db_write_clobstatement-namesql [ -bind bind_set_id | -bind bind_value_list ] + +db_write_blobstatement-namesql [ -bind bind_set_id | -bind bind_value_list ] + +db_blob_get_filestatement-namesql [ -bind bind_set_id | -bind bind_value_list ] +
Analagous to ns_ora
+write_clob/write_blob/blob_get_file
.
+db_release_unused_handles
db_release_unused_handles
Releases any allocated, unused database handles.
+
+db_transaction
+db_transactioncode_block [ on_error { code_block } ] +
Executes code_block
transactionally.
+Nested transactions are supported (end
+transaction
is transparently ns_db dml
'ed when the outermost transaction
+completes). The db_abort_transaction
command can be used to
+abort all levels of transactions. It is possible to specify an
+optional on_error
code block
+that will be executed if some code in code_block throws an exception. The
+variable errmsg
will be bound
+in that scope. If there is no on_error
code, any errors will be
+propagated.
Example:
+ +proc replace_the_foo { col } { + db_transaction { + db_dml "delete from foo" + db_dml "insert into foo(col) values($col)" + } +} + +proc print_the_foo {} { + doc_body_append "foo is [db_string "select col from foo"]<br>\n" +} + +replace_the_foo 8 +print_the_foo ; # Writes out "foo is 8" + +db_transaction { + replace_the_foo 14 + print_the_foo ; # Writes out "foo is 14" + db_dml "insert into some_other_table(col) values(999)" + ... + db_abort_transaction +} on_error { + doc_body_append "Error in transaction: $errmsg" +} + + +print_the_foo ; # Writes out "foo is 8" + ++
+db_abort_transaction
db_abort_transaction
Aborts all levels of a transaction. That is if this is called
+within several nested transactions, all of them are terminated. Use
+this insetead of db_dml "abort" "abort
+transaction"
.
+db_multirow
+db_multirow [ -local ] [ -append ] [ -extend column_list ] \ + var-namestatement-namesql \ + [ -bind bind_set_id | -bind bind_value_list ] \ + code_block [ if_no_rows if_no_rows_block ] +
Performs the SQL query sql
,
+saving results in variables of the form
+var_name:1
,
+var_name:2
, etc, setting
+
+var_name:rowcount
to the total
+number of rows, and setting
+var_name:columns
to a list of
+column names.
Each row also has a column, rownum, automatically added and set +to the row number, starting with 1. Note that this will override +any column in the SQL statement named 'rownum', also if you're +using the Oracle rownum pseudo-column.
If the -local
is passed, the
+variables defined by db_multirow will be set locally (useful if
+you're compiling dynamic templates in a function or similar
+situations).
You may supply a code block, which will be executed for each row +in the loop. This is very useful if you need to make computations +that are better done in Tcl than in SQL, for example using +ns_urlencode or ad_quotehtml, etc. When the Tcl code is executed, +all the columns from the SQL query will be set as local variables +in that code. Any changes made to these local variables will be +copied back into the multirow.
You may also add additional, computed columns to the multirow,
+using the -extend { col_1col_2 ... }
switch. This is
+useful for things like constructing a URL for the object retrieved
+by the query.
If you're constructing your multirow through multiple queries
+with the same set of columns, but with different rows, you can use
+the -append
switch. This causes
+the rows returned by this query to be appended to the rows already
+in the multirow, instead of starting a clean multirow, as is the
+normal behavior. The columns must match the columns in the original
+multirow, or an error will be thrown.
Your code block may call continue
in order to skip a row and not
+include it in the multirow. Or you can call break
to skip this row and quit
+looping.
Notice the nonstandard numbering (everything else in Tcl starts +at 0); the reason is that the graphics designer, a non programmer, +may wish to work with row numbers.
Example:
+db_multirow -extend { user_url } users users_query { + select user_id first_names, last_name, email from cc_users +} { + set user_url [acs_community_member_url -user_id $user_id] +} + ++
+db_resultrows
db_resultrows
Returns the number of rows affected or returned by the previous +statement.
+
+db_with_handle
+db_with_handlevarcode_block +
Places a database handle into the variable var
+and executes code_block
. This is useful when
+you don't want to have to use the new API (db_foreach
, db_1row
, etc.), but need to use database
+handles explicitly.
Example:
+ +proc lookup_the_foo { foo } { + db_with_handle db { + return [db_string unused "select ..."] + } +} + +db_with_handle db { + # Now there's a database handle in $db. + set selection [ns_db select $db "select foo from bar"] + while { [ns_db getrow $db $selection] } { + set_variables_after_query + + lookup_the_foo $foo + } +} + ++
+db_name
db_name
Returns the name of the database, as returned by the driver.
+
+db_type
db_type
Returns the RDBMS type (i.e. oracle, postgresql) this OpenACS +installation is using. The nsv ad_database_type is set up during +the bootstrap process.
+
+db_compatible_rdbms_p
+db_compatible_rdbms_p db_type
+
+
Returns 1 if the given db_type is compatible with the current +RDBMS.
+
+db_package_supports_rdbms_p
+db_package_supports_rdbms_p db_type_list
+
+
Returns 1 if db_type_list contains the current RDMBS type. A +package intended to run with a given RDBMS must note this in it's +package info file regardless of whether or not it actually uses the +database.
+
+db_legacy_package_p
+db_legacy_package_p db_type_list
+
+
Returns 1 if the package is a legacy package. We can only tell +for certain if it explicitly supports Oracle 8.1.6 rather than the +OpenACS more general oracle.
+
+db_version
db_version
Returns the RDBMS version (i.e. 8.1.6 is a recent Oracle +version; 7.1 a recent PostgreSQL version.
+
+db_current_rdbms
db_current_rdbms
Returns the current rdbms type and version.
+
+db_known_database_types
db_known_database_types
Returns a list of three-element lists describing the database +engines known to OpenACS. Each sublist contains the internal +database name (used in file paths, etc), the driver name, and a +"pretty name" to be used in selection forms displayed to the +user.
The nsv containing the list is initialized by the bootstrap +script and should never be referenced directly by user code. +Returns the current rdbms type and version.
+