<%= [openacs_header "Query Dispatcher"] %>

Query Dispatcher

<%= [openacs_menu] %>

Goals

OpenACS 4.x should be, to some degree, database-independent:

New RDBMS support should be implementable on a per-package basis without editing existing code.
Packages should be storage-aware, and potentially multi-storage capable.

Storage-Aware Architecture

Each ACS classic package contains storage-dependent pieces, including:

a data model for one RDBMS, usually Oracle 8
default queries using the new database API, written for Oracle 8

In order to safely extend the ACS package structure, an OpenACS package will include additional information, including:

data model files tagged by RDBMS type and version
RDBMS type and version of default queries
additional queries for other RDBMSs.

If this additional information isn't included in a package's meta-data, the OpenACS package manager will assume:

the single data model provided is built for Oracle 8
the default queries are written for Oracle 8

Because we don't want to bias OpenACS for one database or another, and because it is quite confusing that some queries are inline while others are not, OpenACS will aim to quickly move all queries outside of the Tcl code, including the Oracle queries.

Data Models

The data model of a package in ACS Classic 4.x is stored in package_root/sql. OpenACS 4.x will change this convention to allow for multiple data models. The package_root/sql location will be split into multiple directories, one per supported RDBMS. The directories will be named according to the RDBMS (and potentially version) they refer to. No attempt will be made at finding common data model components across RDBMSs, as this seems to complicate the situation more than simplify anything. However, it is expected that the SQL creation files are exactly parallel for all RDBMSs supported by a single package. While these specifications may not assume such a file hierarchy at this point in time, they may change to make such an assumption at a later date.

Thus, a directory package_root/sql might look like:

package_root/sql/oracle/
package_root/sql/oracle-9/
package_root/sql/postgres
package_root/sql/postgres-7.0

Such a structure would mean that an Oracle 8 installation would use the default oracle directory, while an Oracle 9 will use the specific oracle-9 directory (this supposes that Oracle 9 has some special features). Similarly, PostgreSQL 7.1 will use the default postgres directory, while someone might choose to make the system also compatible with PostgreSQL 7.0 by adding a specific data model for it in postgres-7.0.

In order to make this work in the APM architecture, the .info file should include RDBMS information. The <%= [ns_quotehtml ""] %> currently contains no additional information. The OpenACS package specification will add an RDBMS tag (NOT an attribute!) as follows:

<%= [ns_quotehtml "oracle9"] %>

For now, this information will be redundant with the naming scheme selected above. Again, this may change at any time. Thus, both the naming convention and the .info additions must be respected.

A Full Query

A query in ACS Classic 4.x is currently tagged uniquely by Tcl script or Tcl procedure. Since it is relatively easy to determine in which Tcl page or procedure a DB call is being made, we have a means of uniquely identifying queries in a system-wide manner. We must also attach all required meta-data to a query. A Full Query is thus defined as:

the SQL of the query
the bind variables expected (which may be implicit in the query)
the RDBMS(s) the query is designed for

The best way to specify a "full query" is via a human-readable, easily-parsable format. Using XML for this purpose is quite appropriate. A full query would then be defined as:

<%= [ns_quotehtml "

select * from foo where foo_id= :foo_id

foo_id

postgresql7

"] %>

The <%= [ns_quotehtml ""] %> component can be left out, or the <%= [ns_quotehtml ""] %> can be left out altogether. The Query Dispatcher will pick the most specific query possible for a given setup. Thus, it is possible to specify a generic SQL92 query that will apply to all RDBMSs. Only when an RDBMS isn't compatible with that generic query will there need to be a more precisely specified query for that particular RDBMS. This allows developers to focus on SQL92 queries, and to work on RDBMS-specific query as a "diff" from the standard. (The fact that this encourages the use of compliant RDBMSs is not a bad thing, either).

The full-query and query dispatching mechanism need to be well-abstracted for possible future enhancements in the means of storing and organizing these full queries. The following API will allow this:

db_fullquery_fetch fullquery_global_name
This retrieves a FullQuery data structure using the global name. The FullQuery data structure is then used by the following API calls to obtain further query information.
db_fullquery_get_text fullquery {rdbms}
Retrieves the SQL query text from the FullQuery data structure. If the RDBMS parameter is provided, the text for that RDBMS is returned. Otherwise, the current RDBMS is used. If no query exists for this RDBMS, this throws an exception.
db_fullquery_compatible_p fullquery {rdbms}
Checks if a FullQuery is compatible with a given RDBMS. If no RDBMS parameter is provided, the FullQuery is checked against the current RDBMS.
db_fullquery_get_bindvars fullquery
Returns a Tcl list of bind variable names for the FullQuery.
db_current_rdbms
Returns the current RDBMS, which can be detailed using the following two procedures.
db_rdbms_get_type rdbms
Returns the type of a given RDBMS.
db_rdbms_get_version rdbms
Returns the version of a given RDBMS.
db_rdbms_match rdbms_instance rdbms_pattern
Returns true if rdbms_instance matches the rdbms_patterns. rdbms_pattern might be something like oracle with no version, while rdbms_instance might be oracle 8.1.6, in which case the procedure returns true. If the rdbms_patterns is oracle version 9, though, the procedure will return false.

The Simple Dispatching Process

The Query Dispatcher must be able to locate the various versions of each query in order to create the FullQuery data structures. For this, each FullQuery must have a global name and, from that global name, a means of locating and loading the query.

Naming Scheme

The naming scheme for a FullQuery is specified as all-lowercase, as follows:

package_name.section_name.chunk_name.query_name

where :

the package_name is the name of the OpenACS package to which this query belongs. Packages are uniquely named in the system, and anything belonging to the kernel will be tagged acs_kernel for the purposes of this naming scheme.
the section_name is either tcl or www depending on whether the query is in a Tcl procedure or web page.
the chunk_name is either a Tcl procedure name, or a web page path.
the query_name is the actual query tag in the db_ API call.

Locating FullQuery Information

FullQuery information will be stored as XML text files. In order to provide the same flexibility as Tcl procedure files and independent web pages, FullQueries will be stored, by convention, in files that parallel the directory hierarchy of the code they pertain to.

For example, in package acs-subsite, the file www/register/deleted-user.tcl will be complemented by the file www/register/deleted-user.postgres.sql and potentially by the file www/register/deleted-user.oracle.sql.

The Query Dispatcher will look at any .sql files in the tcl and www directories, and load the FullQuery information from those files. The actual names of the files (deleted-user and .oracle extension) don't matter, as the fully-qualified name of the query and the RDBMS-compatibilities are defined in the FullQuery XML itself. The file naming is a convention to make it easier for developers to get at the right query text.

Storing & Refreshing FullQuery Information

The Query Dispatcher will load up all queries and store them in an nsv array at AOLserver/OpenNSD boot time. The file location of each query will also be stored in the nsv array so that, at a later date, the query information can be easily located and reloaded on a per-query-file basis. We assume that, if the definition of a query changes place (which shouldn't happen when users are following the proper naming conventions), an AOLserver/OpenNSD reboot will be necessary to properly reload the queries.

During development time, it is perfectly acceptable to reparse the queries on every DB query. The first version of the Query Dispatcher will not bother with much caching, in fact. The first production release, however, will provide two means of caching:

all queries loaded up only once at server startup (i.e. no ability to reload queries, good only for production sites)
queries reloaded when the source file has changed

Dynamic SQL Strategies

under development <%= [openacs_footer] %>