%= [openacs_header "Query Dispatcher"] %>
Query Dispatcher
<%= [openacs_menu] %>
Goals
OpenACS 4.x should be, to some degree, database-independent:
- New RDBMS support should be implementable on a per-package basis without editing existing code.
- Packages should be storage-aware, and potentially multi-storage capable.
Storage-Aware Architecture
Each ACS classic package contains storage-dependent pieces, including:
- a data model for one RDBMS, usually Oracle 8
- default queries using the new database API, written for Oracle 8
In order to safely extend the ACS package structure, an OpenACS package will include additional information, including:
- data model files tagged by RDBMS type and version
- RDBMS type and version of default queries
- additional queries for other RDBMSs.
If this additional information isn't included in a package's meta-data, the OpenACS package manager will assume:
- the single data model provided is built for Oracle 8
- the default queries are written for Oracle 8
Because we don't want to bias OpenACS for one database or another, and
because it is quite confusing that some queries are inline while
others are not, OpenACS will aim to quickly move all queries
outside of the Tcl code, including the Oracle queries.
Data Models
The data model of a package in ACS Classic 4.x is stored in package_root/sql. OpenACS 4.x will change this convention to allow for multiple data models. The package_root/sql location will be split into multiple directories, one per supported RDBMS. The directories will be named according to the RDBMS (and potentially version) they refer to. No attempt will be made at finding common data model components across RDBMSs, as this seems to complicate the situation more than simplify anything. However, it is expected that the SQL creation files are exactly parallel for all RDBMSs supported by a single package. While these specifications may not assume such a file hierarchy at this point in time, they may change to make such an assumption at a later date.
Thus, a directory package_root/sql might look like:
- package_root/sql/oracle/
- package_root/sql/oracle-9/
- package_root/sql/postgres
- package_root/sql/postgres-7.0
Such a structure would mean that an Oracle 8 installation would use the default oracle directory, while an Oracle 9 will use the specific oracle-9 directory (this supposes that Oracle 9 has some special features). Similarly, PostgreSQL 7.1 will use the default postgres directory, while someone might choose to make the system also compatible with PostgreSQL 7.0 by adding a specific data model for it in postgres-7.0.
In order to make this work in the APM architecture, the .info file should include RDBMS information. The <%= [ns_quotehtml ""] %> currently contains no additional information. The OpenACS package specification will add an RDBMS tag (NOT an attribute!) as follows:
<%= [ns_quotehtml "oracle9"] %>
For now, this information will be redundant with the naming scheme selected above. Again, this may change at any time. Thus, both the naming convention and the .info additions must be respected.
A Full Query
A query in ACS Classic 4.x is currently tagged uniquely by Tcl script or Tcl procedure. Since it is relatively easy to determine in which Tcl page or procedure a DB call is being made, we have a means of uniquely identifying queries in a system-wide manner. We must also attach all required meta-data to a query. A Full Query is thus defined as:
- the SQL of the query
- the bind variables expected (which may be implicit in the query)
- the RDBMS(s) the query is designed for
The best way to specify a "full query" is via a human-readable, easily-parsable format. Using XML for this purpose is quite appropriate. A full query would then be defined as:
<%= [ns_quotehtml "
select * from foo where foo_id= :foo_id
foo_id
postgresql7
"] %>
The <%= [ns_quotehtml ""] %> component can be left out, or the <%= [ns_quotehtml ""] %> can be left out altogether. The Query Dispatcher will pick the most specific query possible for a given setup. Thus, it is possible to specify a generic SQL92 query that will apply to all RDBMSs. Only when an RDBMS isn't compatible with that generic query will there need to be a more precisely specified query for that particular RDBMS. This allows developers to focus on SQL92 queries, and to work on RDBMS-specific query as a "diff" from the standard. (The fact that this encourages the use of compliant RDBMSs is not a bad thing, either).
The full-query and query dispatching mechanism need to be well-abstracted for possible future enhancements in the means of storing and organizing these full queries. The following API will allow this:
- db_fullquery_fetch fullquery_global_name
This retrieves a FullQuery data structure using the global name. The FullQuery data structure is then used by the following API calls to obtain further query information.
- db_fullquery_get_text fullquery {rdbms}
Retrieves the SQL query text from the FullQuery data structure. If the RDBMS parameter is provided, the text for that RDBMS is returned. Otherwise, the current RDBMS is used. If no query exists for this RDBMS, this throws an exception.
- db_fullquery_compatible_p fullquery {rdbms}
Checks if a FullQuery is compatible with a given RDBMS. If no RDBMS parameter is provided, the FullQuery is checked against the current RDBMS.
- db_fullquery_get_bindvars fullquery
Returns a Tcl list of bind variable names for the FullQuery.
- db_current_rdbms
Returns the current RDBMS, which can be detailed using the following two procedures.
- db_rdbms_get_type rdbms
Returns the type of a given RDBMS.
- db_rdbms_get_version rdbms
Returns the version of a given RDBMS.
- db_rdbms_match rdbms_instance rdbms_pattern
Returns true if rdbms_instance matches the rdbms_patterns. rdbms_pattern might be something like oracle with no version, while rdbms_instance might be oracle 8.1.6, in which case the procedure returns true. If the rdbms_patterns is oracle version 9, though, the procedure will return false.
The Simple Dispatching Process
The Query Dispatcher must be able to locate the various versions of
each query in order to create the FullQuery data structures. For this,
each FullQuery must have a global name and, from that global name, a
means of locating and loading the query.
Naming Scheme
The naming scheme for a FullQuery is specified as all-lowercase, as follows:
package_name.section_name.chunk_name.query_name
where :
- the package_name is the name of the OpenACS
package to which this query belongs. Packages are uniquely named in
the system, and anything belonging to the kernel will be tagged
acs_kernel for the purposes of this naming scheme.
- the section_name is either tcl or
www depending on whether the query is in a Tcl procedure or
web page.
- the chunk_name is either a Tcl procedure name, or
a web page path.
- the query_name is the actual query tag in the
db_ API call.
Locating FullQuery Information
FullQuery information will be stored as XML text files. In order to
provide the same flexibility as Tcl procedure files and independent
web pages, FullQueries will be stored, by convention, in files that
parallel the directory hierarchy of the code they pertain to.
For example, in package acs-subsite, the file
www/register/deleted-user.tcl will be complemented by the
file www/register/deleted-user.postgres.sql and potentially
by the file www/register/deleted-user.oracle.sql.
The Query Dispatcher will look at any .sql files in the
tcl and www directories, and load the FullQuery
information from those files. The actual names of the files
(deleted-user and .oracle extension) don't matter, as
the fully-qualified name of the query and the RDBMS-compatibilities
are defined in the FullQuery XML itself. The file naming is a
convention to make it easier for developers to get at the right query
text.
Storing & Refreshing FullQuery Information
The Query Dispatcher will load up all queries and store them in an nsv
array at AOLserver/OpenNSD boot time. The file location of each query
will also be stored in the nsv array so that, at a later date, the
query information can be easily located and reloaded on a
per-query-file basis. We assume that, if the definition of a query
changes place (which shouldn't happen when users are following the
proper naming conventions), an AOLserver/OpenNSD reboot will be
necessary to properly reload the queries.
During development time, it is perfectly acceptable to reparse the
queries on every DB query. The first version of the Query Dispatcher
will not bother with much caching, in fact. The first production
release, however, will provide two means of caching:
- all queries loaded up only once at server startup (i.e. no
ability to reload queries, good only for production sites)
- queries reloaded when the source file has changed
Dynamic SQL Strategies
under development
<%= [openacs_footer] %>