Upgrading existing ADPs to noquote templating

Introduction.

The variable substitution in the templating has been changed to become more friendly towards quoting. The rationale for the change and the definition of terms like quoting are present in the quoting article. As it discusses these concepts in some depths, we see no reason to repeat them here. Instead, we will assume that you have read the previous article and focus on the topic of this one: the changes you need to apply to make your module conformant to the new quoting rules.

This text is written as a result of our efforts to make the ACS installation for the German Bank project work, therefore it is based on field experience rather than academic discussion. We hope you will find it useful.

Recap of the Theory.

The change to the templating system can be expressed in one sentence:
All variables are now quoted by default, except those explicitly protected by ;noquote or ;literal;.
This means that the only way your code can fail is if the new code quotes a variable which is not meant to be quoted. Which is where ;noquote needs to be added. That's all porting effort that is required. Actually, the variables are subject to HTML-quoting and internationalization. The suffix ;noquote means that the variable's content will be internationalized, but not HTML-quoted, while ;no18n means quote, but don't internationalize. Finally ;literal means: don't quote and don't internationalize.

This is not hard because most variables will not be affected by this change. Most variables either need to be quoted (those containing textual data that comes from the database or from the user) or are unaffected by quoting (numerical database IDs, etc.) The variables where this behavior is undesired are those that contain HTML which is expected to be included as part of the page, and those that are already quoted by Tcl code. Such variables should be protected from quoting by the ;noquote modifier.

The Most Common Cases.

The most common cases where you need to add ;noquote to the variable name are easy to recognize and identify.

Hidden form variables.
Also known as "hidden input fields", hidden form variables are form fields with pre-defined values which are not shown to the user. These days they are used for transferring internal state across several form pages. In HTML, hidden form variables look like this:

<form>
  <input name=var1 value="value1">
  <input name=var2 value="value2">
  ... real form stuff ...
</form>
      
ACS has a convenience function for creating hidden form variables, export_form_vars. It accepts a list of variables and returns the HTML code containing the hidden input tags that map variable names to variable values, as found in the Tcl environment. In that case, the Tcl code would set the HTML code to a variable:
set form_vars [export_vars -form {var1 var2}]
      
The ADP will simply refer to the form_vars variable:
<form>
  @form_vars@              <!-- WRONG!  Needs noquote -->
  ... real form stuff ...
</form>
      
This will no longer work as intended because form_vars will be, like any other variable, quoted, and the user will end up seeing raw HTML text of the hidden variables. Even worse, the browser will not be aware of these form fields, and the page will not work. After protecting the variable with ;noquote, everything works as expected:
<form>
  @form_vars;noquote@
  ... real form stuff ...
</form>
      

Snippets of HTML produced by Tcl code, a.k.a. widgets.
Normally we try to fit all HTML code into the ADP template and have the Tcl code handle the "logic" of the program. And yet, sometimes pieces of relatively convoluted HTML need to be included in many templates. In such cases, it makes sense to generate the widget programmatically and include it into the template as a variable. A typical widget is a date entry widget which provides the user the input and selection boxes for year, month, and day, all of which default to the current date.

Another example of widgets is the context bar often found on top of ACS pgages.

Obviously, all widgets should be treated as HTML and therefore adorned with the ;noquote qualifier. This also assumes that the routines that build the widget are correctly written and that they will quote the components used to build the widget.

Pieces of text that are already quoted.
This quoting is usually part of a more general preparation for HTML rendering of the text. For instance, a bboard posting can be either HTML or text. If it is HTML, we transmit it as is; if not, we perform quoting, word-wrapping, etc. In both cases it is obvious that quoting performed by the templating system would be redundant, so we must be careful to add ;noquote to the ADP.

The property and include Gotchas.

Transfer of parameters between included ADPs often requires manual addition of ;noquote. Let's review why.

The property tag is used to pass a piece of information to the master template. This is used by the ADP whose writer consciously chose to let the master template handle a variable given by the Tcl code. Typically page titles, headings, and context bars are handled this way. For example:

master:
<head>
  <title>@title@</title>
</head>
<body bgcolor="#ffffff">
  <h1>@heading@</h1>
  <slave>
</body>
      
slave:
<master>
<property name="title">@title@</property>
<property name="heading">@title@</property>
...
      
The obvious intention of the master is to allow its slave templates to provide a "title" and a "heading" of the page in a standardized fashion. The obvious intention of our slave template is to allow its corresponding Tcl code to set a single variable, title, which will be used for both title and heading. What's wrong with this code?

The problem is that title gets quoted twice, once by the slave template, and once by the master template. This is the result of how the templating system works: every occurrence of @variable@ is converted to [ad_quotehtml $variable], even when it is used only to set a property and you would expect the quoting to be suppressed.

Implementation note: Ideally, the templating system should avoid this pitfall by quoting the variable (or not) only once, at the point where the value is passed from the Tcl code to the templating system. However, no such point in time exists because what in fact happens is that the template gets compiled into code that simply takes what it needs from the environment and then does the quoting. Properties are passed to the master so that all the property variables are shoved into an environment; by the time the master template is executed, all information on which variable came from where and whether it might have already been quoted is lost.

This occurrence is often referred to as over-quoting. Over-quoting is sometimes hard to detect because things seem to work fine in most cases. To notice the problem in the example above (and in any other over-quoting example), the title needs to contain one of the characters <, > or &. If it does, they will appear quoted to the user instead of appearing as-is.

Over-quoting is resolved by adding ;noquote to one of the variables. We strongly recommend that you add ;literal inside the property tag rather than in the master. The reason is that, first, it makes sense to do so because conceptually the master is the one that "shows" the variable, so it makes sense that it gets to quote it. Secondly, a property tag is supposed to merely transfer a piece of text to the master; it is much cleaner and more maintainable if this transfer is defined to be non-lossy. This becomes important in practice when there is a hierarchy of master templates -- e.g. one for the package and one for the whole site.

To reiterate, a bug-free version of the slave template looks like this:

slave sans over-quoting:
<master>
<property name="doc(title)">@title;literal@</property>
<property name="heading">@title;literal@</property>
...
      

The exact same problems when the include statement passes some text. Here is an example:

Including template:
<include src="user-kick-form" id=@kicked_id@ reason=@default_reason@>
      
Included template:
<form action="do-kick" method=POST>
  Kick user @name@.<br>
  Reason: <textarea name=reason>@reason@</textarea><br>
  <input type="submit" value="Kick">
</form>
      
Here an include statement is used to include an HTML form widget parts of which are defined with Tcl variables $id and $default_reason whose values presumably come from the database.

What happens is that reason that prefills the textarea is over-quoted. The reasons are the same as in the last example: it gets quoted once by the includer, and the second time by the included page. The fix is also similar: when you transfer non-constant text to an included page, make sure to add ;literal.

Including template, sans over-quoting:
<include src="user-kick-form" id=@kicked_id;literal@ reason=@default_reason;literal@>
      

Upgrade Overview.

Upgrading a module to handle the new quoting rules consists of applying the process mentioned above to every ADP in the module. Using the knowledge gained above, we can specify exactly what needs to be done for each template. The items are sorted approximately by frequency of occurrence of the problem.
  1. Audit the template for variables that export form variables and add ;noquote to them.

  2. More generally, audit the template for variables that are known to contain HTML, e.g. those that contain widgets or HTML content provided by the user. Add ;noquote to them.

  3. Add ;literal to variables used inside the property tag.

  4. Add ;noquote to textual variables whose values are attributes to the include tag.

  5. Audit the template for occurrences of <%= [ns_quotehtml @variable@] => and replace them with @variable@.

  6. Audit the Tcl code for occurrences of ns_quotehtml. If it is used to build an HTML component, leave it, but take note of the variable the result gets saved to. Otherwise, remove the quoting.

  7. Add ;noquote to the "HTML component" variables noted in the previous step.
After that, test that the template behaves as it should, and you're done.

Testing.

Fortunately, most of the problems with automatic quoting are very easy to diagnose. The most important point for testing is that it covers as many cases as possible: ideally testing should cover all the branches in all the templates. But regardless of the quality of your coverage, it is important to know how to conduct proper testing for the quoting changes. Here are the cases you need to watch out for.
Hrvoje Niksic
Last modified: Thu Aug 20 18:38:05 CEST 2015