Hierarchical data

<authorblurb>

by Jade Rubick with help from many people in the OpenACS community

</authorblurb>

One of the nice things about using the OpenACS object system is that it has a built-in facility for tracking hierarchical data in an efficient way. The algorithm behind this is called tree_sortkey.

Any time your tables are subclasses of the acs_objects table, then you automatically get the ability to structure them hierarchically. The way you do this is currently via the context_id column of acs_objects (Note that there is talk of adding in a parent_id column instead, because the use of context_id has been ambiguous in the past). So when you want to build your hierarchy, simply set the context_id values. Then, when you want to make hierarchical queries, you can do them as follows:

      db_multirow categories blog_categories "
      SELECT
      c.*,
      o.context_id,
      tree_level(o.tree_sortkey)
      FROM
      blog_categories c,
      acs_objects o
      WHERE
      c.category_id = o.object_id
      ORDER BY
      o.tree_sortkey"
    

Note the use of the tree_level() function, which gives you the level, starting from 1, 2, 3...

Here's an example, pulling all of the children for a given parent:

      SELECT 
      children.*,
      tree_level(children.tree_sortkey) -
        tree_level(parent.tree_sortkey) as level
      FROM 
      some_table parent, 
      some_table children
      WHERE 
      children.tree_sortkey between parent.tree_sortkey and tree_right(parent.tree_sortkey)
      and parent.tree_sortkey <> children.tree_sortkey
      and parent.key = :the_parent_key;
      

The reason we subtract the parent's tree_level from the child's tree_level is that the tree_levels are global, so if you want the parent's tree_level to start with 0, you'll want the subtraction in there. This is a reason you'll commonly see magic numbers in tree_sortkey SQL queries, like tree_level(children.tree_sortkey) - 4. That is basically an incorrect way to do it, and subtracting the parent's tree_level is the preferred method.

This example does not include the parent. To return the entire subtree including the parent, leave out the non-equals clause:

      SELECT
      subtree.*,
      tree_level(subtree.tree_sortkey) -
        tree_level(parent.tree_sortkey) as level
      FROM some_table parent, some_table subtree
      WHERE 
      subtree.tree_sortkey between parent.tree_sortkey and tree_right(parent.tree_sortkey)
      and parent.key = :the_parent_key;
    

If you are using the Content Repository, you get a similar facility, but the parent_id column is already there. Note you can do joins with tree_sortkey:

      SELECT
      p.item_id,
      repeat(:indent_pattern, (tree_level(p.tree_sortkey) - 5)* :indent_factor) as indent,
      p.parent_id as folder_id,
      p.project_name
      FROM pm_projectsx p, cr_items i
      WHERE p.project_id = i.live_revision
      ORDER BY i.tree_sortkey
    

This rather long thread explains How tree_sortkeys work and this paper describes the technique for tree_sortkeys, although the OpenACS implementation has a few differences in the implementation, to make it work for many languages and the LIKE construct in Postgres.