Index: openacs-4/packages/acs-content-repository/www/doc/tutorial.adp =================================================================== RCS file: /usr/local/cvsroot/openacs-4/packages/acs-content-repository/www/doc/tutorial.adp,v diff -u -N -r1.2 -r1.2.2.1 --- openacs-4/packages/acs-content-repository/www/doc/tutorial.adp 27 Oct 2014 16:39:13 -0000 1.2 +++ openacs-4/packages/acs-content-repository/www/doc/tutorial.adp 25 Aug 2015 18:02:02 -0000 1.2.2.1 @@ -2,67 +2,88 @@ {/doc/acs-content-repository {Content Repository}} {ACS Content Repository Tutorial} ACS Content Repository Tutorial - -

How to use the content repository

-by Jade Rubick

Why use the content repository?

Let's say you're a developer making a package for OpenACS. + +by Jade Rubick +

Why use the content repository?

Let's say you're a developer making a package for OpenACS. You've heard statements like, "every package should use the content repository", or maybe a developer has suggested that you use it. Or maybe you just stumbled across it. Why would you want to spend your time reading this document and wasting a good afternoon when you -could get started coding right away?

The simple answer is that the content repository (CR) gives you -many different things for free:

The simple answer is that the content repository (CR) gives you +many different things for free:

support for versioning
hierarchical organization of items
permissions
extending your tables dynamically

State of this document

Introduction

For the sake of an example, I'm going to use a Tasks +

Introduction

For the sake of an example, I'm going to use a Tasks application. This application will keep track of all the tasks of an organization, deadlines for those tasks, and who needs to work -on them.

The reason I might be interested in using the content repository +on them.

The reason I might be interested in using the content repository (CR) in this case is that I can keep track of all the changes to a particular Task, so that if someone changes the deadline on an item, I can see what the original deadline was. In addition, I as a developer would like to have sub-tasks, and sub-sub-tasks, so I can have a whole hierarchy of things that need to be accomplished. Big tasks can be sub-divided so that several people can each do their -particular parts.

So I decide to create a Tasks table. Each of these Tasks has +particular parts.

So I decide to create a Tasks table. Each of these Tasks has various information associated with it, such as deadlines, -descriptions of what needs to be accomplished, and so on:

+descriptions of what needs to be accomplished, and so on:

 Task
           Title
           Description
           Task Number
         
-

Overview

First of all, let's get some terminology out of the way. Columns +

Overview

First of all, let's get some terminology out of the way. Columns of a table are referred to as attributes in content repository-speak.

What attributes do you want?
Define tables
Describe attributes

What attributes do you want?

The first step is to decide on what part of a Task you'd you'd + +

What attributes do you want?

The first step is to decide on what part of a Task you'd you'd like to have under revision control, and what portion you'd like to have just one version of. In our case, the only thing we wouldn't want under version control is the Task Number. This will be a unique identifier for the task, and we don't want that changing -every time someone edits it.

For our simple example:

+every time someone edits it.

For our simple example:

           Title - want versions
           Description - want versions
           Task Number - do NOT want versions
         
-

Define tables

You will have two tables: one with versioned attributes, and one -without versioned attributes.

Define tables

You will have two tables: one with versioned attributes, and one +without versioned attributes.

Convention: often, developers will name the first table by what it is (in my case pm_tasks), and the second, versioned table by the same name, but with _revisions at the end. Thus, I'll name my second table pm_tasks_revisions.

Versioned portion:

+Versioned portion:
+             create table pm_tasks_revisions (
             task_revision_id
             integer 
@@ -77,7 +98,9 @@
             varchar(4000)
             );
           
-
Unversioned portion:
+
+Unversioned portion:
+             create table pm_tasks (
             task_id
             integer
@@ -90,27 +113,34 @@
             integer
             )
           
-
One thing you have to be careful of when creating these tables
+

One thing you have to be careful of when creating these tables is that there are no columns that have the same names as any of the columns in the cr_items and cr_revisions tables. For example, you can't call you key on the pm_tasks_revisions table revision_id. Why? There are some views that are automatically generated that combine these tables for you, but they won't be created if the names conflict. I'll describe what these views are later, but they are useful. You -were warned.

Notice that each table uses as its primary key a reference to +were warned.

Notice that each table uses as its primary key a reference to either the cr_revisions table or the cr_items table. A content item is basically just some content: either text or binary data. The contents revisions table keeps track of which version from the tasks_revisions table is the most current, and which one is -live.

All this is going inside the +live.

All this is going inside the sql/postgresql/project-manager-create.sql file. Your -name will be different of course.

Describe attributes

After we've created the two tables, we need to let the content +name will be different of course.

Describe attributes

After we've created the two tables, we need to let the content repository know that we have a new type of structured data that we are storing in the content repository. Tasks are a "content type", because they have data associated with them, such as when they are -due, and what needs to be done.

I thus need to to

+due, and what needs to be done.
+I thus need to to
+           --create the content type
           select content_type__create_type (
           'pm_task', -- content_type   
@@ -122,10 +152,12 @@
           'content_revision.revision_name'
           );
         
-
You then need to add in all the attributes, so that the content
+

You then need to add in all the attributes, so that the content repository can do some magic things behind the scenes. The content repository doesn't know about what's inside of the pm_tasks -and pm_tasks_revisions tables, so we teach it:

+and pm_tasks_revisions tables, so we teach it:
+           -- add in attributes
           
           select content_type__create_attribute (
@@ -161,17 +193,21 @@
           'numeric' -- column_spec
           );
         
-

+

Side effect: once you've created the content type, the content repository creates a view for you called pm_tasks_revisionsx. Note the x at the end of the name. If you're using Postgres, I believe it will also create a view for you called pm_tasks_revisionsi -

Why are these two views created? the x view is created for +

Why are these two views created? the x view is created for selection, and the i view is created for inserts. They join the acs_objects, cr_revisions, and our pm_tasks_revisions tables together. Try viewing them to get an idea of how they might be -useful.

Advanced topic: Creating types and attributes

It is also possible to dynamically create tables, and extend +useful.

Advanced topic: Creating types and attributes

It is also possible to dynamically create tables, and extend them with extra columns. You could do this by using create table or alter table add column statements in SQL, but this also adds in some meta-data that will be useful to @@ -180,12 +216,14 @@ really cool stuff with it, like automatically generate interfaces that take advantage of the new columns and tables you've added. Another nice thing is that all that messy business of defining your -attributes through the API is taken care of.

+attributes through the API is taken care of.

Types is the content repository are another term for tables, although that doesn't explain it completely. Types are also kept track of within OpenACS, in the acs_object_types table, so the system knows about the tables you create, and can do -some intelligent things with them.

A lot of the intelligent things you can do with this +some intelligent things with them.

A lot of the intelligent things you can do with this information is still being built. But imagine for example that you are using the project manager package I've written. You work at an ice cream company, and every task that is done also has an @@ -196,24 +234,29 @@ automatically taken care of. You'll be able to select a flavor when you edit a task, and it will be shown on the task view page. This is the direction OpenACS development is going, and it will be -really really cool!

First, I'm going to describe how to extend other content +really really cool!

First, I'm going to describe how to extend other content repository tables using the CR API. Then, I'll describe how to set -up your own tables as well:

As you recall from earlier in this page, attributes are just +up your own tables as well:

As you recall from earlier in this page, attributes are just another term for columns in a table. The Content Repository has a mechanism for adding and removing columns via the pl/sql API. If you check your /api-doc: /api-doc/plsql-subprogram-one?type=FUNCTION&name=content%5ftype%5f%5fcreate%5fattribute , you'll see that there is a way to extend the columns -programmatically.

Why would you want to do this? For project manager, I decided to +programmatically.

Why would you want to do this? For project manager, I decided to do this because I wanted to customize my local version of the projects table, to account for company-specific information. That way, I can have a separate edit page for those types, but not have a separate table to join against.

 alter table pm_projects add column 
         target_date  date;

 select content_type__create_attribute(
@@ -227,6 +270,7 @@
         'date'
 );

the possibility of storing all the revisioned columns in a generic -table

How versioning works

pm_task__new_task_item

cr_revisions

cr_items

pm_tasks

pm_tasks_revisions

cr_revisions

cr_items

pm_tasks

pm_tasks_revisions

add

pm_task__new_task_revision

add

pm_task__new_task_revision

not

Defining your pl/sql functions

-CVS browser's entry for project-manager

 select define_function_args('pm_task__new_task_item', 'task_id, project_id, title, description, end_date, percent_complete, estimated_hours_work, estimated_hours_work_min, estimated_hours_work_max, creation_date, creation_user, creation_ip, package_id');
@@ -447,7 +504,9 @@
         return 0;
 end;
 $$ language plpgsql;
-

Explanation of the columns in cr_items and cr_revisions

item_id - unique id for this item, will be @@ -461,6 +520,7 @@ revision_id that is the latest version
publish_status - not sure
content_type - not sure
storage_type - not sure, probably text or binary?
storage_area_key - not sure
tree_sortkey - a utility column used in hierarchical queries.

revision_id - a unique id for this revision.
item_id - a reference to the item_id for this revision
title - you can use this for your application. For example, @@ -469,7 +529,9 @@ is for your use, or internal
mime_type - the mime type.
nls_language - I believe this is for internationalization
lob - the binary content.
content - the text content.
content_length - the length of the text or binary content?
-

Structuring your data into a hierarchy

Using this structure is optional, but useful in many -circumstances.

The facility for this is built into the cr_items +circumstances.

The facility for this is built into the cr_items data model. This makes sense, because you wouldn't want your hierarchy associated with each revision. Here's how Postgres -describes the cr_items table:

+describes the cr_items table:
+                          Table "public.cr_items"
       Column      |          Type          |          Modifiers          
 ------------------+------------------------+-----------------------------
@@ -499,10 +563,15 @@
  storage_area_key | character varying(100) | not null default 'CR_FILES'
  tree_sortkey     | bit varying            | 
 
-The parent_id refers to either a content item
-(cr_items), or a subclass of a content_item (such as
-cr_folders). I'll explain more later about
-cr_folders.
+
+The parent_id
+ refers to either a content item
+(cr_items
+), or a subclass of a content_item (such as
+cr_folders
+). I'll explain more later about
+cr_folders
+.
 One thing that you might want to do for your application is to
 give the application its own root directory. Because the content
 repository is shared among applications, this separates it off from
@@ -516,7 +585,8 @@
 your own root repository is that you application may be mounted
 several times. If you want to separate the directory structure
 between instances of your application, you need to create your own
-root directory:
+root directory:
+ -- Creates and returns a unique name for new project folders
 
 select define_function_args('pm_project__new_unique_name', 'package_id');
@@ -574,6 +644,7 @@
 end;
 $$ language plpgsql;
 
+
 Note that this example is for projects rather than tasks. This is
 because for the application I'm writing, projects are what tasks
 are stored inside of. A project has many component tasks. If you
@@ -582,32 +653,39 @@
 Typically, this definition would go in your
 sql/postgresql/project-manager-create.sql file. If
 this file is broken in several parts, this would go in the
-project-manager-create-functions.sql portion.
Once you've created your root directory, you will set the
+project-manager-create-functions.sql portion.
+Once you've created your root directory, you will set the
 parent_id of your items to the id for the new root
 repository (in our case, it's returned from the
-pm_project__new_root_folder function)
In the project-manager application, we'll create a root
+pm_project__new_root_folder function)
+In the project-manager application, we'll create a root
 repository, and make all projects under that root repository. That
 means they'll all have a parent_id set to the root
 repository. However, we also want to make projects that are
 sub-projects of other projects. In that case, we will set the
 parent_id of the sub-project to the
-item_id of the parent.
Understanding folders
+item_id of the parent.
+Understanding folders
+
 For a little while now, we have been talking about folders, but we
 haven't delved into what CR folders are. Folders are sub-classes of
-cr_items, and the only real difference is that they
+cr_items
+, and the only real difference is that they
 contain no data, except for a label and description.
 If you create folders for your application, then you'll need to
 make sure you manage them along with your other objects. For
 example, if you were to add a folder for each of your objects, then
 you would probably want to make sure you delete the folder when you
-delete the object.
However, in many cases you are not creating more than one
+delete the object.
+However, in many cases you are not creating more than one
 folder. In fact, the only folder you might have will be the root
 folder you create for each instance of your application (if you
 install the project-manager in two parts of your web server, for
 example, it should have two different root folders). When your
 application is running, it can determine the root folder by
 searching the cr_folders table. Here's the definition of that
-table:
+table:
+                  Table "public.cr_folders"
        Column       |          Type           |  Modifiers  
 --------------------+-------------------------+-------------
@@ -618,22 +696,30 @@
  has_child_symlinks | boolean                 | default 'f'
  package_id         | integer                 | 
 
-Note that there is a package_id column. The nice thing
+
+Note that there is a package_id
+ column. The nice thing
 about this column is that you can use it to find the root
 repository, if you only have one folder per instance of your
 application. You can get your package_id using this call within
 your .tcl file:
  set package_id [ad_conn package_id]
 
+
 Then you can find the root repository by using a query like this:
  select folder_id from cr_folders where package_id = :package_id;
-
Create scripts
Drop scripts
+
+Create scripts
+Drop scripts
+
 If you have problems with your drop script in OpenACS 4.6.2, then
 Tammy's
-drop scripts might be of interest to you.
+drop scripts
+ might be of interest to you.
 Using your data model
+
 You now have a shiny new data model that handles revisions and all
 sorts of other things we haven't gotten to yet. Now, in your Tcl
 pages and your ps/sql code, you can...
@@ -645,71 +731,92 @@
 Get latest revision (pl/sql)live_revision_id :=
 content_item__get_live_revision(:item_id);
 
-The item_id identifies the content item with which the revision
-is associated.
Likewise, the most recent revision of a content item can be
-obtained with the content_item__get_latest_revision function
Reference:

+
+The item_id identifies the content item with which the revision
+is associated.
+Likewise, the most recent revision of a content item can be
+obtained with the content_item__get_latest_revision function
+Reference:
+
 OpenACS Content
 Repository docs

 Dave's page on Using the Content Repository
-
Reference: Definitions

+
+Reference: Definitions
+
 Content Type
A set of attributes that may be associated with a text or
 binary content object. For example, a press_release content type
 may include a title, byline, and publication date. These attributes
 are stored in the cr_revisions table, and a table that
 you set up to store specialized data. In this case, the title (I
 think), byline, and publication date would be stored in a
 specialized table.
-

+
+
 Content Item
Items are the fundamental building blocks of the content
 repository. Each item represents a distinct text or binary content
 object that is publishable to the web, such as an article, report,
 message or photograph. An item my also include any number of
 attributes with more structured data, such as title, source, byline
 and publication date.
-

+
+
 Content Revision
A revision consists of the complete state of the item as it
 existed at a certain point in time. This includes the main text or
 binary object associated with the item, as well as all
 attributes.
-

+
+
 Content Folder
A folder is analogous to a folder or directory in a file
 system. It represents a level in the content item hierarchy. In the
 previous example, press-releases is a folder under the repository
 root, and products is folder within that.
-

+
+
 Symbolic Link
Analogous to a symlink, alias or shortcut in a file system.
 Allows an item to be accessed from multiple folders.
-

+
+
 Templates
Templates are merged with content items to render output in
 HTML or other formats. Templates are assumed to be text files
 containing static markup with embedded tags or code to incorporate
 dynamic content in appropriate places.
-
Content templates
+
+Content templates
+
 The only place content templates are used in OpenACS are in the 5.0
 version of file storage. See CR and
-content_template defined wrongTroubleshooting
+content_template defined wrong
+Troubleshooting
+
 One problem I ran into while trying to get my SQL create and drop
 scripts working was that sometimes I wasn't able to delete a
 content type because I would get errors like these:
  Referential Integrity: attempting to delete live_revision: 658
 
+
 The problem seems to be that there were still items in the
-cr_items table. You can remove them using select
-content_item__delete(648); in psql. You get the codes by
+cr_items
+ table. You can remove them using select
+content_item__delete(648);
+ in psql. You get the codes by
 doing a query like this:
  select i.item_id, r.revision_id, r.title, i.content_type from cr_items i, cr_revisions r where i.item_id = r.item_id order by i.item_id, r.revision_id;
 
+
 Really, however, what you need to do is make sure your __delete and
 drop scripts first go through and delete all children of those
 items. I'm not sure if you need to delete the items themselves -- I
 believe they may be dropped by themselves when the tables are
-dropped, because of the cascade portion of the SQL
+dropped, because of the cascade
+ portion of the SQL
 data model.
 When I was troubleshooting folders, I found this query
-useful:
+useful:
+ select f.folder_id,f.label,f.description,i.content_type from cr_folders f, cr_items i where f.folder_id = i.item_id;
-
Once again, thanks to daveb for help in tracking this down (he
+
+Once again, thanks to daveb for help in tracking this down (he
 rocks!).
-