2.2. Webmail Design Document

Table of Contents
2.2.1. Essentials
2.2.2. Introduction
2.2.3. Historical Considerations
2.2.4. Competitive Analysis
2.2.5. Design Tradeoffs
2.2.6. Data Model Discussion
2.2.7. Legal Transactions
2.2.8. API
2.2.8.1. PL/SQL Procedures
2.2.8.2. Tcl Procedures:
2.2.9. User Interface
2.2.10. Configuration/Parameters
2.2.11. Future Improvements/Areas of Likely Change
2.2.12. Authors

by Erik Bielefeldt (adapted from the original by Jin Choi)

2.2.1. Essentials

2.2.2. Introduction

Email handlers are among the first user-level programs written for any new operating system, and are one of the few core tools that almost anyone using a computer will use on a regular basis. Most recently, we have seen a blossoming of Web-based email systems such as Hotmail and Yahoo Mail. Why build yet another mail system?

Some of the desirable traits of a mail system are:

The webmail application addresses the first three traits (the last is a work in progress). These requirements argue for the primary message store to remain on a well-administered server. These are the same needs addressed by the designers of IMAP. IMAP solves all these issues except for one: availability; an IMAP client isn't always installed on every computer with a net connection, whereas a Web browser almost always is. But a Web browser is a less-than-ideal interface for reading email when compared to an all-singing, all-dancing mail client. Thus, the ideal mail solution is an IMAP server with a web interface that accesses the same message store as the IMAP client.

2.2.3. Historical Considerations

Mail systems with this architecture already exist. Oracle provides software that does what the webmail application does, and probably more. CriticalPath is a company that provides outsourced email with IMAP and web front ends. These may be better than webmail. CriticalPath certainly has the advantage that it requires no effort on the part of the user other than sending them a check every once in a while. However, Jin Choi reports that when he used CriticalPath, it was unreachable or unuseably slow about half the time (usually due to network problems to their server). He also ran out of patience attempting to install Oracle Email Server.

2.2.4. Competitive Analysis

The downside to these other systems is lack of control. It is difficult to modify the look or extend the features of an email server without access to the source code. In the case of CriticalPath, you are stuck with what they provide, and cannot integrate it to provide web-based email as a seamless service of your web site. If you are using the ArsDigita Community System, webmail provides the core of a web-based email system that relies on proven, reliable systems to do all of the hard work and that is simple to extend. If you are not using the ACS, then perhaps studying an implementation of a working system will aid you in building something suitable for your own needs.

2.2.5. Design Tradeoffs

In reworking Jin Choi's original implementation, we sought to improve it in a couple ways. First, Webmail was lacking some basic functionality that many web-based email services provide, like a decent folder system, signatures, a paged index view, forwarding messages, and a customizable interface. The second consideration was to improve performance, mainly through reworking the data model and resource consuming queries. In particular, the delivery of messages and the main index view are areas which need to be as efficient as possible. The former restricts the volume of incoming mail that Webmail may handle, and the latter affects both server load and usability. The index view is not only the most used, but one of the most expensive pages in terms of working the database. Where I made changes to the original data-model, it will be noted why the change was made below it.

2.2.6. Data Model Discussion

The following section will step through the data model, discussing important or interesting aspects.

    
    -- Domains for which we receive email.
    create table wm_domains (
            -- short text key
            short_name              varchar(100) not null primary key,
            -- fully qualified domain name
            full_domain_name        varchar(100) not null
    );
        

The wm_domains table contains the domains for which we expect to receive mail. The short_name field stores the text key used to differentiate mail to different domains as discussed above. Qmail must be configured to handle email for these domains manually.

    
    -- Maps email accounts to ACS users.
    create table wm_email_user_map (
        user_id         references users,
        email_user_name     varchar(100) not null,
        delivery_address    varchar(200) not null,
        domain          references wm_domains,
        primary key (user_id)
    );
        

wm_email_user_map assigns email addresses to ACS users. Why not just use the email column in the users table? This approach permits flexibility on which address is published to other registered users and provides an external contact if needed. As a row is inserted into this table, the appropriate .qmail alias files are created for that user.

delivery_address contains the full qmail delivery address: (wm_domains.short_name || '-' || email_user_name || '@' || wm_domains.full_domain_name) for ease and speed of lookup in the delivery process (otherwise we will have to re-create it for each user on each delivery). I have removed the possibility of one user being mapped to two Webmail accounts because of its confusing nature. There was also some extra code involved and performance drawbacks too.

    
    -- Maps mailboxes (folders, in more common terminology) to ACS users.
    create sequence wm_mailbox_id_sequence;
    
    create table wm_mailboxes (
            mailbox_id      integer primary key,
            name            varchar(100) not null,
            creation_user   references users(user_id),
            creation_date   date,
            uid_validity    integer, -- Needed for IMAP
            unique(creation_user, name)
    );
        

A "mailbox" is what other systems would term "folders."

    
    create table wm_messages (
            msg_id          integer primary key,
        mailbox_id  integer references wm_mailboxes,
            body            clob,
        -- plain text portions of MIME message; empty if 
        -- entire message is of type text/*.
        mime_text   clob,
            message_id      varchar(500), -- RFC822 Message-ID field
        msg_size    integer,
        date_value  date,
        subject_value   varchar(150),
        to_value    varchar(150),
        from_value  varchar(150),
        seen_p      char(1) default 'f' check(seen_p in ('t','f')),
        answered_p  char(1) default 'f' check(answered_p in ('t','f')),
        flagged_p   char(1) default 'f' check(flagged_p in ('t','f')),
        deleted_p   char(1) default 'f' check(deleted_p in ('t','f')),
        draft_p     char(1) default 'f' check(draft_p in ('t','f')),
        recent_p    char(1) default 't' check(recent_p in ('t','f'))
    );
    
    create index wm_messages_by_message_id on wm_messages(message_id);
        

This is the primary message table. It stores the body of the message, a parsed plain-text version with markers for attachments if this is a multipart MIME message, the mailbox that this message is currently filed in, a denormalized Message-ID field for easy reference by Message ID, and yet another ID field for IMAP bookkeeping. The message_id field is not unique, since the same message may have been received multiple times.

We also store the 4 header columns which are needed for the mailbox index view in the wm_messages table. These were separated from the other headers which are stored in a separate table (see below) because previous Webmail installations experienced problems with slow index views. The net effect of this is immediately visible: while index views of mailboxes with 500 messages used to be perceptively slow, they are now almost instantaneous with over 2000 messages. This is because we no longer have to join the wm_headers table four times with the wm_messages table to get all the needed headers for the index view. This change should vastly improve the scalability of Webmail.

The mapping of messages to mailboxes was also changed from the previous Webmail implementation; instead of using a mapping table, we use the mailbox_id column in the wm_messages table. This eliminates an extra join in a good number of queries, although it gives up the possibility of mapping messages to multiple mailboxes (which was a feature that was not utilized in the previous Webmail anyhow). One possibility to regain this functionality would be to have a column in wm_messages which references a separate table which would contain "common" messages which multiple users could view as normal messages. Such a feature could save resources for intra-Webmail spam, and may be implemented in the future if deemed necessary.

    
    -- Stores attachments for MIME messages.
    create table wm_attachments (
            msg_id          not null references wm_messages,
            -- File name associated with attachment.
            filename        varchar(600) not null,
            -- MIME type of attachment.
            content_type    varchar(100),
            data            blob,
            primary key (msg_id, filename)
    );
        

This table stores MIME attachments and associated information.

    
    -- Headers for a message.
    create table wm_headers (
            msg_id          integer not null references wm_messages,
            -- field name as specified in the email
            name            varchar(100) not null,
            value           varchar(4000),
            -- original order of headers
            sort_order      integer not null
    );
    
    create index wm_headers_by_msg_id_name on wm_headers (msg_id, lower_name);
        

Headers are stored separately from the message to aid in searching. The original ordering of the headers is maintained, both so that we can recreate the header block and because order is significant for certain fields.

    
    -- Table for recording messages that we failed to parse for whatever reason.
    create table wm_parse_errors (
            filename                varchar(255) primary key not null, -- message queue file
            error_message           varchar(4000),
            first_parse_attempt     date default sysdate not null
    );
        

If an error occurs while attempting to parse a message, we store a record of the error in this log for the administrator to review. Only the first occurrence of an error is logged for any file, to prevent hundreds of identical error messages from clogging the log.

    
    -- Used for storing attachments for outgoing messages.
    -- Should be cleaned out periodically.
    
    create sequence wm_outgoing_msg_id_sequence;
    
    create table wm_outgoing_messages (
            outgoing_msg_id         integer not null primary key,
            body                    clob,
            composed_message        clob,
            creation_date           date default sysdate not null,
            creation_user           not null references users
    );
    
    create table wm_outgoing_headers (
            outgoing_msg_id         integer not null references wm_outgoing_messages on delete cascade,
            name                    varchar(100) not null,
            value                   varchar(4000),
            sort_order              integer not null
    );
    
    create unique index wm_outgoing_headers_idx on wm_outgoing_headers (outgoing_msg_id, name);
    
    create sequence wm_outgoing_parts_sequence;
    create table wm_outgoing_message_parts (
            outgoing_msg_id         integer not null references wm_outgoing_messages on delete cascade,
            data                    blob,
            filename                varchar(600) not null,
            content_type            varchar(100), -- mime type of data
            sort_order              integer not null,
            primary key (outgoing_msg_id, sort_order)
    );
    
    -- Create a job to clean up orphaned outgoing messages every day.
    create or replace procedure wm_cleanup_outgoing_msgs as
    begin
      delete from wm_outgoing_messages
        where creation_date < sysdate - 1;
    end;
    /
    
    declare
      job number;
    begin
      dbms_job.submit(job, 'wm_cleanup_outgoing_msgs;',
                      interval => 'sysdate + 1');
    end;
    /
        

When composing messages for sending, the unsent message and any attachments are stored in the database. When the message is sent, a MIME message is composed consisting of the text of the message followed by any attachments (there is currently no facility to intersperse attachments with text). Instead of deleting this as soon as it is sent, we delete old messages daily, allowing users the chance to hit back on their browsers if they wish to resend the previously composed messages.

Unsent outgoing attachments could as well be stored in the filesystem, but it is easier to manage them when they are all contained within the database.

    
    -- PL/SQL bindings for Java procedures
    create or replace procedure wm_process_queue (queuedir IN VARCHAR)
    as language java
    name 'com.arsdigita.mail.MessageParser.processQueue(java.lang.String)';
    /
    
    create or replace procedure wm_compose_message (outgoing_msg_id IN NUMBER)
    as language java
    name 'com.arsdigita.mail.MessageComposer.composeMimeMessage(int)';
    /
        

These PL/SQL bindings for Java procedures are the heart of the system. wm_process_queue attempts to parse every file in the given directory as an email message, deliver it to a webmail user, and delete the file. It is scheduled with AolServer to run every minute. Various bugs in Oracle's dbms_job package have proven that this is a more reliable scheduling system.

    
    -- Trigger to delete subsidiary rows when a message is deleted.
    create or replace trigger wm_messages_delete_trigger
    before delete on wm_messages
    for each row
    begin
      delete from wm_headers where msg_id = :old.msg_id;
      delete from wm_attachments where msg_id = :old.msg_id;
    end;
    /
        

This trigger makes deleting messages easy; deleting from wm_messages will also delete the appropriate rows from any subsidiary tables.

    
    -- interMedia index on body of message
    create index wm_ctx_index on wm_messages (body)
    indextype is ctxsys.context parameters ('memory 250M');
    
    -- INSO filtered interMedia index for attachments.
    create index wm_att_ctx_index on wm_attachments (data)
    indextype is ctxsys.context parameters ('memory 250M filter ctxsys.inso_filter format column format');
    
    -- Trigger to update format column for INSO index.
    create or replace trigger wm_att_format_tr before insert on wm_attachments
    for each row
    declare
      content_type  varchar(100);
    begin
      content_type := lower(:new.content_type);
      if content_type like 'text/%' or content_type like 'application/msword%' then
        :new.format := 'text';
      else
        :new.format := 'binary';
      end if;
    end;
    /
    
    -- Resync the interMedia index every hour.
    declare
      job number;
    begin
      dbms_job.submit(job, 'ctx_ddl.sync_index(''wm_ctx_index'');',
                      interval => 'sysdate + 1/24');
      dbms_job.submit(job, 'ctx_ddl.sync_index(''wm_att_ctx_index'');',
                      interval => 'sysdate + 1/24');
    end;
    /
        

These indices and triggers enable full-text searches over messages. An INSO filtered index is also created to allow full-text searches over any attachments which contain text, including formatted documents.

2.2.7. Legal Transactions

/webmail/admin/

The following legal transactions can occur from the events administration pages located under /admin/webmail/:

domains
  • Domains may be created and deleted.

  • The account size limit may be set for the domain.

accounts
  • Email accounts may be created or deleted.

/webmail/

The following legal transactions can occur from the events administration pages located under /webmail/:

messages
  • Messages may be viewed, re-filed, or deleted.

composing messages
  • New messages may be composed.

  • Attachments may be added.

folders
  • Folders may be created.

  • Folders may be emptied of messages (their contents deleted).

  • User created folders may be renamed and deleted(including contents).

filters
  • Views may be created, edited, and deleted.

  • Action filters may be created, edited, and deleted.

preferences
  • User preferences may be edited.

2.2.8. API

2.2.8.1. PL/SQL Procedures

wm_process_queue (queuedir IN VARCHAR)

Processes the mail queue directory and inserts messages into the database (scheduled to run every minute by default)

wm_compose_message (outgoing_msg_id IN NUMBER)

Given an outgoing_msg_id, updates the wm_outgoing_messages table and sets the composed_message column to a complete message (including mail headers) which is ready to send.

2.2.8.2. Tcl Procedures:

ad_proc ad_acs_webmail_id_mem {}

This is for getting the package id so we can use ad_parameter in this file.

ad_proc wm_add_user { user_id username short_name }

Creates a new webmail account for the given user.

ad_proc wm_header_display { msg_id header_display_style user_id }

Creates a string of "header: value" pairs

ad_proc wm_quote_message { author msg_text }

quotes message with ">" on each line

ad_proc wm_msg_permission { msg_id user_id }

Does user_id have permission to access msg_id? Returns 0 or 1

ad_proc wm_mailbox_permission { mailbox_id user_id }

Does user_id have permission to access mailbox_id? Returns 0 or 1

ad_proc wm_get_mime_part { }

Processes requests for message attachments

ad_proc wm_return_error { errmsg }

Just redirects to the webmail-error page

ad_proc wm_move_to_next { msg_id }

Redirects to the next message in the "current_messages" client property

ad_proc wm_get_preference { user_id preference }

Gets specified preference for specified user from the wm_preferences table

ad_proc select_default_mailbox { user_id }

sets the default mailbox (INBOX) using ad_set_client_property

ad_proc wm_format_for_seen_or_deleted { seen_p deleted_p str }

Format an element differently for read or deleted messages.

ad_proc accumulate_msg_id { msg_id seen_p deleted_p }

collects message data for navigation in message.tcl

ad_proc wm_likefy { string }

escapes % and \ with \ for an Oracle "like" clause

ad_proc wm_build_view_sql { user_id view_id }

Builds the inner part of the complex index-view.tcl query for the specified view. If you pass view_id as -1, it will attempt to get the view from the client's browser properties (used for the "Custom View")

ad_proc create_read_string { comp_string }

Creates checkboxes for whether a message is read Helper to wm_create_filter_form

ad_proc create_constraint_string { comp_object comp_type comp_string i }

Creates constraint inputs Helper to wm_create_filter_form

ad_proc create_age_string { comp_type comp_string }

Creates an age input constraint Helper for wm_create_filter_form

ad_proc wm_create_filter_form { user_id edit_filter_id {format long} }

Creates strings for displaying the form for editing a filter view. Specify 0 for edit_filter_id to have an empty form pass format "flat" to have the mailbox option printed 5 to a row (default is 2)

ad_proc wm_build_index_view { mailbox_id view_id mailbox_name msg_per_page page_num orderby view_sql }

Creates a table of the messages in the current mailbox or view See index.tcl and index-view.tcl for use examples

ad_proc qmail {to from subject body {extraheaders {}}}

Creates a message and injects it into qmail. This proc was originally in qmail.tcl, but since qmail.tcl is no longer distributed with ACS 4.0, I added it here.<

ad_proc qmail_send_complete_message {from msg}

Injects a full formed message into qmail

2.2.9. User Interface

The user interface for webmail includes:

2.2.10. Configuration/Parameters

Please refer to the Installing WebMail ACS 4.0 doc for installation and configuration. It covers configuring qmail, loading the data-model, the java files, and testing and configuring the system.

2.2.11. Future Improvements/Areas of Likely Change

Future improvements will possibly include POP3 and/or IMAP access, voice-xml access to messages, and LDAP interface.

2.2.12. Authors

Erik Bielefeldt