Chat Design Document

by David Dao

I. Essentials

II. Introduction

We have our own chat server because:

III. Historical Considerations

Chat applications allow inexpensive and efficient social interactions between community members spread across the globe.  A community can use the Chat application  to allow its members to have an interactive session with a special guest or a community member through the means of a moderated chat. A customer support site can use the Chat application to offer instant responses to its customers regarding its products.

IV. Competitive Analysis

AOL Instant Messenger (AIM)

So why can't companies that want to do 1:1 conversations just use AIM?  AOL Instant Messenger works best with some additional software to be installed on the user's machine. This is free software, is shipped with Netscape 4.x, and is certainly available to AOL service customers. But it is not universal and you can't be guaranteed that when someone is connecting to your Web site they have the AIM client.  Furthermore, AIM doesn't has a logging option, thus there is no way for a site to offer a chat transcript to its members.

Yahoo Chat

Yahoo offers its members a wide range of services including chat. Their chat supports both HTML and Java applet clients. Unfortunately, their chat software is proprietary, so companies cannot incorporate their community model with Yahoo's chat software.

DigiChat

DigiChat is a standalone Java chat application. It offers a well thought out chat client interface and also supports moderated chats. However, like most third party applications, there are no easy means by which Digichat can be integrated with a site's preexisting user data model.  DigiChat also comes with a heavy price tag, and as such it might not be suitable for a small communities that need to support chat sessions.

The ArsDigita Java Chat application does not have a pretty client interface, nor does it support text formatting like Yahoo and Digichat.  However, our Chat application is open source.  As such, any competent Java developer will be able to improve the interface based upon their or their employer's preferences. Using the ArsDigita Chat application allows site developers access to a rich community data model which is compatible with a variety of existing open-source applications.

V. Design Tradeoffs

Archive or not?

We have to drive the entire system design from a publishing decision: are we interested in seeing archives of chat sessions? If we are, then archiving into a single table makes a lot of sense.   We can perform a single SQL query to see everything that a user has contributed. We can perform a single SQL query to see how a customer service person is doing. And so on.

A disadvantage of archiving is that it chews up disk space.  Imagine for a moment that your service is the size of America Online, wherein one million subscribers chat or use AIM every day.  Imagine also that each person types 50 rows of chat/AIM content each of which requires 100 bytes of storage.   Such a scenario would require our table to grow by 50 million rows and 5 GB each day. After 20 days, we would begin to bump up against the billion-row table size that data warehouse experts suggest as a practical limit.

So it seems that on popular public sites we won't be able to store everything that users type. At the same time, a customer's interaction with a customer service person, or a special guest chat should be archived.  Also, many sites engage considerably less traffic than does AOL, and these sites may benefit from being able to log all chat/AIM communications.

The solution we came up with was to provide an option that allows the creator of a room to decide whether or not to archive the messages.

HTML vs. Applet

An advantage to using an HTML client as opposed to an applet-based client is the ability to have rich graphical representations as a part of the chat, such as color formatting, in-line images rendered by the browser in use, et cetera.  Furthermore, users who are still using older non-Java enabled browsers will be able to participate in the chat.  A limitation of the HTML client is that users cannot see messages in real time, as HTML provides no auto-update functionality.

By using a Java applet-based client, users can see messages updated in real time. This is important in, e.g., a customer service setting.   Since SUN JDK 1.1 doesn't provides a text rendering API, applet-based clients are limited in the format of the text that they can display. There are also limitations on the server as to how many TCP sockets can be open at the same time.  Since each user using an applet-based chat client will use one TCP socket, there could be a limit on how many user can use the applet. This problem could be solved by employing customized hardware more suitable for the task. 

Our system provides both a HTML and a Java applet-based chat client.  By providing each of these, we open up our Chat application to a broader pool of users by offering users the chat method which will work best for them.

Java Servlet vs. Pure AOL server

Use of a Java servlet provides us with however using a servlet requires a somewhat nonstandard (for ACS, that is) installation method for the chat package, since the application requires Jakarta-tomcat and nstomcat to be preinstalled and configure properly. These extra steps are dificult and different on each platform.

Using the AOL server socket API, there could be a performance improvement and will not require installation of Jakarta-tomcat.  Unfortunately, AOL does not provide much by way of documentation regarding AOL socket. As of the current time, there are still major features missing from the AOL socket API which would be required to completely replace the Java/servlet method.

Initially, I chose using a Java servlet as my development platform due to time constraints. As development progressed, I analyzed the use of the servlet in the current chat application. Since the servlets sole purpose is to provide communication between the HTML and the Java applet clients, I concluded that this would be adequate for the present purposes, and that the extra efforts required to set up Jakarta-tomcat with AOL server would be unjustified. After some experimentation, I was able to rebuild the bridge between the HTML and the Java applet clients using only AOL Server and the Java chat server. As a result the Chat application may now be downloaded and installed just as any other ACS 4 applications may be.

Chat message protocol: Text base vs. Java serialize object

In the earlier chat version, chat messages were broadcast to the applet client Java serialize objects. The advantage of serialize objects is that they make it easy to retrieve information. The disadvantage of this method is the messages are limited to Java only.

To support clients from different programming language and not limit our application to Java, I choose an XML text base for the chat message protocol.

Chat messages are stored in cache

Since we do not allow a user to view messages prior to his entrance into the chat room, there is no need for time-consuming Oracle queries to retrieve messages.  Storing messages in the server cache keeps us from being able to archive all of the messages from the time the server started.  I decided that I would limit the number of messages cached per room.

VI. API

Chat message XML definition


<login>

   <user_id></user_id>

   <user_name></user_name>

   <pw></pw>

   <room_id></room_id>

</login>

After connection to the chat server, each client must identify itself via a login message. The chat server will disconnect the client if the first message is not a proper login message or if the user doesn't have proper permissions for the chat room.

<message>

   <message_id></message_id>

   <from_user_id></from_user_id>

   <from></from>

   <to_user_id></to_user_id>

   <to></to>

   <room_id></room_id>

   <status>approved | pending</status>

   <body></body>

</message>

To construct a public message that will broadcast to everyone in the chat room, the to_user_id and to fields need to be excluded from the message. When these two fields are present in the message then the message will only send to a specific user. Sending private messages to HTML users is not yet supported.

<system>

   <user_enter>

      <user_id></user_id>

      <user_name></user_name>

   </user_enter>

</system>


<system>

   <user_leave>

      <user_id></user_id>

      <user_name></user_name>

   </user_leave>

</system>

Each time a client enters or exits the room, an appropriate message will be broadcast to all clients in the chat room notifying them of a change in the chat user list.

<system>

   <userlist>

      <user>

         <user_id></user_id>

         <user_name></user_name>

      </user>

      <user>

         <user_id></user_id>

         <user_name></user_name>

      </user>

      ...
      ...
   </userlist>

</system>

After the Java applet client has successfully logged into the chat room, a list of users currently in the room will be sent our from the server.

API

All chat functionalities accessible from the browser are available as an API. Providing these API allows different applications to modify chat without an application-specific user interface.   These API will throw errors if their corresponding PL/SQL statement fail, so all applications employing them need to 'catch error' to display a nice error message to the user.

chat_message_post - This API inserts chat message to the database depend whether a chat room archive mode. It is also broadcast the message to all Java clients in the room.

chat_message_retrieve - This API is only used by the HTML client, and is used each time the HTML client refreshes. The API does not require database call, messages will be retrieve from AOL Server cache.

VII. Data Model Discussion

Should chat messages constitute a subtype of acs-object?

We are not implementing individual permissioning on each chat message. As a result we can avoid unnecessary complexity of subtyping the chat message type. Chat messages will have their own table.

VIII. User Interface

Types of chat we need to support

For either kind of chat room, we should support moderated chat. That is, in such a scenario a posting doesn't go live until it has been approved by someone who has the 'chat_room_moderate' privilege in the room. 

We want to support one-to-one messages for customer support, so we need one layer on top of the Chat application to make sure that users can find an appropriate chat partner. For example, if Bill User says that he needs support for his Acme widget, the system has to find the least busy authorized Acme widget support person and start a one-to-one chat session between Bill and that person.

For public community sites where nothing is being sold or supported, a publisher might wish to limit the load on the server from all of this one-to-one chatting.

Options for the publisher

Some options are configurable per-room, e.g.,

Options for room administrator

Why the HTML version can't autorefresh

The HTML page cannot have a Refresh header to force a redraw of the page; if it did, the user would be at risk of losing what he or she was typing into the post form.

IX. Configuration/Parameters

X. Future Improvements/Areas of Likely Change

A much more stable Applet client interface would be in order. It would also be nice if we allowed more concurrent applet connections to the server.

XI. Authors

XII. Revision History

Document Revision # Action Taken, Notes When? By Whom?
0.1 Revision from chat 3.4 design document 11/17/2000 David Dao
0.2 Editing and augmentation 11/18/2000 David Dao and Josh Finkler
0.3 Editing 12/04/2000 David Dao
0.4 Editing 12/05/2000 Josh Finkler
0.5 Revision for beta version 01/11/2001 David Dao


ddao@arsdigita.com