Portal 3.0 System Requirements by Ian Baker Introduction This document records the requirements for the ACS package, Portal v3.0. Portal is an API and interface for integrating information from multiple, disparate sources on a single page or a single set of pages. For examples of other Internet portals, observe My Yahoo! and Slashdot. Portal 3 is a significant departure from previous versions, both in its internal design and external interface. Specifically, we'd like to move away from the monolithic model necessitated by previous versions of ACS. Portal will gain a significant advantage in usability and functionality if it's tightly integrated with the templating, subsite and permission systems provided by ACS 4.x. Also, as you'll see later, we've chosen to drop the plural from the name. From here on, the package will be called "Portal," not "Portals." The reasons should make themselves clear as you become familiar with the new architecture. Vision Statement The Internet is big. There's a lot of information there, and nobody's interested in all of it. If you run a successful website, you're likely to have a similar problem, albeit on a significantly smaller scale. The Internet is complex. All that information comes from a lot of different places. Sorting through it is hard and time-consuming. Finding it in the first place is difficult, especially if you don't already know it's there. If you have a wide variety of content, your users may have the same problem within your site. People are different. If you manage to solve the above problems for a single user, even a whole group of users, others will find little utility in your solution. People have different interests and passions. The things that user A cares about mean nothing to user B. The portal provides a solution for all of these problems with a standard interface for aggregating content from a number of different sources on a single page, or group of pages, and a way to customize that content. Generally, a user will be presented with a page containing summary data from a number of different parts of your site and/or other places on the Net. This page is called a "portal." The user may customize her portal by adding and removing elements, and by altering the way the elements are displayed. System/Application Overview [note: terms ("element", "datasource", "theme", etc.) are not set in stone. I chose them because what we had seemed like it could use improvement. If you have a way to describe these things better, I'd love to hear about it. -ib] Portal structure: The basic unit of a portal is the element. An element is usually rendered as a box with a title bar containing the name of the element, and, optionally, some buttons for manipulating it. The contents of the box might be an image, a list of stock quotes, the current weather, some news items, some links, etc. Synonyms for the element include Slashdot's Slashbox and Yahoo's content module. In previous versions of ACS Portals, this has been called a table, and later, a portlet. The components of an element are the datasource, the theme, and the element parameters. A datasource is an item which specifies what the contents of the box will be. The information may be in a variety of formats, and will often be just a pointer to info maintained elsewhere. Depending on the content-type of the datasource, some preprocessing may or may not be applied to it. See requirement 10.10.20 for more information. The analogous object in previous versions of the Portals Module is the data feed. It was renamed because the rest of the world seems to prefer this vocabulary (see Mozilla for an example). The theme is a template that produces an HTML fragment which is wrapped around the content specified by the datasource. This template is responsible for producing the decorations (title bar and its buttons, the box surrounding the element). A single theme may be associated with multiple elements. The element parameters are a set of key/value pairs that are associated with the element. They're used to provide a standard way to maintain element-level configuration. For example, an element might have the following parameters: context_id - the portion of the site-map upon which the datasource should act. display_format - A parameter that tells the datasource how its HTML should be rendered. foo - something I haven't thought of yet. Parameters are passed to the element in whatever fashion is appropriate for the data-type (see requirement 10.10.30). Elements are arranged in rows and columns within a layout. A layout is a separate template that defines a set of numbered areas in which elements may be inserted. Layouts will generally, for the moment, be implemented using HTML tables, though in the future, CSS may be substituted. Layouts may be designed using either a provided layout tool (which may be somewhat limited in the first few releases of portal), or written directly by an administrator. The grouping of themes and datasources into elements and the mapping of those elements on to a layout produces a portal. Each instance of the package defines exactly one portal. One portal defines exactly one complete web page and one site-map entry. Here's a graphical representation: Portal Structure Diagram Important considerations: When a user edits a portal, he is editing a personal copy of the portal which is generated automatically for him. His version is not available to other users. An administrator defines a portal's default behavior. She may choose not to grant users write permission on the portal, in which case they can only see the default. Use-cases and User-scenarios Not available at the moment. -ib Competitive Analysis: It goes without saying that other people have implemented portal systems in the past. I'd like to focus this analysis on four similar systems: My Yahoo, Slash, Netscape Netcenter, and Plumtree. Slash: (or Slashcode). Slash is the engine that's used to produce Slashdot, a fairly popular news portal, as well as a number of lesser-known sites that essentially do the same thing for a different group of users. Slash's portal features leave much to be desired with regard to layout and flexibility, though its open-source nature makes it ultimately customizable. Slash's most prominent feature is its (admittedly kludgey) support for RDF, which allows the site administrator to include syndicated content from other websites almost effortlessly. It goes as far as to distribute with a large number of predefined remote data sources already included. Essentially, it comes with an impressive array of content out-of-the-box, courtesy of other website maintainers who have chosen to syndicate. Even if the site administrator doesn't want to use any of that content, it provides a good example of the proper way to set it up. Remember: even if they don't get used, defaults are good. Assuming RDF support, all of the portal-related functionality of the average Slash site could be duplicated within ACS. Plumtree: Plumtree is a closed-source "business portal" solution that actually duplicates a great deal of ACS functionality. I couldn't find a place from which to test-drive an actual installation of the software, so most of my conclusions have been jumped to from the shaky platform of screen shots and marketing literature. Still, there might be something of value here. Plumtree implements a number of features (exchange connectivity, bboard, etc.) that are, or would be, handled by other parts of ACS. The difference with Plumtree is that all of these features are implemented as portal elements ("Plumtree Portal Gadgets[tm]"). As their software is entirely based on portals, we can probably assume that the engine that produces them is pretty good. From what I can see, they have a standardized method for installing/removing gadgets and a gadget repository on their site (something like APM integrated with portal formatting), and some well-defined method for distributing gadget maintenance across multiple servers, which is nice. There's a lot of hoo-ha about using HTTP for server <-> server communication (ala Cybercash), which, depending on how they did it, could have been a really horrible design decision (HTTP is only useful for fairly trivial communication, as it has no error checking). Much of that network-based communication could be accomplished with ACS Portal, though there currently exists no secure framework for setting up any but the most simple data sources. This is something to seriously consider for a future version. The title bars of Plumtree portal elements have a lot of buttons in them. I can guess what some of them do, but others leave me baffled. They may have some interesting UI ideas, but not likely anything our data model can't accommodate. It seems likely that their presentation is less flexible - it might not be possible to deviate from the "box with a title-bar with some set of buttons" metaphor, while ACS Portal can (though it probably won't very often). Plumtree portals are, most likely, very tightly integrated as a consequence of being portal-based from the beginning. This is currently something ACS lacks, and won't gain until this package is finished and has been released for a while, and package authors can have a chance to produce good data sources for their code. ACS's advantages over Plumtree, generally, would seem to be our open-sourcedness and our presentation flexibility. ACS doesn't have to be a portal. My Yahoo and My Netcenter: Yahoo and Netcenter are good examples of portal engines in use. I mention them here more for comparison than anything else. Neither of them competes directly with ACS Portal, as the code that drives them isn't available (as far as I know). Yahoo, all around, has a fairly good portal engine. You have some layout choices (though you get a fairly limited number of them). Also, an admirable innovation at Yahoo is the "wide" and "narrow" distinction for elements ("content items"). The creator of an element can specify how much horizontal space they expect the element to occupy, giving the user a hint about its placement. The provision for some sort of similar layout hint is worth considering for ACS Portal. Netscape's layout is quite restricted (three columns, one page, period), but they have a number of other novel ideas: You can "minimize" or "shade" an element, so that only its title bar is displayed by clicking a button. Seems handy. You can tell Netcenter to set a refresh for the page, so that it's reloaded after a configurable number of minutes. Good for keeping content current, and would be easy to implement. You can customize the colors that Netcenter will use on your page. ACS portal provides similar functionality. A fairly verbose description can be attached to a datasource/element. This is an excellent idea, and we should support it. Building a better element selection UI than is possessed by any of the above products (excepting Plumtree, whose UI I haven't seen) should be fairly trivial. All of them essentially give you a screenful of checkboxes, and you check the things you want. Slash provides a "preview" of sorts. A select-list- and JavaScript-based approach would be much more friendly, falling back onto a page-of-checkboxes sort of system where JavaScript isn't available. Related Links Requirements and design documents for ACS 3.4.x Portals. I need to find more current URLs for these. -ib Design document (doesn't actually exist yet). Similar systems: My Yahoo Slash (and an example at Slashdot) My Netscape Plumtree Requirements: Data Model 10.10.10 Each instance of the package is a single portal. A portal defines exactly one HTML page, though the content of that page may vary from user to user. Portals may be grouped using an element distributed with the package to produce multi-page arrangements. See requirement 70.10.10. 10.10.11 A second package, a 'singleton,' will be defined to encapsulate Portal's low-level administrative functionality. Portal should be usable without this package installed, but with, perhaps, only a limited set of defaults. See requirement 40.10.15. 10.10.14 A stylesheet may be associated with any portal along with its layout. The CSS should define the look for the element HTML itself, by defining classes for common data source components (link, title, etc.). The names of these classes will be documented, and should be used by package authors when creating data sources. 10.10.15 Which datasources are available to a portal when it's created is decided by where that portal is mounted in the site-map. Generally, only datasources for packages mounted at the same level and beneath will be made available. A visual example: subsite #1 \ | bboard news portal #1 page \ | chat portal #2 In this example, portal #1 has access to the bboard, news, and chat datasources, while portal #2 has access to only the chat datasource. There are, of course, two exceptions: "General" datasources - datasources that are available to any package. A number of general datasources will be distributed with portal, see requirement 70.10.10. If an element is identified as "exportable" in its configuration, it may be included by anyone who has permission to see it, no matter where in the site-map it resides. General datasources may be implemented as a subset of exportable datasources. 10.10.20 Each datasource should have a "content-type" attribute associated with it, which specifies the type of the datasource's output. For example, if a chunk of Tcl code returns an HTML fragment, the datasource's type should be text/html. This is necessary so that correct processing of non-HTML formats (text/plain, or application/rdf) might be accomplished. 10.10.30 A datasource may contain data that is produced in different ways. To get the content for a particular datasource, for example, a stored Tcl procedure could be evaluated, a URL fetched, or some ADP parsed. The datasource should have a data-type attribute associated with it to note what should be done with the data. Some types might be "raw" (requires no processing), "tcl", "tcl-proc", "plsql", "adp", "url". 10.10.35 Any number of parameters may be associated with an element. a parameter is a key/value pair that is supplied to the data source in whatever format is appropriate for the data-type (passed to a Tcl procedure as an argument, sent with the query string in a URL, etc). Multiple values with the same key may exist. When passed to a Tcl procedure, all values with the same key will be aggregated into a Tcl list. Parameters will be passed to Tcl in a format compatible with the argument parser described in requirement 50.20.20. 10.10.37 To make it possible to write generic tools for configuring datasources, a datasource must specify, when created, what parameters it will accept and what the default values for those parameters should be. 10.10.40 When duplicating an element, it must be possible to make either the templates associated with it, its configuration, or both immutable. That is, an administrator should be able to set whether the portal user may change any of the behavior of that element in his own configuration. Therefore, the data model must accommodate the representation of configuration, template, and element information separately. 10.10.50 When creating themes, a template for each output content-type should be defined. That is, for example, there should be a different template for text/plain than text/html for any particular theme. This allows for support of formats with differing display requirements, as well as those with richer structure than a simple text stream. For example, an administrator may create a "green title bar with a curved corner" theme. She'd create one template defining the theme for text/html, which would simply include the datasource content directly. Next, she'd create one for text/plain that would surround the content in <pre> tags. Finally, she'd create one for application/rdf that would surround each channel item with a black box, using an ATS <multiple> tag. Note that a number of predefined themes will distribute with portal (requirement 70.10.20). 10.10.60 Parameters preventing the export and/or copy of an element should be defined for administrators who don't want their elements used on outside portals. "No Copy" is important, as the datasource or parameters of an element may contain passwords or other sensitive information. 10.10.65 An instance-level parameter should be defined regulating whether elements from external instances may be imported. 10.10.70 It would be good for there to be a way to change the predominant theme for an entire portal, rather than changing it for every single element individually. 10.10.80 There should exist a categorized repository for global data sources that an administrator might like to use in his own portal. A number of datasources will distribute with portal, and more will be added as packages are installed. See requirements 10,10.15, 30.20.30, 70.10.10, and 70.10.40. 10.10.90 Portals that share a parent_id (mounted at the same place on the site-map), are considered to part of a group. See requirement 70.10.10, "Current Portal" for an example of where this applies. Requirements: User Interface 30.10 Use: 30.10.10 A user should be able to edit her own personal copy of any portal. This is accomplished by making a copy of the current portal. This new portal should reference all of the parent's elements, rather than copying them. It should only copy the elements if the user edits them directly. If the user has her own version of a portal she should see her copy, though there should be provided a way for her to view the default if she so desires. To revert to the default configuration for the portal, she should be able to easily delete her copy. 30.10.20 Generally, element theme design should mimic the window-system UI with which users are likely to be familiar. An element may have a "title bar" area, which should contain links that perform generic actions on that element (remove, open in new window, etc.) This should be available at least as an example to administrators, though they shouldn't be limited to this sort of layout. See requirement 70.10.20. 30.10.30 When customizing a page, a user may use datasources from other portals to which he has access. 30.10.35 Where permitted, a user should be able to set the theme for any element on on his portal. 30.10.40 Some hints with regard to how much horizontal space an element would like to occupy should be provided to the user when placing elements in a layout, ala Yahoo's "wide" and "narrow" elements. It's not necessary, however, to force the user to have distinct wide and narrow columns. 30.20 Administration: 30.20.10 An administrator may define some number of possible page layouts from which users may choose when customizing a portal. Sensible defaults will be provided. Areas in those layouts may be designated 'immutable', in which users will not be able to add or remove elements. 30.20.20 The administrator of a portal should compile a list of which elements are available to the users of the portal when making customizations. Elements from outside that portal's scope in the site-map (see requirement 10.10.15) may not be included, though the administrator may enable or disable the importation of elements into a customized portal. 30.20.30 An administrator should be able to make a copy of an element that's contained in another instance, if he has permission to do so. 30.20.40 There should be provided a parameter browser that an administrator can use to set parameters on elements. 30.20.50 There should be defined a UI for creating datasources. Requirements: Security and Permissions 40.10.10 Permission is set using the main Portal package for the following groups of actions. I've also indicated the classes of users who might generally be performing these actions. Read performed by registered users and possibly casual users as well. Viewing a portal (that is, a package instance). Viewing an element. Inherited by default from the portal that contains the element. Write generally only performed by registered users. Generating and editing a personal copy of a portal. Administer performed by administrators who aren't site-wide administrators, and possibly by users on their own personal portals. Creating pages using predefined layouts. Altering the presence and placement of elements in those pages. Creating/editing datasources using only "safe" types of content (ie. those that do not allow one to execute arbitrary code). Examples of such types include straight HTML, and possibly sanitized ADP (though most likely not). URLs may also be defined as safe content, if a robust system for retrieving them is carefully designed and implemented (probably in a future revision). Combining themes and datasources into elements. Editing the parameters associated with elements. Creating layout templates using a (hypothetical) template creation application. This will likely exist in a future revision of Portal. Reverting a subset of user pages in the current instance to the default (ie. deleting them) 40.10.15 The following actions are performed using the Portal Administration package. Permission to use the Portal Admin package should be granted only to "trusted" administrators. That is, those who the site admin feels comfortable allowing to execute arbitrary Tcl code on the server. Creating layouts by editing raw template code. Creating datasources with all types of content. Editing theme templates. Setting permissions. 40.10.20 A user must never be allowed to view an element when he does not have read permission on that element, even if the user had read permission in the past and imported the element into another portal. 40.10.30 Before insertion into a template, all ADP must be filtered out of the datasource's processed content. No ADP or Tcl may be accepted from an untrusted source. See above for a definition of "trusted". Requirements: API 50.10 Template API: Portal should provide the following functionality to the authors of layout templates and element themes: 50.10.5 "Is this the default portal?" - a procedure that will tell the template in question if it is being attached to the default portal, or a customized user portal. 50.10.10 "Kill this element" - a procedure that will return a link that, when followed, will remove the element in question from a page. Useful for making the "close box" interface widget. 50.10.15 "Spawn new window" - open the element in question in a new window, all by itself. Compare with My Yahoo. 50.10.20 All of the datasource's parameters are passed to the template as regular variables. 50.20 Developer API: Support for the ACS developer seeking to integrate another package with portal. Of course, this doesn't cover the full range of available procedures. That's what documentation is for. 50.20.10 Include any API provisions from the ACS 3.4 series Portals package that remain relevant. 50.20.20 A procedure should be defined for accepting parameter arguments passed to a Tcl data source. It should allow the datasource author to retrieve arguments as either an array or an ns_set. The goal here is that the author shouldn't have to worry about the specific format in which arguments are passed to Tcl datasources. He should simply be able to say, "Give me the parameters for this datasource." See requirement 10.10.35. 50.20.30 Portal should define procedures for producing commonly used interface widgets (like the dimensional sliders available in the ACS 3.4 series Portals) that function in an element-independent way. That is, for example, if a user is looking at a portal full of News elements, and changes one of them from the default 7-day view to the more verbose 30-day view, the other elements should remain unaffected. 50.30 PL/SQL: 50.30.10 There should exist a complete PL/SQL API that includes, among other things, a function for copying elements and duplicating portals (to make personal portals and/or move elements from one portal to another). Requirements: Performance, Scalability, Robustness 60.10 Performance and Scalability: 60.10.10 Caching should be used wherever possible, either in the database or server memory, to reduce query overhead and increase response time. This is particularly important in Portal, as the typical page may use a large number of queries into the data from different packages. The core Portal engine itself should be as fast as possible, and should should provide as much help as possible to the developer and administrator for increasing speed and scalability. 60.20 Robustness 60.20.10 Though Portal may pull data from a remote server, errors on that remote machine should have as little effect on Portal as possible. We can't rely on other people to maintain reliable web services. 60.20.20 If/when an error occurs in a Portal component, the error should not affect the function of the rest of the system. (for example, if a data source suddenly begins failing, it should disappear from the page, rather than producing a raw error message). Requirements: Defaults 70.10.10 The following elements should be provided with portal: "Current Portal", or similar -- display links to the other portals that share the same parent_id (that is, that reside at the same point in the site-map), and that the current user has permission to view. This should be setup as RDF or take a parameter or something so that it can either be displayed as an unordered list, or with a folder-tabs style UI. "My Portals" -- links to all of the other portals with the lowest sort_key in their parent_id that the user can currently access. 70.10.20 A number of ready-made element themes should be distributed with portal. 70.10.30 There should be a number of sensible layout templates available. Examples include two- and three-column layouts, with or without a single, all-spanning row at the top. 70.10.40 Pending functioning RDF support, as many remote content datasources should be available in portal as possible. (ex. Freshmeat, Brunching Shuttlecocks, Perl.com, PBS Online, Slashdot). Lots of people make directories of their content available, we should include links to them. Revision History Document Revision # Action Taken, Notes When? By Whom? 0.1 Creation 11/17/2000 Ian Baker 0.2 Reviewed, revised. 11/28/2000 Michael Bryzek, Ian Baker 0.2 Reviewed, revised. 12/3/2000 Michael Bryzek, Oumi Mehrotra, Ian Baker ibaker@arsdigita.com $Id: requirements.xml,v 1.1 2002/07/09 17:35:10 rmello Exp $