Server Clustering
part of the ArsDigita Community System by by Jon Salz
- Tcl: /tcl/ad-server-cluster.tcl
The Problem
Many heavily-hit sites sit behind load balancers, which means that requests to a particular
site can be handled by one of several machine conspiring to appear as a single server.
For instance, requests to www.foobar.com might be routed to either www1.foobar.com,
www2.foobar.com, or www3.foobar.com,
three physically separate servers which share an Oracle tablespace (and
hence all the data in ACS).
Many database queries are memoized in individual servers' local memory
(using the util_memoize procedures) to minimize fetches from the database.
When a server updates an item in the database, the
old item needs to be removed from the server's local cache (using util_memoize_flush)
to force a database query the next time this item is accessed. But what happens when:
- www1.foobar.com does util_memoize "get_greeble_info 43" (incurring an actual
database lookup, SELECT * FROM greeble WHERE greeble_id = 43, and caching the result)
- www2.foobar.com does util_memoize "get_greeble_info 43" (incurring a
database lookup and caching the result)
- www1.foobar.com UPDATEs the info for greeble #43 and does
util_memoize_flush "get_greeble_info 43"
- www2.foobar.com does util_memoize "get_greeble_info 43" (returned a cached
value). The old info for greeble #43 hasn't been flushed from its local cache, so the result
is outdated!
In general, if any of several servers can
update an item, the old version of the item can remain in other servers' local caches.
Doh!
The Solution
We introduce the concept of a server cluster, a group of look-alike servers sharing an Oracle tablespace.
To set up a cluster, add the following to the ACS parameters/yourservername.ini file on each
of the servers in the cluster:
; address information for a cluster of load-balanced servers (to enable
; distributed util_memoize_flushing, for instance). One entry per
; server; this machine's IP may be included as well
[ns/server/click/acs/server-cluster]
; 192.168.16.1 is www1.foobar.com
ClusterMachine=192.168.16.1
; 192.168.16.2 is www2.foobar.com
ClusterMachine=192.168.16.2
; 192.168.16.3 is www3.foobar.com
ClusterMachine=192.168.16.3
Now when a server (say, www1.foobar.com) invokes
util_memoize_flush or util_memoize_seed, those routines use
server_cluster_httpget_from_peers
to issue an HTTP GET request to all machines in the cluster (omitting the local server):
- GET http://www2.foobar.com/SYSTEM/flush-memoized-statement.tcl?statement=tcl-statement
- GET http://www3.foobar.com/SYSTEM/flush-memoized-statement.tcl?statement=tcl-statement
causing the other machines (www2.foobar.com and www3.foobar.com) to flush the Tcl statement
from their local caches. This is transparent and works with all existing code.
So don't think about it - just set up the server-cluster block in your yourservername.ini file,
and util_memoize and friends will be happy.
jsalz@mit.edu