Home Articles Talks Links Contact Me ISA ThoughtWorks

Database Session State


Store session date as committed data in the database.

How it Works

When a call occurs from the client to the server, the server object will first pulls the data required for the request from the database. The server object then does the work it needs to do and saves all the data required back to the database.

In order to pull information from the database, the server object will need some information about the session - requiring at least a session ID number to be stored on the client. Usually, however, this information is nothing more than the appropriate set of keys that's needed to find the appropriate amount of data in the database.

The data that's involved will typically involve a mix of session data, that's only local to the current interaction, and committed data that's relevant to all interactions.

One of the key issues to consider here is how to deal with the fact that session data is usually considered local to the session and shouldn't affect other parts of the system until the session as a whole is committed. So if you are working on an order in a session and you want to save it's intermediate state to the database, you usually need to handle it differently to an order that is confirmed at the end of a session. This is because that often, you don't want pending orders to appear in queries that are run against the database for such things as book availability, revenue that day, and the like.

So how do you separate the session data? One route is to simply add a field to each database row that may have session data.The simplest form of this is to just have a boolean field isPending. However a better way is to store a session id in this field. This makes it much easier to find all the data for a particular session. All queries that want only record data now need to be modified with a clause sessionID is not NULL, or use a view that filters out that data.

Using a session ID field ends up being a very invasive solution because all applications that touch the record database need to know about the meaning of the field to avoid getting session data. Views will sometimes do the trick and remove the invasiveness, but they often impose other costs of their own.

A second alternative is to use a separate set of pending tables. So if you have an orders and an order lines tables in your database, you would add tables for pending orders and pending order lines. While session data is pending you save it to the pending table and when it becomes record data you save it to the real tables. This removes much of the invasiveness problem, but you'll need to add the appropriate table selection logic to your database mapping code, which will certainly add some complications.

Often the record data will have integrity rules that aren't really true for pending data. In this case the pending tables allow you to not have the rules when you don't want them, but to enforce them when you do. Validation rules also will typically not be applied when saving pending data. You may face different validation rules depending on where you are in the session - but this will typically appear in server object logic.

If you use pending tables, they should be pretty much exact clones of the real tables. That way you can keep your mapping logic as similar as possible. So use the same field names between the two tables. However do add a session id field to the pending tables so that you can easily find all the data for a session.

You'll need a mechanism to clean out the session data if a session is cancelled or abandoned. Using a session ID makes this very straightforward as you can find all data with that session ID and delete it. If users abandon the session without telling you, you'll need some kind of timeout mechanism. A daemon that runs every few minutes can look for old session data. This implies a table in the database that keeps a track of the time of the last interaction with the session.

Rollback is made much more complicated by updates. If you update an existing order in a session that allows a rollback of the whole session, how do you perform the rollback? One option is to not allow cancellation of a session like this. Any updates to existing record data become part of the record data at the end of the request. This is simple and often fits the users' view of the world. The alternative is awkward whether you use pending fields or pending tables. Pending tables can be easier as you can copy all the data that may be modified into pending tables, modify it there, and commit it back to the record tables at the end of the session. You can do a similar approach with a pending field, but only if the session ID becomes part of the key so that you can keep the old and new IDs in the same table at the same time - which can get very messy.

If you are going to use separate pending tables that are only read by objects that handle a session, then there may be little point in tabularizing the data. Then it's better to use a Serialized LOB. At this point we've crossed the boundary into a Server Session State

You can avoid all of the hassles of pending data by not having any. If you design your system so that all data is considered to be record data, then you avoid all issues of pending data. This isn't always possible, of course, and sometimes it's done in such an awkward way that the designers would be better off with thinking about explicit pending data. But if you have the option it makes Database Session State a lot easier to work with.

When to Use it

Database Session State is one alternative to handling session state and should be compared with Server Session State and Client Session State.

The first aspect to consider with this pattern is performance. You'll gain by allowing yourself to use stateless objects on the server, thus enabling pooling and easy clustering. However the cost you'll pay is the time to pull the data in and out of the database with each request. You can reduce this cost by caching the server object. This will save you having to read the data out of the database whenever the cache is hit. You'll still pay the write costs.

The second main issue is that programming effort, most of which centers around handling session state. If you have no session state and are able to save all your data as record data in each request - then this pattern is an obvious choice as you lose nothing in either effort or performance (if you cache your server objects).

In a choice between Database Session State and Server Session State the biggest issue may be in how easy it is to support clustering and fail over with Server Session State in your particular application server. Getting clustering and fail over to work with Database Session State is usually more straightforward, at least with the regular solutions.



© Copyright Martin Fowler, all rights reserved