As Web sites get more complicated and more dynamic, developers want to give users a more cohesive environment. This cohesion can provide all sorts of functionality, from a simple method of tracking a shopping basket to providing full-blown customization of stories, templates, and information shown to users as they use the Web site. The key to this system is the session — a unique identifier that enables developers to identify users, either for relatively short periods (e.g., in shopping baskets) or longer (full customization).
Apache, combined with Perl and PHP, is one option for managing sessions involving user identification on dynamic Web sites.
This article will look at how Apache can help with session management and how that information can be used with Perl and PHP scripts.
Defining a Session
A session is typically defined as a single visit to a Web site where one might conduct one or more transactions, but that is not persistent. For example, a session might be used to track a user’s progress through a store and record his shopping basket. At the end of the process, he buys the goods, and that is typically the end of the ‘session’.
Two facts can be expanded from this. First, the session is typically unique to a particular visit. If the same user visited the store again, he would get a new session ID and therefore a new shopping basket. Second, because we cannot tell whether a user has actually stopped using our Web site, we must use a timeout value to identify when the user is no longer visiting the site. The value of the timeout is important in that it must be long enough that users can continue with their session, even if they are interrupted, but short enough that a new session is triggered when they visit later in the day or week.
However, we can also use the principles of a session to enable a user to visit a site and get a customized or personalized environment without requiring users to log in or set their preferences again. In this case, the persistence of the session becomes a way of identifying the user, and the timeout value for the session may be a longer period (such as weeks or months), to enable the user to re-visit the site without having to login again and re-create her session.
Whether you use the principles of the session to track single visits (often suitable for simple e-commerce sites) or longer-term identification of users (ideal on larger well-used e-commerce and community sites that are visited regularly), the fundamentals of a session system are actually quite straightforward.
There are two elements to a session. The first is the requirement that the browser supply a unique (but consistent) session identity when it accesses objects from the Web server. It must be unique to identify the user or his session, and consistent so that individual object requests use the same ID. The second is that the session is used to store information about the user (e.g., her shopping basket or site preferences). The former requires interaction from the browser (which we can control). The latter requires programming and storage on the Web server to control the content returned to the browser.
For communicating a unique session ID, two mechanisms are available: cookies and URL rewriting. This article will concentrate on the former, as it is the most practical, but URL rewriting has its benefits and advantages. PHP includes built-in support for cookies and the retention information across object accesses. Within Perl and Python, standard and third-party modules make using sessions and retaining session data much easier.
Cookies write the session ID information into the user’s browser cookie database; this cookie is supplied automatically to the site when the user accesses a page. Cookies are a practical way of sharing information between page views without requiring complicated scripting to embed session information into the URL. The browser automatically supplies cookies, regardless of whether you are using dynamic or static components. As a result, they can be used with combination sites without having to worry about how to exchange the information.
A cookie is generally used to store a single piece of information, for example the session ID or a shopping basket item. Each cookie is specific to a site, optionally to a path within the site, and each has a specific name. You can therefore create multiple cookies to store multiple pieces of information.
Some users are wary of enabling cookies, mostly because of some early bad press and problems with the implementations that allowed cookie information to be read by different sites. Today, cookie security is much tighter.
Cookies are secure because of a number of parameters that should be defined when creating the cookie. The three elements of a cookie control its duration, the site or domain on which it is valid, and the path where it is valid. The effect of the different settings controls how and where the information in a cookie is made available.