From Database to Web Page: Server Strategies When Turning Data Into Content

From Database to Web Page: Server Strategies When Turning Data Into Content


February 21, 2002

Take this scenario: You have information in a database somewhere, and you need to get it up on your Web site -- perhaps in an interactive page or a full-blown application. This tutorial is aimed at users who are not necessarily HTML jockeys but are discovering that data (and situations like this) is looming prominently in their future, and they are trying to figure out how to get it from the database server to the Web server.

So, what are the options?

Good news: There are a lot of them. But that's also the bad news because choosing the right approach isn't necessarily easy. This tutorial will offer a quick tour of some of the most important options with an eye toward helping users understand when and how they may be appropriate

Using HTML for managing data is like using a set of three metric wrenches to fix an American automobile. It can be done, maybe -- sort of, but it sure isn't comfortable or pretty.

This is because there simply aren't that many ways to deal with data in HTML, or in DHTML for that matter. It can work, as long as your needs are simple, the volume of data is trivial, and the data does not change much. For the most part HTML data is static -- a put it up and leave it there sort of thing.

These days this often isn't enough.

In the real-time environment of the Web, users want information now, and at some point an organization will need to have a dynamic relationship to data.

As an example, data is available on a server somewhere -- maybe an Oracle or Microsoft SQL Server database -- and you need to get information from it and probably to it.

The potential scope on this is broad and can be broken down into six steps:

  1. Know what data is needed and where to get it (i.e., the specific database)
  2. Make a connection to the data source -- i.e. the server (Note that this step can be quite technical.)
  3. Request the data from the server, usually with an SQL statement
  4. Process, validate, or pre-format the data returned from the server
  5. Present the data to the user (i.e., display it in a Web page)
  6. Possibly get data from the user or another source of interactive input

At least four pieces of software (and probably three computers) are involved in this process: a Web server, a script running processes on the Web server, a database server, and the browser.

All of these components must work together. Generally, the person overseeing the data migration also has responsibility for the correctness and integrity of the data. Is a database management expert needed to do this? No, although of course experience does help.)

In most cases, some knowledge of SQL is required. What is needed most though are tools that help with the process. Most of these are scripting frameworks -- typically software running on the Web server -- such as Perl, PHP, JSP, ASP.Net, ColdFusion, or an application server system. In the next few pages we will discuss each of these.

If you haven't yet met Perl ... well, you've missed quite a character. Perl has been around for a while, and data is hardly its only trick. Perl can do a good job with most phases of getting information from databases to a Web page. Its longevity has a couple of benefits: It has lots of support -- the database access API and DBI (http://dbi.symbolstone.org) for example -- and virtually every server running almost any operating system will have Perl on it to run programs.

Some people swear by Perl; others swear at it. It's not elegant, but it will do the job for data.

PHP is nearly as old as Perl (and in fact was originally written in Perl). Unlike most of the other tools, PHP addresses databases on an individual basis. This can be good, as in the case of its most common companion MySQL (www.mysql.com), an open source database manager in which specific functions support the database connection. In other cases, such as databases running under Windows, PHP is much more rudimentary. Mostly though, PHP and MySQL are a partnership found on almost every Linux server -- an economical and well-supported relationship worth exploiting.

Java Server Pages were something of a late comer to Java and had a rocky start, but that was a couple of years ago. Today, JSP is a well-accepted component of Java, and many tools use and support it. As a framework for scripting database connections, JSP usually works through JDBC (the Java equivalent of ODBC).

JSP is a good vehicle for moving between platforms; you don't need to be overly cautious about the effects of extensions (like extensions for Unix in PHP) or incompatibilities (like with Microsoft COM objects in ASP). A good place to start with JSP is the reference implementation, Tomcat (http://jakarta.apache.org/tomcat), an open source program.

As the successor and first upgrade in a long time to the original Active Server Pages (ASP), ASP.Net's data handling aspects have undergone more than just name changes (ODBC has generally migrated to OLE DB).

The .Net in the name indicates Microsoft's use of ASP to support Web services. The main disadvantage is a familiar one for anything from Microsoft: Porting to other platforms is difficult or impossible. Most organizations would not choose to start with ASP.Net if they knew they also had to run data on Linux with Apache servers. On the other hand, if an organization knows its going to run data on Microsoft servers (e.g., IIS), then there are optimizations available to ASP, a ton of COM objects to use, and many third-party development tools.

Macromedia ColdFusion (www.macromedia.com) was among the first and most successful products to insert its own markup language into HTML with a primary focus on incorporating data. Today, people still like ColdFusion for the data orientation focus.

Available for several different server platforms, CFML (Cold Fusion Markup Language) allows tight control when connecting to databases, formulates an SQL statement to fetch the data, and controls how the returned data is organized. Although CFML can be programmed in a simple text editor, we believe the Windows-only ColdFusion Studio is the best choice for a development environment. ColdFusion has come so far down the trail of sophistication that it is now a system with an application server at its heart.

As you go up the complexity scale in Web data -- multiple database servers, distributed applications, Web services, and fat clients -- it becomes increasingly difficult for simple coding with a markup language to manage the data. That's where a relatively new breed of software comes in: the application server. (See our tutorial "Web Servers vs. Application Servers: Making the Right Choice" for more information about app servers.)

Application servers have many functions, but they start with database connectivity and are designed to help organize and control the potentially vast amount of data flowing back and forth from applications, objects, and database servers. These days, application servers from the likes of Borland, IBM, Oracle, and Sun Microsystems also come hitched to massive software development systems (e.g., IBM Websphere Studio) that provide enough tools for even the largest enterprise applications.

In addition to the scripting frameworks, eXtensible Markup Language (XML) can be used to help orchestrate the migration. XML already plays a major role in data management in and around Web pages. It's a tag-based approach to organizing pages with data, passing data between dissimilar computer systems, and much more -- a complicated subject that we cannot do justice to here. XML is relatively new; its protocols, standards, and tools are still in the process of development and shakeout. However, if you're going to work with Web pages and data, you'll need to add XML knowledge to your toolkit.

If the need for data in your Web pages looks like it will grow substantially, then we recommend using Java (JSP) or .Net (ASP) because it's a safe assumption that in the long term these will be well-supported by major development environments and application servers. At more modest levels of data use, PHP and Perl still offer inexpensive and very effective ways to manage data within the HTML environment.

Also, as you may have noticed, we've piled up at least three different servers (application, Web, and database) in our descriptions. This arrangement becomes complicated very quickly. Management of data is often the heart of Web applications, and as we presumably move into the era of Web services, we'll see more of it.