Implementing a Search Engine in ASP

Scott Mitchell
As a web site grows, finding content on the site becomes increasingly difficult. To combat the difficulty of finding relevant information on a large site, many developers turn to writing a search engine for their site. This article discusses how to implement such a system using Active Server Pages and SQL Server.

There are two "types" of search engines. Both take a search string from the user to begin, but what, exactly, they search differs. A completely dynamic search engine for a completely dynamic web site will hit a database table which ties an article URL to the articles description. The database can then compare the user's search request to the descriptions of the available articles and return the relevant URLs. Introduction: As a web site grows, finding content on the site becomes increasingly difficult. To combat the difficulty of finding relevant information on a large site, many developers turn to writing a search engine for their site. This article discusses how to implement such a system using Active Server Pages and SQL Server.

Another approach is to do an actual text search through each of the files. For example, say that the user searched for "Microsoft." Your search engine would then look through all of your HTML files and return the URLs of those which had the word "Microsoft" somewhere in the document. Such a system is used for this web site's search engine. In my opinion, it is much easier to write such a system described in Perl (which this system is written in), than in Active Server Pages; however, it is quite possible to write a text-finding search system in ASP.

In this article I plan to implement the former search engine, the dynamic search engine. For this example I will make a table called ArticleURL, which will have the following definition:

ArticleURLID int   PK
URL varchar(255)
Title varchar(100)
Description varchar(255)

Now that we've got our table definition, let's look at how our web visitors will enter their queries.

Search Querying
A search engine is rather useless unless queries can be made, and the results are returned. Let's examine how we will code the first needed part, the user search requests. All we will need is a simple HTML FORM which takes input from the user and passes it on to an ASP page. Here is an example of a file we'll call SearchStart.htm:

Search for: <INPUT TYPE=TEXT NAME="txtSearchString" SIZE="50">

This, of course, is not a pretty HTML page, but its functionality is there. There are many things which could be done to enhance this page. It is recommended that JavaScript functions be present to make sure the user is searching something (i.e. not just clicking Submit when there is no search string).

Now that we have the Query, we need to look at the second phase of any search engine: retrieving the data and presenting it to the user. Here is where the real fun begins!

Retrieving the Data and Presenting It:
Our ASP page Search.asp must do a few steps. First, it must parse the FORM variable txtSearchString. Right now, I am assuming that each word in the txtSearchString separated by a space will be ANDed together. You can alter this (have it ORed), or, to make it more professional, you can give the user the option of which boolean to put inbetween each spaced word.

Next, Search.asp will need to hit the database table ArticleURL and return the data in a user-friendly fashion. Also, we will want to display the results only 10 records at a time, so logic will need to be implemented to handle this as well. Let's look at some code.

'Connect to Database
Dim Conn
Set Conn = Server.CreateObject("ADODB.Connection")
Conn.Open Application("MyConnectString")

'Set these up to your preference
DefaultBoolean = "AND"
RecordsPerPage = 10

'Get our form variable
Dim strSearch
strSearch = Request.form("txtSearchString")

'Get our current ID. This let's us know where we are Dim ID
ID = Request.QueryString("ID")

'Set up our SQL Statement
Dim strSQL, tmpSQL
tmpSQL = "(Description LIKE "

'OK, we need to parse our string here
Dim Pos
Pos = 1
While Pos > 0
      Pos = InStr(1, strSearch," ")
      If Pos = 0 Then
            'We have hit the end
            tmpSQL = tmpSQL & "'%" & strSearch & "%')"
            tmpSQL = tmpSQL & "'%" & Mid(strSearch,1,Pos) & "%' " & DefaultBoolean & " Description LIKE "
            strSearch = Mid(strSearch,Pos+1,len(strSearch))
      End If

'Now, we've got to make sure we only get the right records
strSQL = strSQL & tmpSQL & " AND ArticleURLID > " & ID
strSQL = strSQL & " ORDER BY ID"    'Important!

'Make our Recordset variable and get the results
Dim rsResults
Set rsResults = Server.CreateObject("ADODB.Recordset")

'Get the right number of records per page
rsResults.MaxRecords = RecordsPerPage

'Set our recordset properties (include ADOVBS.inc for the constant definitions!)
rsResults.CursorType = adForwardOnly

'Get our data
rsResults.Open strSQL

'OK, we've got the data, let's display it in HTML
'First, though, let's get the total number of records
Dim rsTotalRecords
Set rsTotalRecords = Conn.Execute(strSQL)

'We also need the max ID value for our search Dim rsMaxID
Set rsMaxID = Conn.Execute(strSQL)

<% if rsResults.EOF then  'No matches found
No matches found! Try broadening your search criteria.<P>
<A HREF="SearchStart.htm">Return to Search</A>
<% Else
Dim iCurrentID
While Not rsResults.EOF
iCurrentID = rsResults("ArticleURLID") %>
<A HREF="<%=rsResults("URL")%>"> <%=rsResults("Title")%></A>

<% rsResults.MoveNext
Wend %>

<%=rsTotalRecords(0)%> Found!<BR>

<% if iCurrentID < rsMaxID(0) then %>
<!-- We have at least another record... -->
<FORM METHOD=POST ACTION="Search.asp?ID=<%=iCurrentID%>">
<INPUT TYPE=HIDDEN NAME="txtSearchString" VALUE="<%=Request.form("strSearchString")%>">
<% end if
end if 'End if for .EOF clause above %>

Note: Please forgive me if there are many errors or typos. I wrote this code while writing this article. It has not been fully tested. In theory it should work. More important than running source code are the ideas behind the code. Source code is a mere transformation of ideas into something a computer can understand. If you truly understand the ideas, the code should write itself.

Hopefully you can understand what this code is doing. This file, Search.asp, will be called the first time a search is executed and each time the user wants to view the next N records. To start out, the file gets the search string and the current ID. The current ID is an important value, it tells this page which records we've already seen. The SQL searches for records who have an ArticleURLID greater than the passed in ID. To start off, we pass in an ID of 0, so all records (assuming ArticleURLID was set as an IDENTITY(1,n))) will be included.

Next we parse out our Search String into a string variable called tmpSQL. If the user searched on "Magnum P I", tmpSQL would contain (Description LIKE '%Magnum%' AND Description LIKE '%P%' AND Description LIKE '%I%'). We then add to our WHERE clause ArticleURLID > ID, where ID is the ID we pass into Search.asp.

Next, we create an instance of an ADO Recordset object, and set the MaxRecords property to N, where N is the number of rows we want to display per page. This will only return N records to our recordset object.

Finally we get the total number of records which match our search criteria and the maximum ID which matches our criteria. We need the maximum ID to determine if we are currently on the last recordset. Once we have all of this data we are ready to display our information.

We start out by seeing if we have any information in the first place! You'll not the if rsResults.EOF then. If no records are found then we inform the user that we could find no results and provide a link back to the SearchStart.htm page from which they came. If, however, rsResults is not empty, we iterate through the recordset. We then check to see if our last ArticleURLID is less than the maximum ID. If it is, then we know we have at least one more record to show, so we display the "Next" button which will display the next N records.

Areas for Improvement:
As I'm sure you can note, this search engine solution leaves a lot to be desired as far as functionality goes when we compare it to standard internet search engines. For example, there is no Back button, only a forward. Also, you cannot do any complex boolean searches, such as: "Microsoft AND 'Active Server Pages' AND NOT (VBScript OR JScript)". These can both be accomplished, though!

Personally, I have written a parser which accepted complex boolean searches similar to the one shown above and transformed it into a SQL WHERE clause. To implement a Back feature, I would recommend a dynamic Array (or stack). You would need to put this in a Session-level variable. Each time the user hits Next, you will want to push the Request.QueryString("ID") onto the stack. When they hit the "Back" button you'll want to pop the last ID off the stack and pass it as ID to Search.asp.

In this article we've examined how to implement a simplistic dynamic search engine using Active Server Pages and SQL Server. While the model implemented in this article is not exactly "feature-ful," it does search, and presents the basic ideas behind a search engine. Without major modifications, this system could be transformed into a very impressive, professional looking search engine.

This article was originally published on Nov 15, 1998
Page 1 of 1

Thanks for your registration, follow us on our social networks to keep up-to-date