Creating Referer Logs with 4D Web Server


ACI - Documentation Français English German ACI Technical Notes ACI Technical Notes, By Subject Back Previous Next Find

Creating Referer Logs with 4D Web Server

By Eric Saltzen, 4D Inc. Technical Support

Technical Note 00-57

Technical Notes for Technical Notes for 00-12 December 2000

Introduction


Who's linking to your pages? By collecting Referer information from browser requests, 4th Dimension can find out. Note that the misspelling of the word "referrer" is, unfortunately, already a part of the standard implementation. I will continue to spell the word as it is actually used in this context (with one 'r'). Referer information can appear in any HTTP request from a web browser. Here is the basic format of Referer information sent by a web browser along with an HTTP request inside of the HTTP Header lines:

"Referer: http://www.anothersite.com/pageWithALink.html"

The more sites that link to your site, the more people are likely to visit your site. To create a successful site, one of your primary goals is to get others to link to you. This increases the chances that some random net surfer will stumble across a link to your site, visit it, and be so enthralled by your content that they'll come back again and again.

Before the Web exploded, you could actually make a decent estimate of how many links referred to your site by surfing around and counting them. Some of the more industrious folks created spiders that rummaged around finding links back to their sites. Once the Web began its exponential growth, this kind of exploratory accounting became impossible. Around that same time, the most recent release of the NCSA httpd server began creating a Referer log, using data passed to it from the browser that was connecting to your site. This data included the URL of the page currently displayed by the browser when it connected to your site. This URL, known as the referring page, gets written to the Referer log along with the document requested from your site.

This is very useful data that receives practically no attention. With a little bit of analysis, you can determine exactly which sites are most often being viewed when a person suddenly links to your site. Clearly, if this happens a lot, the referring site probably has a link to your site. You may also find Referer entries for links between pages in your own site, this is a good indicator of the way people typically find pages on your site. For example, do they perform searches or follow the hierarchy of your site? How popular is your Site Map?

67RefererGrab and 67LinkerSample


67RefererGrab is a sample database (available in both Mac and PC format) that shows how to cull Referer information from browser requests. If you examine its Database Properties, you will find that 67RefererGrab is set to Publish Database at Startup on Port 8080, Start without Context, and has a default HTML Root and Home Page set. Also, "Use 4DVAR Comments instead of Brackets" is checked. The database is set to a non-standard web serving port (8080) to allow easy testing of the Referer methods by launching another web server on the same machine. Along with 67RefererGrab is provided another small database/web site called 67LinkerSample which can be opened at the same time using another copy of 4th Dimension. It is set to serve on port 80, and contains two sample web pages with links to 67RefererGrab's home page hard-coded at 127.0.0.1 (the local loop back address). The database 67RefererGrab itself contains only two methods: On Startup and RefererGrabber. There are only three very simple HTML pages for the 67RefererGrab and 67LinkerSample sites which were created by hand: index.shtml, linker.html, and sublinker.html. The file index.shtml is the page we wish to track Referer information for and it contains a 4DSCRIPT tag which invokes the RefererGrabber project method (listed below) whenever the page is served. The files linker.html and sublinker.html are meant to be served from a separate web server and server as the "referring pages" which contain links to our tracked page. Note that all three HTML files contain three http-equivalent HTML tags that attempt to discourage browser caching.

On Startup

  

` 67RefererGrab Sample Database, November 2000

  ` by Eric Saltzen, 4D Inc. Technical Support   ` Database Method: On Startup   ` collect full HTTP_Header in HTTP_Requests table (see RefererGrabber method) <>refererDiagnosticMode:=True   ` write Referer information to log file (see RefererGrabber method) <>refererLogFile:=True

Here we merely set some preferences for how we want RefererGrabber to operate. If <>refererDiagnosticMode is set to True, then the entire HTTP header text is stored in the [HTTP_Requests] table along with the Date, Time, URL and Referer. If <>refererLogFile is set to True, then all Referer information is appended to the file "RefererLog" which must already exist in the same folder with the database structure. This file is created in the standard format for web server logs and can be analyzed by any log analysis program. For example, see Analog at:

http://www.statslab.cam.ac.uk/~sret1/analog/

RefererGrabber

  

` 67RefererGrab Sample Database, November 2000

  ` by Eric Saltzen, 4D Inc. Technical Support   ` Project Method: RefererGrabber ARRAY TEXT(nameArray;0) ARRAY TEXT(valueArray;0) C_TEXT(header;vtLogLine) C_TIME(vhDocRef) GET HTTP HEADER (nameArray;valueArray) $refererField:=Find in array(nameArray;"Referer") If ($refererField#-1)     CREATE RECORD([HTTP_Requests])    [HTTP_Requests]Request_ID:=Sequence number([HTTP_Requests])    [HTTP_Requests]Date:=Current date    [HTTP_Requests]Time:=Current time     $XURLfield:=Find in array(nameArray;"X-URL")    If ($XURLfield#-1)       [HTTP_Requests]URL:=valueArray{$XURLfield}    End if    [HTTP_Requests]Referer:=valueArray{$refererField}     If (<>refererDiagnosticMode)  ` don't collect this much info about request normally        GET HTTP HEADER (header)  ` get all text version of HTTP request header for posterity       [HTTP_Requests]HTTP_Header:=header    End if    SAVE RECORD([HTTP_Requests])     If (<>refererLogFile)       vhDocRef:=Append document("RefererLog")       If (OK=1)          vtLogLine:=valueArray{$refererField}+" -> "+valueArray{$XURLfield}+Char(Carriage return)+Char(Line feed)          SEND PACKET(vhDocRef;vtLogLine)          CLOSE DOCUMENT(vhDocRef)  ` Close the document       End if    End if End if

RefererGrabber uses the new 4th Dimension version 6.7 command GET HTTP HEADER to make short work of finding the Referer information and storing it in the [HTTP_Requests] table and optionally writing it to the RefererLog file. Note that $1 and $2 (normally the URL requested and full HTTP header text) cannot be accessed here because we are running as a 4DSCRIPT called while pre-processing an HTML page, not a 4DACTION called directly as a URL. Instead, we manually locate the information we need in the arrays returned by the GET HTTP HEADER command using the command Find in array.

Summary


This Technical Note and sample database provide an easy way to begin collecting Referer information with 4th Dimension version 6.7. Note that in order to activate Referer logging for any page on your web site, you must place the following tag inside the HTML for that page:

<!--4DSCRIPT/RefererGrabber

Also, the name of any page so enabled must end in the ".shtml" extension which indicates the page has server-side include elements (4D HTML tags) that need to be processed by 4th Dimension.


ACI - Documentation Français English German ACI Technical Notes ACI Technical Notes, By Subject Back Previous Next Find