The Windows Log Format

The Windows Log Format 1.1

Bob Denny (May 13, 1998)

Introduction

WebSite Professional 2.3 supports a new revision (1.1) of the Windows Log Format (WLF). WLF was originally designed to make it easy to import log data into most Windows office productivity packages such as Microsoft Excel and Microsoft Access. Since tab-delimiters are used, it is much easier (and faster) to parse WLF log entries using Visual Basic or perl. WLF log files are, however, larger than the NCSA Common and Combined formats (which are also supported by WebSite).

The WLF 1.1 revision addresses issues arising from the locale-dependent format of the date/time fields in version 1.0. The variations in date-time format (depending on the Regional Settings of the host operating system) created problems for developers of log analysis programs. Since Excel, Access, etc., can accept date/time strings of various formats for import, we decided to change to a fixed numeric format in WLF. The month and day ordering is the US convention, chosen because most of our customers are in the US and will be able to import the logs into Access, etc., with no special action.

NOTE: Some fields contain data taken directly from HTTP request header fields. If the browser sends corrupted or illegal information in a relevant header field, the log entry will contain this corrupted or illegal information.

Description of Windows Log Entries

Each HTTP request received by WebSite is logged as a single line. Fields in the log entry are tab-delimited text. There is no "encoding" or other alteration of the values in log entries. They are logged verbatim. The table below describes each field, starting with the leftmost (first) field in an entry. Tabs are included for empty fields.

Windows Log Entry Fields

Field Description

1 The date and time according to the server's system clock at which the request was received.
The format is:
MM/DD/YYYY HH:MM:SS
which includes the full 4-digit year. The time is UTC (GMT) and is always in 24-hour format.

2 The IP address of the remote client/browser. This will be the remote client's DNS hostname if DNS reverse lookup is enabled (not recommended).

3 The server hostname for the request. If the HTTP request contains a Host: header, and that hostname is a configured identity, the name in that header is used. Otherwise, the hostname configured for the IP address on which the request was received is used.

4 The authentication realm, if present in the request. NOTE: Presence of this field does not imply that the requested object was access-controlled. This is taken from the string sent by the browser in the Authorization: header field, and decoded by the server.

5 The authentication username, if present in the request. NOTE: Presence of this field does not imply that the requested object was access-controlled. This is taken from the string sent by the browser in the Authorization: header field, and decoded by the server.

6 The HTTP method of the request (e.g., GET or POST)

7 The path portion of the HTTP request URL. This may or may not contain the URL query string (if present in the request), depending on a WebSite server configuration setting.

8 The complete referring URL, if present in the request. Most of the time, if this is present, it is the complete URL of the document that contained the link that generated this request. This is the string sent by the browser in the Referer: header field.

9 The email address of the client/browser user. This is the string sent by the browser in the From: header field. This field is not currently generated by any known browser, due to privacy concerns.

10 A string describing the client/browser software and version. This is the string sent by the browser in the User Agent: header field.

11 The numeric status code of the request, for example 200 for OK.

12 The number of actual content bytes transferred in the response. This does not count HTTP response header bytes.

13 The time, in milliseconds, between the arrival of this request and the time it was logged. This includes not only the processing time, but also the time it took to receive any content data provided with the request (e.g., form data), transmit the response to the client, close the TCP connection, and clean up thereafter. Due to buffering within the TCP/IP kernel, this may be optimistic. Several K bytes of data can be buffered beyond the time the server closes the connection.