Current Issue


Table of contents

CD-ROM

Sys Admin and The Perl Journal CD-ROM version 12.0

Version 12.0 delivers every issue of Sys Admin from 1992 through 2006 and every
issue of The Perl Journal from 1996-2002 in one convenient CD-ROM!

Order now!

Sys Admin Magazine > Archives > 2002 > July

Parsing and Summarizing a Logfile

Randal L. Schwartz

I recently put www.stonehenge.com behind a caching reverse-proxy, and rather than switch technologies, I'm using another instance of a stripped-down Apache server to do the job. But what kind of job is it doing? How many of my hits are being cached and delivered by the lightweight front servers, instead of going all the way through to the heavy mod_perl_and_everything_else backend servers?

Luckily, I have included the caching information in the access log file, thanks to the CustomLog and LogFormat directives:

LogFormat "[virt=%v] %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \
   \"%{User-Agent}i\" \"%{X-Cache}o\"" combined

CustomLog var/log/access_log combined

I have added a virtual host entry (for tracking) to the front of the line, and the X-Cache header of the response to the end of the line. Of course, doing so means my access log is not in a standard format any more, so I can't use off-the-shelf tools for log analysis. That's okay, because I'm pretty good at writing my own data-reduction tools. A typical output line looks like this:

[virt=www.stonehenge.com] 192.168.42.69 - - [10/May/2002:01:51:50 \
  -0700] "GET /merlyn/UnixReview/ HTTP/1.0" 200 101324 "-" \
  "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)" "MISS \
  from www.stonehenge.com"

For my analysis, I wanted to see how many of those X-cache fields began with HIT or MISS, indicating that the mod_proxy module had gone all the way to the backend server, and either gotten a good cache-able hit, or had to regenerate it. I also wanted the data summarized on an hour-by-hour basis, in a CSV-style file so I could pull it in to my favorite spreadsheet to do graphs and formulas.




MarketPlace

Build IT Knowledge with Current & Trusted Content
Helps Employees Develop & Hone New Technical Programming Skills. Sign Up & Get Full Access.

Villanova University Six Sigma & IT Certificate Programs
100% Online programs in Six Sigma, IS Security, CISSP Prep, Business Analysis, Proj. Mgmt. and more!

Workflow Enabled Help Desk & IT Service Management
Automate service desk activities and integrate processes across IT. Learn more here.

Flowcharts from C/C++ code -- Free trial download
Understand C/C++ code in less time. A new team member ? Inherited legacy code ? Get up to speed faster with Crystal Flow for C/C++. Code-formatting improves readability. Flowcharts are integrated with code browser. Export flowcharts to Visio.

Wanna see your ad here?