Logfile spliting

One logfile
Usually, httpd server are logging requests in one log file.
On NCSA or Apache server, log name is access_log
On CERN server, log name is httpd-log
In the config.pl file, you have to edit : $zip = 0

With traffic increase, most provider now support compressed logfile to save disk space. Some use daily compressed logile, others use monthly compressed logfiles. W3Perl is able to cope with both of them.


Multi logfiles
Select in the configuration file the filename log format you are using.

Log filename string can be whatever you want :

  • prefixlog is the constant string in your logfile (ex : access_log),
  • day is a two digits number (from 01 to 31),
  • smallday is a number (from 1 to 31),
  • month is a 2 digits number (from 01 to 12),
  • smallmonth is a number (from 1 to 12),
  • lettermonth is a 3 letters string with first upper (from Jan to Dec),
  • year is 4 digits number (ex : 1998)
  • smallyear is 2 digits number (ex : 98)
  • rotate is the Apache rotation index number (ex : 4)
  • and suffix is the compression extension used by on your system (ex : gz or zip).

Example of supported string filename :

  • access.log.2.gz : %prefixlog.%smallmonth
  • access_log.1998Mar.gz : %prefixlog.%year%lettermonth
  • 1998.03.10.raw.zip : %year.%month.%day.%prefixlog
  • log.03-12-1998.gz : %prefixlog.%day-%month-%year
  • in.9904 : %prefixlog.%smallyear%month

One file One file compressed Apache rotation Daily Daily compressed Monthly Monthly compressed
%prefixlog %prefixlog %prefixlog.%rotate %prefixlog.%year%month%day %prefixlog.%lettermonth-%day-%year %prefixlog.%year%month %prefixlog.%smallyear-%lettermonth
access_log access_log.gz access.log.4.gz
access.log.3.gz
access.log.2.gz
access.log.1
access.log
access_log.19990303
access_log.19990304
access_log.19990305
access_log.19990306
access_log
access.Apr-22-1999.zip
access.Apr-23-1999.zip
access.Apr-24-1999.zip
access.Apr-25-1999.zip
access
access_log.199903
access_log.199904
access_log.199905
access_log.199906
access_log
access_log.99-Oct.gz
access_log.99-Nov.gz
access_log.99-Dec.gz
access_log.00-Jan.gz
access_log


Compressed files
A number of ISP are providing logfile in a compressed format to save space.
Use $zip = 1 to tell W3Perl you are using compressed logfiles.
If $zipcut is set to 1, the package will search for monthly compressed logfiles, is set to 2 in the configuration file, daily logfiles will be used. Setting this value to 3 mean you're using Apache rotation logfiles.


Save space
If you are using one big logfile and want to save disk space, you can use squeezelog to cut and compress the log file each month. Compression rate is around 90% of the initial file saving a lots of disk space.
These replace the rotate-log utility provided with httpd server. If you are running IIS, using a monthly logfile would be my best choice.
W3Perl will also run faster as it will have only to scan the current month logfile.

You need to edit squeezelog file and configure it with your own system. The prog should be added in your crontab and executed once every first day of each month.

Example :
01 00 1 * * /usr/local/bin/perl /norfolk/www-data/w3perl/squeezelog

(if the logfiles are owned by root, you should ask your administration system to install squeezelog).