AWStats was useful for integrating access logs from multiple servers
table of contents
This is Yamada from the Systems Department.
This time, we will discuss how to combine Apache access logs distributed across multiple web servers into one.
The beginning of things
The other day, I had to analyze the access logs of a certain site, and it
turned out that there were multiple sites in an environment with balanced web servers.
Of course, access logs are different for each server.
The site has been around for a long time and doesn't even have nice tools like fluentd.
So, since it was a somewhat urgent requirement, I wanted to quickly integrate each time series into one file.
there was. There was. A good tool.
"logresolvemerge.pl"
This is a script that is included in an access analysis tool called "awstats" that I haven't seen recently.
Nowadays, when it comes to access analysis, "Google Analytics" is very famous, but
I remember that a while ago, "awstats" was used as a matter of course. (Yamada research)
I wonder what I can do with this.
This runs logresolvemerge in command line to open one or several
server log files to merge them (sorted on date) and/or to make a reverse
DNS lookup (if asked). The result log file is sent on standard output.logresolvemerge works on the command line to
merge one or several web server log files (sorted by date) and
(optionally) perform a reverse DNS lookup. The log contents of the execution results are output to the standard output.
Source: AWStats logfile analyzer 7.5 Documentation
oh! This this!
Usage: logresolvemerge.pl [options] file (date sorting of only one file) logresolvemerge.pl [options] file1 ... filen (date sorting and merging of multiple files) logresolvemerge.pl [options] *.* (date sorting of multiple files (sort and merge all logs by date) perl logresolvemerge.pl [options] *.* > newfile (sort and merge all logs in the directory by date and write to newfile) Options: -dnslookup make a reverse DNS lookup on IP adresses -dnslookup=n same with an parallel threads instead of serial requests -dnscache=file make DNS lookup from cache file first before network lookup (Performs DNS lookup from cache file "file" before network lookup) -showsteps print on stderr benchmark information every 8192 lines (Outputs stderr benchmark information every 8192 lines) -addfilenum if -addfilename if used with several files, file name can be added in first field of output file. This can be used to add a cluster id when log files come from several load balanced computers. -stoponfirsteof Stop processing when any logfile reaches end-of-file. (Stops processing when the last line of any one of multiple files is reached.) -printfields For IIS or W3C logs, prints the latest field header for the currentlog file when switching between log file entries so that the parsercan automatically determine which fields are available. (For IIS or W3C logs, outputs the latest field headers.) do not have)
As a caution, it states that ``there is no guarantee,'' ``sorting is not exact,'' and ``it is not a tool for sorting a single file.''
It's a free tool so don't get angry if you have any problems.
Let's try it
First of all, install
Fortunately, "awstats" was installed on this server, but
if it is not installed on your server, please install it using the command below.
cd /usr/src/ wget http://prdownloads.sourceforge.net/awstats/awstats-7.3.tar.gz tar zxvf awstats-7.3.tar.gz
Click here if you want to install using yum. (To be honest, if you only want to use logresolvemerge.pl, I think you can use just the source.)
yum install --enablerepo=epel awstats
Now let's actually use the tool.
Log integration
The working directory is set to [/var/tmp], but please change it as appropriate.
*The installation directory for awstats is "/usr/src/awstats-7.3" when it is entered as source.
For now, let's assume that the necessary access logs are placed in the working directory.
Here we go! Access log integration!
cd /var/tmp/ perl /usr/src/awstats-7.3/tools/logresolvemerge.pl web01-access_log web02-access_log > merged-access_log less merged-access_log
Oh my god!
By the way, there seem to be various other tools under the awstats tool directory.
maillogconvert.pl … Converts postfix, sendmail, and qmail logs into human-readable format urlaliasbuilder.pl … Generates a URL alias file from a URL list file
Please use it as a reference if you like!
You can just give up on server management that involves troublesome log analysis