The Importance of Using zgrep
No, I have not read the Oscar Wilde play but one grave, serious Java programmer says in response to my cubicle ranting, “Hey what is zgrep?” It is a shock to me not to find this Linux command on Wikipedia.org. This command is important to me because other serious Linux admins running my server for hosting companies ‘ship’ my Web access log in a compressed (gzipped) format.
Some of these Linux admins work for hosting companies where the policy is to not hand out these logs on a day-by-day basis. A promising alternative is to have your Web traffic logs for each month. My current hosting service allows me to specify that my monthly logs should be ‘shipped’ to a local directory under my home shell account. After being abused by the practices of DreamHost.com, it was only natural for me not to assume that what I configure in a Web host control panel would actually happen. So access to daily, accruing traffic data is available before month end. It seemed a prudent defensive move to write a script to break this accruing data into daily logs (in gzip format) just in case my monthly logs did not show up (DreamHost.com, by the way, does this for you automatically).
A rough sketch of this script is available at SonghaySystem.com. Before I embarked on this journey into the land of bash, I had to find out that my problem is actually a recognized problem for other people. Yes: the Apache Foundation has a Perl-based tool called
split-logfile. My savage ignorance of Perl led me to look at bash in “Read File line by line with Shell Script”—which turned out to be hopeless slow for large Web logs. So some person named “hans” at kriyayoga.com introduces me to
zgrep in “Split logfiles using grep/zgrep.” His “hard-coded” examples for one file had to be made generic with parameters for multiple files. This need led me to using printf in a bash
So it turned out that my monthly logs turned up just fine and I did not need to use this stuff at all…