You can’t be your own ISP without having some pretty bar graphs and pie charts to impress your friends with how well you’re doing. And no one does them better than The Webalizer. So today we want to add Webalizer to our ISP-In-A-Box and henceforth you’ll have daily statistics for your web site that you can review and analyze ad nauseum. These include summaries of hits, files, pages, and kilobytes for each hour of the day, each day of the week, each URL on your web site, and each entry and exit page of your site plus you get listings of the top referrers to your pages, the top search strings, the top user agents, and totals by Apache response code. Not bad for just installing a free piece of software. Right? Well, not so fast! Webalizer, as it turns out, is one of thousands of little Unix gems sitting out there that is virtually worthless in the current Mac world unless you have a fairly good grasp of Unix because no one has taken the time lately to actually make it work and document what it takes. One would think with all the resources that Apple pours into hardware and software development (not to mention publicity), they could hire just one person to comb through applications (like Webalizer) and clean up the installation routines to keep them up to date with the shipping version of their OS. Alas, we don’t live in a perfect world, do we? The bottom line is that if you simply download the Webalizer package, which incidentally claims to have a Mac OS X installer, it won’t work. So let’s be the good citizens that we are and at least put the pieces together so that it’s usable with Mac OS X v10.3, aka Panther. I’m anything but a Unix guru so you’ll probably want to read the comments to this article (from some real experts) that will tell you all the shortcuts I could have taken if only I had known what I was doing. As they say, you get what you pay for. But, you never know. Some energetic whiz kid may come along and read what we’ve done and decide to automate the whole process with a script. That would be great at least until Mac OS X v10.4 is released. Then we’re back to square one again. See what I mean about having an Apple employee do it.
Here’s our plan of attack with Webalizer. We’re going to download the Webalizer package and then manually put the pieces where they should go to make things work smoothly. We’ll build a directory off of our main web site to house the daily Webalizer web pages. I’ll provide you a cleaned up configuration file to download and drop in the /etc folder on your server so Webalizer can find it. The config file just tells Webalizer where we’ve put stuff. Then we’ll clean out the old Apache log file and tweak the Apache web server config file to output more detailed logs so that Webalizer can paint pretty pictures for you. After restarting the web server, you’ll have a new Apache log file to support Webalizer. Finally, we’ll introduce crontab and try out our Webmin program from last week to schedule Webalizer to update its data once a day. Then you’ll be able to go to http://localhost/webalizer or your Internet address and look at all the statistical information about your web site whenever you wish.
Prerequisites. Beginning with this chapter, we’ll list the other ISP-In-A-Box projects you must complete before starting this one. For the Webalizer project, you first must enable the Apache Web Server and at least access one web site on your local machine. This was all covered in our first ISP-In-A-Box installment. You’ll also need to install and activate WebMin to complete the optional crontab portion of this tutorial.
Obtaining The Webalizer. We’re going to be using Webalizer 2.0-10 which is the current stable version of the software. It’s available from a number of sources. The easiest is probably MacUpdate, but it’s also available for Mac OS X on the Webalizer web site. This software is packaged as a tarball so, once you download it to your desktop, it should decompress into a folder named webalizer-2.01-10-macosx. You also need to download my customized version of the Webalizer config file. Just Control-Click here and Save the Linked File to your desktop as webalizer.conf. Once the download completes, drag it into the Webalizer installation folder to keep things tidy. Now drag the Webalizer installation folder to your Applications folder. We’ll work with it from there. Do not run either of the installation scripts! For those that don’t trust their mother (much less their teacher), here’s what I did with the config file. I started with the sample.conf file which is in the Webalizer download folder. However, it had the wrong Mac location for the Apache log file (which is what Webalizer uses to prepare its charts and data), and we needed a customized web site location to house the Webalizer web pages so I’ve plugged that in as well. If you’d like to look for yourself, open the file with TextEdit, not WorldText. For now, don’t change anything else in the config file, or you’re on your own.
Apache Housekeeping. As mentioned, we have to do a couple things with the Apache web server to get the most out of The Webalizer. We’re going to modify log file format so that we get more informative statistics. Then we’re going to delete the current log file (actually we’ll rename it so you don’t get too nervous). And finally we will restart the Apache web server which will build us a new log file with the proper format for The Webalizer.
Open a Terminal window by going to your Applications/Utilities folder and clicking Terminal. Switch to root user access: sudo su. Provide your admin password if prompted. Now let’s move to the directory where the Apache configuration file is stored: cd /etc/httpd. Let’s make a copy of our config file just in case something goes wrong: cp httpd.conf httpd.conf.save. Then you could copy it back if you need to. Now let’s edit the config file: pico httpd.conf. Be careful here! Let’s first find where we need to make our logfile format change: Ctrl-W, logformat, and then enter. Now press the down-arrow key exactly 12 times. You should be at the beginning of a line which reads: CustomLog "/private/var/log/httpd/access_log" common. Insert a pound sign at the beginning of this line by pressing #. Now press the down-arrow key exactly 13 times. You should be at the beginning of a line which reads: #CustomLog "/private/var/log/httpd/access_log" combined. Delete the pound sign at the beginning of this line by pressing Ctrl-D. The # sign should disappear. Now save your changes: Ctrl-X, Y, and press enter.
We’ve configured Apache to generate log entries in the new format, but we still have a log file in the old format. So let’s rename it. Move to the Apache log file directory: cd /var/log/httpd. Now rename the log file: mv access_log access_log.save. To generate a new empty log file in the new format, we need to restart Apache: Click on the Apple icon in the upper-left corner of your screen, choose System Preferences, and click on the Sharing folder. Uncheck the check box beside Personal Web Sharing and wait for your web server to shut down. Now check the check box beside Personal Web Sharing to restart Apache. Command-Q closes System Preferences. That wasn’t so bad, was it?
Installing The Webalizer. Now we’re ready to install our Webalizer application. All we need to do is copy the application files to their permanent home and put the Webalizer config file in a place where Webalizer can find it when it runs. Last but not least, we need to create a directory to store our Webalizer web pages which the program will generate each day.
You should still have a Terminal session with root access open. If not, do it again using the instructions above. Now let’s move to the directory where our installation files are stored. cd /Applications/webalizer-2.01-10-macosx. There are only three files we need to copy to get Webalizer going:
To make sure everything works, first open a web browser and go to http://localhost. This will create an entry in your Apache log file.
Now run Webalizer once in a Terminal window: sudo /usr/local/bin/webalizer
Switch back to your web browser and go to http://localhost/webalizer/. Wasn’t that easy!
You can manually run Webalizer as we just did whenever you want to, or you can put an entry in your cron file and have your Mac run it automatically each day. We need to learn about cron files for some future projects anyway so let’s automate the process so your Webalizer statistics are generated once each day.
First start up WebMin if it’s not already running on your server: sudo /etc/webmin/start. Then open Webmin with your web browser: http://localhost:10000. Now choose System, Scheduled Cron Jobs and then click Create a New Scheduled Cron Job. The form shown above will display. Fill in the form with the values in italics:
Now look at the bottom section of the form and click on a minute and an hour using a 24-hour clock to designate when Webalizer should be run. Leave All selected for the Days, Weeks, and Months options. You might want to select a time a few minutes from now just to be sure everything works properly. Then you can adjust the time later by clicking on this cron job in the System, Scheduled Cron Jobs web page of WebMin. Once you have chosen a minute and hour, click the Create button to activate the Webalizer cron job. Now access your http://localhost web site several times. Then you can check your Webalizer web site after the time passes to be sure it updated the page hits from your last visits. That’s it for today. Enjoy!
Is there any way of password-protecting webalizer? In case we don’t want anyone that is not authorized to view the statistics.
[WM: Password-protecting web directories is pretty straight-forward. We covered it in our Web Sites 101 tutorial. Good luck.]
Finally had a chance to try password-protection, it works great!
Thanks for this great series.
GREAT series. Everything works perfectly!
How to install and configure webalizer to work on a server with many virtual domains?
Thanks for the series. I’m gradually moving away from my Windows XP system to OS X. One of the things that’s hard for me to get my head around is the BSD underpinnings. Apple hides it pretty well behind it’s user interface.
Occasionally it’s still necessary to get under the hood.
How about a tutorial on setting up SQUID?