Get Firefox!

Official dbWebLog Home

Description

dbWebLog is going to be the successor to pgLOGd. It is currently in development and as soon as a beta is ready it will be posted here.

dbWebLog is a totally new version that uses threads to achieve the asynchronous functionality that the Postgres C-API offers. I'm doing this for a couple of reasons, the first being to simplify the code a little, the second is to support more databases. I would like the program to be useful to more people, but I'm afraid the masses are using things like MySQL and Oracle...

The major changes I'm making in the new version are:

  1. Threaded. This will allow me to support more backends via plug in and configurable modules. It would also allow more than one backend to be written to at the same time. And last, it keeps the server asynchronous and fast.

  2. Writer Modules. This goes hand-in-hand with the threaded operation and would allow multiple backends to be written to simultaneously. This would allow seamless switching of backend databases for example, writing to flat files if necessary for testing, etc.. The initial "writer modules" would be: PG, MySQL, Oracle, and Flat File, although I don't have a MySQL or Oracle server running, so help in this area would be really nice once I get the writer layout established.

  3. Move away from the FIFO. I like the FIFO idea, but it really didn't work out for a couple of reasons. First, I didn't realize that data written to a FIFO that is larger than the system's PIPE_BUF setting, (which on most OSes is set to 512 bytes) is not atomic. I blindly ignored this in pglogd, so it is possible that log entries will interleave and thus be thrown out. Also, the FIFO must be created and opened for reading before Apache is started, which meant pgLOGd had to be started first. This makes start-up a pain, since Apache also requires pgLOGd to be running all the time and will suffer degraded performance if pgLOGd crashes or is stopped for some reason.

    In dbWebLog I'm moving to Apache's ability to write to an external program via the pipe symbol in the custom_log config parameter. I didn't do this initially because I noticed that a copy of the external program was spawned for each log, in addition to a controlling shell for that spawned program. Two additional processes for each log file was not something I wanted on my server, let along on a very busy server with many virtual hosts. What I failed to realize was that if each custom_log is set to log to the same file (or piped to the same program), then only a single instance is spawned! This is not typically something you would do if logging to a file since all the entries would be crammed into that single file (and it does not work well with cronolog either, which is what I based my tests on.) But, logging to something like dbWebLog works great since the database takes care of sorting out the entries. Also, Apache will take care of starting the external program, and it will restart the program if an instance is not running when it tries to log something. Since the pipe option is just like writing to a file, the FIFO PIPE_BUF limit is removed and the writes are totally atomic.

  4. Configurable logging. Probably the single biggest request I've had was to support such-and-such a logging option. This will be a pretty big task, and it would have to be created in such a way that it did not really affect performance. This may have to be put on the to-do list.

  5. New parsing function. This will probably become modular to work with the configurable logging. At the very least the current parser should be revisited. I didn't know that Apache now escapes certain characters, like double-quotes which are used as field delimiters for some of the fields. Again, something I didn't realize when I wrote the parser (or maybe Apache didn't do the escaping until recently.)

  6. Better support of signals like HUP to totally reread the config file and reconfigure internally without shutting down.

  7. Support for using syslog instead of a flat-file for the dbWebLog's log file (not to be confused with the log entries coming from Apache.)

  8. Logging the error logs to a database. Would probably consist of a virtual host field, timestamp, and entry. Error logging format is so random that you really can't do much more. But at least it would put all the logs in a single place.

Here is a list of stuff that I'd like it to add, but that would be pretty big projects in their own right:

  1. A stats package to work against the database. I was actually considering writing something GUI based for this and maybe releasing it as a commercial product. It does not have to be commercial, but as a starving self-employed programmer, sometimes we have to charge for our work. I want to keep the server components Open Source for sure, and this part seemed like a good place to maybe generate a little revenue. The stats package would be a huge project in its own right, but could also be Open Source as well I suppose if people were to step up to make it happen.

  2. Server side support. This would provide admins a complete solution to allow their customers to login and look at their log files, compiled stats, real-time stats, etc.. I figured this part would be written in PHP (my server side scripting language of choice.)

There are probably more areas that could be broken out, but I can't think of them right the moment.

So, that's the bulk of it for now. Any feedback is welcome.