Apache config and .htaccess - logging

From Helpful
Jump to: navigation, search
Related to web development, lower level hosting, and such: (See also the webdev category)
JS libraries and frameworks

Server stuff:

Dynamic server stuff:



These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

See this for the meaning of those percent-fields

The "Common Log Format" used by apache (which seems to call it
), and imitated by others, is:
%h %l %u %t \"%r\" %>s %b

...which looks like: - - [17/Apr/2008:12:23:32 +0200] "GET /foo.txt HTTP/1.1" 200 117

The CLF-with-VirtualHost consists of the vhost name followed by the CLF fields:

%v %h %l %u %t \"%r\" %>s %b

...which looks like:

www.example.com - - [17/Apr/2008:12:23:32 +0200] "GET /foo.txt HTTP/1.1" 200 117

The (NCSA) 'extended/combined log format' (apache seems to call it
) is basic CLF plus two extra header values at the end, referer and user-agent:
%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"

...which looks like (line-broken for readability) - - [17/Apr/2008:12:23:32 +0200] "GET /foo.txt HTTP/1.1" 200 117 
     "http://www.example.com/start.html" "Opera/9.20 (Windows NT 6.0; U; en)"

You also see that that in in virtualhost form - apache seems to call it
%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" 

Which looks like

www.example.com - - [17/Apr/2008:12:23:32 +0200] "GET /foo.txt HTTP/1.1" 200 117 
     "http://www.example.com/start.html" "Opera/9.20 (Windows NT 6.0; U; en)"

There are other additions, very usually at the end of one of the above.

Other notes:

  • I've seen some odd variations, e.g. logging the vhost name instead of the client host name/IP - probably a contortion to make some analysis program happy (one that only understood CLF)
  • When you use virtual hosts and log into a single log file, you don't want to use CLF (or combined) because you can't tell the vhosts apart
    • You can log each vhost into a distinctly named log file, or make logging prepend the vhost name
      • (you could even use basic scripting to convert between many CLF ogs and one big vhost log. ...if you have the vhost names handy)
    • What you choose is mostly about convenience. Would you split the logs anyway? Do you keep them for automated analysis?
    • There are two constraints you may wish to consider (particularly when running many vhosts, or just a busy site):
      • Apache doesn't seem to like logs >2GB. You can use [Logging#logrotate|logrotate] or something similar to avoid that, but for extremely busy sites, a little splitting up can be good.
      • the host system's (configured) limit of file descriptors per process

Separated logs/statistics, optional filtering

Typical logging is done via mod_log_config

Most people use:

  • CustomFormat is usually easier to use than LogFormat+TransferLog
    • Specifies where to log (file or pipe)
    • specifies format to use (name from LogFormat, or literal string)

(You can get the same functionality via LogFormat plus TransferLog but that's more verbose

On multiple logs

You could log different things in different places, for example:

# main log in common log format  
CustomLog logs/access_log common
# e.g. for an easy pie chart of user agents
CustomLog logs/agent_log "%{User-agent}i"
#(not so useful while debugging, though, since you don't know what visits these were)

On multiple logs and vhosts

Specifying any logging within a vhost replaces the global setting.

Only when all vhosts should do exactly the same is it useful to only use the global setting (or as a fallback)

And you'll want to specify complete logging behaviour in each vhost. (For ease of management, using file includes can be pretty useful)

On (multiple logs and) filtering

You can selectively filter things to log.

For example, you can use blacklist-style logic to avoid logging local use:

SetEnvIf Remote_Addr "127\.0\.0\.1" dontlog
CustomLog logs/access_log common env=!dontlog

More interesting are uses like:

SetEnvIf Request_URI \.gif$ gif-image
CustomLog gif.log common env=gif-image

Don't log specific requests

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

You can use the same sort of filtering (see previous section) to marking specific requests (at the apache side), then override default logging with custom logging (if you haven't already), and add that as a condition

For example:

SetEnvIf Request_URI "^/api/count$" dontlog
SetEnvIf Remote_Addr "127\.0\.0\.1" dontlog


CustomLog logs/access_log common env=!dontlog

See also:

Don't log at all

On pipes

Apache can pipe into a program instead of writing to a file.

Its main use is to log to another target, e.g. a database, or centralized/unified logging server, or some custom tool. I've used it to count traffic per vhost towards munin.

The program you specify

  • is spawned per apache child
    • so you typically have many instances and you need to avoid races
    • respawned if they stop/crash (for reliability)
  • inherit the userid of that process
which is significant in that this may be root
  • will run via a shell (/bin/sh -c) if you use a single pipe, without it when specifying a double pipe
shell-less may be a little more predictable/cleaner around restarts

Example uses:

  • apache does not rotate logs by default. It does come with a utility to do this:
CustomLog "|/usr/local/apache/bin/rotatelogs /var/log/access_log 86400" common
# or (basically equivalent)
CustomLog "||/usr/local/apache/bin/rotatelogs /var/log/access_log 86400" common

See also