Workaround for broken varnishncsa logging due to shell mishandling of spaces in LOG_FORMAT variables

For security, troubleshooting and client billing reasons went to turn on Varnish Cache logging today.

On Debian (should be the same in Ubuntu), with varnish 3.0.4 packages installed from repo.varnish-cache.org, it's all done in /etc/default/varnishncsa.

Uncommented VARNISHNCSA_ENABLED=1, added a custom LOG_FORMAT line with fields I wanted my logs to contain, using the varnishncsa manual page as my guide. Started up varnishncsa with sudo service varnishncsa start, only to find just the first field from my custom log format logged (remote host, or %h, in my case).

This is an issue previously reported by others, see varnishncsa -F flag custom log format and varnishncsa -F question, for instance. The issue basically has to do with how the shell interprets variables with spaces.

Struggled for a while looking for the right sequence of escapes and variable substitutions, eventually giving in and modifying the init script directly (/etc/init.d/varnishncsa). That felt sub-optimal (having to re-apply the modification during package updates, etc).

Then realized that I'm going through all this trouble just to preserve damned spaces in logs. Those are a pain to work with anyway, especially when some log value is empty, one ends up with an extra space to deal with, making log analysis harder.

Since I don't have to deal with historical logs for compatibility, this is an opportunity to setup logging exactly how I want it -- to make my log parsing and analysis life easier.

With that in mind, I quickly settled on the pipe as the field delimiter, and that worked beautifully, making it easy to parse logs and run reports.

Here's an example log entry as it appears in the log:

000.000.000.000|-|2013-10-21|19:24:21|-0700|GET|example.com|/foo/bar|?query-string|200|1805|http://example.com/baz|Mozilla/5.0|0.000040531|0|0.000101|hit|4824091.254|backend-host|HIT (22261)

Here's the same entry with fields split and numbered for illustration:

awk -F'|' '$7="example.com" && $7=="/foo/bar" && 9!="" {print}' /var/log/varnish/varnishncsa.log | tail -n1 |  tr '|' '\n' | cat -n
     1	000.000.000.000
     2	-
     3	2013-10-21
     4	19:24:21
     5	-0700
     6	GET
     7	example.com
     8	/foo/bar
     9	?query-string
    10	200
    11	1805
    12	http://example.com/baz
    13	Mozilla/5.0
    14	0.000040531
    15	0
    16	0.000101
    17	hit
    18	4824091.254
    19	backend-host
    20	HIT (22261)

Below is my /etc/default/varnishncsa config that produces the above log output. At the bottom, I put a little legend identifying what each format variable means to refer to when log parsing.

# Configuration file for varnishncsa
#
# Note: If systemd is installed, this file is obsolete and ignored.  You will
# need to copy /lib/systemd/system/varnishncsa.service to /etc/systemd/system/
# and edit that file.
#
# Uncomment this to enable logging for varnish.  Please make sure you have
# enough disk space for significant amounts of log data.  To disable logging,
# set the variable to "0", "no", or leave it unset.
#
# NCSA log format, to be used by HTTP log analyzers
VARNISHNCSA_ENABLED=1
 
LOG_FORMAT="%h|%u|%{%Y-%m-%d}t|%{%H:%M:%S}t|%{%z}t|%m|%{Host}i|%U|%q|%s|%b|%{Referer}i|%{User-agent}i|%{Varnish:time_firstbyte}x|%T|%D|%{Varnish:handling}x|%{X-FE-Varnish-Obj-TTL}o|%{X-FE-Varnish-Backend}o|%{X-FE-Varnish-Obj-Stat}o"
 
DAEMON_OPTS="-a -w ${LOGFILE} -D -P ${PIDFILE} -F ${LOG_FORMAT}"
 
#--------------------------------------------------------
#                   LOG_FORMAT legend:
#--------------------------------------------------------
#
# Check with: tail -n1 varnishncsa.log | tr '|' '\n' | cat -n
#
# 1 %h
#   Remote host. Should be IP of the client. In VCL this is set
#   from X-Forwarded-For, if not coming from LB.
# 
# 2 %u
#   Remote auth user.
# 
# 3 %{%Y-%m-%d}t
#   date in YYYY-MM-DD format.
# 
# 4 %{%H:%M:%S}t
#   time in HH:MM:SS format.
# 
# 5 %{%z}t
#   timezone, e.g. -0700.
# 
# 6 %m
#   Request method.
#
# 7 %{Host}i
#   Host header supplied in HTTP request.
# 
# 8 %U
#   The request URL without any query string.
# 
# 9 %q
#   The query string, if supplied.
# 
# 10 %s
#   Status sent to the client.
# 
# 11 %b
#   Size of response in bytes, excluding HTTP headers.
# 
# 12 %{Referer}i
#   The contents of request header.
# 
# 13 %{User-agent}i
#   The contents of request header.
# 
# 14 %{Varnish:time_firstbyte}x
#   Time from when the request processing starts until the
#   first byte is sent to the client.
# 
# 15 %T
#   Time taken to serve the request, in seconds.
# 
# 16 %D
#   Time taken to serve the request, in microseconds.
# 
# 17 %{Varnish:handling}x
#   How the request was handled, whether it was a cache hit, 
#   miss, pass, pipe or error.
# 
# 18 %{X-FE-Varnish-Obj-TTL}o
#   Custom header showing remaining TTL of the object.
# 
# 19 %{X-FE-Varnish-Backend}o
#   Custom header showing backend for the object.
# 
# 20 %{X-FE-Varnish-Obj-Stat}o
#   Custom header showing whether request resulted in cache HIT
#   or MISS, if HIT, the number of HITs thus far for the object.

1 Comment

  • 1. M. Omer Golgeli replies at 19th March 2014, 5:32 am :

    Thanks,

    Now I figured put why my daemon wasn’t working.
    Also, I want to mention that we’re not just limited to %h, you can log either %{X-Forwarded-For}i OR %{CF-Connecting-IP}i by custom log format as well. Might come handy to those who’re beind the CloudFlare network… Though I had to disable it since I can’t do anything with the IPv6 addresses in my in logs.

    For me, the logs are handy since I’m planning on using Fail2Ban.
    Anyway,

    Thanks again.

Leave a comment

NOTE: Enclose quotes in <blockquote></blockquote>. Enclose code in <pre lang="LANG"></pre> (where LANG is one of these).