We're back after a server migration that caused effbot.org to fall over a bit harder than expected. Expect some glitches.

Common Log Format

March 2004 | Fredrik Lundh

Here’s a simple regular expression that can be used to parse server log files, in the Common Log Format.

p = re.compile(
    '([^ ]*) ([^ ]*) ([^ ]*) \[([^]]*)\] "([^"]*)" ([^ ]*) ([^ ]*)'
    )

for line in file.readlines():
    m = p.match(line)
    if not m:
        continue
    host, ignore, user, date, request, status, size = m.groups()
    ...

Here’s a variation that parses the Extended Common Log Format, which contains additional referrer and user-agent fields.

p = re.compile(
    '([^ ]*) ([^ ]*) ([^ ]*) \[([^]]*)\] "([^"]*)" ([^ ]*) ([^ ]*)'
    ' "([^"]*)" "([^"]*)"' # extensions
    )


for line in file.readlines():
    m = p.match(line)
    if not m:
        continue
    host, ignore, user, date, request, status, size,\
        referer, agent = m.groups()
    ...