Regular Expressions Cookbook, 2nd Edition
by Steven Levithan
Published by
O'Reilly Media, Inc., 2012
and
Tags
| Regex options: ^ and $ match at line breaks |
| Regex flavors: .NET, Java 7, XRegExp, PCRE 7, Perl 5.10, Ruby 1.9 |
^(?P<client>\S+)●\S+●(?P<userid>\S+)●\[(?P<datetime>[^\]]+)\]↵ ●"(?P<method>[A-Z]+)●(?P<request>[^●"]+)?●HTTP/[0-9.]+"↵ ●(?P<status>[0-9]{3})●(?P<size>[0-9]+|-)●"(?P<referrer>[^"]*)"↵ ●"(?P<useragent>[^"]*)"
| Regex options: ^ and $ match at line breaks |
| Regex flavors: PCRE 4, Perl 5.10, Python |
^(\S+)●\S+●(\S+)●\[([^\]]+)\]●"([A-Z]+)●([^●"]+)?●HTTP/[0-9.]+"↵ ●([0-9]{3})●([0-9]+|-)●"([^"]*)"●"([^"]*)"●"([^"]*)"●"([^"]*)"
| Regex options: ^ and $ match at line breaks |
| Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
The Combined Log Format is the same as the Common Log Format, but
with two extra fields added at the end of each entry, and the first
extra field is the referring URL. The second extra field is the user
agent. Both appear as double-quoted strings. We can easily match those
strings with ‹"[^"]*"›. We
put a capturing group around the ‹[^"]*› so that we can easily retrieve the referrer
or user agent without the enclosing quotes.