Regular Expressions Cookbook, 2nd Edition
by Steven Levithan
Published by
O'Reilly Media, Inc., 2012
and
Tags
| Regex options: Free-spacing, case insensitive |
| Regex flavors: .NET, Java 7, XRegExp, PCRE 7, Perl 5.10, Ruby 1.9 |
\b(?:(?P<dec>[1-9][0-9]*) | (?P<oct>0[0-7]*) | 0x(?P<hex>[0-9A-F]+) | 0b(?P<bin>[01]+) )(?P<L>L)?\b
| Regex options: Free-spacing, case insensitive |
| Regex flavors: PCRE 4, Perl 5.10, Python |
\b(?:([1-9][0-9]*)|(0[0-7]*)|0x([0-9A-F]+)|0b([01]+))(L)?\b
| Regex options: Case insensitive |
| Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
This regular expression is essentially the combination of the solutions presented in Recipe 6.5 (decimal), Recipe 6.4 (octal), Recipe 6.2 (hexadecimal), and Recipe 6.3 (binary). The digit zero all by itself can be either a decimal or an octal number. This makes no difference, as it is number zero either way. So we removed the alternative for the number zero from the part of the regex that matches decimal numbers.
We used a noncapturing group around each of the four
alternatives to make sure that the word boundaries and the suffix
L are applied to the regex as a whole, rather than to
just the first and last alternative. Named capturing groups make the
regex easier to read and make it easier to convert the matched number
from text into an actual number in procedural code. JavaScript and Ruby
1.8 do not support named capture. For these languages, you can use the
alternative solution with five numbered capturing groups.
Chapter 6 has all the details on matching integer and floating-point numbers with regular expressions. In addition to the techniques explained there, this recipe uses named capture (Recipe 2.11) and free-spacing (Recipe 2.18).