Message94056
> That's why I wrote 'without checking if they are in range(256)'; the
fact that this regex matches invalid digits was not relevant in my
example (and it's usually easier to convert the digits to int and check
if 0 <= digits <= 255). :)
NO ! You have to check also the number of digits for values below 100 (2
digits only) or below 10 (1 digit only)
And when processing web log files for example, or when parsing Wiki
pages or emails in which you want to autodetect the presence of ONLY
valid IP addresses within some contexts, where you want to transform
them to another form (for example when converting them to links or to
differentiate 'anonymous' users in wiki pages from registered named
users, you need to correctly match these IP addresses. In addition,
these files will often contain many other occurences that you don't want
to transform, but just some of them in specific contexts given by the
regexp. for this reason, your suggestion will often not work as
expected.
The real need is to match things exactly, within their context, and
capturing all occurences of capturing groups.
I gave the IPv4 regexp only as a simple example to show the need, but
there are of course much more complex cases, and that's exactly for
those cases that I would like the extension: using alternate code with
partial matches and extra split() operations give a code that becomes
tricky, and most often bogous. Only the original regexp is precise
enough to parse the content correctly, find only the matches we want,
and capturing all the groups that we really want, in a single operation,
and with a near-zero cost (and without complication in the rest of the
code using it). |
|
| Date |
User |
Action |
Args |
| 2009-10-14 23:52:12 | verdy_p | set | recipients:
+ verdy_p, ezio.melotti, r.david.murray |
| 2009-10-14 23:52:12 | verdy_p | set | messageid: <[email protected]> |
| 2009-10-14 23:52:10 | verdy_p | link | issue7132 messages |
| 2009-10-14 23:52:10 | verdy_p | create | |
|