TIP: Click on subject to list as thread! ANSI
echo: rberrypi
to: DAN CROSS
from: MARTIN GREGORIE
date: 2020-03-19 17:42:00
subject: Re: Regexes and C

On Thu, 19 Mar 2020 15:19:36 +0000, Dan Cross wrote:

>
> What do you mean "doesn't provide any way to anchor a regex to either
> end of a string"?  That's what the `^` and `$` metacharacters in the
> regex are for, and they're fully supported by the library.
>
Just that:

My original regex was

"[a-zA-Z0-9][.a-zA-Z0-9_-]*@[a-zA-Z0-9][a-zA-Z0-9.]*[a-zA-Z0-9]*"

and matched a string  containing "a bc@d.e", so I changed it to

"^[a-zA-Z0-9][.a-zA-Z0-9_-]*@[a-zA-Z0-9][a-zA-Z0-9.]*[a-zA-Z0-9]*$"

and it *still* matched that string. So I reread regex(7) and this time
noticed:

'^' (matching the null string at the beginning of a line),
'$' (matching the null string at the end of a line)

Which, by its discussion of lines, seems to imply that regcomp/regexec
thinks strings, i.e. shell parameters are somehow different from strings
that have been filled by reading lines from a file.

>>This does the trick, but no thanks to the man pages regex(3), which
>>describes the C functions, and regex(7), which describes the regex
>>syntax.
>>Both are poorly formatted, hard to read, and seem to have omitted useful
>>information, such as the inability of specifying anchor points in
>>strincs that DO NOT contain newlines.
>
> Could you clarify what you mean?  '$' will match the empty string at the
> end of a line, '^' matches the empty string at the beginning of a line.
>
Exactly so. But they don't match the ends of a string that was passed in
as a command-line parameter.

> As far as other libraries, if you can link against C++ code, the RE2
> library is very nice.
>
I tried getting int C++ years ago when it first became common (think
Borland C++) and hated it, found Bjarne Stoustrup's C++ far below the
standard set by K&R and finally gave it up when I found all too much C++
code was in face just ANSI C with // comment delimiters.

Java beats the crap out of it, IMO anyway.

> You'd want something that covers the POSIX interfaces.
>
Quite possibly, though I'm constantly surprised by how useful and
relevant it still is. This is about the first time it hasn't come up with
the goods, though that says at least as much about how stable the C
standard library's APIs are.

Would you care to recommend a POSIX book thats as good as the SVR4 one
was in its time?


--
Martin    | martin at
Gregorie  | gregorie dot org

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | FidoUsenet Gateway (3:770/3)

SOURCE: echomail via QWK@docsplace.org

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.