TIP: Click on subject to list as thread! ANSI
echo: rberrypi
to: A. DUMAS
from: MARTIN GREGORIE
date: 2020-03-19 14:12:00
subject: Re: Regexes and C

On Thu, 19 Mar 2020 14:29:35 +0100, A. Dumas wrote:

> More or less impossible. E.g. apparently you didn't think that + is a
> valid character, which it is (in the part before the @).
>
The sources I consulted said the only permitted nonalphanumerics in the
usernames are period, hyphen and underscore, just as the only
nonalphanumeric in the domain is the period.

> Also, domains
> (and usernames) can be UTF8. Best way is: try to deliver, check reply.
>
Fair point - I should have said that I'm want to use this as a filter to
prevent cross-site scripting attacks, i.e. to prevent the From address
being used as an attack vector.

Another annoyance with regcomp/regexec is that the common :alnum:
abbreviation is *only* recognised if it occupies the whole set of
alternates, i.e. [:alnum:] works, [.:alnum:_-] doesn't.

All in all this looks like something that would be better done without
using C regexes. IOW, either as a rather messy string comparison game or
in Java using its pattern matching classes.


--
Martin    | martin at
Gregorie  | gregorie dot org

--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | FidoUsenet Gateway (3:770/3)

SOURCE: echomail via QWK@docsplace.org

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.