On Thu, 19 Mar 2020 14:29:35 +0100, A. Dumas wrote:
> More or less impossible. E.g. apparently you didn't think that + is a
> valid character, which it is (in the part before the @).
>
The sources I consulted said the only permitted nonalphanumerics in the
usernames are period, hyphen and underscore, just as the only
nonalphanumeric in the domain is the period.
> Also, domains
> (and usernames) can be UTF8. Best way is: try to deliver, check reply.
>
Fair point - I should have said that I'm want to use this as a filter to
prevent cross-site scripting attacks, i.e. to prevent the From address
being used as an attack vector.
Another annoyance with regcomp/regexec is that the common :alnum:
abbreviation is *only* recognised if it occupies the whole set of
alternates, i.e. [:alnum:] works, [.:alnum:_-] doesn't.
All in all this looks like something that would be better done without
using C regexes. IOW, either as a rather messy string comparison game or
in Java using its pattern matching classes.
--
Martin | martin at
Gregorie | gregorie dot org
--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | FidoUsenet Gateway (3:770/3)
|