On Thu, 19 Mar 2020 14:24:42 +0000, Roger Bell_West wrote:
> On 2020-03-19, Martin Gregorie wrote:
>>So, can any of you do better, i.e. write a regex that CAN validate the
>>syntax of an e-mail address in terms of its structure and the set of
>>permitted characters on the username and domain parts (the permitted
>>character sets are not the same).
>
> No; email addresses cannot be syntactically validated by regexp alone.
>
OK, I'm starting to see that, so it looks like my current strategy of
inverting a bracket expression containing all the characters that can
legitimately be in an e-mail address is about as far as I can go.
Doing this in either C or Java should be OK, since I'm only looking to
stop From: headers being used as attack vectors on a bash script. AFAICR
Bash only accepts ASCII, so any message whose From: address contains
anything that isn't ASCII alphanumeric, '@', hyphen, underscore or period
can be binned.
--
Martin | martin at
Gregorie | gregorie dot org
--- SoupGate-Win32 v1.05
* Origin: Agency HUB, Dunedin - New Zealand | FidoUsenet Gateway (3:770/3)
|