[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Email Address Regex [was Re: silly regex question]

Jacob Fugal

1/3/2006 11:42:00 PM

On 1/3/06, Dan Kohn <dan@dankohn.com> wrote:
> Here's a rails example for validating email addresses.
>
> validates_format_of :login, :with => /
> ^[-^!$#%&'*+\/=?`{|}~.\w]+
> @[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*
> (\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*)+$/x,
> :message => "must be a valid email address",
> :on => :create

Be careful with email validation via regex, it's harder than you might
think[1][2]:

/^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B\x5D
-\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?\/`!
|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.
([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[
\x00-\x7F])*\]))*$/

Jacob Fugal

[1] From http://phantom.byu.edu/pipermail/uug-list/2004-January/0...
[2] That regex needs some serious /x treatment, which I didn't know
about at the time it was written.


28 Answers

Tim Fletcher

1/4/2006 10:19:00 AM

0

http://tfletcher.com/lib...

(doesn't look quite as messy :)

Andreas S.

1/4/2006 12:48:00 PM

0

Jacob Fugal wrote:
> On 1/3/06, Dan Kohn <dan@dankohn.com> wrote:
>> Here's a rails example for validating email addresses.
>>
>> validates_format_of :login, :with => /
>> ^[-^!$#%&'*+\/=?`{|}~.\w]+
>> @[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*
>> (\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*)+$/x,
>> :message => "must be a valid email address",
>> :on => :create
>
> Be careful with email validation via regex, it's harder than you might
> think[1][2]:
>
> /^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B\x5D
> -\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-> x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?\/`!
> |#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.
> ([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[
> \x00-\x7F])*\]))*$/

It is trivial to create a formally correct address that makes absolutely
no sense, so what's the point of doing such a complicated and
error-prone validation?

--
Posted via http://www.ruby-....


Matthew Smillie

1/4/2006 1:15:00 PM

0

On Jan 4, 2006, at 12:47, Andreas S. wrote:

> Jacob Fugal wrote:
>> On 1/3/06, Dan Kohn <dan@dankohn.com> wrote:
>>> Here's a rails example for validating email addresses.
>>>
>>> validates_format_of :login, :with => /
>>> ^[-^!$#%&'*+\/=?`{|}~.\w]+
>>> @[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*
>>> (\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*)+$/x,
>>> :message => "must be a valid email address",
>>> :on => :create
>>
>> Be careful with email validation via regex, it's harder than you
>> might
>> think[1][2]:
>>
>> /^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B
>> \x5D
>> -\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]
>> +|"([\x00->> x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?
>> \/`!
>> |#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])
>> (\.
>> ([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|
>> \\[
>> \x00-\x7F])*\]))*$/
>
> It is trivial to create a formally correct address that makes
> absolutely
> no sense, so what's the point of doing such a complicated and
> error-prone validation?

Job security? I mean, without pointer arithmetic and its associated
mysteries (negative array indices were a personal favourite), we need
something to keep us gainfully employed!

matthew smillie.


Tim Fletcher

1/4/2006 2:06:00 PM

0

By "error prone" do you mean that it won't detect addresses that don't
exist?

Is it not still better to catch some errors than none at all?

Jacob Fugal

1/4/2006 5:01:00 PM

0

On 1/4/06, Tim Fletcher <twoggle@gmail.com> wrote:
> http://tfletcher.com/lib...
>
> (doesn't look quite as messy :)

Yeah, as I said in the footnote, the regex I posted needed some
readability treatment. Yours looks pretty nice, and exactly equivalent
except for a typo in quoted_pair:

- quoted_pair = '\\x5c\\x00-\\x7f'
+ quoted_pair = '\\x5c[\\x00-\\x7f]'

Jacob Fugal


Jacob Fugal

1/4/2006 5:12:00 PM

0

On 1/4/06, dblack@wobblini.net <dblack@wobblini.net> wrote:
> See also: http://www.ex-parrot.com/~pdw/Mail-RFC822-Ad...

Yeah, I've seen that one as well. My regex is only meant to match the
definition of an 'addr-spec' token (described as "global" or "simple"
address) in section 6.1 of the RFC822 grammar, as opposed to a
'mailbox' or 'address'. I figure people aren't going to type the "John
Doe <john@doe.com>" format into a form, nor named lists ('group' token
in the grammar).

Jacob Fugal


Andreas S.

1/4/2006 5:18:00 PM

0

Tim Fletcher wrote:
> By "error prone" do you mean that it won't detect addresses that don't
> exist?

No, I mean that it might declare some addresses invalid although they
aren't.

--
Posted via http://www.ruby-....


Jacob Fugal

1/4/2006 5:24:00 PM

0

On 1/4/06, Andreas S. <f@andreas-s.net> wrote:
> Tim Fletcher wrote:
> > By "error prone" do you mean that it won't detect addresses that don't
> > exist?
>
> No, I mean that it might declare some addresses invalid although they
> aren't.

You'll see from my comments in the original post[1] and in my reply to
David Black in the other thread[2] that this regex is indeed compliant
with a single, non-named address as defined by the RFC[3].

Jacob Fugal

[1] http://phantom.byu.edu/pipermail/uug-list/2004-January/0...
[2] [ruby-talk:174081]
[3] http://www.faqs.org/rfcs/r...


Andreas S.

1/4/2006 5:57:00 PM

0

Jacob Fugal wrote:
> On 1/4/06, Andreas S. <f@andreas-s.net> wrote:
>> Tim Fletcher wrote:
>> > By "error prone" do you mean that it won't detect addresses that don't
>> > exist?
>>
>> No, I mean that it might declare some addresses invalid although they
>> aren't.
>
> You'll see from my comments in the original post[1] and in my reply to
> David Black in the other thread[2] that this regex is indeed compliant
> with a single, non-named address as defined by the RFC[3].

Possibly. Still, I prefer a simple solution over a complicated one. What
type of errors do you hope to catch with this huge regex? Typing errors?
Deliberately entered rubbish? The regex accepts just about anything with
a "@", e.g. "$@$".

--
Posted via http://www.ruby-....


Jacob Fugal

1/4/2006 6:19:00 PM

0

On 1/4/06, Andreas S. <f@andreas-s.net> wrote:
> Jacob Fugal wrote:
> > On 1/4/06, Andreas S. <f@andreas-s.net> wrote:
> >> Tim Fletcher wrote:
> >> > By "error prone" do you mean that it won't detect addresses that don't
> >> > exist?
> >>
> >> No, I mean that it might declare some addresses invalid although they
> >> aren't.
> >
> > You'll see from my comments in the original post[1] and in my reply to
> > David Black in the other thread[2] that this regex is indeed compliant
> > with a single, non-named address as defined by the RFC[3].
>
> Possibly. Still, I prefer a simple solution over a complicated one. What
> type of errors do you hope to catch with this huge regex? Typing errors?
> Deliberately entered rubbish? The regex accepts just about anything with
> a "@", e.g. "$@$".

Not possibly. Gauranteed. It's compliant to the portions of the RFC I mentioned.

Still, I'll concede it doesn't prevent rubbish from being entered. The
domain of valid email addresses is much larger than the domain of
*actual* email addresses. I'm not claiming that this regex should even
be used for form validation. I dislike email validation period. My
intent in first writing the regex two years ago and bringing it up
again now is mostly:

1) To show off my regex-fu
2) To demonstrate the inadequacy of simplistic regexes for email validation