[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

newb: Rails character encoding and validation

john

12/15/2006 7:47:00 PM

I'm putting together a basic rails application, and writing my first
units tests for it.. It occured to me that the user 'name' field might
want to contain foreign characters (like é,â,ì,ø... etc.) But two
problems have popped up. Firstly, I can't dig up a good reference for a
suitable regular expression for validating the field.
at the moment, I'm using:
validates_format of :name, :with => /^[-' a-zA-Z]+$/
but this isn't going to allow the foreign characters, so the test fails.

The second problem is the error message I get when I run the unit test:
My test framework sets the name to José, the the failure message when I
run the script returns JosÜ.
It looks like the character encoding of my editor isn't the same as the
character encoding that rails is using.

so, a) any clues as to what is going on? and b) is there a consistent
way of dealing with foreign characters for validation purposes?

Many thanks in advance!

Mark.
8 Answers

David Vallner

12/16/2006 8:10:00 PM

0

Luciano Ramalho wrote:
> It is not really viable to validade a name field with a regex if you
> are willing to accept Unicode characters. The only reasonable
> validation is to check whether the field is empty.
>

It is so viable. Just not using [a-zA-Z].

Character classes are your friend.

David Vallner

12/16/2006 8:15:00 PM

0

David Vallner wrote:
> Luciano Ramalho wrote:
>> It is not really viable to validade a name field with a regex if you
>> are willing to accept Unicode characters. The only reasonable
>> validation is to check whether the field is empty.
>>
>
> It is so viable. Just not using [a-zA-Z].
>
> Character classes are your friend.
>

For clarification: I am unsure just how well Ruby's regexp engine
handles Unicode "extended latin" characters, a trivial test using $KCODE
= 'u', require 'jcode', and iconv failed for me. But that could be me
getting the codepages wrong. The above is just saying that there is
nothing saying that a regexp engine properly supporting Unicode and
character classes would be unsuitable to validate non-ASCII text.

David Vallner

ramalho@gmail.com

12/16/2006 8:32:00 PM

0

On 12/16/06, David Vallner <david@vallner.net> wrote:
> It is so viable. Just not using [a-zA-Z].
>
> Character classes are your friend.

You mean, using the Unicode database?

Yes, I know that is possible. My point was that it is not worthwhile
(that's why I wrote "not viable" instead of "impossible"; sorry if I
was not clear: English is not my first language).

Besides all sort of letter-like characters, ideograms and so on, a
person's name may contain hyphens, apostrophes and who-knows-what
other characters.

Remember Prince's name when he used to be called "the artist formerly
known as Prince"? [1]

I just do not think it is "economically viable" the effort to try to
validate a name, except to verify that it contains something other
than blanks.

BTW, which would be a safe way to know whether a Unicode string
contains something other than blanks? Because AFAIK unicode has many
other blank characters besides the old ASCII ones. Can a Ruby regex
cope with that?

Cheers,

Luciano

[1] http://en.wikipedia.org/wi...(musician)

David Vallner

12/16/2006 8:47:00 PM

0

Luciano Ramalho wrote:
> I just do not think it is "economically viable" the effort to try to
> validate a name, except to verify that it contains something other
> than blanks.
>

That is true. It doesn't have anything to do with Unicode however, as I
think your post implied.

Speaking of which, I wonder if there's a database name record out there
at all containing someone with a retroflex click in his name. And if
it's recorded as the exclamation mark, or U+01C3 ;P

> BTW, which would be a safe way to know whether a Unicode string
> contains something other than blanks? Because AFAIK unicode has many
> other blank characters besides the old ASCII ones. Can a Ruby regex
> cope with that?
>

It Should Be Able To.

I think at least oniguruma can do this sort of "industrial-strength"
processing, no idea about the current engine.

Speaking of which, is there a Oniguruma 1.8 backport (?) that you could
use as an add-on regexp engine? (I think currently you can use it as a
drop-in replacement if you built Ruby from source, I was thinking of a
more orthogonal way of using the Shiny Features. Where orthogonal really
means from a binary gem.)

David Vallner

cloud dreamer

2/20/2012 8:28:00 PM

0

On 20/02/2012 4:33 PM, Patty Winter wrote:
> In article<jhu2kb$r3i$9@news.albasani.net>,
> Jim G.<jimgysin@geemail.com> wrote:
>
>
> [snip snip]
>
>
>> But you guys just keep doing what you're doing. Eventually, everyone
>> who's not a sock will have you killfiled, which will eliminate the
>> "bother" of dealing with real people and free up *all* of your time to
>> play with the socks.
>
> Which is exactly why Suzeeq and Barry have been in my killfile for
> a long time. And Erilar is about to go in, too. She's posted multiple
> followups to Seamus just in the past couple of days.
>
> How to get into a killfile:
>
> * Be a troll, or...
> * Keep responding to a troll

* Or try to help those who can't identify Seamus by making a simple post
to say "Seamus alert."

* EE

..


--
We must change the way we live
Or the climate will do it for us

suzeeq

2/20/2012 8:34:00 PM

0

Patty Winter wrote:
> In article <jhu2kb$r3i$9@news.albasani.net>,
> Jim G. <jimgysin@geemail.com> wrote:
>
>
> [snip snip]
>
>
>> But you guys just keep doing what you're doing. Eventually, everyone
>> who's not a sock will have you killfiled, which will eliminate the
>> "bother" of dealing with real people and free up *all* of your time to
>> play with the socks.
>
> Which is exactly why Suzeeq and Barry have been in my killfile for
> a long time. And Erilar is about to go in, too. She's posted multiple
> followups to Seamus just in the past couple of days.
>
> How to get into a killfile:
>
> * Be a troll, or...
> * Keep responding to a troll

Too bad you're so restrictive on people you'll read.

erilar

2/20/2012 10:46:00 PM

0

In article <4f42a70b$0$12028$742ec2ed@news.sonic.net>,
Patty Winter <patty1@wintertime.com> wrote:

> In article <jhu2kb$r3i$9@news.albasani.net>,
> Jim G. <jimgysin@geemail.com> wrote:
>
>
> [snip snip]
>
>
> >But you guys just keep doing what you're doing. Eventually, everyone
> >who's not a sock will have you killfiled, which will eliminate the
> >"bother" of dealing with real people and free up *all* of your time to
> >play with the socks.
>
> Which is exactly why Suzeeq and Barry have been in my killfile for
> a long time. And Erilar is about to go in, too. She's posted multiple
> followups to Seamus just in the past couple of days.
>
> How to get into a killfile:
>
I've been posting to other people--as you are here 8-)

--
Erilar, biblioholic medievalist


trotsky

2/21/2012 12:39:00 AM

0

On 2/20/12 2:03 PM, Patty Winter wrote:
> In article<jhu2kb$r3i$9@news.albasani.net>,
> Jim G.<jimgysin@geemail.com> wrote:
>
>
> [snip snip]
>
>
>> But you guys just keep doing what you're doing. Eventually, everyone
>> who's not a sock will have you killfiled, which will eliminate the
>> "bother" of dealing with real people and free up *all* of your time to
>> play with the socks.
>
> Which is exactly why Suzeeq and Barry have been in my killfile for
> a long time. And Erilar is about to go in, too. She's posted multiple
> followups to Seamus just in the past couple of days.
>
> How to get into a killfile:
>
> * Be a troll, or...
> * Keep responding to a troll


Sadly, though, none of this bullshit is TV related.