Robert Klemme
2/13/2007 9:40:00 PM
On 13.02.2007 21:19, Ian Macdonald wrote:
> Hello,
>
> Can anyone explain this to me?
>
> $ echo $LANG
> nl_NL
> $ irb -f
> irb(main):001:0> foo = "préférées"
> => "pr\351f\351r\351es"
> irb(main):002:0> foo =~ /[^[:alnum:]]/
> => nil
> irb(main):003:0> foo =~ /\W/
> => 2
>
> First question: Why does the final statement return 2 instead of nil?
> All characters in foo are alphabetic characters in this locale.
>
> Then:
>
> $ echo $LANG
> nl_NL
> $ cat ./foo
> #!/usr/bin/ruby -w
>
> foo = "préférées"
> p foo =~ /[^[:alnum:]]/
> p foo =~ /\W/
> $ ./foo
> 2
> 2
>
> Huh?
>
> Second question: Why does the first regex match now return 2 instead of
> nil?
>
> To my way of thinking, both statements should always return nil, whether
> or not they are typed into irb or run in a stand-alone script. At the
> very least, both statements should return the same answer, regardless of
> the context.
>
> What am I missing here?
Maybe there is an initialization in .irbrc that leads to a changed
locale inside IRB. Or your IRB belongs to a different Ruby version on
that system.
Other than that, I guess you tripped into the wide and wild country of
i18n - many strange things can be found there. Maybe \w and \W only
treat ASCII [a-z] characters as word characters.
Kind regards
robert