Horacio Sanson
12/26/2005 2:52:00 AM
I found some documentation about this. Thanks.
Just one question, it seems to me that I can make two different things to
allow Regexp's to handle multibyte Shift_JIS strings. One is to set the
$KCODE global variable to "sjis" and the other one is to use the "s" modifier
when constructing the regular expresion.
The question is do I use only one of the two methods or shall I use the "s"
modifier even if I set $KCODE to "sjis"??
My testing tells me that only setting the $KCODE global var is enough to get
Shift_JIS strings and Regexp's to work correctly but I just want to make
sure.
thanks,
Horacio
Monday 26 December 2005 10:29?Horacio Sanson ????????:
> Thanks a lot... this seems to work ok.
>
> Where can I find documentation about this $KCODE global var and the "s"
> thing after each regexp? What does the s exactly mean?
>
> Do I have to put it only in regexps with japanese characters or any regexp?
> I tried both and saw no difference.
>
> When using Regexp.new to construct the regular expression how can I set the
> s to the end of it??
>
> sorry for so many questions but I don't seem to find any docs about these
> options.
>
>
> Horacio
>
> Wednesday 21 December 2005 21:48?Yukihiro Matsumoto ????????:
> > Hi,
> >
> > In message "Re: Multibyte regexps..."
> >
> > on Wed, 21 Dec 2005 18:59:59 +0900, Horacio Sanson
>
> <hsanson@moegi.waseda.jp> writes:
> > |I am having some issues with regular expressions when working with
> > | japanese strings.
> > |
> > |Using ruby-1.8.3 on Windows XP home (Japanese version) I have this test:
> > |
> > |irb(main):271:0> s = "?"
> > |=> "\212\223"
> > |irb(main):272:0> l = "?"
> > |=> "\215s"
> > |irb(main):273:0> l =~ /s/
> > |=> 1
> > |irb(main):274:0> puts "#{$`}<<#{$&}>>#{$'}"
> > |E<s>>
> > |=> nil
> > |irb(main):275:0> "#{$`}<<#{$&}>>#{$'}"
> > |=> "\215<<s>>"
> > |irb(main):276:0> s =~ /l/
> > |=> nil
> >
> > The encoding seems to be Shift_JIS. You have to specify encoding
> > before you make regular expression matching. Put s after every
> > regular expression.
> >
> > $KCODE="sjis" # to make p work right
> > p s = "?"
> > p l = "?"
> > p l =~ /s/s
> > puts "#{$`}<<#{$&}>>#{$'}"
> > p "#{$`}<<#{$&}>>#{$'}"
> > p s =~ /l/s
> >
> > matz.