[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Reding unicode characters?

dare ruby

3/10/2008 3:37:00 AM

Hi friends,

Could any one help me in writing a method which reads all Unicode
characters supported in ruby or else using regular expressions.

Thanks in advance,


Regards,
Jose Martin
--
Posted via http://www.ruby-....

11 Answers

7stud --

3/10/2008 3:00:00 PM

0

dare ruby wrote:
> Hi friends,
>
> Could any one help me in writing a method which reads all Unicode
> characters supported in ruby or else using regular expressions.
>
> Thanks in advance,
>
>
> Regards,
> Jose Martin

Ruby does not support unicode.
--
Posted via http://www.ruby-....

James Gray

3/10/2008 3:14:00 PM

0

On Mar 10, 2008, at 10:00 AM, 7stud -- wrote:

> dare ruby wrote:
>> Hi friends,
>>
>> Could any one help me in writing a method which reads all Unicode
>> characters supported in ruby or else using regular expressions.
>>
>> Thanks in advance,
>>
>>
>> Regards,
>> Jose Martin
>
> Ruby does not support unicode.

Really?

$ ruby -KU -r jcode -e 'p "R=E9sum=E9".jsize'
6

James Edward Gray II=

dare ruby

3/11/2008 3:16:00 AM

0

Is there any possibilities using regular expressions or writing own
methods for unicode charatcers?




>>
>> Ruby does not support unicode.
>
> Really?
>
> $ ruby -KU -r jcode -e 'p "R�sum�".jsize'
> 6
>
> James Edward Gray II

--
Posted via http://www.ruby-....

7stud --

3/11/2008 3:30:00 AM

0

James Gray wrote:
> On Mar 10, 2008, at 10:00 AM, 7stud -- wrote:
>
>>> Jose Martin
>>
>> Ruby does not support unicode.
>
> Really?
>
> $ ruby -KU -r jcode -e 'p "Résumé".jsize'
> 6
>
> James Edward Gray II

How does that prove the ruby supports unicode? Where are there any
unicode characters in your string?
--
Posted via http://www.ruby-....

Lionel Bouton

3/11/2008 10:03:00 AM

0

7stud -- wrote:
> James Gray wrote:
>> [...]
>> $ ruby -KU -r jcode -e 'p "R�sum�".jsize'
>> 6
>>
>> James Edward Gray II
>
> How does that prove the ruby supports unicode? Where are there any
> unicode characters in your string?

1/ There's a difference between codepoints and characters, speaking of
unicode "characters" is confusing at best.

2/ "Supporting unicode" is probably meaningless (which unicode encoding
by the way?), building UTF-8 applications in Ruby is perfectly doable
thanks to jcode, regex UTF-8 support, ... I know, among other things
it's what I built my company on.

The example above obviously assumes an UTF-8 locale in the terminal you
type it...
For more data, just try size instead of jsize in the same example and
read jcode's rdoc.

Lionel

James Gray

3/11/2008 12:49:00 PM

0

On Mar 10, 2008, at 10:29 PM, 7stud -- wrote:

> James Gray wrote:
>> On Mar 10, 2008, at 10:00 AM, 7stud -- wrote:
>>
>>>> Jose Martin
>>>
>>> Ruby does not support unicode.
>>
>> Really?
>>
>> $ ruby -KU -r jcode -e 'p "R=E9sum=E9".jsize'
>> 6
>>
>> James Edward Gray II
>
> How does that prove the ruby supports unicode?

If the code was not character aware, it would have returned a count of =20=

the bytes in the String (more than six). String#size, for example.

> Where are there any unicode characters in your string?

I entered the accented e characters in UTF-8, that's why you see the -=20=

KU switch to tell Ruby the encoding.

James Edward Gray II


Todd Benson

3/11/2008 4:35:00 PM

0

On Tue, Mar 11, 2008 at 7:49 AM, James Gray <james@grayproductions.net> wro=
te:
>
> If the code was not character aware, it would have returned a count of
> the bytes in the String (more than six). String#size, for example.
>
>
> > Where are there any unicode characters in your string?
>
> I entered the accented e characters in UTF-8, that's why you see the -
> KU switch to tell Ruby the encoding.
>
> James Edward Gray II

I think this may have been discussed before, but -KU doesn't work for
me on Windows XP. I get an unterminated string error with the
"R=E9sum=E9" UTF-8 encoded string. I can only assume that the parser is
still interpreting the string as one byte per character. Anyone have
any ideas?

Todd

Jimmy Kofler

3/11/2008 4:50:00 PM

0

> Todd Benson wrote:
> On Tue, Mar 11, 2008 at 7:49 AM, James Gray <james@grayproductions.net>
> wrote:
>> James Edward Gray II
> I think this may have been discussed before, but -KU doesn't work for
> me on Windows XP. I get an unterminated string error with the
> "R�sum�" UTF-8 encoded string. I can only assume that the parser is
> still interpreting the string as one byte per character. Anyone have
> any ideas?
>
> Todd

Maybe try a regex-based UTF-8 hack (Ruby 1.8.6) like here:
http://snippets.dzone.com/posts...

Cheers,
jk
--
Posted via http://www.ruby-....

Todd Benson

3/11/2008 5:14:00 PM

0

On Tue, Mar 11, 2008 at 11:49 AM, Jimmy Kofler <koflerjim@mailinator.com> wrote:

>
> Maybe try a regex-based UTF-8 hack (Ruby 1.8.6) like here:
> http://snippets.dzone.com/posts...
>
> Cheers,
> jk

Thanks for the pointer!

Todd

7stud --

3/11/2008 11:47:00 PM

0

James Gray wrote:
> On Mar 10, 2008, at 10:29 PM, 7stud -- wrote:
>
>>> 6
>>>
>>> James Edward Gray II
>>
>> How does that prove the ruby supports unicode?
>
> If the code was not character aware, it would have returned a count of
> the bytes in the String (more than six). String#size, for example.
>
>> Where are there any unicode characters in your string?
>
> I entered the accented e characters in UTF-8, that's why you see the -
> KU switch to tell Ruby the encoding.
>
> James Edward Gray II

Ahh, I see. You think UTF-8 is unicode. And apparently you think that
when you enter a UTF-8 character in a post that everyone will see the
character you entered.
--
Posted via http://www.ruby-....