[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

ruby, irb and iconv with translit

Stefan Schmiedl

3/3/2007 5:33:00 PM

Greetings.

I've been trying to find out why sorting a list of German names failed on
both my local Gentoo box and my remote Debian server. Can somebody please
explain the following in simple words?

$ ruby -v
ruby 1.8.5 (2006-12-25 patchlevel 12) [x86_64-linux]
$ irb -v
irb 0.9.5(05/04/13)


$ cat test.rb | irb
>> $KCODE='u'
=> "u"
>> require "iconv"
=> true
>> $conv = Iconv.new("ASCII//TRANSLIT", "UTF-8")
=> #<Iconv:0x2ace62d57090>
>> $arg = "ärger"
=> "ärger"
>> $asc = $conv.iconv($arg)
=> "?rger"
>> puts $asc.size
5

ok, translit fails. This might be a bug somewhere,
but then why does the following work, where I called
iconv interactively, but with the same string?

$ irb -r test.rb
5
>> watch_this = $conv.iconv($arg)
=> "aerger"
>> watch_this.size
=> 6


Thanks,
s.
2 Answers

Stefan Schmiedl

3/3/2007 5:48:00 PM

0

On Sat, 03 Mar 2007 18:33:18 +0100, Stefan Schmiedl wrote:

> Greetings.
>
> I've been trying to find out why sorting a list of German names failed
> on both my local Gentoo box and my remote Debian server. Can somebody
> please explain the following in simple words?
>
> $ ruby -v
> ruby 1.8.5 (2006-12-25 patchlevel 12) [x86_64-linux] $ irb -v
> irb 0.9.5(05/04/13)
>
>
> $ cat test.rb | irb
>>> $KCODE='u'
> => "u"
>>> require "iconv"
> => true
>>> $conv = Iconv.new("ASCII//TRANSLIT", "UTF-8")
> => #<Iconv:0x2ace62d57090>
>>> $arg = "ärger"
> => "ärger"
>>> $asc = $conv.iconv($arg)
> => "?rger"
>>> puts $asc.size
> 5
>
> ok, translit fails. This might be a bug somewhere, but then why does the
> following work, where I called iconv interactively, but with the same
> string?
>
> $ irb -r test.rb
> 5
>>> watch_this = $conv.iconv($arg)
> => "aerger"
>>> watch_this.size
> => 6
>

Another facet of the problem:

$ irb -r test
5
$ irb
>> require "test"
6
=> true

I'd really like to know what irb does to make iconv behave...
s.

Stefan Schmiedl

3/3/2007 7:47:00 PM

0

On Sat, 03 Mar 2007 18:47:44 +0100, Stefan Schmiedl wrote:

>>
>> I've been trying to find out why sorting a list of German names failed
>> on both my local Gentoo box and my remote Debian server. Can somebody
>> please explain the following in simple words?
>>

The folks at #ruby-de helped me out with some brain waves and the
problem originates with locale settings. To wit:

$ echo Ã?rger | LC_CTYPE=de iconv -f utf8 -t ascii//translit
?rger
$ echo Ã?rger | LC_CTYPE=de_DE iconv -f utf8 -t ascii//translit
AErger

But while you can get the iconv tool to behave by setting LC_CTYPE,
there's no such luck with ruby:

$ LC_CTYPE=C ruby test.rb
"?rger"
$ LC_CTYPE=de_DE ruby test.rb
"?rger"

For the record, test.rb (saved as utf8) looks like:

$KCODE='u'
require "iconv"
$conv = Iconv.new("ASCII//TRANSLIT", "UTF-8")
p($conv.iconv("Ã?rger"))

s.