Asp Forum
Home
|
Login
|
Register
|
Search
Forums
>
comp.lang.ruby
Help with Iconv needed
Marcus Strube
11/29/2007 12:51:00 PM
Can someone tell me what it is that I'm getting wrong here with "iconv"?
I either get "IllegalSequence" or "äöü�" are not encoded properly when
using Iconv.conv while it looks good using backticks. ("IllegalSequence
right now with the second. ��ü with the first anytime...)
require 'rss/1.0'; require 'rss/2.0'; require 'open-uri'; require
"iconv"
#source = "
http://www.sueddeutsche.de/app/service/rss/alles/rss...
source = "
http://www.welt.de/vermischtes/?service...
content = ""; open(source) { |s| content = s.read }; rss =
RSS::Parser.parse(content, false)
rss.items.each do |item|
converted = `'#{item.title}' | iconv -c -f ISO-8859-1 -t UTF8`
puts(Iconv.conv('ISO-8859-1', 'UTF-8', item.title)); puts " "
end
--
Posted via
http://www.ruby-...
.
1 Answer
MonkeeSage
11/30/2007
0
On Nov 29, 6:50 am, Marcus Strube <marcus.str...@gmx.net> wrote:
> Can someone tell me what it is that I'm getting wrong here with "iconv"?
> I either get "IllegalSequence" or "äöüß" are not encoded properly when
> using Iconv.conv while it looks good using backticks. ("IllegalSequence
> right now with the second. ÄÖü with the first anytime...)
>
> require 'rss/1.0'; require 'rss/2.0'; require 'open-uri'; require
> "iconv"
>
> #source = "
http://www.sueddeutsche.de/app/service/rss/alles/rss...
> source = "
http://www.welt.de/vermischtes/?service...
>
> content = ""; open(source) { |s| content = s.read }; rss =
> RSS::Parser.parse(content, false)
>
> rss.items.each do |item|
> converted = `'#{item.title}' | iconv -c -f ISO-8859-1 -t UTF8`
> puts(Iconv.conv('ISO-8859-1', 'UTF-8', item.title)); puts " "
> end
> --
> Posted via
http://www.ruby-...
.
Not sure about the error, but I see two issues. First, this is an
error...
`'#{item.title}' | iconv -c -f ISO-8859-1 -t UTF8`
I think you meant to echo the vale to the pipe...
`echo -n '#{item.title}' | iconv -c -f ISO-8859-1 -t UTF8`
Second, iso-8859-1 to utf-8 doesn't appear to be the proper encoding.
The following string...
Düsseldorf: Prominentengedrängel bei der Bambi-Verleihung
...is encoded as...
"D\303\203\302\274sseldorf: Prominentengedr\303\203\302\244ngel bei
der Bambi-Verleihung"
...by iconv from the command prompt. But it should be...
"D\303\274sseldorf: Prominentengedr\303\244ngel bei der Bambi-
Verleihung"
I'm not good with encodings and utf-8, so I can't tell you the
problem. I just know "umlaut u" should be 0xc3bc (\303\274), but it's
not doing that.
Regards,
Jordan
Servizio di avviso nuovi messaggi
Ricevi direttamente nella tua mail i nuovi messaggi per
Help with Iconv needed
Inserendo la tua e-mail nella casella sotto, riceverai un avviso tramite posta elettronica ogni volta che il motore di ricerca troverà un nuovo messaggio per te
Il servizio è completamente GRATUITO!
x
Login to ForumsZone
Login with Google
Login with E-Mail & Password