Wilson Bilkovich
12/21/2005 3:55:00 PM
On 12/21/05, Paul Duncan <pabs@pablotron.org> wrote:
> * Andreas S. (f@andreas-s.net) wrote:
> [snipped]
> > I dislike that Iconv raises an exception when it finds characters it can
> > not convert. I would prefer if it could be made to ignore invalid
> > characters and just try to make the best of the text.
>
> Seconded, Thirded, and Quadrupled.
>
> Iconv needs a "as close as I could get with transliteration and ignoring
> invalid characters" mode.
>
> We're doing something comparable in Raggle by trapping the exception and
> stripping out the invalid character. Obviously this doesn't work
> properly for multibyte characters, and you won't be able to use a lookup
> table for arbitrary source encodings, but it's a start.
>
<snip interesting code>
What if String just had a couple of new methods on it:
String#transcode(from_encoding, to_encoding)
..and
String#transcode!(from_encoding, to_encoding)
..and the "modifies receiver" version returned true or false,
depending on whether it managed to convert every character?
Then you could do:
unless some_string.transcode!('Shift-JIS', 'UTF-8')
puts "Some characters got mangle-fied!"
end
Is that a mess? I kinda like it, at first glance.