Mike Durham
10/23/2006 2:21:00 AM
Wilson Bilkovich wrote:
> On 10/22/06, Mike Durham <mdurham@people.net.au> wrote:
>> James Edward Gray II wrote:
>> > On Oct 22, 2006, at 7:30 AM, ilhamik wrote:
>> >
>> >> Thanks Peter, it works fine.
>> >
>> > You missed Tim Bray's RubyConf talk. According to him we should, never
>> > be using the case changing methods. "Just don't do it!" ;)
>> >
>> > James Edward Gray II
>> >
>> Why not? What reason did he give?
>
> The problem is that proper upcasing and downcasing of characters is
> locale-dependent, not just encoding or language-dependent.
>
> As examples, he mentioned that the uppercase version of accented
> characters varies from area to area in France. Also, in Turkish,
> there are four different cases of 'i', not just two.. and which is
> correct depends on the jurisdiction.
> Determining the locale in a correct way is really, really hard. Tim
> Bray says it's basically impossible. Also, all of these rules make
> any decent upcase/downcase function ruinously slow.
>
> He shared a story about the original version of XML. At the time, it
> was case-insensitive. The very first XML library was running horribly
> slow. After profiling, they found that it was spending 90% of its
> time in the Java downcase routine. After that, XML was made
> case-sensitive.
>
Thanks Wilson, that explains everything. I'd never thought about
problems like that.
Cheers, Mike