Oliver Cromm
3/6/2006 4:18:00 AM
rio4ruby wrote:
> James Edward Gray II wrote:
>> On Feb 28, 2006, at 7:59 PM, Oliver Cromm wrote:
>>
>>> The speed difference looks too extreme too me:
>>>
>>>
>>> caps = []
>>> File.open('caps_u8.dic').each {|line| caps << line.split(';')[0]}
>>>
>>> => 1.8 seconds
>>
>> Here you are rolling your own split.
>>
>>> require 'rio'
>>> caps = rio('caps_u8.dic').csv(";").columns(0)[].flatten
>>> p caps
>>>
>>> => 50.9 seconds
>>
>
> This is a false comparison. The speedy code will not properly parse
> many CSV files.
I didn't claim they are equivalent in principle; but for the purpose at
hand, they are. And in this case, I wouldn't have cared if one version
takes 5 times as long, but 25 times is not practicable - that speed
difference would easily justify, say, 15 minutes more time for
programming, so I could cover a lot of cases.
> For example, the following is a legal line from a CSV file:
>
> "Field 1","Hello, World", "Field 3"
I doubt that a split(/\",\s*\"/) (plus necessary adjustments) would be
much slower.
--
Oliver C.
Die demoskopische Hauptzielgruppe von "Focus" sind Maenner aus dem
gehobenen Mittelstand zwischen 40 und 65 (IQ, nicht Alter).
Andreas Kabel in de.etc.sprache.deutsch