James Gray
1/10/2008 1:45:00 PM
On Jan 10, 2008, at 2:15 AM, Nicko Kaltner wrote:
> I've found a bug in the 1.9.0 csvparser. I've got a script and data =20=
> that effectively breaks it, but it is 567kb. Is that too large for =20=
> this list?
Probably, but you are welcome to email it to me privately. I maintain =20=
the CSV library in Ruby 1.9.
> The ruby instance takes 100% of cpu while processing this file, and =20=
> I had to stop it after 5 minutes..
I'm 95% percent sure this is an issue of your CSV file being invalid. =20=
Because of the way the format works, a simple file like:
"=8510 Gigs of data without another quote=85
can only be detected invalid by reading the entire thing. I've =20
considered putting limits in the library for controlling how far it =20
would read ahead, but these would break some valid data. Then the =20
problem becomes, do I make the limits default to on? That's the only =20=
way they would have helped here, but that would break some otherwise =20
valid code. It's a tough problem to solve correctly.
James Edward Gray II=