[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Serious YAML bug

Frantisek Fuka

6/4/2006 12:56:00 AM

Hello,

I am using YAML to store around 100KB of strings (just simple arrays
and hashes) and I found out the "unpacking" of YAML data stops after
4096 (or so) characters of yaml file (without raising any errors) - the
parsed data just ends in the middle of the string. And it has nothing
to do with disk operations, it happens even when I read the whole yaml
file into string and then use "YAML.parse(longstring)" call. Maybe it
has something to do with the fact that my strings contains many
non-Ascii characters?

Here is the file (zipped) that gives me this problem:
http://fuxoft.cz/tes...

I am using the latest Ruby included in Dapper Drake Linux distro and
YAML::Slyck version is "0.60".

12 Answers

James Herdman

6/4/2006 4:42:00 AM

0

What version of Ruby comes with Dapper? I installed it on a system the
other day and found that I needed to upgrade -- I believe Dapper comes
with 1.8.2 which appears to be somewhat problematic. The process for
apt-getting Ruby 1.8.4 is a little tricky, and long enough that I can't
remember it off of the top of my head.

Can you post your code?

James H

Frantisek Fuka wrote:
> Hello,
>
> I am using YAML to store around 100KB of strings (just simple arrays
> and hashes) and I found out the "unpacking" of YAML data stops after
> 4096 (or so) characters of yaml file (without raising any errors) - the
> parsed data just ends in the middle of the string. And it has nothing
> to do with disk operations, it happens even when I read the whole yaml
> file into string and then use "YAML.parse(longstring)" call. Maybe it
> has something to do with the fact that my strings contains many
> non-Ascii characters?
>
> Here is the file (zipped) that gives me this problem:
> http://fuxoft.cz/tes...
>
> I am using the latest Ruby included in Dapper Drake Linux distro and
> YAML::Slyck version is "0.60".

Frantisek Fuka

6/4/2006 9:13:00 AM

0

Version is: ruby 1.8.4 (2005-12-24) [i486-linux]

The code is rather simple:

str=File.open(fname).read
puts str #prints 100KB of data
data=YAML.parse(str)
puts data.emit #prints 4KB of data

However, using "data = YAML.load(File.open(fname))" (which I used
originally) produces the same error.

James H. wrote:
> What version of Ruby comes with Dapper? I installed it on a system the
> other day and found that I needed to upgrade -- I believe Dapper comes
> with 1.8.2 which appears to be somewhat problematic. The process for
> apt-getting Ruby 1.8.4 is a little tricky, and long enough that I can't
> remember it off of the top of my head.
>
> Can you post your code?
>
> James H
>
> Frantisek Fuka wrote:
> > Hello,
> >
> > I am using YAML to store around 100KB of strings (just simple arrays
> > and hashes) and I found out the "unpacking" of YAML data stops after
> > 4096 (or so) characters of yaml file (without raising any errors) - the
> > parsed data just ends in the middle of the string. And it has nothing
> > to do with disk operations, it happens even when I read the whole yaml
> > file into string and then use "YAML.parse(longstring)" call. Maybe it
> > has something to do with the fact that my strings contains many
> > non-Ascii characters?
> >
> > Here is the file (zipped) that gives me this problem:
> > http://fuxoft.cz/tes...
> >
> > I am using the latest Ruby included in Dapper Drake Linux distro and
> > YAML::Slyck version is "0.60".

Robert Klemme

6/4/2006 10:10:00 AM

0

Frantisek Fuka wrote:
> Version is: ruby 1.8.4 (2005-12-24) [i486-linux]
>
> The code is rather simple:
>
> str=File.open(fname).read

Note that you do not close the file descriptor properly which may cause
problems if you access that file later on in the same process. Better
do

str = File.read fname

> puts str #prints 100KB of data
> data=YAML.parse(str)

You probably used the wrong method:

irb(main):030:0> YAML.parse( "foo".to_yaml )
=> #<YAML::Syck::Scalar:0x488b388>
irb(main):031:0> YAML.load( "foo".to_yaml )
=> "foo"

> puts data.emit #prints 4KB of data
>
> However, using "data = YAML.load(File.open(fname))" (which I used
> originally) produces the same error.

Again, here you do not close the file handle properly. When reading
from a file you can also use

data = YAML.load_file fname

Cheers

robert

ts

6/4/2006 10:25:00 AM

0

>>>>> "F" == Frantisek Fuka <fuxoft@gmail.com> writes:

F> Version is: ruby 1.8.4 (2005-12-24) [i486-linux]

It's fixed in cvs

svg% ./ruby -v ~/b.rb
ruby 1.8.4 (2005-12-24) [i686-linux]
:subtitles
"WARNER BROS. uv\303\241d\303\255"
"Ud\304\233lala jsem chybu. V\303\255m, \305\276"
svg%

svg% ./ruby -v ~/b.rb
ruby 1.8.4 (2006-06-02) [i686-linux]
:subtitles
"WARNER BROS. uv\303\241d\303\255"
""
:saved_at
Fri May 26 19:25:06 CEST 2006
:custom_splits
{"Po \304\215lov\304\233ku, kter\303\275 mi p\305\231ipomn\304\233l p\303\241t\303\275 listopad."=>"Po \304\215lov\304\233ku, kter\303\275 mi\np\305\231ipomn\304\233l p\303\241t\303\275 listopad.", "A mn\304\233 se nest\303\275sk\303\241 po ideji, ale po \304\215lov\304\233ku."=>"A mn\304\233 se nest\303\275sk\303\241\npo ideji, ale po \304\215lov\304\233ku."}
svg%


--

Guy Decoux

Ross Bamford

6/4/2006 10:34:00 AM

0

On Sun, 04 Jun 2006 11:09:58 +0100, Robert Klemme <bob.news@gmx.net> wrote:

> Frantisek Fuka wrote:
>> Version is: ruby 1.8.4 (2005-12-24) [i486-linux]
>> The code is rather simple:
>> str=File.open(fname).read
>
> Again, here you do not close the file handle properly. When reading
> from a file you can also use
>
> data = YAML.load_file fname
>

Hmm, the behaviour I'm seeing here would suggest this to be a bug. With
load_file, sometimes it works, sometimes it doesn't. Loading the file in
Ruby and passing it in always fails.

$ ruby -v
ruby 1.8.4 (2005-12-24) [i686-linux]

$ irb -ryaml
YAML::Syck::VERSION
# => "0.60"

YAML::load(File.read('test.yaml'))
ArgumentError: syntax error on line 50, col 57: `C3\xBD, Denisi."
- ""
- "Z\xC3\xA1\xC5\x99n\xC3\xB'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from (irb):3

YAML::load_file('test.yaml')
ArgumentError: syntax error on line 50, col 57: `C3\xBD, Denisi."
- ""
- "Z\xC3\xA1\xC5\x99n\xC3\xB'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:144:in `load_file'
from /usr/local/lib/ruby/1.8/yaml.rb:143:in `load_file'
from (irb):5

YAML::load_file('test.yaml')
# => {:subtitles=>[ ... 49 elements ... ]}

YAML::load_file('test.yaml')
ArgumentError: syntax error on line 50, col 57: `C3\xBD, Denisi."
- ""
- "Z\xC3\xA1\xC5\x99n\xC3\xB'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:133:in `load'
from /usr/local/lib/ruby/1.8/yaml.rb:144:in `load_file'
from /usr/local/lib/ruby/1.8/yaml.rb:143:in `load_file'
from (irb):7

YAML::load_file('test.yaml')
# => {:subtitles=>[ ... 49 elements ... ]}

YAML::load_file('test.yaml')
# => {:subtitles=>[ ... 49 elements ... ]}

YAML::load_file('test.yaml')
# => {:subtitles=>[ ... 49 elements ... ]}

--
Ross Bamford - rosco@roscopeco.remove.co.uk

Frantisek Fuka

6/4/2006 10:59:00 AM

0

ts wrote:
> It's fixed in cvs

Glad to hear that. How exactly does the updating/packaging work? When
can I expect this to be fixed in Ubuntu? Or can I use some sort of
workaround meanwhile? I plan to use this code on several Ubuntu
machines.

ts

6/4/2006 11:08:00 AM

0

>>>>> "F" == Frantisek Fuka <fuxoft@gmail.com> writes:

F> Glad to hear that. How exactly does the updating/packaging work? When
F> can I expect this to be fixed in Ubuntu? Or can I use some sort of
F> workaround meanwhile? I plan to use this code on several Ubuntu
F> machines.

Well, you can install it from cvs :

cvs -d :pserver:anonymous@cvs.ruby-lang.org:/src login

Just press RETURN when it ask for a password, then

cvs -z4 -d :pserver:anonymous@cvs.ruby-lang.org:/src co -r ruby_1_8 ruby

(without -r ruby_1_8, it will retrieve ruby 1.9)

cd ruby
autoconf
./configure
make
sudo make install

it will be installed in /usr/local/bin, or you can run

./configure --prefix=some_dir

if you want to put it in some_dir



--

Guy Decoux

Frantisek Fuka

6/4/2006 11:21:00 AM

0

Thanks. The thing is, the code has to be used on several machines that
don't have development environment on them (e.g. cannot compile) by
people who don't have admin rights there. I guess I'll have to wait
until it's automatically updated by Ubuntu. In the meantime - is there
some way I can get around this bug in my Ruby code? Maybe redefining
some method in Yaml??

ts wrote:
> >>>>> "F" == Frantisek Fuka <fuxoft@gmail.com> writes:
>
> F> Glad to hear that. How exactly does the updating/packaging work? When
> F> can I expect this to be fixed in Ubuntu? Or can I use some sort of
> F> workaround meanwhile? I plan to use this code on several Ubuntu
> F> machines.
>
> Well, you can install it from cvs :
>
> cvs -d :pserver:anonymous@cvs.ruby-lang.org:/src login
>
> Just press RETURN when it ask for a password, then
>
> cvs -z4 -d :pserver:anonymous@cvs.ruby-lang.org:/src co -r ruby_1_8 ruby
>
> (without -r ruby_1_8, it will retrieve ruby 1.9)
>
> cd ruby
> autoconf
> ./configure
> make
> sudo make install
>
> it will be installed in /usr/local/bin, or you can run
>
> ./configure --prefix=some_dir
>
> if you want to put it in some_dir
>
>
>
> --
>
> Guy Decoux

ts

6/4/2006 11:56:00 AM

0

>>>>> "F" == Frantisek Fuka <fuxoft@gmail.com> writes:

F> Thanks. The thing is, the code has to be used on several machines that
F> don't have development environment on them (e.g. cannot compile) by
F> people who don't have admin rights there. I guess I'll have to wait
F> until it's automatically updated by Ubuntu. In the meantime - is there
F> some way I can get around this bug in my Ruby code? Maybe redefining
F> some method in Yaml??

Well, YAML::load_file call a C method. Perhaps the problem is in syck (the
C extension) I've not looked at it.


--

Guy Decoux

Francis Hwang

6/5/2006 6:58:00 PM

0

YAML in Ruby relies on Syck, which is pretty remarkable but not yet, in
my experience, stable enough to hold large amounts of data, or data
that changes frequently. And _why's page on Syck (
http://whytheluckystiff... ) points out that Unicode support is
also not very good yet.

If you need to stick with YAML, I'd recommend getting Ruby from CVS
head; it's supposed to contain a newer version of Syck. If you can get
away with it, though, I'd suggest using Marshal instead. It's less cool
than YAML, but pretty solid otherwise.

Francis Hwang
http://f...


Frantisek Fuka wrote:
> Hello,
>
> I am using YAML to store around 100KB of strings (just simple arrays
> and hashes) and I found out the "unpacking" of YAML data stops after
> 4096 (or so) characters of yaml file (without raising any errors) - the
> parsed data just ends in the middle of the string. And it has nothing
> to do with disk operations, it happens even when I read the whole yaml
> file into string and then use "YAML.parse(longstring)" call. Maybe it
> has something to do with the fact that my strings contains many
> non-Ascii characters?
>
> Here is the file (zipped) that gives me this problem:
> http://fuxoft.cz/tes...
>
> I am using the latest Ruby included in Dapper Drake Linux distro and
> YAML::Slyck version is "0.60".