Asp Forum - Marshal's handling of floats

Brian Palmer

7/9/2006 4:32:00 AM

I was thinking about writing a patch to modify how Marshal handles
floats, right now it dumps them using sprintf(3) and stores the
resulting string in the Marshal stream. I'd like to see it handle
floats the same way that Array#pack does:

[400.53].pack('g').length == 4
[400.53].pack('G').length == 8

while

Marshal.dump(400.53).length - 3 == 22
(and is slower, to boot)

I want to make sure, though, that this would be an acceptable patch.
I can't think why it would be OK for Array#pack to work this way and
not Marshal, but is there any particular reason why it can't be done?
Obviously it would break backwards compatability with older Marshal
dumps, but I don't think they're often used for long-term storage,
are they?

-- Brian Palmer

13 Answers

Ara.T.Howard

7/9/2006 5:40:00 AM

Brian Palmer

7/9/2006 7:22:00 AM

Hi Ara,

The 'g' and 'G' flags for Array#pack/String#unpack are in network
byte order, so they're in a platform-independent format, as far as I
know. I actually tested this by Packing a couple thousand floats on
my Mac, sending them in UDP packets and unpacking them on my AMD64
desktop, they all came across correctly.

-- Brian

On Jul 8, 2006, at 11:40 PM, ara.t.howard@noaa.gov wrote:

> On Sun, 9 Jul 2006, Brian Palmer wrote:
>
>> I was thinking about writing a patch to modify how Marshal handles
>> floats, right now it dumps them using sprintf(3) and stores the
>> resulting string in the Marshal stream. I'd like to see it handle
>> floats the same way that Array#pack does:
>>
>> [400.53].pack('g').length == 4
>> [400.53].pack('G').length == 8
>>
>> while
>>
>> Marshal.dump(400.53).length - 3 == 22
>> (and is slower, to boot)
>>
>> I want to make sure, though, that this would be an acceptable
>> patch. I can't think why it would be OK for Array#pack to work
>> this way and not Marshal, but is there any particular reason why
>> it can't be done? Obviously it would break backwards compatability
>> with older Marshal dumps, but I don't think they're often used for
>> long-term storage, are they?
>>
>> -- Brian Palmer
>
> i've never tried to use marshaled data across a big and little
> endian machine
> - but this would break it. consider drb: if you had a mac and a
> linux box
> talking on the wire you might see
>
> harp:~ > ruby -e' puts [1.44417819733316e-41].pack
> ("g").reverse.unpack("g")[0].to_i '
> 42
>
> which could be confusing. then again maybe i'm overlooking something.
>
> cheers.
>
> -a
> --
> suffering increases your inner strength. also, the wishing for
> suffering
> makes the suffering disappear.
> - h.h. the 14th dali lama
>

Yukihiro Matsumoto

7/9/2006 11:52:00 PM

Hi,

In message "Re: Marshal's handling of floats"
on Sun, 9 Jul 2006 13:31:59 +0900, Brian Palmer <rubytalk@brian.codekitchen.net> writes:

|I was thinking about writing a patch to modify how Marshal handles
|floats, right now it dumps them using sprintf(3) and stores the
|resulting string in the Marshal stream. I'd like to see it handle
|floats the same way that Array#pack does:
|
|[400.53].pack('g').length == 4
|[400.53].pack('G').length == 8
|
|while
|
|Marshal.dump(400.53).length - 3 == 22
|(and is slower, to boot)
|
|I want to make sure, though, that this would be an acceptable patch.

There are issues:

* pack('g') would not work on non-IEEE floating machines.
* changing marshal format in incompatible way causes a lot of
troubles, so that it should be avoided if possible.

I think we can merge it for 1.9 (if we address IEEE754 issue).

matz.

Brian Palmer

7/10/2006 12:13:00 AM

On Jul 9, 2006, at 5:52 PM, Yukihiro Matsumoto wrote:

> Hi,
>
> In message "Re: Marshal's handling of floats"
> on Sun, 9 Jul 2006 13:31:59 +0900, Brian Palmer
> <rubytalk@brian.codekitchen.net> writes:
>
> |I was thinking about writing a patch to modify how Marshal handles
> |floats, right now it dumps them using sprintf(3) and stores the
> |resulting string in the Marshal stream. I'd like to see it handle
> |floats the same way that Array#pack does:
> |
> |[400.53].pack('g').length == 4
> |[400.53].pack('G').length == 8
> |
> |while
> |
> |Marshal.dump(400.53).length - 3 == 22
> |(and is slower, to boot)
> |
> |I want to make sure, though, that this would be an acceptable patch.
>
> There are issues:
>
> * pack('g') would not work on non-IEEE floating machines.
> * changing marshal format in incompatible way causes a lot of
> troubles, so that it should be avoided if possible.
>
> I think we can merge it for 1.9 (if we address IEEE754 issue).
>
> matz.
>

Yes, that's a biggie. I didn't realize that ruby compiled on non-IEEE
machines, but it makes sense now that I think about it. I think this
is out of my league, I suppose it would require integrating a
floating-point emulation library into ruby on such platforms, and
having that library handle the packing/marshaling, or even using that
library to back all Float objects on such platforms. I think that for
my purposes it makes more sense to just write a separate Marshal-type
extension library, since I only plan to target IA32, IA64 and Apple
G4/G5.

Thanks for the response!

-- Brian

Nobuyoshi Nakada

7/10/2006 1:13:00 AM

Hi,

At Mon, 10 Jul 2006 09:12:32 +0900,
Brian Palmer wrote in [ruby-talk:201018]:
> Yes, that's a biggie. I didn't realize that ruby compiled on non-IEEE
> machines, but it makes sense now that I think about it. I think this
> is out of my league, I suppose it would require integrating a
> floating-point emulation library into ruby on such platforms, and
> having that library handle the packing/marshaling, or even using that
> library to back all Float objects on such platforms. I think that for
> my purposes it makes more sense to just write a separate Marshal-type
> extension library, since I only plan to target IA32, IA64 and Apple
> G4/G5.

What's the reason of your proposal?

If it is for precision issue, rather I'd suppose to represent
floating points in hexadecimal format.

--
Nobu Nakada

Brian Palmer

7/10/2006 1:57:00 AM

Hey,

On Jul 9, 2006, at 7:13 PM, nobu@ruby-lang.org wrote:

> Hi,
>
> At Mon, 10 Jul 2006 09:12:32 +0900,
> Brian Palmer wrote in [ruby-talk:201018]:
>> Yes, that's a biggie. I didn't realize that ruby compiled on non-IEEE
>> machines, but it makes sense now that I think about it. I think this
>> is out of my league, I suppose it would require integrating a
>> floating-point emulation library into ruby on such platforms, and
>> having that library handle the packing/marshaling, or even using that
>> library to back all Float objects on such platforms. I think that for
>> my purposes it makes more sense to just write a separate Marshal-type
>> extension library, since I only plan to target IA32, IA64 and Apple
>> G4/G5.
>
> What's the reason of your proposal?
>
> If it is for precision issue, rather I'd suppose to represent
> floating points in hexadecimal format.
>
> --
> Nobu Nakada
>

Actually, I'm not terribly concerned with precision, but rather with
speed and size. The application I'm building needs to communicate
data over a wireless network every 100 milliseconds, and it needs to
use as little bandwidth as possible. I was considering just using the
Marshal methods, but Marshal's floating-point representation killed
that idea, though for other Ruby built-in data types it seems to be
quite good, even truncating small integers to bytes and shorts. I've
decided that I only need a couple decimal places of accuracy, though,
so I'm just going to multiply each float by 100 and then convert it
to an int.

Out of curiousity, what do you mean by 'represent floating points in
hexadecimal format'? I've never heard of that before.

-- Brian Palmer

M. Edward (Ed) Borasky

7/10/2006 2:26:00 AM

Yukihiro Matsumoto wrote:
> There are issues:
>
> * pack('g') would not work on non-IEEE floating machines.
> * changing marshal format in incompatible way causes a lot of
> troubles, so that it should be avoided if possible.
>
> I think we can merge it for 1.9 (if we address IEEE754 issue).
>
> matz.
>
A couple of questions:

1. How does XML handle floats?
2. How does YAML handle floats?
3. What is the format of a float when dumped using Marshal?

My recommendation would be to dump floats as the hexadecimal
representation of IEEE 64-bit formatted numbers. This is "almost
universal" and occupies only 16 bytes. The alternative, dumping them in
decimal in some "scientific" notation, takes more bytes and loses small,
but noticeable, accuracy. Moreover, it does not capture IEEE's "Inf" and
"NaN" values, which are very much part of the semantics and syntax of
modern numeric processing.

Non-IEEE architectures are very much the exception rather than the rule,
and they can be expected to dump IEEE hex and read IEEE hex as a penalty
for not adopting the standard. :)
>
>

--
M. Edward (Ed) Borasky

http://linuxcapacitypl...

Joel VanderWerf

7/10/2006 3:43:00 AM

Brian Palmer wrote:
> Actually, I'm not terribly concerned with precision, but rather with
> speed and size. The application I'm building needs to communicate data
> over a wireless network every 100 milliseconds, and it needs to use as
> little bandwidth as possible. I was considering just using the Marshal
> methods, but Marshal's floating-point representation killed that idea,
> though for other Ruby built-in data types it seems to be quite good,
> even truncating small integers to bytes and shorts. I've decided that I
> only need a couple decimal places of accuracy, though, so I'm just going
> to multiply each float by 100 and then convert it to an int.

If you don't need to send arbitrary (marshallable) ruby objects, but
only fairly well-defined struct-like packets, the you might find this
helpful. It's built on top of pack/unpack, but it has a "dsl" flavor and
makes it easier to work with odd-length bit fields, for example:

http://redshift.sourceforge.net/b...

(I wrote this for purposes like yours: to send data over a constrained
wireless network, and also to communicate easily with non-ruby code, so
Marshal was not an option.)

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Yukihiro Matsumoto

7/10/2006 5:10:00 AM

Hi,

In message "Re: Marshal's handling of floats"
on Mon, 10 Jul 2006 11:26:25 +0900, "M. Edward (Ed) Borasky" <znmeb@cesmail.net> writes:

|1. How does XML handle floats?
|2. How does YAML handle floats?
|3. What is the format of a float when dumped using Marshal?

Currently all use human-readable decimal string representation.

matz.

Nobuyoshi Nakada

7/10/2006 6:43:00 AM

Hi,

At Mon, 10 Jul 2006 10:56:55 +0900,
Brian Palmer wrote in [ruby-talk:201027]:
> Out of curiousity, what do you mean by 'represent floating points in
> hexadecimal format'? I've never heard of that before.

Introduced in C99, like as 0xaaaa.ccccP+10.

--
Nobu Nakada

comp.lang.ruby

Marshal's handling of floats

Brian Palmer

Ara.T.Howard

Brian Palmer

Yukihiro Matsumoto

Brian Palmer

Nobuyoshi Nakada

Brian Palmer

M. Edward (Ed) Borasky

Joel VanderWerf

Yukihiro Matsumoto

Nobuyoshi Nakada

x Login to ForumsZone