[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

[RCR] digest/from_io

Michael Neumann

10/14/2004 3:54:00 PM

Hi,

Would be nice to have in standard Ruby, as it's common to calculate a
digest of a file.

Example of usage:

require 'digest/sha1'
require 'digest/from_io'

File.open('/tmp/x') {|f|
Digest::SHA1.from_io(f)
}

Implementation:

# file: digest/from_io.rb
class Digest::Base
def self.from_io(io, block_size=8*1024)
digest = new
while data = io.read(block_size)
digest.update(data)
end
digest
end
end

Another addition would be the raw_digest method (which of course could
be better implemented in C):

require 'enumerator'
class Digest::Base
def raw_digest
hexdigest.to_enum(:scan, /../).map {|byte| byte.to_i(16).chr}.join
end
alias rawdigest raw_digest
end

Gavin, feel free to add it to extlib/addlib if you think it's worth.

matz, do you think, we can add from_io to stdlib?

Regards,

Michael


4 Answers

Gavin Sinclair

10/15/2004 12:48:00 AM

0

On Friday, October 15, 2004, 1:53:37 AM, Michael wrote:

> Gavin, feel free to add it to extlib/addlib if you think it's worth.

Yeah, sounds like a good idea.

Gavin



nobu.nokada

10/15/2004 3:03:00 AM

0

Hi,

At Fri, 15 Oct 2004 00:53:37 +0900,
Michael Neumann wrote in [ruby-talk:116637]:
> Implementation:
>
> # file: digest/from_io.rb
> class Digest::Base
> def self.from_io(io, block_size=8*1024)
> digest = new
> while data = io.read(block_size)
> digest.update(data)
> end
> digest
> end
> end

Another implementation could be:

def Digest::Base.from(src)
digest = new
src.each(&digest.method(:update))
digest
end

This requires #each method instead of #read, do you think which
is better?

> Another addition would be the raw_digest method (which of course could
> be better implemented in C):
>
> require 'enumerator'
> class Digest::Base
> def raw_digest
> hexdigest.to_enum(:scan, /../).map {|byte| byte.to_i(16).chr}.join
> end
> alias rawdigest raw_digest
> end

It is equivalent to Digest::Base#digest.

--
Nobu Nakada


Michael Neumann

10/16/2004 10:21:00 AM

0

On Fri, Oct 15, 2004 at 12:03:17PM +0900, Nobuyoshi Nakada wrote:
> Hi,
>
> At Fri, 15 Oct 2004 00:53:37 +0900,
> Michael Neumann wrote in [ruby-talk:116637]:
> > Implementation:
> >
> > # file: digest/from_io.rb
> > class Digest::Base
> > def self.from_io(io, block_size=8*1024)
> > digest = new
> > while data = io.read(block_size)
> > digest.update(data)
> > end
> > digest
> > end
> > end
>
> Another implementation could be:
>
> def Digest::Base.from(src)
> digest = new
> src.each(&digest.method(:update))
> digest
> end
>
> This requires #each method instead of #read, do you think which
> is better?

What if #each does not return a string? Does #update work for all Ruby
objects? Personally I like #from_io more, as it's more natural how it
works.

What if #from would take more arguments, like this:

Digest.from(io, :each_chunk, blk_sz = 10000, bytes = 1_000_000)
Digest.from(io, :each_line)

This would be a far more general solution, and as simple to implement.

> > Another addition would be the raw_digest method (which of course could
> > be better implemented in C):
> >
> > require 'enumerator'
> > class Digest::Base
> > def raw_digest
> > hexdigest.to_enum(:scan, /../).map {|byte| byte.to_i(16).chr}.join
> > end
> > alias rawdigest raw_digest
> > end
>
> It is equivalent to Digest::Base#digest.

Oh, thanks.

Regards,

Michael


Austin Ziegler

10/16/2004 2:14:00 PM

0

On Sat, 16 Oct 2004 19:20:58 +0900, Michael Neumann <mneumann@ntecs.de> wrote:
> On Fri, Oct 15, 2004 at 12:03:17PM +0900, Nobuyoshi Nakada wrote:
> > Hi,
> >
> > At Fri, 15 Oct 2004 00:53:37 +0900,
> > Michael Neumann wrote in [ruby-talk:116637]:
> > > Implementation:
> > >
> > > # file: digest/from_io.rb
> > > class Digest::Base
> > > def self.from_io(io, block_size=8*1024)
> > > digest = new
> > > while data = io.read(block_size)
> > > digest.update(data)
> > > end
> > > digest
> > > end
> > > end
> >
> > Another implementation could be:
> >
> > def Digest::Base.from(src)
> > digest = new
> > src.each(&digest.method(:update))
> > digest
> > end
> >
> > This requires #each method instead of #read, do you think which
> > is better?
> What if #each does not return a string? Does #update work for all Ruby
> objects? Personally I like #from_io more, as it's more natural how it
> works.

#update should work on all Digest::Base subclasses. This line is roughly:

src.each { |data| digest.update(data) }

Digest#update methods should work with individual bytes or strings,
shouldn't they?

It also depends on what Digest#update does to the implied data value;
if it calls #to_s or #to_str, then arrays of strings or other values
could be dealt with very easily. To me, that would be as or more
useful than limiting it to IO objects; I might want to generate a
digest from the result of IO::readlines (an array of strings).

-austin
--
Austin Ziegler * halostatue@gmail.com
* Alternate: austin@halostatue.ca
: as of this email, I have [ 5 ] Gmail invitations