Asp Forum - subclass of hash, that times last read

Boris Schmid

5/28/2008 6:29:00 PM

Hi all. I'm rather unfamiliar with the subclass / superclass principle,
but I just tried to write one. The result is below here, and should be a
hash-like class, with the exception that it stores the last-accessed
time (the purpose of this being that I can more easily delete entries
from the hash that haven't been used in a while - I'm using a hash for
caching a costly function, but over time the hash grows big and slow,
with a lot of old entries in it.. I'm trying out different ways of
pruning the hash, and this seemed to be an interesting option).

The below code seems to work fine, but it would be a comfort to me to
know if I'm introducing several hairy bug-prone things if I add this to
my code, or that this is the relatively safe and correct way to do it.

class Myhash < Hash
alias :old_write :[]=
def []=(key,value)
old_write(key,[value,Time.now])
end

alias :old_read :[]
def [](key)
value = old_read(key)
old_write(key,[value[0],Time.now]) if value.respond_to?(:[])
old_read(key)
end
end

hash = Myhash.new
1000.times {hash[rand] = rand}
sleep 7
1000.times {hash[rand] = rand}
hash.delete_if {|k,v| Time.now - v[1] > 5}

Which should delete the 1000 entries from the hash that were created in
the first loop.
--
Posted via http://www.ruby-....

2 Answers

David Masover

5/28/2008 9:42:00 PM

On Wednesday 28 May 2008 13:28:58 Boris Schmid wrote:
> Hi all. I'm rather unfamiliar with the subclass / superclass principle,
> but I just tried to write one. The result is below here, and should be a
> hash-like class, with the exception that it stores the last-accessed
> time (the purpose of this being that I can more easily delete entries
> from the hash that haven't been used in a while - I'm using a hash for
> caching a costly function, but over time the hash grows big and slow,
> with a lot of old entries in it.. I'm trying out different ways of
> pruning the hash, and this seemed to be an interesting option).
>
> The below code seems to work fine, but it would be a comfort to me to
> know if I'm introducing several hairy bug-prone things if I add this to
> my code, or that this is the relatively safe and correct way to do it.

The only bug I've found so far is that it won't work if the user provides a
default value for the hash.

But allow me to nitpick a bit:

> class Myhash < Hash
> alias :old_write :[]=
> def []=(key,value)
> old_write(key,[value,Time.now])
> end

I would avoid aliasing things out of the way. It pollutes the namespace -- you
now have an old_write method, which is really only used by your one writer
method.

And I think a conventional way to do that is to choose names that reflect what
you're doing to change it -- for example, here, you'd choose something
like :write_without_myhash. (Imagine someone inherits from the stack, and you
both do the exact same thing, and define a method called old_write. You'll
create an infinite loop.)

Since you're in a subclass anyway, though, you get this for free:

def []=(key,value)
super(key,[value,Time.now])
end

> alias :old_read :[]
> def [](key)
> value = old_read(key)
> old_write(key,[value[0],Time.now]) if value.respond_to?(:[])
> old_read(key)
> end
> end

The same can be done here, reducing some code duplication:

def [](key)
value = super(key)
value[1] = Time.now if self.include? key
value
end

If that makes sense, you can skip the part where I take five times as long to
explain it...

I'm setting value[1] to Time.now. This will stay, because it's still the same
array object -- I don't have to change anything in the hash.

I'm checking for self.include?, because this will tell me if that particular
key has been assigned yet. If it hasn't, we're dealing with a default value,
and I'm not sure quite what you want to do -- probably store it and assign a
timestamp anyway? If so:

def [](key)
value = super(key)
if self.include? key
value[1] = Time.now
else
self[key] = value
value = self[key]
end
value
end

And since you already have the result of super(key) in a value, I'm returning
that value, rather than looking it up again, when I an. Because it's still
the same array, it will have the new timestamp, which is what your old code
did.

> hash.delete_if {|k,v| Time.now - v[1] > 5}

You might consider putting this in your class. For example:

def delete_before(time)
delete_if { |k,v| v.last < time }
end
hash.delete_before(Time.now - 5)

Maybe even:

def delete_older_than(seconds)
delete_before(Time.now - seconds)
end
hash.delete_older_than(5)

Dave Bass

5/29/2008 11:20:00 AM

My approach would be quite different. I'd use files to store the values
("costly functions"), with the name of the file being determined by the
key. (A sort of mini-database I suppose.)

It's very easy to determine when a file was last accessed, so you could
have a separate process running in the background looking for files to
delete. When I did this, I used a cron job, but other approaches are
possible of course.

File accesses are going to be slower than hash lookups, but whether this
is important or not depends on your application. And if some keys are
used very frequently, the associated files should get cached in memory
by the OS, so they should be quick to read.

Of course you could wrap this in a class called MyHash. ;-)

--
Posted via http://www.ruby-....

comp.lang.ruby

subclass of hash, that times last read

Boris Schmid

David Masover

Dave Bass

x Login to ForumsZone