Ryan Davis
2/9/2007 12:51:00 AM
On Feb 7, 2007, at 9:21 AM, Mark Alexander Friedgan wrote:
> We've been struggling with this problem for months. We use
> TupleSpace to
> implement a distributed processing framework and periodically if
> the number
> of objects
> in the TupleBag gets large (not ridiculous but 50,000 or so) the
> TupleSpace
> begins to take 100pct cpu and is effectively neutered and must be
> restarted.
> We've been trying to come up
> with a better implementation of TupleBag but have not had much luck
> so far.
> Has anyone else done this?
What's happening is that you've populated a hash enough that you've
saturated the bins. You're now hitting a lot of hash collisions. You
can either improve the hash method on the tuples (possibly... really
depends on what you're storing), or switch the underlying data
structure (maybe a balanced tree will suit you better at large tuple
populations).
Or, if you're using patterns ala Gelernter, you might be able to pre-
partition your spaces into multiple bags based on activities or
specialists (or job or task type). Sounds like you've probably
already tried that tho.