Asp Forum - Accessing ruby objects across VMs

Piyush Ranjan

4/20/2009 10:13:00 PM

[Note: parts of this message were removed to make it a legal post.]

I have 1 ruby instance running for a website. I do heavy polling using in
memory data and to do run a few more. I have all my data stored in a MySQL
DB. Right now whenever an entry is made to this DB (containing only a single
table), I cache this data in memory and that serves my purpose. Now as the
traffic is increasing I need to start more ruby instances. Is there any way
to store this data in a single in-memory location and access it from all
other vms ?

There are a few options that come to my mind like
1. Storing in a memcache
2. Storing in localmemcache
The problem with both of these options is that I need to store heavily
nested objects. If I do that in memcache I may have to do using something
like Marshal dump and load which is a little slow for my liking.
Other option could be to keep seperate copies in all VMs and refresh it
using a hhtp get call to all VMs whenever the data changes in the DB.

Now the question is , is there a way to access in a shared manner between
all the VMs ? Or am I mssing something very obvious here? or is it not
doable ?

Piyush Ranjan

4 Answers

Joel VanderWerf

4/20/2009 11:59:00 PM

Piyush Ranjan wrote:
> I have 1 ruby instance running for a website. I do heavy polling using in
> memory data and to do run a few more. I have all my data stored in a MySQL
> DB. Right now whenever an entry is made to this DB (containing only a single
> table), I cache this data in memory and that serves my purpose. Now as the
> traffic is increasing I need to start more ruby instances. Is there any way
> to store this data in a single in-memory location and access it from all
> other vms ?
>
> There are a few options that come to my mind like
> 1. Storing in a memcache
> 2. Storing in localmemcache
> The problem with both of these options is that I need to store heavily
> nested objects. If I do that in memcache I may have to do using something
> like Marshal dump and load which is a little slow for my liking.
> Other option could be to keep seperate copies in all VMs and refresh it
> using a hhtp get call to all VMs whenever the data changes in the DB.
>
> Now the question is , is there a way to access in a shared manner between
> all the VMs ? Or am I mssing something very obvious here? or is it not
> doable ?

AFAIK there's no way for two VMs to share live, writable object storage.

Looks like the choice is between deserializing on every read (opts 1 and
2) vs. deserializing on every write (your "other option"). So if reads
outnumber writes, as they often do, the other option might be better.

You could use drb (which does marshal over sockets) instead of http for
this. Probably faster, and certainly easier to use. Use the
drbunix:/path/to/socket protocol instead of tcp, and it will be even
faster (maybe 50%, in my experience).

YMMV, and standard cautions against premature optimization apply....

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Piyush Ranjan

4/21/2009 12:12:00 AM

[Note: parts of this message were removed to make it a legal post.]

Thanks Joel.
DRB seems a better option but I think Marshal load dump seems easier even
though there is a performance degradation.
Optimization is premature but I have to pull it off because there is limited
hardware available to me for this project.
Thanks
Piyush

On Tue, Apr 21, 2009 at 5:28 AM, Joel VanderWerf <vjoel@path.berkeley.edu>wrote:

> Piyush Ranjan wrote:
>
>> I have 1 ruby instance running for a website. I do heavy polling using in
>> memory data and to do run a few more. I have all my data stored in a MySQL
>> DB. Right now whenever an entry is made to this DB (containing only a
>> single
>> table), I cache this data in memory and that serves my purpose. Now as the
>> traffic is increasing I need to start more ruby instances. Is there any
>> way
>> to store this data in a single in-memory location and access it from all
>> other vms ?
>>
>> There are a few options that come to my mind like
>> 1. Storing in a memcache
>> 2. Storing in localmemcache
>> The problem with both of these options is that I need to store heavily
>> nested objects. If I do that in memcache I may have to do using something
>> like Marshal dump and load which is a little slow for my liking.
>> Other option could be to keep seperate copies in all VMs and refresh it
>> using a hhtp get call to all VMs whenever the data changes in the DB.
>>
>> Now the question is , is there a way to access in a shared manner between
>> all the VMs ? Or am I mssing something very obvious here? or is it not
>> doable ?
>>
>
> AFAIK there's no way for two VMs to share live, writable object storage.
>
> Looks like the choice is between deserializing on every read (opts 1 and 2)
> vs. deserializing on every write (your "other option"). So if reads
> outnumber writes, as they often do, the other option might be better.
>
> You could use drb (which does marshal over sockets) instead of http for
> this. Probably faster, and certainly easier to use. Use the
> drbunix:/path/to/socket protocol instead of tcp, and it will be even faster
> (maybe 50%, in my experience).
>
> YMMV, and standard cautions against premature optimization apply....
>
> --
> vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
>
>

Joel VanderWerf

4/21/2009 12:55:00 AM

Piyush Ranjan wrote:
> Thanks Joel.
> DRB seems a better option but I think Marshal load dump seems easier even
> though there is a performance degradation.

Marshal #load and #dump _is_ what drb is doing.

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Robert Klemme

4/21/2009 7:27:00 AM

2009/4/21 Piyush Ranjan <piyush.pr@gmail.com>:

> DRB seems a better option but I think Marshal load dump seems easier even
> though there is a performance degradation.

I am not sure what you mean by this. As Joel pointed out, DRb does use
Marshal. You just need to decide which objects are remotely accessible
and which should be sent over the wire.

If you have a heavy concurrent application with frequent updates you
must watch out for all sorts of consistency issues. It may turn out
that it is better to use the database for maintaining integrity and do
a per process caching of relevant data from the DB.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestprac...

comp.lang.ruby

Accessing ruby objects across VMs

Piyush Ranjan

Joel VanderWerf

Piyush Ranjan

Joel VanderWerf

Robert Klemme

x Login to ForumsZone