Asp Forum - BackgrounDRb Theory

subsume@gmail.com

5/27/2007 6:15:00 PM

I'm making a site which allows subscribers to scrape their unnamed
social network profile for changes regularly using BackgrounDrb. I can:

A) Create 1 worker which will update all subscriber profiles at once. I
like this because its simple. I don't like this because it will create a
server-intensive traffic spike.

B) Instantiate a new worker which will run every 24 hours after a person
signs up. I like this because it eliminates the spike but still updates
everyone. But I'm not sure what effect having an worker running for each
user will have on server memory.

C) Other?

Comments please.

--
Posted via http://www.ruby-....

8 Answers

brabuhr

5/27/2007 7:01:00 PM

On 5/27/07, Sy Ys <subsume@gmail.com> wrote:
> I'm making a site which allows subscribers to scrape their unnamed
> social network profile for changes regularly using BackgrounDrb. I can:
>
> A) Create 1 worker which will update all subscriber profiles at once. I
> like this because its simple. I don't like this because it will create a
> server-intensive traffic spike.
>
> B) Instantiate a new worker which will run every 24 hours after a person
> signs up. I like this because it eliminates the spike but still updates
> everyone. But I'm not sure what effect having an worker running for each
> user will have on server memory.
>
> C) Other?

C) Create a pool of n workers which will pull jobs from a queue (maybe
also look at something like Ruby Queue[1]).

[1] http://codeforpeople.com/lib/ruby/rq...

Ezra Zygmuntowicz

5/27/2007 8:52:00 PM

On May 27, 2007, at 12:01 PM, brabuhr@gmail.com wrote:

> On 5/27/07, Sy Ys <subsume@gmail.com> wrote:
>> I'm making a site which allows subscribers to scrape their unnamed
>> social network profile for changes regularly using BackgrounDrb. I
>> can:
>>
>> A) Create 1 worker which will update all subscriber profiles at
>> once. I
>> like this because its simple. I don't like this because it will
>> create a
>> server-intensive traffic spike.
>>
>> B) Instantiate a new worker which will run every 24 hours after a
>> person
>> signs up. I like this because it eliminates the spike but still
>> updates
>> everyone. But I'm not sure what effect having an worker running
>> for each
>> user will have on server memory.
>>
>> C) Other?
>
> C) Create a pool of n workers which will pull jobs from a queue (maybe
> also look at something like Ruby Queue[1]).
>
> [1] http://codeforpeople.com/lib/ruby/rq...
>

Yeah I have the best luck with backgroundrb when I run a set number
of immortal workers that just loop and pull jobs from a queue.

Cheers-
-- Ezra Zygmuntowicz
-- Lead Rails Evangelist
-- ez@engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)

subsume@gmail.com

5/27/2007 8:56:00 PM

Ezra Zygmuntowicz wrote:
> Yeah I have the best luck with backgroundrb when I run a set number
> of immortal workers that just loop and pull jobs from a queue.

How is this different from B? Why would rq be necessary for this?

--
Posted via http://www.ruby-....

Ezra Zygmuntowicz

5/27/2007 9:14:00 PM

On May 27, 2007, at 1:56 PM, Sy Ys wrote:

> Ezra Zygmuntowicz wrote:
>> Yeah I have the best luck with backgroundrb when I run a set number
>> of immortal workers that just loop and pull jobs from a queue.
>
> How is this different from B? Why would rq be necessary for this?
>
> --
> Posted via http://www.ruby-....
>

B) Instantiate a new worker which will run every 24 hours after a person
signs up. I like this because it eliminates the spike but still updates
everyone. But I'm not sure what effect having an worker running for each
user will have on server memory.

you definitely don't want one worker running for each user. Each
worker is a new process and takes up resources. Try it first with
just one worker that gets started on bdrb server start and loops,
works on jobs in the queue, then sleeps and does it again.

Here is a simple example, you may want to add a pending and executed
flag to the queue items.:

class PublishWorker < BackgrounDRb::Worker::Base
def do_work(args={})
loop do {
urls_to_publish = PublishQueue.find(:all, :limit => 20)
urls_to_publish.each do |url|
# code here to work with urls
end
sleep args[:sleep]
}
end
end

PublishWorker.register

Cheers-

-- Ezra Zygmuntowicz
-- Lead Rails Evangelist
-- ez@engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)

ara.t.howard

5/27/2007 9:24:00 PM

On May 27, 2007, at 2:56 PM, Sy Ys wrote:

> Ezra Zygmuntowicz wrote:
>> Yeah I have the best luck with backgroundrb when I run a set number
>> of immortal workers that just loop and pull jobs from a queue.
>
> How is this different from B? Why would rq be necessary for this?
>

it wouldn't be. i'll be bundling rq for rails in the next week or
two. one advantage (i think) is that rq is durable across machine
reboots and also allows commandline interaction. it's a different
beast than backgrounddrb though.

fyi.

-a
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama

brabuhr

5/28/2007 1:14:00 AM

On 5/27/07, ara.t.howard <ara.t.howard@gmail.com> wrote:
> On May 27, 2007, at 2:56 PM, Sy Ys wrote:
> > Ezra Zygmuntowicz wrote:
> >> Yeah I have the best luck with backgroundrb when I run a set number
> >> of immortal workers that just loop and pull jobs from a queue.
> >
> > How is this different from B? Why would rq be necessary for this?
> >
>
> it wouldn't be. i'll be bundling rq for rails in the next week or
> two. one advantage (i think) is that rq is durable across machine
> reboots and also allows commandline interaction. it's a different
> beast than backgrounddrb though.

For the first proof of concept for the project I am currently working on,
I took the easy route of spawning a backgroundrb worker for each task
request. That, of course, ran into issues as the number of jobs grew
(in that case, running on a 1.? GHz/1GB laptop was after a couple of
dozen tasks. For the prototype I am currently building, I am using rq
for both of those reasons and to simply scale across several machines
(currently, I am farming the tasks out to 6 machines).

subsume@gmail.com

5/28/2007 5:34:00 PM

> Ezra Zygmuntowicz wrote:
> ...worker is a new process and takes up resources. Try it first with
> just one worker that gets started on bdrb server start and loops,
> works on jobs in the queue, then sleeps and does it again.

Two things:

-I went through your tutorial and documentation but I couldn't figure
out how to start workers on load. I thought it was via
background_schedules.yml but on start it didn't run.

-I'm not understanding the purpose for sleep in this case. Is the worker
doing something after it has elapsed that requires sleep?

--
Posted via http://www.ruby-....

subsume@gmail.com

5/28/2007 7:06:00 PM

> Two things:
>
> -I went through your tutorial and documentation but I couldn't figure
> out how to start workers on load. I thought it was via
> background_schedules.yml but on start it didn't run.

Elaboration:

|#background_schedules.yml
|update_f:
| :class: :update_frogs_worker
| :job_key: :schedule_test
| :worker_method: :dailyUpdate
| :worker_method_args: "x"
| :trigger_type: :cron_trigger
| :trigger_args: "1 * * * * * *"

|#lib/workers/update_frogs_worker.rb
|require 'open-uri'
|require 'rubygems'
|require 'net/http'
|require 'pp'
|class UpdateFrogsWorker < BackgrounDRb::Worker::RailsBase
| def do_work(args)
| end
| def dailyUpdate(args)
| @@frogs = Array.new
| @sub = Subscriber.find(4) #rails model
| @@frogID = @sub.domainID
| getFrogs getBody(@@frogsURL + @@frogID)
| oldFrogs = Array.new
| logger.info('Got ' + getFrogs.size.to_s + ' frogs')
| @@frogs.each{|f| oldFrogs << f[0]
| logger.info('pushing ' + f[0] + 'into oldFrogs array')
| }
| @sub.frogs.each{|f| curFrogs << f.id
| logger.info('pushing ' + f.domainID + 'into newFrogs array')
| }
| diff = Array.new
| diff = oldFrogs - curFrogs
| diff.each{|d| logging.info("Diff:" + d)}
| end
|end

--
Posted via http://www.ruby-....

comp.lang.ruby

BackgrounDRb Theory

subsume@gmail.com

brabuhr

Ezra Zygmuntowicz

subsume@gmail.com

Ezra Zygmuntowicz

ara.t.howard

brabuhr

subsume@gmail.com

subsume@gmail.com

x Login to ForumsZone