[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Odd result when attempting to use Mechanize in parallel with Threads

Richard Conroy

12/7/2006 2:48:00 PM

I wrote a simple tool to iterate a network to try and find web servers
running on specific ports. We have a lot of devices & software with a
web UI, and I thought that this would be a handy way to find them,
and even tell what they are.

I thought this would be a handy coding project too, and a good way to
cut my teeth on Ruby threads, and build up some usage with
Mechanize.

BTW I am running this on *Windows XP*.

However my code is quite obviously executing this serially. Is there
something obviously wrong with my code below? (results after
code snippet). I am aware this could make my machine choke from
thread overkill, but I wanted to get it working in parallel first.
Perhaps Mechanize instances have some shared elements?

============================
require 'mechanize'

threads = Array.new
puts "sweep of 153.200.72.* segment http ports"
(1..254).each do |ran|
threads << Thread.new(ran) { |r|
agent = WWW::Mechanize.new
agent.user_agent_alias = 'Windows Mozilla'
ports = [80,8080]
ports.each do |p|
begin
page = agent.get("http://153.200.72."+r.to_s+":&qu...)
puts "153.200.72."+r.to_s+":"+p.to_s+" - "+page.title
rescue
puts "153.200.72."+r.to_s+":"+p.to_s+" - NOTHING"
end
end
}
threads.each { |aThread| aThread.join }
end
============================

153.200.72.10:80 - NOTHING
153.200.72.10:8080 - NOTHING
153.200.72.11:80 - NOTHING
153.200.72.11:8080 - NOTHING
153.200.72.12:80 - NOTHING
153.200.72.12:8080 - NOTHING
153.200.72.13:80 - NOTHING
153.200.72.13:8080 - NOTHING
153.200.72.14:80 - NOTHING
153.200.72.14:8080 - NOTHING
153.200.72.15:80 - NOTHING
153.200.72.15:8080 - NOTHING
153.200.72.16:80 - NOTHING
153.200.72.16:8080 - NOTHING
153.200.72.17:80 - NOTHING
153.200.72.17:8080 - NOTHING

3 Answers

Ara.T.Howard

12/7/2006 3:19:00 PM

0

Richard Conroy

12/7/2006 3:35:00 PM

0

On 12/7/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:
> threads.each { |aThread| aThread.join } # THIS MUST BE OUTSIDE THE LOOP!

<homer>*d'Oh</homer>

> fyi. starting a thread, and then immediately joining it is the same as not
> using a thread at all!

Ah yes, cutting & pasting a line too high ....

> another fyi - threads are io (even socket io) is a dealy combination on
> windows. run this on linux/mac if possible.

Has to be windows, but this isn't mission critical code - just a
development tool that may eventually post the results to a wiki or
something. I can break this
up a bit so it doesn't kill my laptop later.

> regards.

Thanks. I knew it had to a WTF.

Richard Conroy

12/12/2006 6:22:00 PM

0

On 12/12/06, Shiwei Zhang <shiwei.zhang@oracle.com> wrote:
> Hi, Richard,
>
> Actually in Ruby, only by the method ".new" we can make threads
> run in parallel rather than serially. And I think it can meet your
> requirement, pls see the programs<multithreads_ProbingHttp.rb> I post at
> the end of this mail, plus the running results.
> Firstly pls notice the following points: 1) The method ".new"
> means "Creates and runs a new thread to execute the instructions given
> in block". 2) The method ".join" means "The calling thread will suspend
> execution and run the called thread. Does not return until the called
> thread exits or until limit seconds have passed".
> ".new" doesn't only mean "creates", it means both "creates" and
> "runs". So ".new" can make son threads run in parallel. And ".join"
> needs to wait for the exit of the called thread, so it gives you the
> illusion that the theads are running serially, but in fact ".join" just
> wraps up the threads. It is inappropriate for us to say whether ".join"
> is making threads run in parallel or serially. We can say ".join" is
> serially waiting for the exits of threads that might be already running
> in parallel. :-) :-)

This is what I noticed. I join up 5 threads at a time, the output jumps
up in batches of 5. This does slow down the algorithm, especially
if there is a lot of positive results - most of these threads are
waiting for the http
connection to timeout.

But I run this thing at night anyway.

As an aside, I have had difficulty getting more than ~ 5 joined threads
to work at all in windows.