[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

how can i find lingering file descriptors?

Brad Volz

9/13/2008 6:46:00 AM

Hello,

I seem to have an issue with file descriptors that aren't being closed
when I attempt to put some parallelization into one of my scripts. I
am trying to make use of the forkoff gem, but I guess that I am not
using it correctly.

If it's useful, here is what the code looks like currently:

def measure_n ( start_time, stop_time, direction, pattern,
records )
counters = [ :flows, :packets, :octets ]

# single process
stats = Hash.new
counters.each { |c| stats[c] = 0 }

puts "starting single process execution"
single_start = Time.now.to_f
pattern.each do |p|
r = self.measure_1( start_time, stop_time, direction,
"#{direction} AS #{p.to_s}", records)
counters.each { |c| stats[c] += r[c] }
end
single_stop = Time.now.to_f
puts stats.inspect
puts "single exeution time: #{single_stop - single_start}"

# multiple processes
stats = Hash.new
counters.each { |c| stats[c] = 0 }

puts "starting multi process execution"
multi_start = Time.now.to_f
asn_stats = pattern.forkoff! :processes => 4 do |asn|
a = Netflow::Nfdump.new
a.measure_1(start_time,stop_time,direction,"#{direction} AS
#{asn}",records)
end
asn_stats.each do |asn|
counters.each { |c| stats[c] += asn[c] }
end

multi_stop = Time.now.to_f
puts stats.inspect
puts "multi execution time: #{multi_stop - multi_start}"

return stats
end


The part that I find really odd, is that I can replace the block that
I am attempting to parallelize with a simple:-

puts "#{asn}"

And the script dies in the same place -- at the 255'th element in the
array, which corresponds well to the number of file descriptors that I
can use:-

bradv:bvolz:$ ulimit -a | grep files
open files (-n) 256

Since I see this problem with a simple 'puts' does that mean that the
issue is not in my code, and perhaps lies elsewhere? Or have I
misunderstood how to make use of the forkoff gem? In either case, how
can I figure out what these open file descriptors are?

Thanks,

Brad





5 Answers

ara.t.howard

9/13/2008 5:45:00 PM

0


On Sep 13, 2008, at 12:45 AM, Brad Volz wrote:

> Hello,
>
> I seem to have an issue with file descriptors that aren't being
> closed when I attempt to put some parallelization into one of my
> scripts.

you can list them all with something like

limit = 8192
files = Array.new(limit){|i| IO.for_fd(i) rescue nil}.compact.map{|
io| io.fileno}
p files


> I am trying to make use of the forkoff gem, but I guess that I am
> not using it correctly.
>
> If it's useful, here is what the code looks like currently:
>
> def measure_n ( start_time, stop_time, direction, pattern,
> records )
> counters = [ :flows, :packets, :octets ]
>
> # single process
> stats = Hash.new
> counters.each { |c| stats[c] = 0 }
>
> puts "starting single process execution"
> single_start = Time.now.to_f
> pattern.each do |p|
> r = self.measure_1( start_time, stop_time, direction,
> "#{direction} AS #{p.to_s}", records)
> counters.each { |c| stats[c] += r[c] }
> end
> single_stop = Time.now.to_f
> puts stats.inspect
> puts "single exeution time: #{single_stop - single_start}"
>
> # multiple processes
> stats = Hash.new
> counters.each { |c| stats[c] = 0 }
>
> puts "starting multi process execution"
> multi_start = Time.now.to_f
> asn_stats = pattern.forkoff! :processes => 4 do |asn|
> a = Netflow::Nfdump.new

this probably opens a file or pipe

>
> a.measure_1(start_time,stop_time,direction,"#{direction} AS
> #{asn}",records)

this too possibly

>
> end
> asn_stats.each do |asn|
> counters.each { |c| stats[c] += asn[c] }
> end
>
> multi_stop = Time.now.to_f
> puts stats.inspect
> puts "multi execution time: #{multi_stop - multi_start}"
>
> return stats
> end
>
>
> The part that I find really odd, is that I can replace the block
> that I am attempting to parallelize with a simple:-
>
> puts "#{asn}"
>
> And the script dies in the same place -- at the 255'th element in
> the array, which corresponds well to the number of file descriptors
> that I can use:-
>
> bradv:bvolz:$ ulimit -a | grep files
> open files (-n) 256
>
> Since I see this problem with a simple 'puts' does that mean that
> the issue is not in my code, and perhaps lies elsewhere? Or have I
> misunderstood how to make use of the forkoff gem? In either case,
> how can I figure out what these open file descriptors are?
>
> Thanks,
>
> Brad
>

you are using the gem properly. you should attempt to track down the
open files with something like this in the forkoff block

ios = Array.new(8192){|i| IO.for_fd(i) rescue nil}.compact
filenos = ios.map{|io| io.fileno}
paths = ios.map{|io| io.path rescue nil}

STDERR.puts "child : #{ Process.pid }"
STDERR.puts "filenos : #{ filenos.inspect }"
STDERR.puts "paths : #{ paths.inspect }"
STDIN.gets

i just releases version 0.0.4 of forkoff. should not affect your
issue, but might want to install anyhow (just pushed to rubyforge -
might have to wait for it propagate or grab the gem from there manually)

cheers.




a @ http://codeforp...
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama




Brad Volz

9/14/2008 12:11:00 AM

0

Thanks for the reply.

I'm going to trim the original message a bit.

On Sep 13, 2008, at 10:44 AM, ara.t.howard wrote:

> On Sep 13, 2008, at 12:45 AM, Brad Volz wrote:
>
>> Hello,
>>
>> I seem to have an issue with file descriptors that aren't being
>> closed when I attempt to put some parallelization into one of my
>> scripts.
>
> you can list them all with something like
>
> limit = 8192
> files = Array.new(limit){|i| IO.for_fd(i) rescue nil}.compact.map{|
> io| io.fileno}
> p files

Excellent. Thanks.

>> # multiple processes
>> stats = Hash.new
>> counters.each { |c| stats[c] = 0 }
>>
>> puts "starting multi process execution"
>> multi_start = Time.now.to_f
>> asn_stats = pattern.forkoff! :processes => 4 do |asn|
>> a = Netflow::Nfdump.new
>
> this probably opens a file or pipe
>
>>
>> a.measure_1(start_time,stop_time,direction,"#{direction} AS
>> #{asn}",records)
>
> this too possibly

Yes. It calls an external program to collect the actual data from the
datastore. Having said that, I don't know that I can confidently say
that the issue is in that particular block of code, as I can remove it
and put in place something like: puts "hello !" and still
experience the same problem.

> you are using the gem properly. you should attempt to track down
> the open files with something like this in the forkoff block
>
> ios = Array.new(8192){|i| IO.for_fd(i) rescue nil}.compact
> filenos = ios.map{|io| io.fileno}
> paths = ios.map{|io| io.path rescue nil}
>
> STDERR.puts "child : #{ Process.pid }"
> STDERR.puts "filenos : #{ filenos.inspect }"
> STDERR.puts "paths : #{ paths.inspect }"
> STDIN.gets
>
> i just releases version 0.0.4 of forkoff. should not affect your
> issue, but might want to install anyhow (just pushed to rubyforge -
> might have to wait for it propagate or grab the gem from there
> manually)

I'll try this again later with the new forkoff, but I currently have
0.0.1.

Thanks for the test block. I ran it via -

require 'rubygems'
require 'forkoff'

(0..255).forkoff do |f|

ios = Array.new(8192){|i| IO.for_fd(i) rescue nil}.compact
filenos = ios.map{|io| io.fileno}
paths = ios.map{|io| io.path rescue nil}

STDERR.puts "child : #{ Process.pid }"
STDERR.puts "filenos : #{ filenos.inspect }"
STDERR.puts "paths : #{ paths.inspect }"
STDIN.gets

end

I won't past in the entire output, but here is the first and last
child for comparison --

bradv:bvolz:$ ruby forkoff-test.rb
child : 47036
filenos : [0, 1, 2, 4]
paths : [nil, nil, nil, nil]

child : 47288
filenos : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101,
102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115,
116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129,
130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143,
144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157,
158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171,
172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185,
186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199,
200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213,
214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227,
228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241,
242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 255]
paths : [nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil,
nil, nil, nil, nil, nil]

/opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:42:in
`pipe': Too many open files (Errno::EMFILE)
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
42:in `forkoff'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
37:in `loop'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
37:in `forkoff'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
34:in `initialize'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
34:in `new'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
34:in `forkoff'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
32:in `times'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
32:in `forkoff'
from forkoff-test.rb:5
bradv:bvolz:$ /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/
forkoff.rb:53:in `write': Broken pipe (Errno::EPIPE)
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
53:in `forkoff'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
37:in `loop'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
37:in `forkoff'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
34:in `initialize'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
34:in `new'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
34:in `forkoff'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
32:in `times'
from /opt/local/lib/ruby/gems/1.8/gems/forkoff-0.0.1/lib/forkoff.rb:
32:in `forkoff'
from forkoff-test.rb:5

If it's useful to know I am using ruby from macports on OS X 10.5

bradv:bvolz:$ ruby -v
ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9]

My current workaround is to use threadify instead of forkoff which is
working beautifully.

bradv:bvolz:$ ruby netflow.rb
starting single process execution
{:flows=>176915, :packets=>90580480, :octets=>81664177152}
single exeution time: 138.792369842529
starting threadify execution
{:flows=>176915, :packets=>90580480, :octets=>81664177152}
threadify execution time: 85.1209449768066

cheers!

Brad


Brad Volz

9/14/2008 12:26:00 AM

0


On Sep 13, 2008, at 10:44 AM, ara.t.howard wrote:
>
> i just releases version 0.0.4 of forkoff. should not affect your
> issue, but might want to install anyhow (just pushed to rubyforge -
> might have to wait for it propagate or grab the gem from there
> manually)

I just tested 0.0.4, and it seems to work just as well as threadify at
keeping the file descriptors under control.

Using the same test code produces -

bradv:bvolz:$ ruby forkoff-test.rb
child : 47335
filenos : [0, 1, 2, 4]
paths : [nil, nil, nil, nil]

that was the first pass, and here is the last --

child : 47600
filenos : [0, 1, 2, 3, 5]
paths : [nil, nil, nil, nil, nil]
bradv:bvolz:$

Many thanks for your help and for releasing these gems to the public.

cheers!

Brad


ara.t.howard

9/14/2008 5:41:00 AM

0


On Sep 13, 2008, at 6:25 PM, Brad Volz wrote:

> I just tested 0.0.4, and it seems to work just as well as threadify
> at keeping the file descriptors under control.


great!

guess i left some pipes lying around in the initial version ;-(

say - which of the two, threadify and forkoff, was fastest for your
use case?


a @ http://codeforp...
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama




Brad Volz

9/14/2008 7:48:00 AM

0


On Sep 13, 2008, at 10:40 PM, ara.t.howard wrote:

>
> On Sep 13, 2008, at 6:25 PM, Brad Volz wrote:
>
>> I just tested 0.0.4, and it seems to work just as well as threadify
>> at keeping the file descriptors under control.
>
>
> great!
>
> guess i left some pipes lying around in the initial version ;-(
>
> say - which of the two, threadify and forkoff, was fastest for your
> use case?

It looks like threadify is.

The following test attempts to be "fair" by specifying 4 threads for
threadify, and 4 processes for forkoff since their defaults are
different.

bradv:bvolz:$ !!
ruby netflow.rb
2391 items in Enumerable
starting threadify execution
{:flows=>150148, :packets=>76875776, :octets=>68717894144}
threadify execution time: 108.343915224075
starting forkoff execution
{:flows=>150148, :packets=>76875776, :octets=>68717894144}
forkoff execution time: 143.863088130951
bradv:bvolz:$

In my case, I'm not doing the calculations in ruby code. I'm calling
an external program to fetch the data for me, so in both cases there
is an external process that can be scheduled by the operating system
onto whatever processor has cycles. Presumably, if that were not the
case, and I was doing a lot of cpu intensive work in Ruby itself, the
forkoff version would be faster by providing 'n' GIL instances instead
of just one.

cheers!

Brad