[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Jesÿfffffas Antonio Sÿffffe1nchez A.

6/9/2005 8:56:00 PM

Hi, I have a question. When I compiled ruby-1.8.2
I noticed that it generated ri docs, but there
where missings the docs from some modules like soap
and getoptlong. What can I do in order to generate
those
docs without doing it manually.

Thanks.




__________________________________
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.co...


12 Answers

Eric Hodel

6/9/2005 9:19:00 PM

0

On 09 Jun 2005, at 13:55, Jesÿfffffas Antonio Sÿffffe1nchez A. wrote:

> Hi, I have a question. When I compiled ruby-1.8.2
> I noticed that it generated ri docs, but there
> where missings the docs from some modules like soap
> and getoptlong. What can I do in order to generate
> those docs without doing it manually.

Neither of these libraries are usefully RDoc'd. You would get little
more than the API listing from the RDoc...

rdoc --ri --op /path/to/ri/data/dir /path/to/lib/ruby/1.8/getoptlong.rb
rdoc --ri --op /path/to/ri/data/dir /path/to/lib/ruby/1.8/soap

Will do it, I think.

--
Eric Hodel - drbrain@segment7.net - http://se...
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04



Jesÿfffffas Antonio Sÿffffe1nchez A.

6/9/2005 10:05:00 PM

0

But for example it run rdoc --op -html-docs inside
ruby source directory it is supposed to search all
subdirectories and process all *.rb and all *.c in
search for data to display, but only creates documents
for some modules, not all. for example, you don't
get the docs from net/*.rb modules, and if you check
the source code it seems that does modules are rdoc
enabled.

Thanks.


--- Eric Hodel <drbrain@segment7.net> wrote:

> On 09 Jun 2005, at 13:55, Jesÿfffffas Antonio
> Sÿffffe1nchez A. wrote:
>
> > Hi, I have a question. When I compiled ruby-1.8.2
> > I noticed that it generated ri docs, but there
> > where missings the docs from some modules like
> soap
> > and getoptlong. What can I do in order to generate
> > those docs without doing it manually.
>
> Neither of these libraries are usefully RDoc'd. You
> would get little
> more than the API listing from the RDoc...
>
> rdoc --ri --op /path/to/ri/data/dir
> /path/to/lib/ruby/1.8/getoptlong.rb
> rdoc --ri --op /path/to/ri/data/dir
> /path/to/lib/ruby/1.8/soap
>
> Will do it, I think.
>
> --
> Eric Hodel - drbrain@segment7.net -
> http://se...
> FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04
>
>
>


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail...


Eric Hodel

6/9/2005 10:16:00 PM

0

On 09 Jun 2005, at 15:05, Jesÿfffffas Antonio Sÿffffe1nchez A. wrote:

> But for example it run rdoc --op -html-docs inside
> ruby source directory it is supposed to search all
> subdirectories and process all *.rb and all *.c in
> search for data to display, but only creates documents
> for some modules, not all.

No, it reads .document files to learn what to RDoc. I believe
if .document doesn't exist, it defaults to grabbing everything.

> for example, you don't get the docs from net/*.rb modules, and if
> you check
> the source code it seems that does modules are rdoc enabled.

$ grep net lib/.document
$

--
Eric Hodel - drbrain@segment7.net - http://se...
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04



Jesÿfffffas Antonio Sÿffffe1nchez A.

6/9/2005 10:23:00 PM

0

OHHHHHH, Thanks, :)

--- Eric Hodel <drbrain@segment7.net> wrote:

> On 09 Jun 2005, at 15:05, Jesÿfffffas Antonio
> Sÿffffe1nchez A. wrote:
>
> > But for example it run rdoc --op -html-docs inside
> > ruby source directory it is supposed to search all
> > subdirectories and process all *.rb and all *.c in
> > search for data to display, but only creates
> documents
> > for some modules, not all.
>
> No, it reads .document files to learn what to RDoc.
> I believe
> if .document doesn't exist, it defaults to grabbing
> everything.
>
> > for example, you don't get the docs from net/*.rb
> modules, and if
> > you check
> > the source code it seems that does modules are
> rdoc enabled.
>
> $ grep net lib/.document
> $
>
> --
> Eric Hodel - drbrain@segment7.net -
> http://se...
> FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04
>
>
>


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail...


Xeno Campanoli

6/11/2005 3:00:00 PM

0

I coded the following:

#!/usr/bin/ruby -w
# start of validateMenuSpecs.rb - A ruby program
#

require "net/http"

host = "http://www.pragmaticprogrammer...
print "trace 1 uri: #{host}\n"
httpH = Net::HTTP.new(host,80)
fspec = "/index.html"
print "trace 5 fspec: #{fspec}\n"
response = httpH.get(fspec)
print "trace 7\n"
unless response.code == 200
print "Probe of #{target} resulted in response code
#{response.code}.\n"
end
print "trace 9\n"
----------------
and I get:

xeno@linux:~/study/data> ./try6.rb
trace 1 uri: http://www.pragmaticprog...
trace 5 fspec: /index.html
/usr/lib/ruby/1.8/net/protocol.rb:83:in `initialize': getaddrinfo: Name
or service not known (SocketError)
from /usr/lib/ruby/1.8/net/protocol.rb:83:in `new'
from /usr/lib/ruby/1.8/net/protocol.rb:83:in `connect'
from /usr/lib/ruby/1.8/net/protocol.rb:82:in `timeout'
from /usr/lib/ruby/1.8/timeout.rb:55:in `timeout'
from /usr/lib/ruby/1.8/net/protocol.rb:82:in `connect'
from /usr/lib/ruby/1.8/net/protocol.rb:64:in `initialize'
from /usr/lib/ruby/1.8/net/http.rb:430:in `open'
from /usr/lib/ruby/1.8/net/http.rb:430:in `do_start'
from /usr/lib/ruby/1.8/net/http.rb:419:in `start'
from /usr/lib/ruby/1.8/net/http.rb:824:in `request'
from /usr/lib/ruby/1.8/net/http.rb:618:in `get'
from ./try6.rb:12
xeno@linux:~/study/data>

I'm nosing around in protocol.rb and http.rb right now. Presumably this
is some simple problem, hopefully mine and not the ruby supporting code,
but I can't see it just yet.

xc

--
Xeno Campanoli, xeno@eskimo.com, http://www.eskimo...
Pride before justice equals destabilization.
Power before truth equals destruction.
Profit before environment equals death.



aslowloris)lorisx(with)

6/11/2005 3:24:00 PM

0

Xeno Campanoli <xeno@eskimo.com> writes:
> host = "http://www.pragmaticprogrammer...

Shouldn't it be:
host = "www.pragmaticprogrammer.com"

-Loris

Xeno Campanoli

6/11/2005 3:39:00 PM

0

a slow loris with poison elbows wrote:

>Xeno Campanoli <xeno@eskimo.com> writes:
>
>
>>host = "http://www.pragmaticprogrammer...
>>
>>
>
>Shouldn't it be:
>host = "www.pragmaticprogrammer.com"
>
>
Yeah, that was it. Sorry. I guess it was too early on Saturday Morning.
xc

>-Loris
>
>
>
>
>


--
Xeno Campanoli, xeno@eskimo.com, http://www.eskimo...
Pride before justice equals destabilization.
Power before truth equals destruction.
Profit before environment equals death.



Xeno Campanoli

6/13/2005 1:31:00 AM

0

I've tried both the net/http and the open-uri packages now, and the
latter is getting me a little closer, especially with assistance of
timeouts and almost universal exception rescues. I now have it narrowed
down to 12 or so exception failures, which are probably real problems,
and a bunch of 404s, and these latter are mostly or completely all
there. I suspect there is an index.html thing that just won't be seen
by these methods, or perhaps a bunch of them configure to use
somedarnthing.html instead of index.html, or perhaps the server likes to
see some headers. Anyway, I wonder whether typically at this point
people just go out and make their own crawlers from scratch (that's what
I did a few years back with Perl, LWP being much more of a hinderance
than a help), or if there are addons or other things that will make
these efforts less ugly that I am just not seeing. Is there some
standard ruby thing that:

1) will deliver acceptable headers and such to retrieve stuff
behind some of these 404 sites?
2) is there a more clean or some more standardized method for just
giving me results
no matter what and not blowing up in cases of some HTTP return codes
so that I can just
treat all failures orthogonally rather than constructing my own
external rescue handlers?
3) should I just be building my own crawler at this point, and
ignoring these above
packages as they might be for more casual users?
4) (off topic) does anyone have any etiquette recommendations for
what I am doing so I
don't irk any netadmins, or others, needlessly?

Thanks, as usual, for any feedback, and please forgive any questions
badly or innappropriately asked.

--
Xeno Campanoli, xeno@eskimo.com, http://www.eskimo...
Pride before justice equals destabilization.
Power before truth equals destruction.
Profit before environment equals death.



Eric Hodel

6/13/2005 2:28:00 AM

0

On 12 Jun 2005, at 18:31, Xeno Campanoli wrote:

> I've tried both the net/http and the open-uri packages now, and the
> latter is getting me a little closer, especially with assistance of
> timeouts and almost universal exception rescues. I now have it
> narrowed down to 12 or so exception failures, which are probably
> real problems, and a bunch of 404s, and these latter are mostly or
> completely all there. I suspect there is an index.html thing that
> just won't be seen by these methods, or perhaps a bunch of them
> configure to use somedarnthing.html instead of index.html, or
> perhaps the server likes to see some headers. Anyway, I wonder
> whether typically at this point people just go out and make their
> own crawlers from scratch (that's what I did a few years back with
> Perl, LWP being much more of a hinderance than a help), or if there
> are addons or other things that will make these efforts less ugly
> that I am just not seeing. Is there some standard ruby thing that:
>
> 1) will deliver acceptable headers and such to retrieve stuff
> behind some of these 404 sites?
> 2) is there a more clean or some more standardized method for
> just giving me results no matter what and not blowing up in cases
> of some HTTP return codes so that I can just treat all failures
> orthogonally rather than constructing my own external rescue handlers?
> 3) should I just be building my own crawler at this point, and
> ignoring these above packages as they might be for more casual users?

I have found a good deal of success with http-access2, but I still
had to do the things you don't want to do (external error handlers).
I don't think there is a good generic way of handling that.

> 4) (off topic) does anyone have any etiquette recommendations
> for what I am doing so I don't irk any netadmins, or others,
> needlessly?

obey /robots.txt

don't crawl too fast (faster than 2 requests/sec is probably too fast)

don't re-crawl too often

--
Eric Hodel - drbrain@segment7.net - http://se...
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04



Gene Tani

6/13/2005 2:51:00 AM

0

go look at OReilly's "spidering Hacks", which I would quote pieces of,
but somebody borrowed it.

Eric Hodel wrote:
> On 12 Jun 2005, at 18:31, Xeno Campanoli wrote:
>
> > I've tried both the net/http and the open-uri packages now, and the
> > latter is getting me a little closer, especially with assistance of
> > timeouts and almost universal exception rescues. I now have it
> > narrowed down to 12 or so exception failures, which are probably
> > real problems, and a bunch of 404s, and these latter are mostly or
> > completely all there. I suspect there is an index.html thing that
> > just won't be seen by these methods, or perhaps a bunch of them
> > configure to use somedarnthing.html instead of index.html, or
> > perhaps the server likes to see some headers. Anyway, I wonder
> > whether typically at this point people just go out and make their
> > own crawlers from scratch (that's what I did a few years back with
> > Perl, LWP being much more of a hinderance than a help), or if there
> > are addons or other things that will make these efforts less ugly
> > that I am just not seeing. Is there some standard ruby thing that:
> >
> > 1) will deliver acceptable headers and such to retrieve stuff
> > behind some of these 404 sites?
> > 2) is there a more clean or some more standardized method for
> > just giving me results no matter what and not blowing up in cases
> > of some HTTP return codes so that I can just treat all failures
> > orthogonally rather than constructing my own external rescue handlers?
> > 3) should I just be building my own crawler at this point, and
> > ignoring these above packages as they might be for more casual users?
>
> I have found a good deal of success with http-access2, but I still
> had to do the things you don't want to do (external error handlers).
> I don't think there is a good generic way of handling that.
>
> > 4) (off topic) does anyone have any etiquette recommendations
> > for what I am doing so I don't irk any netadmins, or others,
> > needlessly?
>
> obey /robots.txt
>
> don't crawl too fast (faster than 2 requests/sec is probably too fast)
>
> don't re-crawl too often
>
> --
> Eric Hodel - drbrain@segment7.net - http://se...
> FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04