Gabriele Marrone
11/3/2006 2:49:00 PM
Il giorno 03/nov/06, alle ore 15:20, z ha scritto:
> I'm trying to write a script to read a list of URLS, get the HTTP
> response
> headers and <title> (if there is a page there) from each URL, and
> output to
> a CSV file in this format:
> URL, header, <title>
>
> I've started with something like this, using Lynx to get the
> headers. The
> part that doesn't seem to work is this:
> `lynx -dump -head "#{line}"` -- it doesn't want to put the url
> into the
> #{line} within the backticks.
>
> How do you insert a variable from Ruby into the shell command? I'm
> ordering
> 4 Ruby books by mail today... I haven't seen anything like this in
> the ones
> that I've browsed though.
>
>
> Here is a larger section of the script:
>
> print "Enter the location of the input file: "
> infile = gets.chomp
>
> # open file
> File.open(infile, "r") do |f|
> # get HTTP headers with Lynx
> output = f.each_line { |line| `lynx -dump -head "#{line}" |
> grep "HTTP"` }
> # puts output to CVS file
> # TODO
If you really want to use an external program, you could use
something like open("|program") in order to get an IO object
connected to its output.
Anyway I think the best way to do that is by using Net::HTTP ( http://
phrogz.net/ProgrammingRuby/lib_network.html#NetHTTP ), give it a
look, you could find it useful :)