Asp Forum - downloading web page as HTML accessed through WATIR

michael

10/19/2006 5:03:00 AM

is there anyway to download the page accessed through WATIR as html
page??

for instance,

require 'watir'

ie = Watir::IE.start("http://www.yahoo...)

above codes will open www.yahoo.com.. what should we do to download
this page as yahoo.htm??

any suggestion or hints will be deeply appreciated..

michael

3 Answers

lrlebron@gmail.com

10/22/2006 12:17:00 AM

ie.html is what you are looking for

aFile = File.new("yahoo.htm" , "w")
aFile << ie.html
aFile.close

and if you want to view the file then

ie.goto("yahoo.htm")

michael wrote:
> is there anyway to download the page accessed through WATIR as html
> page??
>
> for instance,
>
>
> require 'watir'
>
> ie = Watir::IE.start("http://www.yahoo...)
>
>
> above codes will open www.yahoo.com.. what should we do to download
> this page as yahoo.htm??
>
> any suggestion or hints will be deeply appreciated..
>
>
> michael

David Vallner

10/22/2006 1:19:00 PM

michael wrote:
> is there anyway to download the page accessed through WATIR as html
> page??
>
> for instance,
>
>
> require 'watir'
>
> ie = Watir::IE.start("http://www.yahoo...)
>
>
> above codes will open www.yahoo.com.. what should we do to download
> this page as yahoo.htm??
>
> any suggestion or hints will be deeply appreciated..
>
>

If this is all you need, you might as well use Net:HTTP, or open-uri -
this should have smaller overhead since you're not instantiating an IE
control.

require 'open-uri'

open('http://www.yahoo...) { |html|
open('yahoo.html', 'w') { |out|
out.print(html.read)
}
}

David Vallner

Chris McMahon

10/23/2006 11:53:00 PM

lrlebron@gmail.com wrote:
> ie.html is what you are looking for
>
> aFile = File.new("yahoo.htm" , "w")
> aFile << ie.html
> aFile.close
>
> and if you want to view the file then
>
> ie.goto("yahoo.htm")

Be very very careful here, make sure you understand what you are doing.
Watir does *not* see the HTML on the page-- Watir only sees the DOM in
Internet Explorer. If the HTML is missing a "/p>", for instance, Watir
can't see it. The #html method will *always* yield valid HTML, because
it it interpreting the DOM, regardless of how awful the original HTML
may or may not be.

comp.lang.ruby

downloading web page as HTML accessed through WATIR

michael

lrlebron@gmail.com

David Vallner

Chris McMahon

x Login to ForumsZone