[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Entering data into Excel, in specific iterated rows/columns

reed.adam@gmail.com

8/9/2006 10:38:00 PM

I have a script that spiders a page looking for elements. In this
specific case, I'm looking for links (innerText and href).

I have this script working so far to output data to a text file, but
I'd like to output it to an Excel file for organization. I'm getting
stuck however trying to think of the correct method to put all of the
link.innerText values into A1, A2, A3, etc and the corresponding
link.href into B1, B2, etc.

Here is the code I have so far (with the normal text file code in place
of the excel code I have yet to think of):

ef grab_links(url)
$ie.goto(url)
excel = WIN32OLE.new("excel.application")
excel.visible = true
workbook = excel.workbooks.add
workbook.saveas("#{Project}_links.xls")
workbook.close

linkfile.puts "\n#{url}"
$ie.links.each{|link|
linkfile.puts("#{link.innerText} - #{link.href}") unless
link.innerText.empty?
}

linkfile.puts "\n\n"
end

One big file with it formatted like this would be fine, or even a
different worksheet for each URL would be great. I'm not really
concered with that. Can you guys give me hand?

Thanks!

9 Answers

reed.adam@gmail.com

8/9/2006 10:40:00 PM

0

Shoot, pardon the "ef" (def) and the excel.close in there, those are
mistakes.

reed.adam@gmail.com wrote:
> I have a script that spiders a page looking for elements. In this
> specific case, I'm looking for links (innerText and href).
>
> I have this script working so far to output data to a text file, but
> I'd like to output it to an Excel file for organization. I'm getting
> stuck however trying to think of the correct method to put all of the
> link.innerText values into A1, A2, A3, etc and the corresponding
> link.href into B1, B2, etc.
>
> Here is the code I have so far (with the normal text file code in place
> of the excel code I have yet to think of):
>
> ef grab_links(url)
> $ie.goto(url)
> excel = WIN32OLE.new("excel.application")
> excel.visible = true
> workbook = excel.workbooks.add
> workbook.saveas("#{Project}_links.xls")
> workbook.close
>
> linkfile.puts "\n#{url}"
> $ie.links.each{|link|
> linkfile.puts("#{link.innerText} - #{link.href}") unless
> link.innerText.empty?
> }
>
> linkfile.puts "\n\n"
> end
>
> One big file with it formatted like this would be fine, or even a
> different worksheet for each URL would be great. I'm not really
> concered with that. Can you guys give me hand?
>
> Thanks!

Sam Smoot

8/9/2006 11:04:00 PM

0

I'd probably recommend FasterCSV instead. You can use Excel through
WIN32OLE (google both terms for tutorials on how), but it's much Much
MUCH simpler to just use FasterCSV.

$ gem install fastercsv
$ irb
>> items = []
>> items << [ 'text1', 'link1' ]
>> items << [ 'text2', 'link2' ]
>> require 'faster_csv'
>> FasterCSV::open('links.csv', 'wb+') do |csv|
>> items.each { |item| csv << item }
>> end
>> exit
$ type links.csv
text1, link1
text2, link2

(It will handle quoting, escaping, etc for you).

You can open a CSV file with Excel just by double-clicking it usually
(Excel should be the default extension handler for CSV files unless
you've registered something else to the extension), and you can save
the file in Excel with "Save As..." if you need a true Excel file with
all the formatting options (CSV files can't remember column widths for
example if you want the widths auto-expanded to the size of the
content).

Even if you still wanted to go with a real Excel file, I'd probably
still use FasterCSV to generate an intermediary file since it's much
easier to do than messing with Excel.Application, Excel.Workbook,
Excel.Worksheet, Excel.Range, etc. Then, you can just instantiate an
Excel.Application, open the CSV as a workbook, and do a "Save As..."
programmatically. With a lot less code, and a lot less debugging effort
than trying to use Excel directly for everything. For example, if
anything goes wrong in your process while working with an
Excel.Application, the Excel process gets orphaned. So you have to wrap
the entire process in a rescue basically so you can make sure and
"ensure" "excel.Quit()" gets called.

If you want to do it the hard way though, then there's an excellent
starter at RubyGarden:
http://wiki.rubygarden.org/Ruby/page/show/Scri...

Good luck!

reed.adam@gmail.com

8/10/2006 4:52:00 AM

0

Thanks Sam, I appreciate the reply. That tutorial is what I have been
using, but nobody really talks about handling an unknown amount of
variable pairs.

Where exactly in the FasterCSV code are you stating, "first part",
"second part". Would it be the same as just saying "#{firstitem},
#{seconditem}\n"?

That would give me:

"first link name, first link url
second link name, second link url"

Would that type of comma-separated file work for Excel output?

Thanks again.

Rafael G.

8/10/2006 8:34:00 AM

0



reed.adam@gmail.com wrote:
>
> I have a script that spiders a page looking for elements. In this
> specific case, I'm looking for links (innerText and href).
>
> I have this script working so far to output data to a text file, but
> I'd like to output it to an Excel file for organization. I'm getting
> stuck however trying to think of the correct method to put all of the
> link.innerText values into A1, A2, A3, etc and the corresponding
> link.href into B1, B2, etc.
>
> Here is the code I have so far (with the normal text file code in place
> of the excel code I have yet to think of):
>
> ef grab_links(url)
> $ie.goto(url)
> excel = WIN32OLE.new("excel.application")
> excel.visible = true
> workbook = excel.workbooks.add
> workbook.saveas("#{Project}_links.xls")
> workbook.close
>
> linkfile.puts "\n#{url}"
> $ie.links.each{|link|
> linkfile.puts("#{link.innerText} - #{link.href}") unless
> link.innerText.empty?
> }
>
> linkfile.puts "\n\n"
> end
>
> One big file with it formatted like this would be fine, or even a
> different worksheet for each URL would be great. I'm not really
> concered with that. Can you guys give me hand?
>
> Thanks!
>
>
>
>

Try with Spreadsheet::Excel

It let you create worksheets, write in a defined column - row, ...

Regards!

--
View this message in context: http://www.nabble.com/Entering-data-into-Excel%2C-in-specific-iterated-rows-columns-tf2081687.htm...
Sent from the ruby-talk forum at Nabble.com.


David Mullet

8/10/2006 11:31:00 AM

0

reed.adam@gmail.com wrote:
> I have a script that spiders a page looking for elements. In this
> specific case, I'm looking for links (innerText and href).
>
> I have this script working so far to output data to a text file, but
> I'd like to output it to an Excel file for organization. I'm getting
> stuck however trying to think of the correct method to put all of the
> link.innerText values into A1, A2, A3, etc and the corresponding
> link.href into B1, B2, etc.
>
> Here is the code I have so far (with the normal text file code in place
> of the excel code I have yet to think of):
>
> ef grab_links(url)
> $ie.goto(url)
> excel = WIN32OLE.new("excel.application")
> excel.visible = true
> workbook = excel.workbooks.add
> workbook.saveas("#{Project}_links.xls")
> workbook.close
>
> linkfile.puts "\n#{url}"
> $ie.links.each{|link|
> linkfile.puts("#{link.innerText} - #{link.href}") unless
> link.innerText.empty?
> }
>
> linkfile.puts "\n\n"
> end
>
> One big file with it formatted like this would be fine, or even a
> different worksheet for each URL would be great. I'm not really
> concered with that. Can you guys give me hand?
>
> Thanks!

xl = WIN32OLE.new("Excel.Application")
xl.Visible = 1
wb = xl.Workbooks.Add
ws = wb.Worksheets(1)
row = 0
$ie.links.each do |link|
next if link.innerText.empty?
row = row + 1
ws.Cells(row, 1).Value = link.innerText
ws.Cells(row, 2).Value = link.href
end

[ Not tested, and posted before I've had my morning tea... ]

Mully


--
Posted via http://www.ruby-....

reed.adam@gmail.com

8/10/2006 1:35:00 PM

0

Wow David, that's dedication -- coding before tea? Were you at least
wearing a helmet or other protection?

I really do appreciate the reply, that's _exactly_ what I was looking
for. Even if it happens to not fully work, that's the logic I was
looking for so the commands should quickly follow.

Thanks again --


David Mullet wrote:

>
> xl = WIN32OLE.new("Excel.Application")
> xl.Visible = 1
> wb = xl.Workbooks.Add
> ws = wb.Worksheets(1)
> row = 0
> $ie.links.each do |link|
> next if link.innerText.empty?
> row = row + 1
> ws.Cells(row, 1).Value = link.innerText
> ws.Cells(row, 2).Value = link.href
> end
>
> [ Not tested, and posted before I've had my morning tea... ]
>
> Mully
>
>
> --
> Posted via http://www.ruby-....

Sam Smoot

8/10/2006 6:44:00 PM

0

> Where exactly in the FasterCSV code are you stating, "first part",
> "second part". Would it be the same as just saying "#{firstitem},
> #{seconditem}\n"?

The "csv << [ 'first part', 'second part' ]"

The csv object passed into the block by FasterCSV::open just expects an
array of values. It handles the escaping of those values internally.
Yes, it's fully compatible with Excel, produces a lot smaller files, is
faster, simpler, and works with pretty much any text editor if you
don't have Excel handy.

I used to try to do everything with Excel Automation, but then it
dawned on me... unless you want colored headers, custom fonts, multiple
sheets, etc, there just isn't good reason I should be targetting Excel
and excluding everyone that doesn't have it, when I can just go with
the simpler CSV and it's available to everyone then (including Excel
users of course).

That's just my rationale tho'. Sometimes people do want the colored
headers, multiple sheets, etc. ;-)

x1

8/11/2006 12:29:00 AM

0

can you not use the worksheet.Range method?

ie
require 'win32ole'
excel = WIN32OLE::new('excel.Application')
excel.visible = true
workbook = excel.workbooks.add
worksheet = workbook.Worksheets(1)
i = 1
"a".upto("z") {|s|
worksheet.Range("A#{i.to_s}:B#{i.to_s}").value = [s, s.next]
i+=1
}


On 8/10/06, Sam Smoot <ssmoot@gmail.com> wrote:
> > Where exactly in the FasterCSV code are you stating, "first part",
> > "second part". Would it be the same as just saying "#{firstitem},
> > #{seconditem}\n"?
>
> The "csv << [ 'first part', 'second part' ]"
>
> The csv object passed into the block by FasterCSV::open just expects an
> array of values. It handles the escaping of those values internally.
> Yes, it's fully compatible with Excel, produces a lot smaller files, is
> faster, simpler, and works with pretty much any text editor if you
> don't have Excel handy.
>
> I used to try to do everything with Excel Automation, but then it
> dawned on me... unless you want colored headers, custom fonts, multiple
> sheets, etc, there just isn't good reason I should be targetting Excel
> and excluding everyone that doesn't have it, when I can just go with
> the simpler CSV and it's available to everyone then (including Excel
> users of course).
>
> That's just my rationale tho'. Sometimes people do want the colored
> headers, multiple sheets, etc. ;-)
>
>
>

Sam Smoot

8/11/2006 4:53:00 AM

0


x1 wrote:
> can you not use the worksheet.Range method?

That works too. I just appreciate a good 1 to ~3 liner. :-) FasterCSV
does that. Excel Automation doesn't.