[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

1st program -- how would you improve this?

Esmail Bonakdarian

10/5/2006 7:51:00 PM

Hi,

I just wrote my first Ruby program and I would be very interested in
some feedback as to style and how to write things more in the
Ruby-way.

I know I could use more methods and/or possibly classes, and that
will happen, but for now I would like to hear what betrays my
programming style coming from other languages.

Thanks,
Esmail



Briefly, this tiny program grabs a HTML for a random date within the
last 10 years from the Astro Picture of the Day (APOD) site
(http://antwrp.gsfc.nasa.gov/apod/archi...).

It then parses the HTML file to extract the name of the larger .jpg or
..gif file (there is usually a smaller version too), generates the
correct URL for the image, and then fires up eog (a Linux program to
display images) to fetch and display the image.

Works .. but it's not pretty, at least in terms of lack of methods
etc. Other ways to improve this?


----------------

#!/usr/bin/ruby
require 'net/http'

# method found on-line, slightly modified

def random_date(years_back=5)
year = Time.now.year - rand(years_back) - 1
month = rand(12) + 1
day = rand(31) + 1
year = year.to_s.gsub(/20|19/,'')
sprintf("%02s%02d%02d", year,month,day)
end

r_date=random_date(10) # go back 10 years

puts r_date

#
# get the html file from the apod site
#

text=Net::HTTP.start( 'antwrp.gsfc.nasa.gov', 80 ) do |http|
http.get( "/apod/ap#{r_date}.html" ).body
end

#
# get the HTML and process contents line by line
#

found_title = false
found_image = false

text.each { |line|

#
# find the title of the current pic
#

if !found_title && line =~ /<TITLE>/i
title = $'

puts "----------------------------------------------"
printf("%s\n", title.strip)
puts "----------------------------------------------\n\n"

found_title = true
end



#
# now start searching for the image name
#

if !found_image && line =~ /\.(jpg|gif) *\">/i

if line.include? "jpg" # is there a way to case do insensitve comparison here?
image = $`+".jpg"
else
image = $`+".gif"
end

url="http://apod.nasa.gov/a...
image=image.gsub(/<a +href="/i,'')
fetch_url=url+image

printf("%s\n", fetch_url)

found_image = true
system("eog " + fetch_url + " >& /dev/null &")
end

}
9 Answers

Sander Land

10/5/2006 9:32:00 PM

0

On 10/5/06, Esmail Bonakdarian <ebonakDUH@hotmail.com> wrote:
> Hi,
>
> I just wrote my first Ruby program and I would be very interested in
> some feedback as to style and how to write things more in the
> Ruby-way.
>
> I know I could use more methods and/or possibly classes, and that
> will happen, but for now I would like to hear what betrays my
> programming style coming from other languages.
>
> Thanks,
> Esmail
> ...
> Works .. but it's not pretty, at least in terms of lack of methods
> etc. Other ways to improve this?

Hi Esmail,

Here's my shot at rewriting your program.
Note that this is simply how I would write it, not everything changed
is necessarily an 'improvement'.

require 'open-uri'
def random_date(years_back=5)
Time.at(Time.now - rand(years_back * 365 * 24 * 3600)).strftime('%y%m%d')
end

r_date=random_date(10)
puts r_date

open("http://antwrp.gsfc.nasa.go...{r_date}.html") {|f|
text = f.read
if text =~ /<TITLE>([^<]*)/i
puts "----------------------------------------------"
puts $1.strip
puts "----------------------------------------------\n\n"
end

if text =~ /href=\"([^\"]+(jpg|gif))\">/i
image = $1
fetch_url = "http://apod.nasa.gov/a... + image
puts fetch_url
# system(...)
end
}

Comments:

- your random_date generated invalid dates sometimes, 01/30/02 and
such. Using timestamps for everything is a bit more stable (Although
slightly off if lots of leap years are in the range).
- puts does to_s automatically, so printf("%s\n",..) is never needed
- reading the text per line was meant as an optimization? This is
probably not necessary.
- $' / $` and such are rarely used in Ruby, try to avoid them when the
more readable $1 or String#scan will do.
- more methods / classes are not really needed for such a small program.

David Vallner

10/5/2006 9:50:00 PM

0

I'd probably have used OpenURI and then

text = open('http://www.example...).read

which I find a little more readable.

Oh, and probably also a HTML parser like Hpricot to make getting the
title tag more foolproof.

I also strongly disapprove of the Perl-ish use of global variables that
get set as a side-effect of a regex match. Notably because that's a)
relying on a side-effect of a function on b) a global variable. But
coding everything explicitly is a personal pet peeve of mine.

David Vallner

Timothy Goddard

10/5/2006 10:00:00 PM

0

Esmail Bonakdarian wrote:
> Hi,
>
> I just wrote my first Ruby program and I would be very interested in
> some feedback as to style and how to write things more in the
> Ruby-way.
>
> I know I could use more methods and/or possibly classes, and that
> will happen, but for now I would like to hear what betrays my
> programming style coming from other languages.
>
> Thanks,
> Esmail
>
>
>
> Briefly, this tiny program grabs a HTML for a random date within the
> last 10 years from the Astro Picture of the Day (APOD) site
> (http://antwrp.gsfc.nasa.gov/apod/archi...).
>
> It then parses the HTML file to extract the name of the larger .jpg or
> .gif file (there is usually a smaller version too), generates the
> correct URL for the image, and then fires up eog (a Linux program to
> display images) to fetch and display the image.
>
> Works .. but it's not pretty, at least in terms of lack of methods
> etc. Other ways to improve this?
>
<<snip>>

You could use Hpricot to do this much more easily and reliably. I've
whipped up an example of using it for this. Note also that your random
dat function could produce a date in the future up to the end of the
current year. Error checking when using any of the network libraries is
also a must.

For a first program that's pretty good though! You obviously have the
hang of using Ruby, you just need to pick up on a few of the libraries
out there and how best to use them.

Here's how I would do it:

# Remove the rubygems require if you manually installed hpricot
require 'rubygems'
require 'hpricot'
require 'net/http'

class Time
def self.random(years_back = 10)
# Set start and end times
end_time = Time.now
start_time = Time.mktime(end_time.year - 10, end_time.month,
end_time.day, 0, 0,0,0)

Time.at(start_time + rand(end_time - start_time))
end
end

# Select a date
r_date = Time.random.strftime("%y%m%d")
puts r_date

# Retrieve the page
response = Net::HTTP.get_response("antwrp.gsfc.nasa.gov",
"/apod/ap#{r_date}.html")
unless (200..299) === response.code.to_i
puts "There was an error retrieving the document"
puts "Response code: #{response.code}"
exit
end

# Parse the document
doc = Hpricot(response.body)

# Extract title
title = (doc / "title").first.inner_html
puts "Title: #{title}"

# Extact all images
images = (doc/"img")
if images.length == 0
puts "Could not extract the image from the document."
exit
end

# Assume the first image is the one we want.
image = images.first["src"]

puts "Running eog (whatever that is) to load image #{image}."

# Execute the program. Note that it depends on the image path being
relative.
system("eog http://apod.nasa....{image}")

Gary Wright

10/5/2006 10:51:00 PM

0


On Oct 5, 2006, at 5:49 PM, David Vallner wrote:
> I also strongly disapprove of the Perl-ish use of global variables
> that
> get set as a side-effect of a regex match. Notably because that's a)
> relying on a side-effect of a function on b) a global variable. But
> coding everything explicitly is a personal pet peeve of mine.

Just because they start with a $ doesn't mean they are global. I
believe
$1, $2 and so on are actually per thread local variables.

Gary Wright




Esmail Bonakdarian

10/5/2006 11:35:00 PM

0

David Vallner wrote:
> I'd probably have used OpenURI and then
>
> text = open('http://www.example...).read
> which I find a little more readable.

ah .. cool .. I got what I had before from some sample
code I found either in a book or on line. Lots of ways
to do things, hence my query, and goal to find out :-)

> Oh, and probably also a HTML parser like Hpricot to make getting the
> title tag more foolproof.

Some one else had mentioned Hpricot .. but that's an add-on, right?
Ie not part of the standard Ruby distribution? My goal is to write
code that will run pretty much everywhere w/o special requirements,
but I will look into Hpricot since it seems very useful.

> I also strongly disapprove of the Perl-ish use of global variables that
> get set as a side-effect of a regex match. Notably because that's a)
> relying on a side-effect of a function on b) a global variable. But
> coding everything explicitly is a personal pet peeve of mine.

Oh, I'm all for explicitly coding too .. (ie don't rely on default
initial values for variables, always explicitly initialize them myself
- good habit in case you work with a language that doesn't do that for
you, like C. or use parens etc.)

So, by globals, are you referring to the $` and $' ?

MonkeeSage

10/5/2006 11:41:00 PM

0

David Vallner wrote:
> I also strongly disapprove of the Perl-ish use of global variables that
> get set as a side-effect of a regex match. Notably because that's a)
> relying on a side-effect of a function on b) a global variable. But
> coding everything explicitly is a personal pet peeve of mine.

Hi David,

Ruby embraces the side-effect. This doesn't mean unpredictability as
many functional programmers assume; it means that the conditions which
*always cause the same side-effect* should be incorporated into the
control flow, leading to the same output from the same input, always.

s = 'tree'
if s =~ /(.ree)/
p $1
end
# => "tree"
if s =~ /(free)/
p $1
end
p $1 # => nil

$1 will *always* be non-nil after a grouped match, and nil after a
non-match or an ungrouped match. So the side-effect is fully
predictable.

Regarding global variables; I don't see anything wrong with them *in
concept*; granted you don't always need them, and you can explicitly
pass around a local variable (or use an object attribute like an
instance variable) to get the same effect in most cases, but that is
the same thing *conceptually* as using the global scope to implicity
pass the value. Also, granted that you should not clutter up the global
namespace with a bunch of junk, especially not vast quantities of
large, short-lived data; but the result of a regexp match which is
released after the next match attempt doesn't seem to be in that
category.

But those issues aside, I also favor being explicit for the sake of
clarity.

m = /(.ree)/.match('tree')
if m and m[1]
p m[1]
end

But even then, $1 still gets set. ;)

Regards,
Jordan

David Vallner

10/5/2006 11:46:00 PM

0

gwtmp01@mac.com wrote:
> Just because they start with a $ doesn't mean they are global. I believe
> $1, $2 and so on are actually per thread local variables.
>

There's still the readability problem with that. You're communicating
information in the program via an implicitly assumed point instead of an
explicitly declared one.

David Vallner

Esmail Bonakdarian

10/5/2006 11:49:00 PM

0

Sander Land wrote:
>
> Here's my shot at rewriting your program.
> Note that this is simply how I would write it, not everything changed
> is necessarily an 'improvement'.

Hi

I do appreciate the feedback!

> require 'open-uri'
> def random_date(years_back=5)
> Time.at(Time.now - rand(years_back * 365 * 24 * 3600)).strftime('%y%m%d')
> end

ah yes, I thought about that after I posted this. Didn't know about
strftime, hence my C sprintf standby :-) I may slightly adjust this
to start with a date in 1995 and work forward to the present.

> r_date=random_date(10)
> puts r_date
>
> open("http://antwrp.gsfc.nasa.go...{r_date}.html") {|f|
> text = f.read
> if text =~ /<TITLE>([^<]*)/i
> puts "----------------------------------------------"
> puts $1.strip
> puts "----------------------------------------------\n\n"
> end
>
> if text =~ /href=\"([^\"]+(jpg|gif))\">/i
> image = $1
> fetch_url = "http://apod.nasa.gov/a... + image
> puts fetch_url
> # system(...)
> end
> }

nice ...

> Comments:
>
> - your random_date generated invalid dates sometimes, 01/30/02 and
> such. Using timestamps for everything is a bit more stable (Although
> slightly off if lots of leap years are in the range).
> - puts does to_s automatically, so printf("%s\n",..) is never needed
> - reading the text per line was meant as an optimization? This is
> probably not necessary.

no, not really, just thought the string matching would work
better with line-by-line.

> - $' / $` and such are rarely used in Ruby, try to avoid them when the
> more readable $1 or String#scan will do.
> - more methods / classes are not really needed for such a small program.

Thanks for taking time to make these suggestions/comments, they are
very useful and just exactly what I was looking for.

Esmail

Esmail Bonakdarian

10/6/2006

0

Timothy Goddard wrote:
>
> You could use Hpricot to do this much more easily and reliably. I've
> whipped up an example of using it for this. Note also that your random
> dat function could produce a date in the future up to the end of the
> current year. Error checking when using any of the network libraries is
> also a must.

Agreed .. more error checking needs to be done .. as you mention
below, I need to browse the libraries and find out what is available
and use it :-)

> For a first program that's pretty good though! You obviously have the
> hang of using Ruby, you just need to pick up on a few of the libraries
> out there and how best to use them.

I've been programming for quite a while, but each language has
its own idiomatic ways for doing things ... which is why this
feedback is so valuable.

> Here's how I would do it:
>
> # Remove the rubygems require if you manually installed hpricot
> require 'rubygems'
> require 'hpricot'
> require 'net/http'
>
> class Time
> def self.random(years_back = 10)
> # Set start and end times
> end_time = Time.now
> start_time = Time.mktime(end_time.year - 10, end_time.month,
> end_time.day, 0, 0,0,0)
>
> Time.at(start_time + rand(end_time - start_time))
> end
> end
>
> # Select a date
> r_date = Time.random.strftime("%y%m%d")
> puts r_date
>
> # Retrieve the page
> response = Net::HTTP.get_response("antwrp.gsfc.nasa.gov",
> "/apod/ap#{r_date}.html")
> unless (200..299) === response.code.to_i
> puts "There was an error retrieving the document"
> puts "Response code: #{response.code}"
> exit
> end
>
> # Parse the document
> doc = Hpricot(response.body)
>
> # Extract title
> title = (doc / "title").first.inner_html
> puts "Title: #{title}"
>
> # Extact all images
> images = (doc/"img")
> if images.length == 0
> puts "Could not extract the image from the document."
> exit
> end
>
> # Assume the first image is the one we want.
> image = images.first["src"]
>
> puts "Running eog (whatever that is) to load image #{image}."
>
> # Execute the program. Note that it depends on the image path being
> relative.
> system("eog http://apod.nasa....{image}")
>

Thank you!

Esmail