[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Code duplication

Arun Kumar

4/6/2009 1:51:00 PM

Hi all,
The following is the code for extracting the html contents of a
website. I have included the code in case url redirect and BadRequest
error.

#getting the HTTP response from 'uri'
response = Net::HTTP.get_response(uri)
case response
# if the url is redirecting then fetch the contents of the
redirected url
when Net::HTTPRedirection then uri = URI.parse(response['Location'])
response =
Net::HTTP.get_response(uri)
# in case of a bad request error
when Net::HTTPBadRequest then http = Net::HTTP.start(uri.host,
uri.port)
#getting the html data by setting the path as '/' and using a user
agent
response = http.get("/", "User-Agent"=>"Mozilla/4.0 (compatible; MSIE
5.5; Windows NT 5.0)")
end

data = response.body

My tutor is saying that there is a duplication in the above code. ie.
code for html reading is specified twice without any purpose and it
should be removed. I've no idea where there is a mistake. I'm a newbee
to ruby and i don't understand the problem correctly or where things
went wrong. Can anyone please help me to find the mistake.

Thanks in advance.

Regards
Arun
--
Posted via http://www.ruby-....

7 Answers

Loga Ganesan

4/6/2009 2:27:00 PM

0

Arun Kumar wrote:
> Hi all,
> The following is the code for extracting the html contents of a
> website. I have included the code in case url redirect and BadRequest
> error.
>
> #getting the HTTP response from 'uri'
> response = Net::HTTP.get_response(uri)
> case response
> # if the url is redirecting then fetch the contents of the
> redirected url
> when Net::HTTPRedirection then uri = URI.parse(response['Location'])
> response =
> Net::HTTP.get_response(uri)
> # in case of a bad request error
> when Net::HTTPBadRequest then http = Net::HTTP.start(uri.host,
> uri.port)
> #getting the html data by setting the path as '/' and using a user
> agent
> response = http.get("/", "User-Agent"=>"Mozilla/4.0 (compatible; MSIE
> 5.5; Windows NT 5.0)")
> end
>
> data = response.body
>
> My tutor is saying that there is a duplication in the above code. ie.
> code for html reading is specified twice without any purpose and it
> should be removed. I've no idea where there is a mistake. I'm a newbee
> to ruby and i don't understand the problem correctly or where things
> went wrong. Can anyone please help me to find the mistake.
>
> Thanks in advance.
>
> Regards
> Arun

What is the use of this below statement ?
response = http.get("/", "User-Agent"=>"Mozilla/4.0
(compatible; MSIE
5.5; Windows NT 5.0)")

Since you had already got the response object using get_response, then
why it is needed?
--
Posted via http://www.ruby-....

Arun Kumar

4/6/2009 2:39:00 PM

0

Loga Ganesan wrote:
> Arun Kumar wrote:
>> Hi all,
>> The following is the code for extracting the html contents of a
>> website. I have included the code in case url redirect and BadRequest
>> error.
>>
>> #getting the HTTP response from 'uri'
>> response = Net::HTTP.get_response(uri)
>> case response
>> # if the url is redirecting then fetch the contents of the
>> redirected url
>> when Net::HTTPRedirection then uri = URI.parse(response['Location'])
>> response =
>> Net::HTTP.get_response(uri)
>> # in case of a bad request error
>> when Net::HTTPBadRequest then http = Net::HTTP.start(uri.host,
>> uri.port)
>> #getting the html data by setting the path as '/' and using a user
>> agent
>> response = http.get("/", "User-Agent"=>"Mozilla/4.0 (compatible; MSIE
>> 5.5; Windows NT 5.0)")
>> end
>>
>> data = response.body
>>
>> My tutor is saying that there is a duplication in the above code. ie.
>> code for html reading is specified twice without any purpose and it
>> should be removed. I've no idea where there is a mistake. I'm a newbee
>> to ruby and i don't understand the problem correctly or where things
>> went wrong. Can anyone please help me to find the mistake.
>>
>> Thanks in advance.
>>
>> Regards
>> Arun
>
> What is the use of this below statement ?
> response = http.get("/", "User-Agent"=>"Mozilla/4.0
> (compatible; MSIE
> 5.5; Windows NT 5.0)")
>
> Since you had already got the response object using get_response, then
> why it is needed?

Hi,
Thanks for the reply. If it is a bad request error, then I have to
communicate to the port and host and then I've to fetch the data. For
eg. if i try to fetch html contents from youtube.com, i get a bad
request error. So I used the Net::HTTP.start() and then I used the path
and user agent to retreive the contents and stored it in response. I
dont think that there is any other way. If I remove that part, I'm not
able to read the html.

Thanks
Arun
--
Posted via http://www.ruby-....

Eleanor McHugh

4/6/2009 3:15:00 PM

0

On 6 Apr 2009, at 14:50, Arun Kumar wrote:
> My tutor is saying that there is a duplication in the above code. ie.
> code for html reading is specified twice without any purpose and it
> should be removed. I've no idea where there is a mistake. I'm a newbee
> to ruby and i don't understand the problem correctly or where things
> went wrong. Can anyone please help me to find the mistake.


There are probably better solutions, but the following illustrates the
point your tutor is making:

MOZILLA_HEADER = { "User-Agent"=>"Mozilla/4.0 (compatible; MSIE 5.5;
Windows NT 5.0)" }

def get_http_response uri, max_redirects = 0
Net::HTTP.start(uri) do |connection|
response = connection.get(uri.path, MOZILLA_HEADER)
response &&= case response
when Net::HTTPRedirection
if max_redirects > 0 then
get_http_response URI.parse(response['Location']),
(max_redirects - 1)
else
raise "Too many redirects"
end
when Net::HTTPRedirection
get_http_response URI.parse("http://#{uri.host}:#{uri.port}/"),
max_redirects
end
end
end

data = get_http_response(my_uri, 3).body

See how get_http_response is recursive in the case of an erroneous
response? This minimises the actual HTTP interaction code as well as
elegantly handling redirects. Whilst this could result in many more
http connections being used, it also makes them clear up after
themselves which is always good.


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-...
----
raise ArgumentError unless @reality.responds_to? :reason



Arun Kumar

4/7/2009 4:32:00 AM

0

> There are probably better solutions, but the following illustrates the
> point your tutor is making:
>
> MOZILLA_HEADER = { "User-Agent"=>"Mozilla/4.0 (compatible; MSIE 5.5;
> Windows NT 5.0)" }
>
> def get_http_response uri, max_redirects = 0
> Net::HTTP.start(uri) do |connection|
> response = connection.get(uri.path, MOZILLA_HEADER)
> response &&= case response
> when Net::HTTPRedirection
> if max_redirects > 0 then
> get_http_response URI.parse(response['Location']),
> (max_redirects - 1)
> else
> raise "Too many redirects"
> end
> when Net::HTTPRedirection
> get_http_response URI.parse("http://#{uri.host}:#{uri.port}/"),
> max_redirects
> end
> end
> end
>
> data = get_http_response(my_uri, 3).body


Thanks Ellie,You gave me a clue of not only solving the code
duplication but also about handling the redirects. Thanks a lot.
Regards
Arun
--
Posted via http://www.ruby-....

Eleanor McHugh

4/7/2009 11:06:00 AM

0

On 7 Apr 2009, at 05:31, Arun Kumar wrote:
> Thanks Ellie,You gave me a clue of not only solving the code
> duplication but also about handling the redirects. Thanks a lot.

My pleasure :)


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-...
----
raise ArgumentError unless @reality.responds_to? :reason



Arun Kumar

4/7/2009 11:19:00 AM

0

Eleanor McHugh wrote:
> On 7 Apr 2009, at 05:31, Arun Kumar wrote:
>> Thanks Ellie,You gave me a clue of not only solving the code
>> duplication but also about handling the redirects. Thanks a lot.
>
> My pleasure :)
>
>
> Ellie
>
> Eleanor McHugh
> Games With Brains
> http://slides.games-with-...
> ----
> raise ArgumentError unless @reality.responds_to? :reason

Hi Ellie,
I once again thank you for your reply. It helped me a lot. Now I want to
share some doubt with you.
1) How can i specify the redirect limit without declaring it inside a
method. Is it possible?
2) By including a redirect limit, will I be able to make the code for
url redirection the most effective one or should i include some aditions
to the code to handle redirection effectively?

Thanks
Arun
--
Posted via http://www.ruby-....

Eleanor McHugh

4/7/2009 12:31:00 PM

0

On 7 Apr 2009, at 12:18, Arun Kumar wrote:
> Hi Ellie,
> I once again thank you for your reply. It helped me a lot. Now I
> want to
> share some doubt with you.
> 1) How can i specify the redirect limit without declaring it inside a
> method. Is it possible?

The redirect limit isn't declared inside the method but as one of the
parameters of the method, which is why it allows recursive execution
as each redirect is received. You'll note that I provided an initial
value as part of the initial functional call:

data = get_http_response(my_uri, 3).body

but in a real-world program you either specify a constant and use that:

MAXIMUM_REDIRECTS = 3
data = get_http_response(MAXIMUM_REDIRECTS, 3).body

or else wrap everything together into an object where this value would
be either an instance or class variable depending on your intent.

> 2) By including a redirect limit, will I be able to make the code for
> url redirection the most effective one or should i include some
> aditions
> to the code to handle redirection effectively?

I can't really answer that question without knowing more about the
real-world problem you're trying to solve. However in general I'd say
that whenever you have a recursive problem like this it's sensible to
ensure that it's throttled to prevent resource exhaustion. For a very
graphic example of why this is important - especially with network
applications - read up on the Morris Worm :)


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-...
----
raise ArgumentError unless @reality.responds_to? :reason