Horacio Sanson
9/21/2006 6:07:00 AM
Thanks for your answer but I found how to fix this problem.
A look at the Mechanize code reveals that each page loaded is stored in a
history hash inside the Mechanize object. This means that as long as the
Mechanize object exist the pages will never go away.
Solution?? simply set the history_max value to something more coherent than
infinite.
############################
agent = WWW::Mechanize.new
agent.history_max = 10
############################
and that's it... no more memory hungry Mechanize.
I noted that setting this value to zero gives some problems when submiting
forms. So don't set it up to zero. Even one seems to work ok.
Hope this helps,
Horacio
??? 21 9? 2006 14:03?John Labovitz ????????:
> On Sep 20, 2006, at 9:24pm, Horacio Sanson wrote:
> > In my script I cannot remove the WWW::Mechanize object since this
> > page in
> > particular is a form and requires cookies state information to be
> > able to
> > access to the pages I need to download.
>
> What if you save the cookies out to a file?
> WWW::Mechanize::CookieJar has a #save_as and #load method to save and
> restore cookies.
>
> > Is there a way to tell the Mechanize Object to delete the pages
> > alreade downloaded??
>
> I actually ran into a similar issue recently; your diagnosis explains
> why my program used too much memory.
>
> You might try the following (assuming "browser" is your
> WWW::Mechanize object):
>
> browser.page.content.replace "" # that's an empty string
> browser.page.root.children = []
>
> That should clear both the original text, and the parsed HTML. I'm
> not sure whether this would get rid of all the references, but at
> least it should help.
>
> --John