[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

RSS/Atom feed consuming lib?

Marcus Bristav

10/18/2006 8:12:00 AM

I have a customer (we build their intranet with Rails) that subscribes
to a number of news feeds. They want to make these feeds available on
their intranet so we need to fetch, parse and publish these feeds
(like a interal web based feed reader).

Are there any good Ruby libs to this that preferably supports 0.92,
2.0 and Atom (Atom is more of a nice to have than need to have...) and
exposes it with a nice common (for all formats) API?

I looked att RAA and Rubyforge but didn't find anything that really
peaked my interest (although I might have missed something)

/Marcus

6 Answers

Jochen Schalanda

10/18/2006 2:34:00 PM

0

Marcus Bristav <marcus.bristav@gmail.com> wrote:
> Are there any good Ruby libs to this that preferably supports 0.92,
> 2.0 and Atom (Atom is more of a nice to have than need to have...) and
> exposes it with a nice common (for all formats) API?

Yes, syndication[1] and FeedTools[2] should be two of the better libraries.

[1]: http://rubyforge.org/projects/sy...
[2]: http://sporkmonger.com/projects/...

HTH,
Jochen

Marcus Bristav

10/18/2006 5:35:00 PM

0

Thanks for the tips! I've tried feedtools and it seems to work nicely :)

Out of curiosity: Why couldn't you use feedtools?

/Marcus

Gustav - Railist

10/18/2006 6:43:00 PM

0


> Marcus Bristav <marcus.bristav@gmail.com> wrote:
>
>> Are there any good Ruby libs to this that preferably supports 0.92,
>> 2.0 and Atom (Atom is more of a nice to have than need to have...) and
>> exposes it with a nice common (for all formats) API?
>>
If you're planning to go through FeedBurner, you can checkout the plugin at
http://combustible.rubyforg...

Still very young, but perhaps you'll find it usefull

Gustav

--
about me:
My greatest achievement was when all the other
kids just learnt to count from 1 to 10,
i was counting (0..9)

- gustav.paul


Andy Smith

10/18/2006 9:23:00 PM

0

Marcus Bristav wrote:
> I have a customer (we build their intranet with Rails) that subscribes
> to a number of news feeds. They want to make these feeds available on
> their intranet so we need to fetch, parse and publish these feeds
> (like a interal web based feed reader).
>
> Are there any good Ruby libs to this that preferably supports 0.92,
> 2.0 and Atom (Atom is more of a nice to have than need to have...) and
> exposes it with a nice common (for all formats) API?
>
> I looked att RAA and Rubyforge but didn't find anything that really
> peaked my interest (although I might have missed something)
>
> /Marcus
>
>

You may be interested in feed-normalizer; something I pieced together to
wrap a few different Atom/RSS parsers. It outputs a normalized
object graph to represent a feed, regardless of the underlying feed format.

It currently wraps the Ruby RSS parser and Lucas Carlson's SimpleRSS,
but it can be easily extended to support more parsers. Patches welcome.

http://feed-normalizer.ruby...

Hope that helps.

Andy

Ray Chen

10/19/2006 9:07:00 AM

0

I am also working on a performance app that requires feed parsing. The
two that I have tried are feedtools and syndication. First I tried
feedtools for RSS and Atom, but that was too slow, so I switched to
syndication for both RSS and Atom. I found syndication to break on a
high percentage of Atom sites, so in the end, I sent RSS to syndication
and Atom to feedtools and took the corresponding perf hit for Atom
feeds.

I find this approach to be decently robust, but not very elegant. I am
going through > 10k feeds a day of all varieties.

Can someone comment on the robustness of Ruby RSS Parser and Lucas
Carlson's SimpleRSS? I am curious about Andy's feed normalizer.

HTH,
Ray


--
Posted via http://www.ruby-....

Andy Smith

10/19/2006 11:59:00 PM

0

Ray Chen wrote:
> I am also working on a performance app that requires feed parsing.

As previously mentioned, feed-normalizer aims to produce a 'Feed' object
that is independent of the underlying format. This means it will use
each parser (in a user-defined order) until it gets back a successful
parse and usable a object which to interface.

What this also means is that the *primary* goal of feed-normalizer is to
produce the aforementioned Feed object graph. This might mean it hitting
3 parsers before it gets that result. So performance isn't really a
consideration.

Of course, you could change the order of parsing so that feed-normalizer
uses the fastest parser first, and so on. feed-normalizer currently uses
most strict to most liberal as its default order. Right now, this just
happens to be fastest parser first, too :)

> The two that I have tried are feedtools and syndication. First I tried
> feedtools for RSS and Atom, but that was too slow, so I switched to
> syndication for both RSS and Atom. I found syndication to break on a
> high percentage of Atom sites, so in the end, I sent RSS to syndication
> and Atom to feedtools and took the corresponding perf hit for Atom
> feeds.

In this case you could create a wrapper for feed-normalizer that
interfaces both syndication and feedtools, and tell feed-normalizer
which one to use first. I assume you'll probably encounter more RSS than
Atom.

>
> I find this approach to be decently robust, but not very elegant. I am
> going through > 10k feeds a day of all varieties.
>
> Can someone comment on the robustness of Ruby RSS Parser and Lucas
> Carlson's SimpleRSS? I am curious about Andy's feed normalizer.
>

I personally have found Ruby's RSS library to be very good at handling
RSS feeds that aren't broken :) What that means is the results should be
predictable, but the chance of a good parse may be lower.

SimpleRSS on the other hand is uber-liberal, and if the feed resembles
anywhere near an RSS or Atom document, you'll probably get a pretty good
result back, but there are small errors sometimes.

Bob Aman did an overview of both parsers, somewhere on sporkmonger.com.

Back to performance again; I did some rudimentary benchmarks[1] of both
Ruby's RSS as well as SimpleRSS. I think the results of this benchmark
really make the point for SimpleRSS being a great 'backup' parser to
have when nothing else will parse an ill-formed feed.

And of course, I'm always looking for patches and new parser wrappers
for feed-normalizer.

> HTH,
> Ray
>
>

Hope that helps.

Andy

[1]
http://blog.andyis.textdriven.com/articles/2006/03/28/parsers-i...