Garold L Johnson
9/29/2007 8:15:00 AM
Booker Bense wrote:
> I'm working on rewriting my email filtering hacks and I need a
> library that can parse unix mbox file format where the embedded
> '\nFrom ' in a message are not quoted. (i.e. one that knows about the
> Header Content-Length ).
>
> I've been using the rmail library in
>
> rubymail-0.17
>
> plus my own hack to parse the mbox file. ( rubymail's parser
> crashes when give such a file. ) I think I have a fix for
> this, but I thought I would check if there are any more recent
> projects that can do this.
>
> I poked around Rubyforge and found nothing useful. I am aware of
> Tmail, but I'm not sure it solves this problem either. Basically,
> I need a ruby version of formail.
>
> If there is a "standard" email handling package, I'd take a look
> and see about making it do what I want. At this point all the
> email handling packages seem like abandonware...
>
> _ Booker C. Bense
>
>
Did you ever find a solution? I am trying to parse Thunderbird email
files, which are supposed to be standard mbox format, but RubyMail
crashes routinely.
Since the parser doesn't track lines, it is often difficult to find the
problem.
Not all headers have Content-length, but I haven't worked out the
pattern yet.
I don't want to start from scratch as this format is clearly
non-trivial. I have looked at some Perl modules and most of them appear
to have the same problem of not handling unquoted "From " lines
--
Thanks, Garold (Gary) L. Johnson