James Gray
10/29/2007 3:19:00 AM
On Oct 28, 2007, at 10:00 PM, Nobuyoshi Nakada wrote:
> Hi,
>
> At Mon, 29 Oct 2007 06:20:48 +0900,
> James Edward Gray II wrote in [ruby-talk:276334]:
>> To solve this, we want to enhance the gateway to convert multipart/
>> alternative messages into something we can legally post to Usenet. I
>> have two thoughts on this strategy:
>>
>> 1. If possible, we should gather all text/plain portions of an email
>> and post those with a content-type of text/plain
>
> Rather I want it to be done by FML itself on ruyb-lang.org.
Excellent. Are their any plans to make that happen?
I'm trying to get it in the gateway so we can stop having this
discussion. ;) But if there are plans to have the list itself do
it, that's great.
>> 2. If that fails, we can just post the original body but force the
>> content-type to text/plain for maximum compatibility
>
> I do it locally by `w3m -dump -T text/html`.
Yes, I assume we could use lynx/links to similar effect. My strategy
wasn't as clever, but I thought by swapping the content type we would
at least get the content, though it would have some noise.
>> The outstanding issue is how to handle character sets for the
>> constructed message. You'll see in the code below that I just pull
>> the charset param from the original message, but after looking at a
>> few messages, I realize that this doesn't make sense. For example,
>> here are the relevant portions of a recent post that wasn't gated
>> correctly:
>>
>> Content-Type: multipart/alternative; boundary=Apple-
>> Mail-18-445454026
>>
>> --Apple-Mail-18-445454026
>> Content-Transfer-Encoding: 7bit
>> Content-Type: text/plain;
>> charset=US-ASCII;
>> delsp=yes;
>> format=flowed
>>
>> As you can see, the overall email doesn't have a charset but each
>> text portion can. If we are going to merge these parts, what's the
>> best strategy for handling the charset?
>
> "alternative" means each bodies have actually same contents,
> so, in theoretically, you can and should select one of them.
> Merging them all is wrong behavior.
Now you know why I asked for help. I know so little about email
rules. Thanks for explaining this.
This is good news because it greatly simplifies the process.
Do you know if multipart content can be nested? For example, could a
single part of a multipart message itself be multipart? The design
of TMail seems to support this, but again it's easier if that's not
the case.
> I suspect you mean multipart/relative.
I wasn't even aware of that format, to be honest. I knew of
multipart/mixed (which our Usenet host will allow) and multipart/
alternative. What is the purpose of multipart/relative?
>> I thought of trying to convert them all to UTF-8 with Iconv, but I'm
>> not sure what to do if a type doesn't declare a charset or when Iconv
>> chokes on what is declared? Please share your opinions.
>
> Should be defaulted to US-ASCII.
Do you mean that US-ASCII is the charset when one is not specified?
Thanks for all for the information.
James Edward Gray II