Great Circle Associates List-Managers
(June 2003)
 

Indexed By Date: [Previous] [Next] Indexed By Thread: [Previous] [Next]

Subject: Re: standards for iso encoding subject lines?
From: Nick Simicich <njs @ scifi . squawk . com>
Date: Fri, 06 Jun 2003 05:12:11 -0400
To: List Managers <list-managers @ greatcircle . com>
In-reply-to: <Pine.GSO.4.50.0306050906370.27747-100000@tcp.com>
References: <3EDF6194.AC6D4B12@mnjazz.com><3EDF6194.AC6D4B12@mnjazz.com>

At 09:07 AM 2003-06-05 -0700, James Lick wrote:

>On Thu, 5 Jun 2003, Al Iverson wrote:
> > Subject: automatische =?iso-8859-1?Q?R=FCckantwort_?=
>
>RFCs 2047 and 2231.

Just as a point:  This is a really poorly thought out RFC.  You might want 
to decode those in your MTA or mailing list manager before forwarding them 
to your subscribers.  You *can't* safely do so.  I have considered doing 
this in demime.  Several times. The best scheme I can some up with is to 
convert everthing that is not a letter or number or known punctuation mark 
to a space.

For example:  It is completely legitimate for someone to encode a sequence 
of multiple line end characters in the header, and to have that sequence 
shielded by the encoding.  They could do this to force compliant MUAs to 
"set the subject off" for emphasis, by encoding

Subject: line-end line-end Actual Subject Line-end Line-end

This might mean that the subject of the mail would be displayed with blank 
lines before and after.

This, of course, if decoded in transit, would cause the subject to become 
blank since the first blank line would now delimit the header, and then to 
move the "actual subject" into the body.  They could do nefarious things as 
well -- since the decoding could also create new headers, possibly after 
you have elided some headers.  This is a real bag of worms.

The subject might or might not display there where it was moved into the 
body...or it might cause the content headers to become part of the body so 
that the mime decoding of the entire letter is hosed, depending on whether 
the subject line is before or after the critical headers.

As a worst case:  Supposing you only allow subscribers to post.  If you do 
the check before this decoding, someone could code their subject ahead of 
the From: header, and then chop the From: header off, pushing it into the 
usually not displayed no-mans-land following the header, and causing a new 
 From header that was not the one you checked to appear in the headers...

Or they could even convert a text/plain to a text/html.

The standard itself says that it is impossible for anything other than the 
end user's MUA to safely decode these headers, and then it can only decode 
them for the purpose of display, the original form should be retained and 
used for header parsing.

Like I said, this is a real mess.  No one thought about retaining a plain 
text section, since it seems that the attitude of many of the people who 
write the mime standards is that people should be forced to upgrade to the 
latest and greatest bit of mail display software, or get left in the 
dust.  Or that maybe these augmented headers should be hidden somewhere.

--
"Forgive him, for he believes that the customs of his tribe are the laws of 
nature!"
  -- George Bernard Shaw (1856-1950)
Nick Simicich - njs@scifi.squawk.com 


Follow-Ups:
References:
Indexed By Date Previous: Re: standards for iso encoding subject lines?
From: James Lick <jlick@drivel.com>
Next: Re: standards for iso encoding subject lines?
From: Russ Allbery <rra@stanford.edu>
Indexed By Thread Previous: Re: standards for iso encoding subject lines?
From: James Lick <jlick@drivel.com>
Next: Re: standards for iso encoding subject lines?
From: Russ Allbery <rra@stanford.edu>

Google
 
Search Internet Search www.greatcircle.com