From majordomo-users-owner@greatcircle.com Mon Mar 12 12:18:32 2007 X-Original-To: majordomo-users@greatcircle.com X-Greylist: delayed 1273 seconds by postgrey-1.24 at mycroft; Mon, 12 Mar 2007 12:18:31 PDT Received: from smtp-out.neti.ee (smtp-out.neti.ee [194.126.126.44]) by mycroft.greatcircle.com (Postfix) with ESMTP id 19F872900DB for ; Mon, 12 Mar 2007 12:18:30 -0700 (PDT) Received: from smtp-out.neti.ee (relay8.neti.ee [88.196.174.139]) by HOT-Bounce1.estpak.ee (Postfix) with ESMTP id 1D21256854A for ; Mon, 12 Mar 2007 20:57:18 +0200 (EET) X-Virus-Scanned: by amavisd-new-2.4.3 (20060930) (Debian) at neti.ee Received: from Relayhost3.neti.ee (unknown [88.196.174.169]) by MXR-8.estpak.ee (Postfix) with ESMTP id 8639CEE9CA for ; Mon, 12 Mar 2007 20:57:12 +0200 (EET) Received: from [88.196.104.30] (88-196-104-30-dsl.trt.estpak.ee [88.196.104.30]) by Relayhost3.neti.ee (Postfix) with ESMTP id 4D2A67847 for ; Mon, 12 Mar 2007 20:57:09 +0200 (EET) Message-ID: <45F5A287.6070802@raad.tartu.ee> Date: Mon, 12 Mar 2007 20:57:11 +0200 From: Toomas Aas User-Agent: Thunderbird 1.5.0.9 (X11/20070304) MIME-Version: 1.0 To: majordomo-users@greatcircle.com Subject: html-stripper-0.1 patch and problems with MIME multipart messages Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Archive-Number: 200703/1 X-Sequence-Number: 5615 Hello! I seem to have a problem with html-stripper-0.1 patch and some MIME multipart messages. Let's say the original message looks someting like this: ========================================================== From: sender Date: Mon, 12 Mar 2007 19:49:30 +0200 To: test-l@mydomain.com Subject: =?utf-8?q?p=C3=A4iste=20test?= MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="--------=BoundaryPyDog1173721770.89------" Message-Id: <20070312174934.4E0517B43C@mh3-4.hot.ee> ----------=BoundaryPyDog1173721770.89------ Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Mis paistab kirja p=C3=A4istes? (katse nr 1) ----------=BoundaryPyDog1173721770.89------ Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Mis paistab kirja p=C3=A4istes? (katse nr 1) ----------=BoundaryPyDog1173721770.89-------- ========================================================== So it's an multipart/alternative message with text/plain part specifying charset UTF-8. After passing through Majordomo with html-stripper enabled and html_policy set to 'strip', the message becomes something like this: ========================================================== From: Date: Mon, 12 Mar 2007 19:49:30 +0200 To: test-l@mydomain.com Subject: =?utf-8?q?p=C3=A4iste=20test?= MIME-Version: 1.0 Content-Type: text/plain Message-Id: <20070312175506.88428207DB@mh3-5.hot.ee> Sender: owner-test-l@post.raad.tartu.ee Precedence: bulk Mis paistab kirja päistes? (katse nr 1) ========================================================== As you see, the content-type in message headers has changed from multipart/alternative to text/plain, but the charset info has been completely lost. As a result of this, the message body contains raw 8-bit characters which remain untranslated in MUA. If I set the list's html_policy to 'pass', the message passes through with all it's parts intact and is displayed correctly to users. I tried to look at the code of html-stripper-0.1 patch, but I'm not a programmer so only thing I got was a headache ;) Maybe someone has a fix? Thanks in advance, -- Toomas Aas