Quantcast
Channel: ImapX 2
Viewing all articles
Browse latest Browse all 1952

New Post: SUBJECT,FROM, TO, CC, BCC and ATTACHMENT NAME fields are NOT decoded CORRECTLY

$
0
0
Hi PsyAfter,

The classic MIME specifications specify that the character set that is to be used for message headers is US-ASCII and only US-ASCII is allowed. To facilitate this, rfc2047 defined the rules for encoding non-ASCII text such that it could be used within the headers of a message. If you look at the raw source of many of your international emails, you'll probably see things like this in your headers:
=?iso-8859-8-i?b?<base64 blob>?=
It appears that the message you are having issues with does not follow the specifications for this. The reason for disallowing arbitrary 8-bit text in headers is that there's no reliable way for the client (library, in this case) to figure out what the character encoding is.

A library I've written, MimeKit, deals with this kind of situation by first checking if the 8-bit text in the headers is UTF-8 and, if so, converts into a C# string using System.Text.Encoding.UTF8. If it is not valid UTF-8, then it falls back to a user-supplied charset (ParserOptions.CharsetEncoding). If the headers do not fit the user-supplied charset either, then it falls back to ISO-8859-1. Later, if the user so desires, he/she is able to locate the Header in the MimeMessage.Headers list and try to decode the header using a different System.Text.Encoding.

ImapX could probably use a similar approach if it doesn't already have a charset fallback option (I haven't looked at the code in ImapX in a while and don't recall if it already has such an option).

Hopefully my explanation is useful to both you and to Pavel. If either of you have any questions, feel free to poke me and I will hopefully be able to answer them. My email address is listed on my GitHub page (I think Pavel already knows my email address as we've emailed back and forth a few times already).

-- Jeff



Note: A relatively new addition to the specifications makes it possible to send non-ASCII text in headers, but only if it is in UTF-8. As far as I'm aware, however, there aren't very many servers that support this yet so it is unlikely that, even if the headers are in UTF-8, that it is a client validly constructing the headers - but it is possible.

Viewing all articles
Browse latest Browse all 1952

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>