Mailing List Archive

invalid byte sequence for encoding "UTF8" from header
Hi all,

I'm getting
PGSQL: query failed: ERROR: invalid byte sequence for encoding "UTF8": 0xfc
while trying to insert a $h_subject.
Main config contains
headers_charset = utf-8
and exim has been built with
HAVE_ICONV=yes
ldd shows
libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x2817f000)
Is this a bug in headers_charset?
May illegal utf-8 in a header be forwarded to $h?

Axel
---
PGP-Key:29E99DD6 ☀ +49 151 2300 9283 ☀ computing @ chaos claudius


--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: invalid byte sequence for encoding "UTF8" from header [ In reply to ]
On 2012-01-29 at 16:58 +0100, Axel Rau wrote:
> I'm getting
> PGSQL: query failed: ERROR: invalid byte sequence for encoding "UTF8": 0xfc
> while trying to insert a $h_subject.
> Main config contains
> headers_charset = utf-8
> and exim has been built with
> HAVE_ICONV=yes
> ldd shows
> libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x2817f000)
> Is this a bug in headers_charset?
> May illegal utf-8 in a header be forwarded to $h?

Yes. Headers in reality contain arbitrary binary data; they're supposed
to be constrained to ASCII with MIME used for encoding other data, but
that can't be relied upon.

The "headers_charset" option only affects MIME decoding of RFC 2047
constructs; if the construct is =?KOI8-RU?Q?...?= then that "..." will
be decoded to KOI8-RU, then translated to headers_charset if possible.
If there was a translation error (unsupported by iconv conversion) then
the data is included verbatim.

There's no support for coercing all raw binary data encountered into the
charset, MIME is assumed to be used for non-ASCII.

Proposals for a better system of handling errors appreciated. Also for
how to efficiently deal with systems that insert binary raw into
headers.
--
https://twitter.com/syscomet

--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##