Aug 1, 2012, 12:43 AM
Post #5 of 5
If I create an HTML-file with the name æøå.html, and access it through Apache, the access log says "GET /%C3%A6%C3%B8%C3%A5.html". It seems to URL-encode it or something. If I then delete the HTML file an try to access it, the error log says "File does not exist: /var/www/\xc3\xa6\xc3\xb8\xc3\xa5.html". Not at all readable either... Maybe it's Apache's fault.
I don't know how I can capture the raw e-mails that RT sends out for dashboard subscriptions. They are sent by RT through Postfix on the local server. Please tell me if you know how.
> Date: Tue, 31 Jul 2012 12:09:57 -0400
> From: firstname.lastname@example.org
> To: email@example.com
> Subject: Re: [rt-users] Charset for logs
> On Mon, Jul 23, 2012 at 09:36:27AM +0200, Ole Jon Bjørkum wrote:
> > RT is installed from the Ubuntu repository, and the installation seems to log to
> > /var/log/syslog and /var/log/apache2/error.log. However, I just discovered that it is only the
> > Apache log that has charset problems. The syslog shows all characters correctly. Also, the
> > Apache log logs in GMT while the syslog logs in the correct timezone, but I guess that is how
> > it's supposed to be.
> RT prints logs in GMT, when those pass through syslog, syslog will add
> an additional timestamp. Apache however keeps the RT timestamps.
> Is it just RT's messages in the apache logs that are corrupt, or is
> something as simple as a request to /Test/latin1pagename.html
> corrupted in the access/error log? RT should be pushing out UTF-8 but
> I'm not sure if RT is doing something wrong or if apache is corrupting
> > I'm not quite sure what you mean by raw subject line.
> > This is what shows up in Outlooks internet headers: Alle nye og ?pne saker
> > This is how the subject line looks in Outlook: Alle nye og **pne saker jeg eier
> > The question mark should be the character "aa", so the word should be "aapne"
> > The message body uses the correct charset (I can see that UTF-8 is specified in the HTML).
> I mean the raw on-disk header. Subject: lines are encoded if they
> contain UTF-8, so something like this:
> Subject: =?UTF-8?B?4pyIVEhSRUUgQ29vbCBEZWFscyBGcm9tIEFtZXJpY2FuIEFpcmxpbmVz?=
> If you have an email that is consistently corrupted when passing
> through RT, if you can capture a raw version of the email (so not the
> .msg file from Outlook, but something caught further upstream, before
> it gets to rt-mailgate preferably) please zip it up and send it into
> the RT bug tracker, along with your System Configuration page which
> contains a ton of information such as perl module versions, some of
> which are known-bad.
> > > Date: Fri, 20 Jul 2012 08:17:45 -0700
> > > From: firstname.lastname@example.org
> > > To: email@example.com
> > > Subject: Re: [rt-users] Charset for logs
> > >
> > > On Fri, Jul 20, 2012 at 09:24:22AM +0200, Ole Jon Bjo/rkum wrote:
> > > > Ever since we started to use RT (before 3.8.7, now 4.0.4), it doesn't seem to use the
> > correct
> > > > charset for logging. All norwegian characters (aeo/aa) becomes: **. I can see this because
> > we
> > > > have scrips that contain norwegian characters, and every time a scrip is launched, it is
> > > > logged to the Apache log.
> > >
> > > How are you logging, Syslog, Screen, File? RT has several different ways
> > > to log and it's impossible to test without knowing.
> > >
> > > > Today I also noticed that if I subscribe to a dashboard with
> > > > norwegian characters in its name, the subject of the email sent out also have this problem
> > (**
> > > > instead of ae, o/ or aa). The email body however, has the correct charset. There is no
> > charset
> > > > problems in the web UI. How can this be fixed?
> > >
> > > Please provide a raw Subject: line so we can see what's going on.