Mailing List Archive

[perl #38456] The :crlf PerlIO layer doesn't like the :encoding layer.
# New Ticket Created by ciaran@tnauk.org.uk
# Please include the string: [perl #38456]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org/rt3/Ticket/Display.html?id=38456 >


This is a bug report for perl from ciaran@tnauk.org.uk,
generated with the help of perlbug 1.35 running under perl v5.8.7.


-----------------------------------------------------------------
[Please enter your report here]

(note: This report was compiled with perlbug, then sent manually with
Thunderbird as perlbug had trouble. However, I've made sure that any
long lines that matter are NOT wrapped.)

There seems to be a problem in the way that the ":crlf" PerlIO layer is
implemented. I'm not entirely sure where the problem lies in :crlf, but
it doesn't play nicely with ":encoding". I have observed this behaviour in:

* Perl v5.8.7, Cygwin (prepackaged)
* Perl v5.8.7, Gentoo Linux (dev-lang/perl-5.8.7-r3)
* Perl v5.9.1, Cygwin (compiled)

Here is a test case:

===CUT HERE===
#!/usr/bin/perl -w

open(FILE, ">:encoding(UTF-16):crlf", "test-file");
print FILE "Test \x{A3}45!\n";
print FILE "Test!\n";
close(FILE);
===CUT HERE===

This should generate a valid UTF-16 file , containing "Test
£45!\r\nTest!\r\n", where the \r and \n characters represent \015 and
\012 respectively, and £ represents the character U+00A3, the British
pound symbol.

On a standard version of Perl, what actually happens is that I get a
message:

> Malformed UTF-8 character (unexpected continuation byte 0xa3, with no
> preceding start byte) in null operation at ./utf16-test.pl line 6.

and the pound sign is replaced with a null character. Using a literal
(not interpolated) UTF-8 character sequence for U+00A3 (ie. C2 A3) in
the file seems to work fine. As noted above, while I am submitting this
report from Perl v5.8.7, I have compiled Perl v5.9.1 on Cygwin and
observed the same behaviour. The output of "perl5.9.1 -V" is included at
the end of the user-modifiable section.

One workaround seems to be to install PerlIO::eol from CPAN and replace
the ":crlf" layer with ":eol(CRLF)". This works correctly, and test-file
contains what it should.

I have not mentioned ActiveState Perl in this report as I assume it is
not your responsibility. (For the record, ActiveState Perl v5.8.7 seems
to treat the layers as if they were reversed, so the inserted CR doesn't
get encoded, and thus corrupts the UTF-16 file by inserting a
single-byte character instead of a double-byte character.)

Thank you. Following is the output of "perl5.9.1 -V":

===CUT HERE===
ciaran@IT-16:~/utf8$ perl5.9.1 -V
Summary of my perl5 (revision 5 version 9 subversion 1) configuration:
Platform:
osname=cygwin, osvers=1.5.19(0.15042), archname=cygwin
uname='cygwin_nt-5.1 it-16 1.5.19(0.15042) 2006-01-20 13:28 i686
cygwin '
config_args='-Dmksymlinks'
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef useithreads=undef usemultiplicity=undef
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=y, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -fno-strict-aliasing',
optimize='-O2',
cppflags='-DPERL_USE_SAFE_PUTENV -fno-strict-aliasing'
ccversion='', gccversion='3.4.4 (cygming special) (gdc 0.12, using
dmd 0.125)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='ld2', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib /lib
libs=-lgdbm -lcrypt -lgdbm_compat
perllibs=-lcrypt -lgdbm_compat
libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
cccdlflags=' ', lddlflags=' -L/usr/local/lib'


Characteristics of this binary (from libperl):
Compile-time options: USE_LARGE_FILES
Built under cygwin
Compiled at Feb 7 2006 09:58:18
%ENV:
PERL5LIB="C:/HATS/scripts/lib"
CYGWIN=""
@INC:
C
/HATS/scripts/lib
/usr/local/lib/perl5/5.9.1/cygwin
/usr/local/lib/perl5/5.9.1
/usr/local/lib/perl5/site_perl/5.9.1/cygwin
/usr/local/lib/perl5/site_perl/5.9.1
/usr/local/lib/perl5/site_perl
.
===CUT HERE===

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=medium
---
Site configuration information for perl v5.8.7:

Configured by gerrit at Fri Dec 30 02:40:15 2005.

Summary of my perl5 (revision 5 version 8 subversion 7) configuration:
Platform:
osname=cygwin, osvers=1.5.18(0.13242),
archname=cygwin-thread-multi-64int
uname='cygwin_nt-5.1 inspiron 1.5.18(0.13242) 2005-07-02 20:30 i686
unknown unknown cygwin '
config_args='-de -Dmksymlinks -Duse64bitint -Dusethreads
-Uusemymalloc -Doptimize=-O3 -Dman3ext=3pm -Dusesitecustomize'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=define use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -fno-strict-aliasing
-pipe -I/usr/local/include',
optimize='-O3',
cppflags='-DPERL_USE_SAFE_PUTENV -fno-strict-aliasing -pipe
-I/usr/local/include'
ccversion='', gccversion='3.4.4 (cygming special) (gdc 0.12, using
dmd 0.125)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t', lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='ld2', ldflags =' -s -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lgdbm -ldb -lcrypt -lgdbm_compat
perllibs=-lcrypt -lgdbm_compat
libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' -s'
cccdlflags=' ', lddlflags=' -s -L/usr/local/lib'

Locally applied patches:
SPRINTF0 - fixes for sprintf formatting issues - CVE-2005-3962

---
@INC for perl v5.8.7:
C
/HATS/scripts/lib
/usr/lib/perl5/5.8/cygwin
/usr/lib/perl5/5.8
/usr/lib/perl5/site_perl/5.8/cygwin
/usr/lib/perl5/site_perl/5.8
/usr/lib/perl5/site_perl/5.8/cygwin
/usr/lib/perl5/site_perl/5.8
/usr/lib/perl5/vendor_perl/5.8/cygwin
/usr/lib/perl5/vendor_perl/5.8
/usr/lib/perl5/vendor_perl/5.8/cygwin
/usr/lib/perl5/vendor_perl/5.8
.

---
Environment for perl v5.8.7:
HOME=/home/ciaran
LANG=en_GB
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/home/ciaran/bin:/usr/bin:/usr/local/bin:/bin:/usr/X11R6/bin:/cygdrive/c/Perl/bin/:/cygdrive/c/PHP/:/cygdrive/c/swig:/cygdrive/c/gnuwin/gnuwin32/bin:/cygdrive/c/WINDOWS/system32:/cygdrive/c/WINDOWS:/cygdrive/c/WINDOWS/System32/Wbem:/cygdrive/c/PROGRA~1/ULTRAE~1:/cygdrive/c/Program
Files/MySQL/MySQL Server 5.0/bin:/cygdrive/c/Program
Files/Subversion/bin:/cygdrive/c/Program Files/Common
Files/GTK/2.0/bin:/cygdrive/c/Program Files/ActiveState Komodo
3.1/:/cygdrive/c/binpath:/usr/bin
PERL5LIB=C:/HATS/scripts/lib
PERL_BADLANG (unset)
SHELL (unset)
Re: [perl #38456] The :crlf PerlIO layer doesn't like the :encoding layer. [ In reply to ]
On Fri, Jan 28, 2011 at 6:43 AM, Father Chrysostomos via RT
<perlbug-followup@perl.org> wrote:
> On Thu Jan 20 15:04:30 2011, LeonT wrote:
>> I've attached a possible fix for this bug. The patch is relative to my
>> patches for #82484 but it doesn't depend on it.
>>
>> Currently, if a «:crlf» layer is given it will first try to (re)enable
>> any crlf layer it can find or else push itself on the stack. This can
>> lead to data corruption. In this patch I've changed it to only check the
>> topmost layer.
>>
>> Testing it would be very welcome, in particular on Windows (not my
> habitat).
>>
>> Leon
>
> Thank you. I have just applied this as 7826b36. Let’s see whether it
> breaks Windows. :-)

Comment and documentation updates with regard to the previous patch.
No code changes.

Leon