Mailing List Archive

[perl #37102] Perl regexp state gets clobbered (includes a fix)
# New Ticket Created by Geoff Mottram
# Please include the string: [perl #37102]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org/rt3/Ticket/Display.html?id=37102 >


The following problem has been observed under versions of Perl after 5.8.4
on Linux and Windows including the most recent stable version (5.8.7). The
script that experienced the problem is too complex to include and requires
a connection to a database server. A patch with a fix is provided at the
end of this email.

The following regex was causing either an Out of Memory condition or a
segmentation fault after two iterations:

if ($a =~ m/^([A-Za-z]+)(.*)/)

It was caused by the PL_regeol variable of regexec.c getting overwritten
when in the "av_store()" call on line 4370 of regexec.c (see gdb backtrace
below):

if (a)
sw = *a;
else if (si && doinit) {
sw = swash_init("utf8", "", si, 1, 0);
(void)av_store(av, 1, sw);
}

When av_store is called, the previous positional parameter is
de-referenced and its DESTROY method is called. I could not figure out
what object was being destroyed but the gdb backtrace will probably ring a
few bells for you. The script was working with UTF-8 content and wide
characters, which I suspect was related to the problem. In any case, the
DESTROY method that was called included a regular expression that would
trash the state of the regular expression engine a few methods up the stack.

My solution is given in a patch at the very end of this email. Please
excuse the comment with my name but it will allow me to find my fix in my
copy of Perl should I need to in the future.

I added the following call to sv_clear() in sv.c:

save_re_context();

just prior to the call to the object's DESTROY method. This allows regular
expressions to be used in the DESTROY method of whatever object is being
used as a positional variable in a regular expression (a Unicode string?).

I agree with the comments above the save_re_context() function that this
is a really kludgey way for the regular expression engine to be written.
The engine should store its state in a stack or allocated structure that
is passed to its subroutines and not use global variables. I am afraid
that there may be other sections in Perl that will also require calls to
save_re_context() that have not been discovered yet.

Thanks for getting my fix in to the Perl distribution.

Sincerely,

Geoff Mottram
geoff@gate.net

--------------Output of perl -V ------------

Summary of my perl5 (revision 5 version 8 subversion 7) configuration:
Platform:
osname=linux, osvers=2.4.21-27.0.2.elsmp, archname=i686-linux
uname='linux classificationweb.net 2.4.21-27.0.2.elsmp #1 smp wed jan
12 23:35:44 est 2005 i686 i686 i386 gnulinux '
config_args='-de'
hint=previous, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-fno-strict-aliasing -pipe -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm -DDEBUGGING',
optimize='-g',
cppflags='-fno-strict-aliasing -pipe -I/usr/local/include
-I/usr/include/gdbm -fno-strict-aliasing -pipe -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm -DDEBUGGING
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm -DDEBUGGING'
ccversion='', gccversion='3.2.3 20030502 (Red Hat Linux 3.2.3-49)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
libc=/lib/libc-2.3.2.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.3.2'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
Compile-time options: DEBUGGING USE_LARGE_FILES
Built under linux
Compiled at Sep 7 2005 06:10:19
@INC:
/usr/local/lib/perl5/5.8.7/i686-linux
/usr/local/lib/perl5/5.8.7
/usr/local/lib/perl5/site_perl/5.8.7/i686-linux
/usr/local/lib/perl5/site_perl/5.8.7
/usr/local/lib/perl5/site_perl
.
--------------------------------------------

------------Output of perbug -d -------------

---
Flags:
category=
severity=
---
Site configuration information for perl v5.8.7:

Configured by minaret at Wed Sep 7 06:08:36 EDT 2005.

Summary of my perl5 (revision 5 version 8 subversion 7) configuration:
Platform:
osname=linux, osvers=2.4.21-27.0.2.elsmp, archname=i686-linux
uname='linux classificationweb.net 2.4.21-27.0.2.elsmp #1 smp wed jan
12 23:35:44 est 2005 i686 i686 i386 gnulinux '
config_args='-de'
hint=previous, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-fno-strict-aliasing -pipe -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm -DDEBUGGING',
optimize='-g',
cppflags='-fno-strict-aliasing -pipe -I/usr/local/include
-I/usr/include/gdbm -fno-strict-aliasing -pipe -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm -DDEBUGGING
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm -DDEBUGGING'
ccversion='', gccversion='3.2.3 20030502 (Red Hat Linux 3.2.3-49)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
libc=/lib/libc-2.3.2.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.3.2'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:


---
@INC for perl v5.8.7:
/usr/local/lib/perl5/5.8.7/i686-linux
/usr/local/lib/perl5/5.8.7
/usr/local/lib/perl5/site_perl/5.8.7/i686-linux
/usr/local/lib/perl5/site_perl/5.8.7
/usr/local/lib/perl5/site_perl
.

---
Environment for perl v5.8.7:
HOME=/minaret
LANG=en_US.UTF-8
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)

PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin/bin:/usr/local/java/jdk1.5.0_04/bin:/usr/local/java/jdk1.5.0_04/jre/bin:/minaret/m/bin:/minaret/m/util:/minaret/bin
PERL_BADLANG (unset)
SHELL=/bin/bash

----------------------------------------------

----- gdb backtrace showing how PL_regeol gets clobbered by
Perl_re_intuit_start-------------------------

#0 Perl_re_intuit_start (prog=0x90e5070, sv=0x8fda2a0, strpos=0x9727d28
"utf8::DESTROY", strend=0x9727d35 "", flags=2, data=0x0)
at regexec.c:454
#1 0x080ca09f in Perl_pp_match () at pp_hot.c:1328
#2 0x080b288d in Perl_runops_debug () at dump.c:1452
#3 0x08063509 in S_call_body (myop=0xbfffa050, is_eval=0) at perl.c:2364
#4 0x0806318f in Perl_call_sv (sv=0x95d17b0, flags=150) at perl.c:2282
#5 0x080d9de2 in Perl_sv_clear (sv=0x972d270) at sv.c:5110
#6 0x080da66c in Perl_sv_free (sv=0x972d270) at sv.c:5356
#7 0x080da24c in Perl_sv_clear (sv=0x978cf10) at sv.c:5208
#8 0x080da66c in Perl_sv_free (sv=0x978cf10) at sv.c:5356
#9 0x080c3800 in Perl_av_store (av=0x95509a0, key=1, val=0x9702280) at
av.c:335
#10 0x0812e348 in Perl_regclass_swash (node=0x9552c84, doinit=1 '\001',
listsvp=0x0, altsvp=0xbfffa27c) at regexec.c:4370
#11 0x0812e551 in S_reginclass (n=0x9552c84, p=0x9760c01 "3279.K86",
lenp=0x0, do_utf8=1 '\001') at regexec.c:4420
#12 0x0812d438 in S_regrepeat (p=0x9552c84, max=2147483647) at regexec.c:4112
#13 0x0812c4ae in S_regmatch (prog=0x9552c74) at regexec.c:3774
#14 0x08126f7c in S_regtry (prog=0x9552c30, startpos=0x9760c00
"B3279.K86") at regexec.c:2204
#15 0x08125a61 in Perl_regexec_flags (prog=0x9552c30, stringarg=0x9760c00
"B3279.K86", strend=0x9760c09 "", strbeg=0x9760c00 "B3279.K86",
minend=0, sv=0x95507fc, data=0x0, flags=3) at regexec.c:1753
#16 0x080ca120 in Perl_pp_match () at pp_hot.c:1340
#17 0x080b288d in Perl_runops_debug () at dump.c:1452
#18 0x08062878 in S_run_body (oldscope=1) at perl.c:1995
#19 0x08062431 in perl_run (my_perl=0x8fda008) at perl.c:1919
#20 0x0805e81b in main (argc=20, argv=0xbfffa914, env=0xbfffa968) at
perlmain.c:98
(gdb) p *prog
$111 = {startp = 0x90dfb48, endp = 0x90e4e88, regstclass = 0x0, substrs =
0x90e1c88, precomp = 0x90e5018 "^threads::new$", data = 0x0,
subbeg = 0x0, offsets = 0x90e5030, sublen = 0, refcnt = 1, minlen = 12,
prelen = 14, nparens = 0, lastparen = 0, lastcloseparen = 0,
reganch = 137363457, program = {{flags = 156 '\234', type = 0 '\0',
next_off = 0}}}

---------------- Patch for sv.c to fix problem ---------------

--- sv.c.orig 2005-05-27 06:38:11.000000000 -0400
+++ sv.c 2005-09-07 19:31:19.000000000 -0400
@@ -5102,6 +5102,8 @@
SV* tmpref = newRV(sv);
SvREADONLY_on(tmpref); /* DESTROY() could be naughty */
ENTER;
+ /* Next line added by Geoff Mottram 9/7/05. */
+ save_re_context();
PUSHSTACKi(PERLSI_DESTROY);
EXTEND(SP, 2);
PUSHMARK(SP);

---------------------------------------------------------------
Re: [perl #37102] Perl regexp state gets clobbered (includes a fix) [ In reply to ]
On 9/8/05, via RT Geoff Mottram <perlbug-followup@perl.org> wrote:
> The following regex was causing either an Out of Memory condition or a
> segmentation fault after two iterations:
>
> if ($a =~ m/^([A-Za-z]+)(.*)/)
>
> It was caused by the PL_regeol variable of regexec.c getting overwritten
> when in the "av_store()" call on line 4370 of regexec.c (see gdb backtrace
> below):
>
> if (a)
> sw = *a;
> else if (si && doinit) {
> sw = swash_init("utf8", "", si, 1, 0);
> (void)av_store(av, 1, sw);
> }
>
> When av_store is called, the previous positional parameter is
> de-referenced and its DESTROY method is called. I could not figure out
> what object was being destroyed but the gdb backtrace will probably ring a
> few bells for you. The script was working with UTF-8 content and wide
> characters, which I suspect was related to the problem. In any case, the
> DESTROY method that was called included a regular expression that would
> trash the state of the regular expression engine a few methods up the stack.

Given that your bug happens inside the debugger, I wonder whether the
following patch isn't more appropriate :


==== //depot/perl/pp_hot.c#415 - /opt/bleadperl/p4/perl/pp_hot.c ====
--- /home/rafael/tmp/tmp.14639.0 2005-10-09 18:59:52.901594208 +0200
+++ /opt/bleadperl/p4/perl/pp_hot.c 2005-10-09 18:57:32.849885304 +0200
@@ -2657,6 +2657,7 @@ PP(pp_entersub)
if (CvASSERTION(cv) && PL_DBassertion)
sv_setiv(PL_DBassertion, 1);

+ save_re_context();
cv = get_db_sub(&sv, cv);
if (!cv || (!CvXSUB(cv) && !CvSTART(cv)))
DIE(aTHX_ "No DB::sub routine defined");
End of Patch.

However, I can't reproduce your problem, so I'm not sure if it fixes it.
Re: [perl #37102] Perl regexp state gets clobbered (includes a fix) [ In reply to ]
On 10/9/05, Rafael Garcia-Suarez <rgarciasuarez@gmail.com> wrote:
> Given that your bug happens inside the debugger, I wonder whether the
> following patch isn't more appropriate :
[...snipped...]

Meanwhile I applied this workaround :

Change 25721 by rgs@marais on 2005/10/09 16:42:26

Don't use a regexp in DB::sub().
This should work around perlbug #37102.

Affected files ...

... //depot/perl/lib/DB.pm#6 edit

Differences ...

==== //depot/perl/lib/DB.pm#6 (text) ====

@@ -63,8 +63,7 @@
push(@stack, $DB::single);
$DB::single &= 1;
$DB::single |= 4 if $#stack == $deep;
-# print $DB::sub, "\n";
- if ($DB::sub =~ /(?:^|::)DESTROY$/ or not defined wantarray) {
+ if ($DB::sub eq 'DESTROY' or substr($DB::sub, -9) eq '::DESTROY' or
not defined wantarray) {
&$DB::sub;
$DB::single |= pop(@stack);
$DB::ret = undef;
End of patch.