Mailing List Archive

NWCLARK TPF grant report #35
[Hours] [Activity]
2012/04/30 Monday
7.75 Makefile.SH miniperl rule unification
0.25 Pod::PerlDoc
0.50 RT #33159

2012/05/01 Tuesday
1.00 AIX bisect
1.00 ID 20010309.008 (#6002)
0.50 ID 20010918.001 (#7698)
0.25 ID 20011126.145 (#7937)
0.50 RT #17711
0.25 RT #29437
0.25 RT #32331
3.75 reading/responding to list mail

2012/05/02 Wednesday
0.25 AIX bisect
0.50 RT #108286
6.25 reading/responding to list mail

2012/05/03 Thursday
0.25 ID 20010309.008 (#6002)
0.25 RT #112732
0.25 RT #18049
0.50 RT #78224
0.25 process, scalability, mentoring
3.75 reading/responding to list mail
0.25 smoke-me/trim-superfluous-Makefile

2012/05/04 Friday
0.75 HP-UX 32 bit -DDEBUGGING failure
0.25 ID 20011126.145 (#7937)
1.75 RT #108286
0.75 process, scalability, mentoring
0.75 reading/responding to list mail

2012/05/05 Saturday
0.25 HP-UX 32 bit -DDEBUGGING failure

2012/05/06 Sunday
2.50 mktables memory usage

Which I calculate is 35.50 hours

I started the week with some more simplification of Makefile.SH, made
possible by last weeks changes. As AIX now uses regular miniperl to run, the AIX-specific section of the Makefile now starts to look
much more similar to the OS/2 specific section that runs there.
With some refactoring of to avoid needing a command-line parameter
to specify the platform to build for, and passing in a -DPERL_DLL on AIX
which does nothing, the two converge even further. At which point it was
a small matter of using Makefile macros to encode the different dependencies
and filenames used, at which point the two can share code, and Makefile.SH
gets simpler. It's a lot better than it, although there's still some more
to do that will make it simpler still.*

Recently James E Keenan, Brian Fraser and Darin McBride have been going
through RT looking for old stalled bugs related to old versions of Perl
on obsolete versions operating systems, and seeing whether they are still
reproducible on current versions. If the problem isn't reproducible, it's
not always obvious whether the bug was actually fixed, or merely that the
symptom was hidden. This matters if the symptom was revealing a buffer
overflow or similar security issue, as we'd like to find these before
the blackhats do. Hence I've been investigating some of these to try to get
a better idea whether we're about to throw away our only easy clue about
still present bug.

One of these was RT #6002, reported back in 2001 in the old system as ID
20010309.008. In this case, the problem was that glob of a long filename
would fail with a SEGV. Current versions of perl on current AIX don't SEGV,
but did we fix it, did IBM, or is it still lurking? In this case, it turned
out that I could replicate the SEGV by building 5.6.0 on current AIX. At
which point, I have a test case, so start up git bisect, and the answer
should pop out within an hour. Only it doesn't, because it turns out that
git bisect gets stuck in a tarpit of "skip"s because some intermediate
blead version doesn't build. So this means a digression into bisecting the
cause of the build failure, and then patching Porting/ to
be able to build the relevant intermediate blead versions, so that it can
then find the true cause. This might seem like a lot of work that is used
only once, but it tends not to be. It becomes progressively easier to
bisect more and more problems without hitting any problems, and until you
have it you don't realise how powerful a tool automated bisection is. It's
a massive time saver.

But, as to the original bug and the cause of its demise. It turned out to be
interesting. And completely not what I expected:

commit 61d42ce43847d6cea183d4f40e2921e53606f13f
Author: Jarkko Hietaniemi <>
Date: Wed Jun 13 02:23:16 2001 +0000

New AIX dynaloading code from Jens-Uwe Mager.
Does break binary compatibility.

p4raw-id: //depot/perl@10554

The SEGV (due to an illegal instruction) goes away once perl switched to using
dlopen() for dynamic linking on AIX. So my hunch that this bug was worth
digging into was right, but not for reason I'd guessed.

Another piece of fun this week was determining why Merijn's HP-UX smoker
wasn't able to build with certain configuration options. The output summary
grid looked like this, which is most strange:

O = OK F = Failure(s), extended report at the bottom
X = Failure(s) under TEST but not under harness
? = still running or test results not (yet) available
Build failures during: - = unknown or N/A
c = Configure, m = make, M = make (after miniperl), t = make test-prep

v5.15.9-270-g5a0c7e9 Configuration (common) none
----------- ---------------------------------------------------------
O O O m - -
O O O O O O -Duse64bitall
O O O m - - -Duseithreads
O O O O O O -Duseithreads -Duse64bitall
| | | | | +- LC_ALL = univ.utf8 -DDEBUGGING
| | | | +--- PERLIO = perlio -DDEBUGGING
| | | +----- PERLIO = stdio -DDEBUGGING
| | +------- LC_ALL = univ.utf8
| +--------- PERLIO = perlio
+----------- PERLIO = stdio

As the key says, 'O' is OK. It's what we want. 'm' is very bad - it means
that it couldn't even build miniperl, let alone build extensions or run any
tests. But what is strange is that ./Configure ... will fail, but the same
options plus -Duse64bitall will work just fine. And this is replicated with
ithreads - default fails badly, but use 64 bit IVs and pointers and it
works. Usually it's the other way round - the default configuration works,
because it is "simplest", and attempting something more complex such as
64 bit support, ithreads, shared perl library, hits a problem.

As it turns out, what's key is that that ./Configure ... contains
-DDEBUGGING. The -DDEBUGGING parameter to Configure causes it to add
-DDEBUGGING to the C compiler flags, and to *add* -g to the optimiser
settings (without removing anything else there). So on HP-UX, with HP's
compiler that changes the optimiser setting from '+O2 +Onolimit' to
'+O2 +Onolimit -g'. Which, it seems, the compiler doesn't accept for
building 32 bit object code (the default) but does in 64 bit. Crazy thing.

Except, that, astoundingly, its not even that simple. The original error
message was actually "Can't handle preprocessed file". Turns out that that
detail is important. The build is using ccache to speed things up, so
ccache is invoking the pre-processor only, not the main compiler, to create
a hash key to look up in its cache of objects. However, on a cache miss,
ccache doesn't run the pre-processor again - to save time by avoiding
repeating work, it compiles the already pre-processed source. And that is key
the distinction between invoking the pre-processor and then compiling,
versus compiling without the pre-processor:

$ echo 'int i;' >bonkers.c
$ cc -c -g +O2 bonkers.c
$ cc -E -g +O2 bonkers.c >bonkers.i
$ cc -c -g +O2 bonkers.i
cc: error 1414: Can't handle preprocessed file "bonkers.i" if -g and -O specified.
$ cat bonkers.i
# 1 "bonkers.c"
int i;

$ cc -c -g +O2 +DD64 bonkers.c
$ cc -E -g +O2 +DD64 bonkers.c >bonkers.i
$ cc -c -g +O2 +DD64 bonkers.i
$ cat bonkers.i
# 1 "bonkers.c"
int i;

No, it's not just crazy compiler, its insane! It handles -g +O2 just fine
normally, but for 32 bit mode it refuses to accept pre-processed input.
Whereas for 64 bit mode it does.

If HP think that this isn't a bug, I'd love to know what their excuse is.

Nicholas Clark

* specifically,
i) auditing all the places that invoke ./miniperl -Ilib ... to see where
-I command line options and BEGIN blocks that set @INC are now redundant
thanks to,
ii) auditing the various Makefile macros used to run miniperl and perl, and
how they use $(LDLIBPTH) and $(RUN) to set dynamic loading and cope with
cross-compilation. Which is part of why Jess Robinson's grant to work on
cross-compilation will have knock-on benefits to general maintainability.