What could cause this SEGV

Paul Smith paul@mad-scientist.net
Tue Jun 12 13:22:00 GMT 2018


I have a strange situation and I'm looking for advice as to how to
proceed further.  Or if I should just give up.

I have an encapsulated build toolchain on GNU/Linux which uses GCC
8.1.0, binutils 2.30 (using gold), and a Red Hat Enterprise Linux 6.5
sysroot (libc etc.)  I can successfully compile code with this
toolchain and have it run on all sorts of systems from RHEL 6.5 and
above (which is most every GNU/Linux distribution released in the last
4 years or so).

However, even though this toolchain is completely encapsulated (that
is, when I look at both the preprocessor output and verbose linker
output I don't see any reference to any file outside of the toolchain,
other than the ones I'm compiling of course), when I create a
particular shared object on one system it dumps core when used but when
I create it on another system it does not dump core and works properly.

The problem happens during the link operation (so maybe this isn't the
right list).  If I copy the object files from the "working" system to
the "failing" system, and just re-link, then the resulting .so still
seg faults.  The encapsulation is identical (copied from one system to
the other).  I've run md5sum on all the object files, local libraries,
and system libraries mentioned in the linker verbose output and
compared them and they're identical before linking.

I've compared the output of -Wl,--verbose and aside from addresses the
output is the same: no different flags etc.  The compile and link use
-mtune=generic -march=x86_64.

Also the working .so runs correctly on all the systems, and the failing
.so fails on all the systems (even the one it was created on).

FWIW, the failing .so is a Python 2.7 module which is part of the
Pycrypto package: it's _fastmath.so which links in a static GNU MPIR
3.0.0 library and is dlopen()'d by Python.  When it fails it dumps core
in the __gmpz_init() function.  Unfortunately I have nowhere near
sufficient assembly fu to investigate much more than that.

Is there more I could do to track this down and understand it?  Should
I just proceed with the "working" _fastmath.so and call it good?  It
seems like there's some hole in my encapsulation that I can't see; I
would prefer to find and patch it if possible...



More information about the Gcc-help mailing list