backtrace a segfault

Janne Blomqvist blomqvist.janne@gmail.com
Fri Apr 10 06:06:00 GMT 2015


On Thu, Apr 9, 2015 at 8:48 PM, Toon Moene <toon@moene.org> wrote:
> On 04/09/2015 09:06 AM, Patrick Begou wrote:
>
>> Hi,
>>
>> I'm working on a large parallel fortran application which give
>> (sometime) a segfault. When this error occurs I would like to backtrace
>> the call stack to know where it takes place but I'm unable to get this
>> information, no more than a list of memory addresses. I've build a small
>> test-case (with an error in array dimension creating a segmentation
>> fault in a subroutine ) to investigate gfortran/gcc options.
>>
>> With gcc version 4.8.2 using options "-g -fbacktrace -gdwarf-3" I get
>> ./plante
>> Program received signal SIGSEGV: Segmentation fault - invalid memory
>> reference.
>> Backtrace for this error:
>> #0  0x7F99F71A9AC7
>> #1  0x7F99F71AA0CE
>> #2  0x7F99F67A9B2F
>> Segmentation fault
>>
>> but addr2line -e ./plante 0x7F99F71AA0CE
>> returns: ??:0
>>
>> What have I missed ?

Depending on which version of binutils you have, addr2line may have
problems understanding the DWARF-3 debug format. Try with -gdwarf-2.


> Hard to say.  I have the same problem with a (far smaller) program of our
> weather forecasting suite. Compiled with gfortran 4.9 and linked against the
> OpenMPI libraries, I get this:
>
> Program received signal SIGSEGV: Segmentation fault - invalid memory
> reference.
>
> Backtrace for this error:
> #0  0x2ADE9148E407
> #1  0x2ADE9148EA1E
> #2  0x2ADE91F1C17F
> #3  0x5EB24E in update_desc_ at update_desc.F90:55
> #4  0x5E97D9 in swapoutdb_ at swapoutdb.F90:16 (discriminator 4)
> #5  0x40B259 in bator at Bator.F90:368 (discriminator 2)
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 27638 on node super.moene.org
> exited on signal 11 (Segmentation fault).
> --------------------------------------------------------------------------
>
> which looks reasonable to me.  Perhaps the earlier addresses are simply
> within the OpenMPI library routines.  The error most certainly isn't there,
> but what you passed as arguments to it.

Typically the first few stack frames are due to the backtrace printing
routines in libgfortran. Another problem is that addr2line can't
resolve addresses from shared libraries (incl. libgfortran.so);
linking statically is a workaround..



-- 
Janne Blomqvist



More information about the Gcc-help mailing list