This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Pointer parameter in function optimized out - incorrectly


On 08/15/2015 02:08 PM, Avi Kivity wrote:
On 08/15/2015 10:19 PM, James Kuyper wrote:
The program where I ran into this problem is too large to justify
posting it, so I don't expect answers that identify precisely what is
going wrong - I'm just looking for advice as to what to look for. If I
don't find the problem soon, I'll try shrinking the program down to a
point where it can be posted.

SD_start_time is a double object with automatic storage duration defined
in process_a_granule() whose address is passed to process_a_scan(),
where that pointer parameter is called scan_time, and then to
compute_SD_start_time(), where the pointer is called SD_start_time.

At optimization level 2, as indicated by the following output from gdb,
scan_time is optimized out of the function call interface for
process_a_scan(). The address for SD_start_time is simply
passed directly to compute_SD_start_time(). The problem is that it is
sometimes the incorrect address. In this particular case, 33 scans were
processed without a hitch, and then it failed for the 34th:

Program received signal SIGSEGV, Segmentation fault.
0x00000000004129d4 in compute_SD_start_time (pkt_header=0x7fffffffd870,
     SD_start_time=0x1fff88b30) at compute_SD_start_time.c:109
109      *SD_start_time = pkt_header->pkt_TAI_time -
global_time_offset_array[index];
(gdb) print SD_start_time
$1 = (PGSt_double *) 0x1fff88b30
(gdb) print *SD_start_time
Cannot access memory at address 0x1fff88b30
(gdb) up
#1  0x000000000041a01c in process_a_scan (scan_number=<value optimized
out>,
     pkt=<value optimized out>, scan_rate=<value optimized out>,
     scan_time=<value optimized out>, L1A_scan=<value optimized out>,
     scan_meta=0x7ffffff88a10, eng_data=0x7ffffff88b80,
     failed_pkts=0x7fffffffd908, pkt_header=0x7fffffffd870,
     scan_pixel=0x7ffffff85b20, L0_file=0x7fffffce617c) at
process_a_scan.c:420
up
420        compute_SD_start_time (pkt_header, scan_time);
(gdb) up
#2  0x0000000000419627 in process_a_granule (L0_file=0,
     gran_start_time=386024405, gran_end_time=386024705,
     pcf_config=0x7fffffffd100, eng_data=0x7ffffff88b80,
     pkt_header=<value optimized out>, pkt=<value optimized out>,
     failed_pkts=0x7fffffffd908) at process_a_granule.c:276
276             L1A_status = process_a_scan(&prev_scan_number, pkt,
(gdb) print SD_start_time
$2 = 386024455.18252099
(gdb) print &SD_start_time
$3 = (PGSt_double *) 0x7ffffff88b30
(gdb) l 275
...
276             L1A_status = process_a_scan(&prev_scan_number, pkt,
277                  &pcf_config->scan_rate, &SD_start_time,
278                  &scan_data, &scan_metadata, eng_data, failed_pkts,
279                  pkt_header, &pixel_quality_data, &L0_file);
(gdb) down
#1  0x000000000041a01c in process_a_scan (scan_number=<value optimized
out>,
     pkt=<value optimized out>, scan_rate=<value optimized out>,
     scan_time=<value optimized out>, L1A_scan=<value optimized out>,
     scan_meta=0x7ffffff88a10, eng_data=0x7ffffff88b80,
     failed_pkts=0x7fffffffd908, pkt_header=0x7fffffffd870,
     scan_pixel=0x7ffffff85b20, L0_file=0x7fffffce617c) at
process_a_scan.c:420

So, &SD_start_time in process_a_granule() is 0x7ffffff88b30, but
SD_start_time in compute_SD_start_time() is 0x1fff88b30.

I'm pretty impressed by this optimization, since all three functions are
defined in different translation units, so it could only be performed at
link time. However, no matter how impressive it is, if it doesn't
produce the right results, it's no good. The overwhelming majority of
the time it works as intended, so the problem must be triggered by input
data that is unusual in some way. What kinds of things could I do to
figure out why the wrong address is sometimes (but not always) sent to
compute_SD_start_time()?
If the incorrect pointer were being transmitted by my own code, it would
be easy to figure this out; but the optimizer removed the code I wrote
to pass the pointer, and used some alternative method of it's own
choosing, and I don't know how to track that down.

The first 32 bits of those pointers are identical, so I expect that
something overwrote the most significant 32 bits of the pointer (with
the value 1).  I recommend you recompile with the various sanitizers
enabled to see what could have caused this.  gcc is probably innocent here.
Right.  Valgrind, sanitizers, etc.

Also note, you can't trust the debugger, <optimized out> can have lots of meanings. The debugger can also give you bogus information when the optimizers are enabled. Use gdb to guide, but when something looks weird, verify the actual machine state by looking directly into hte relevant registers, memory locations, etc etc.

Jeff



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]