This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch] Fix PR 60040
- From: Senthil Kumar Selvaraj <senthil_kumar dot selvaraj at atmel dot com>
- To: Bernd Schmidt <bschmidt at redhat dot com>
- Cc: Senthil Kumar Selvaraj <senthil_kumar dot selvaraj at atmel dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, <uweigand at de dot ibm dot com>, <gnu at amylaar dot uk>
- Date: Thu, 28 Apr 2016 12:27:38 +0530
- Subject: Re: [Patch] Fix PR 60040
- Authentication-results: sourceware.org; auth=none
- References: <87h9fd1vqf dot fsf at atmel dot com> <5706B950 dot 3090003 at redhat dot com> <8737qnrq4i dot fsf at atmel dot com> <571E30FF dot 4020907 at redhat dot com>
Bernd Schmidt writes:
> On 04/15/2016 02:52 PM, Senthil Kumar Selvaraj wrote:
>>
>> For both testcases in the PR, reload fails to take into account that
>> FP-SP elimination can no longer be performed, and tries to find reload
>> regs for an rtx generated when FP-SP elimination was valid.
>>
>> 1. reload initializes elim table with FP->SP elimination enabled.
>> 2. alter_reg for a pseudo allocates a stack slot for the pseudo, and sets
>> reg_equiv_memory_loc to frame_pointer_rtx plus offset. It also sets
>> something_was_spilled to true.
>> 3. The main reload loop starts, and it resets something_was_spilled to false.
>> 4. reload calls eliminate_regs for the pseudo and sets reg_equiv_address to
>> (mem(SP + offset)).
>> 5. calculate_needs_all_insns pushes a reload for SP (for the AVR target,
>> SP cannot be a pointer reg - it needs to be reloaded into X Y or Z regs).
>> 6. update_eliminables_and_spill calls targetm.frame_pointer_required,
>> which returns true. That causes can_eliminate for FP->SP to be reset
>> to zero, and FP to be added to bad_spill_regs_global. For the AVR,
>> FP is Y, one of the 3 pointer regs. reload also notes that something
>> has changed, and that the loop needs to run again.
>> 7. reload still calls select_reload_regs, and find_regs fails to find a
>> pointer reg to reload SP, which is unnecessary as FP->SP elimination
>> had been disabled anyway in (6).
>>
>> IOW, reload fails to find pointer regs for an RTL expression that was
>> created when FP->SP elimination was true, even after it turns out that
>> the elimination can't be done after all. The patch tries to detect that
>> - if it knows the loop is going to run again, it silences the failure.
>>
>> Also note that at a different point in the loop, the reload loop starts
>> over if something_was_spilled (line 982-986). If set outside the reload
>> loop by alter_reg, it gets reset at (3) - not sure why. I'd think a
>> "continue" after update_eliminables_and_spill (line 1019-1022) would
>> also work - haven't tested it though.
>
> That's what I was going to ask next. I think if that works for you, I
> think that's an approach that would make more sense.
>
> However, it looks like after this call, and the other one in the same
> loop, should probably be calling finish_spills. It looks like an
> oversight in the existing code that this doesn't happen for the existing
> continue. Please try adding it in both places.
Tried that out - that works too. Bootstrap and regression tests on
x86_64 went fine, and there were no regressions for the avr target as
well - the extra three passes that showed up with the initial patch show
up now too.
I couldn't put a continue before select_reload_regs though.
save_call_clobbered_regs sets up a few things that are cleaned up in
delete_caller_save_insns a little later in the loop. So I only exclude
select_reload_regs - finish_spills and cleanup will happen in both cases.
Does this patch look ok? If yes, could you commit it for me please? I
don't have commit access.
Regards
Senthil
gcc/ChangeLog
2016-04-28 Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com>
PR target/60040
* reload1.c (reload): Call finish_spills before
restarting reload loop. Skip select_reload_regs
if update_eliminables_and_spill returns true.
gcc/testsuite/ChangeLog
2016-04-28 Sebastian Huber <sebastian.huber@embedded-brains.de>
Matthijs Kooijman <matthijs@stdin.nl>
Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com>
PR target/60040
* gcc.target/avr/pr60040-1.c: New.
* gcc.target/avr/pr60040-2.c: New.
diff --git gcc/reload1.c gcc/reload1.c
index c2800f8..d6fcece 100644
--- gcc/reload1.c
+++ gcc/reload1.c
@@ -981,7 +981,8 @@ reload (rtx_insn *first, int global)
/* If we allocated another stack slot, redo elimination bookkeeping. */
if (something_was_spilled || starting_frame_size != get_frame_size ())
{
- update_eliminables_and_spill ();
+ if (update_eliminables_and_spill ())
+ finish_spills (global);
continue;
}
@@ -1021,10 +1022,12 @@ reload (rtx_insn *first, int global)
did_spill = 1;
something_changed = 1;
}
-
- select_reload_regs ();
- if (failure)
- goto failed;
+ else
+ {
+ select_reload_regs ();
+ if (failure)
+ goto failed;
+ }
if (insns_need_reload != 0 || did_spill)
something_changed |= finish_spills (global);
diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog
index 65fe0d0..bbaa3a2 100644
--- gcc/testsuite/ChangeLog
+++ gcc/testsuite/ChangeLog
@@ -1,7 +1,3 @@
-2016-04-26 Bernd Schmidt <bschmidt@redhat.com>
-
- * gcc.target/i386/lzcnt-1.c: Allow a different lzcntw output register.
-
2016-04-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/70780
diff --git gcc/testsuite/gcc.target/avr/pr60040-1.c gcc/testsuite/gcc.target/avr/pr60040-1.c
new file mode 100644
index 0000000..4fe296b
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr60040-1.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+
+float dhistory[10];
+float test;
+
+float getSlope(float history[]) {
+ float sumx = 0;
+ float sumy = 0;
+ float sumxy = 0;
+ float sumxsq = 0;
+ float rate = 0;
+ int n = 10;
+
+ int i;
+ for (i=1; i< 11; i++) {
+ sumx = sumx + i;
+ sumy = sumy + history[i-1];
+ sumy = sumy + history[i-1];
+ sumxsq = sumxsq + (i*i);
+ }
+
+ rate = sumy+sumx+sumxsq;
+ return rate;
+}
+
+void loop() {
+ test = getSlope(dhistory);
+}
diff --git gcc/testsuite/gcc.target/avr/pr60040-2.c gcc/testsuite/gcc.target/avr/pr60040-2.c
new file mode 100644
index 0000000..c40d49f
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr60040-2.c
@@ -0,0 +1,112 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef unsigned char __uint8_t;
+typedef short unsigned int __uint16_t;
+typedef long unsigned int __uint32_t;
+typedef __uint8_t uint8_t ;
+typedef __uint16_t uint16_t ;
+typedef __uint32_t uint32_t ;
+typedef __builtin_va_list __gnuc_va_list;
+typedef __gnuc_va_list va_list;
+typedef enum rtems_blkdev_request_op {
+ RTEMS_BLKDEV_REQ_READ,
+} rtems_fdisk_segment_desc;
+typedef struct rtems_fdisk_driver_handlers
+{
+ int (*blank) (const rtems_fdisk_segment_desc* sd,
+ uint32_t device,
+ uint32_t segment,
+ uint32_t offset,
+ uint32_t size);
+} rtems_fdisk_driver_handlers;
+typedef struct rtems_fdisk_page_desc
+{
+ uint16_t flags;
+ uint32_t block;
+} rtems_fdisk_page_desc;
+typedef struct rtems_fdisk_segment_ctl
+{
+ rtems_fdisk_page_desc* page_descriptors;
+ uint32_t pages_active;
+} rtems_fdisk_segment_ctl;
+typedef struct rtems_fdisk_segment_ctl_queue
+{
+} rtems_fdisk_segment_ctl_queue;
+typedef struct rtems_fdisk_device_ctl
+{
+ uint32_t flags;
+ uint8_t* copy_buffer;
+} rtems_flashdisk;
+
+extern void rtems_fdisk_error (const char *, ...);
+extern int rtems_fdisk_seg_write(const rtems_flashdisk*,
+ rtems_fdisk_segment_ctl*,
+ uint32_t,
+ const rtems_fdisk_page_desc* page_desc,
+ uint32_t);
+
+void rtems_fdisk_printf (const rtems_flashdisk* fd, const char *format, ...)
+{
+ {
+ va_list args;
+ __builtin_va_start(args,format);
+ }
+}
+static int
+rtems_fdisk_seg_blank_check (const rtems_flashdisk* fd,
+ rtems_fdisk_segment_ctl* sc,
+ uint32_t offset,
+ uint32_t size)
+{
+ uint32_t device;
+ uint32_t segment;
+ const rtems_fdisk_segment_desc* sd;
+ const rtems_fdisk_driver_handlers* ops;
+ return ops->blank (sd, device, segment, offset, size);
+}
+static int
+rtems_fdisk_seg_write_page_desc (const rtems_flashdisk* fd,
+ rtems_fdisk_segment_ctl* sc,
+ uint32_t page,
+ const rtems_fdisk_page_desc* page_desc)
+{
+ uint32_t offset = page * sizeof (rtems_fdisk_page_desc);
+ if ((fd->flags & (1 << 3)))
+ {
+ int ret = rtems_fdisk_seg_blank_check (fd, sc,
+ offset,
+ sizeof (rtems_fdisk_page_desc));
+ }
+ return rtems_fdisk_seg_write (fd, sc, offset,
+ page_desc, sizeof (rtems_fdisk_page_desc));
+}
+void
+rtems_fdisk_recycle_segment (rtems_flashdisk* fd,
+ rtems_fdisk_segment_ctl* ssc,
+ rtems_fdisk_segment_ctl* dsc,
+ uint32_t *pages)
+{
+ int ret;
+ uint32_t spage;
+ uint32_t used = 0;
+ uint32_t active = 0;
+ {
+ rtems_fdisk_page_desc* spd = &ssc->page_descriptors[spage];
+ {
+ rtems_fdisk_page_desc* dpd;
+ uint32_t dpage;
+ dpd = &dsc->page_descriptors[dpage];
+ *dpd = *spd;
+ ret = rtems_fdisk_seg_write_page_desc (fd,
+ dsc,
+ dpage, dpd);
+ }
+ }
+ rtems_fdisk_printf (fd, "ssc end: %d-%d: p=%ld, a=%ld, u=%ld",
+ pages, active, used);
+ {
+ rtems_fdisk_error ("compacting: ssc pages not 0: %d",
+ ssc->pages_active);
+ }
+}
diff --git gcc/testsuite/gcc.target/i386/lzcnt-1.c gcc/testsuite/gcc.target/i386/lzcnt-1.c
index c7054d1..f6240d1b 100644
--- gcc/testsuite/gcc.target/i386/lzcnt-1.c
+++ gcc/testsuite/gcc.target/i386/lzcnt-1.c
@@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-options "-O2 -mlzcnt " } */
-/* { dg-final { scan-assembler "lzcntw\[^\\n]*(%|)\[ad\]\[xi\]" } } */
+/* { dg-final { scan-assembler "lzcntw\[^\\n]*(%|)ax" } } */
#include <x86intrin.h>
diff --git libcilkrts/ChangeLog libcilkrts/ChangeLog
index 8fada8a..ed26a3a 100644
--- libcilkrts/ChangeLog
+++ libcilkrts/ChangeLog
@@ -1,9 +1,3 @@
-2016-04-26 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
-
- PR target/60290
- * Makefile.am (GENERAL_FLAGS): Add -funwind-tables.
- * Makefile.in: Regenerate.
-
2015-11-09 Igor Zamyatin <igor.zamyatin@intel.com>
PR target/66326
diff --git libcilkrts/Makefile.am libcilkrts/Makefile.am
index 4f944dd..70538a2 100644
--- libcilkrts/Makefile.am
+++ libcilkrts/Makefile.am
@@ -43,9 +43,6 @@ GENERAL_FLAGS = -I$(top_srcdir)/include -I$(top_srcdir)/runtime -I$(top_srcdir)/
# Enable Intel Cilk Plus extension
GENERAL_FLAGS += -fcilkplus
-# Always generate unwind tables
-GENERAL_FLAGS += -funwind-tables
-
AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99
AM_CPPFLAGS = $(GENERAL_FLAGS)
AM_LDFLAGS = $(XLDFLAGS)
diff --git libcilkrts/Makefile.in libcilkrts/Makefile.in
index a25d1c6..629aa6a 100644
--- libcilkrts/Makefile.in
+++ libcilkrts/Makefile.in
@@ -371,11 +371,9 @@ ACLOCAL_AMFLAGS = -I .. -I ../config
# GENERAL_FLAGS += -D_Cilk_spawn="" -D_Cilk_sync="" -D_Cilk_for=for
# Enable Intel Cilk Plus extension
-
-# Always generate unwind tables
GENERAL_FLAGS = -I$(top_srcdir)/include -I$(top_srcdir)/runtime \
-I$(top_srcdir)/runtime/config/$(config_dir) \
- -DIN_CILK_RUNTIME=1 -fcilkplus -funwind-tables
+ -DIN_CILK_RUNTIME=1 -fcilkplus
AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99
AM_CPPFLAGS = $(GENERAL_FLAGS)
AM_LDFLAGS = $(XLDFLAGS)