User account creation filtered due to spam.

Bug 65342 - [5/6/7/8 Regression] FAIL: gfortran.dg/intrinsic_(un)?pack_1.f90 -O1 execution test on powerpc-apple-darwin9 after r210201
Summary: [5/6/7/8 Regression] FAIL: gfortran.dg/intrinsic_(un)?pack_1.f90 -O1 execu...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 5.0
: P4 normal
Target Milestone: 5.5
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-03-07 10:01 UTC by Dominique d'Humieres
Modified: 2016-06-03 10:06 UTC (History)
4 users (show)

See Also:
Host: powerpc-apple-darwin9
Target: powerpc-apple-darwin9
Build: powerpc-apple-darwin9
Known to work:
Known to fail:
Last reconfirmed: 2015-03-07 00:00:00


Attachments
workaround (480 bytes, patch)
2015-03-10 12:44 UTC, Alan Modra
Details | Diff
Patch for discussion (2.86 KB, patch)
2015-03-16 09:38 UTC, Iain Sandoe
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dominique d'Humieres 2015-03-07 10:01:52 UTC
On powerpc-apple-darwin9 I see the following failures with -m64 after r210201

FAIL: gfortran.dg/intrinsic_pack_1.f90   -O1  execution test
FAIL: gfortran.dg/intrinsic_pack_1.f90   -O2  execution test
FAIL: gfortran.dg/intrinsic_unpack_1.f90   -O1  execution test
FAIL: gfortran.dg/intrinsic_unpack_1.f90   -O2  execution test
FAIL: gfortran.dg/intrinsic_unpack_1.f90   -O3 -fomit-frame-pointer  execution test
FAIL: gfortran.dg/intrinsic_unpack_1.f90   -O3 -fomit-frame-pointer -funroll-loops  execution test
FAIL: gfortran.dg/intrinsic_unpack_1.f90   -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions  execution test
FAIL: gfortran.dg/intrinsic_unpack_1.f90   -O3 -g  execution test

The failures occur only in the following reduced codes

program intrinsic_unpack
   implicit none
   integer(kind=2), dimension(3, 3) :: a2, b2

   logical, dimension(3, 3) :: mask
   character(len=500) line1, line2
   integer i

   mask = reshape ((/.false.,.true.,.false.,.true.,.false.,.false.,&
                    &.false.,.false.,.true./), (/3, 3/));
   a2 = reshape ((/1, 0, 0, 0, 1, 0, 0, 0, 1/), (/3, 3/));
   b2 = unpack ((/2_2, 3_2, 4_2/), mask, a2)
   print *, b2
   if (any (b2 .ne. reshape ((/1, 2, 0, 3, 1, 0, 0, 0, 4/), (/3, 3/)))) &
      call abort

end program

program main
  implicit none
  integer :: i
  integer(kind=2), dimension(3,3) :: i2
  integer(kind=2), dimension(9) :: vi2
  integer(kind=2), dimension(9) :: ri2

  vi2 = (/(i+10,i=1,9)/)
  i2 = reshape((/1_2, -1_2, 2_2, -2_2, 3_2, -3_2, 4_2, -4_2, 5_2/), shape(i2))
  ri2 = pack(i2,i2>0,vi2)
  if (any(ri2 /= (/1_2, 2_2, 3_2, 4_2, 5_2, 16_2, 17_2, 18_2, 19_2/))) &
       & call abort

end program main

all the other checks in the original tests succeed.

The difference between the assembly generated by r210200 (-) and r210201 (+) is

--- intrinsic_unpack_1_red.s	2015-03-07 10:11:12.000000000 +0100
+++ intrinsic_unpack_1_red_b.s	2015-03-07 10:10:46.000000000 +0100
@@ -26,18 +26,14 @@ L1$pb:
 	std r10,172(r1)
 	lwz r2,32(r2)
 	stw r2,180(r1)
-	addis r9,r31,ha16(_A.1.1600-L1$pb)
-	la r2,lo16(_A.1.1600-L1$pb)(r9)
-	lwz r7,lo16(_A.1.1600-L1$pb)(r9)
-	lwz r8,4(r2)
-	lwz r10,8(r2)
-	lwz r9,12(r2)
-	stw r7,112(r1)
-	stw r8,116(r1)
-	stw r10,120(r1)
-	stw r9,124(r1)
-	lhz r2,16(r2)
-	sth r2,128(r1)
+	addis r2,r31,ha16(_A.1.1600-L1$pb)
+	la r9,lo16(_A.1.1600-L1$pb)(r2)
+	ld r2,lo16(_A.1.1600-L1$pb)(r2)
+	ld r10,8(r9)
+	lhz r9,16(r9)
+	std r2,112(r1)
+	std r10,120(r1)
+	sth r9,128(r1)
 	li r25,138
 	std r25,752(r1)
 	li r30,1
Comment 1 Alan Modra 2015-03-07 13:27:22 UTC
Confirmed.  The problem occurs in fwprop1 where instructions corresponding to the following assembly
	addis r2,r31,ha16(_A.1.1600-L1$pb)
	la r9,lo16(_A.1.1600-L1$pb)(r2)
	ld r2,0(r9)
are combined to
	addis r2,r31,ha16(_A.1.1600-L1$pb)
	la r9,lo16(_A.1.1600-L1$pb)(r2)
	ld r2,lo16(_A.1.1600-L1$pb)(r2)
ie. the offset is propagated into the memory load.  This ought to give you an error at assembly or link time.  If not, you have a bad assembler or linker..

movdi_low is the culprit, I think.  It should require a suitably aligned offset (operand 2).
Comment 2 Dominique d'Humieres 2015-03-07 13:40:00 UTC
> Confirmed.  The problem occurs in fwprop1 where instructions corresponding to the
> following assembly
>	addis r2,r31,ha16(_A.1.1600-L1$pb)
>	la r9,lo16(_A.1.1600-L1$pb)(r2)
>	ld r2,0(r9)
> are combined to
>	addis r2,r31,ha16(_A.1.1600-L1$pb)
>	la r9,lo16(_A.1.1600-L1$pb)(r2)
>	ld r2,lo16(_A.1.1600-L1$pb)(r2)
> ie. the offset is propagated into the memory load.  This ought to give you
> an error at assembly or link time.

No error at assembly or link time.

> If not, you have a bad assembler or linker..

Well, we'll have to live with them!-(EOL target).

> movdi_low is the culprit, I think.  It should require a suitably aligned
> offset (operand 2).

How?
Comment 3 Iain Sandoe 2015-03-07 16:55:19 UTC
(In reply to Dominique d'Humieres from comment #2)
> > Confirmed.  The problem occurs in fwprop1 where instructions corresponding to the
> > following assembly
> >	addis r2,r31,ha16(_A.1.1600-L1$pb)
> >	la r9,lo16(_A.1.1600-L1$pb)(r2)
> >	ld r2,0(r9)
> > are combined to
> >	addis r2,r31,ha16(_A.1.1600-L1$pb)
> >	la r9,lo16(_A.1.1600-L1$pb)(r2)
> >	ld r2,lo16(_A.1.1600-L1$pb)(r2)
> > ie. the offset is propagated into the memory load.  This ought to give you
> > an error at assembly or link time.
> 
> No error at assembly or link time.

> > If not, you have a bad assembler or linker..
> 
> Well, we'll have to live with them!-(EOL target).

The system as is based on gas-1.38 and doesn't warn for this :-(
ld64 is slightly better and does catch a few more cases (where they resolve at link-time).

.. there's some hope for my WIP GAS port and an updated ld64 (yeah, I know it's taking a long time for these to appear) ..

> > movdi_low is the culprit, I think.  It should require a suitably aligned
> > offset (operand 2).
> 
> How?

In the mdynamic-no-pic case, the literal value should work the same as for linux.

In the case of PIC, I suspect we need to look through the uspecs that wrap mach-o PIC offsets and try to determine if the alignment of the referenced symbol is guaranteed to be "enough".  The alignment of the picbase will always be >= 4.

Some time ago I had a WIP patch to try and deal with this … but it's bit-rotted so needs a significant re-visit.
Comment 4 Alan Modra 2015-03-10 08:22:51 UTC
Here's another failing powerpc-darwin testcase due to movdi_low (movdf_low_di and their store counterparts have the same problem of course).

/* -m64 -O1 -S -fno-pic  */
struct {
  char c;
  long l;
} __attribute__ ((__packed__)) x;

long get_x (void)
{
  return x.l;
}
Comment 5 Iain Sandoe 2015-03-10 08:44:22 UTC
(In reply to Alan Modra from comment #4)
> Here's another failing powerpc-darwin testcase due to movdi_low
> (movdf_low_di and their store counterparts have the same problem of course).
> 
> /* -m64 -O1 -S -fno-pic  */

for Darwin, -mdynamic-no-pic ^ would be a real-life User case.

> struct {
>   char c;
>   long l;
> } __attribute__ ((__packed__)) x;
> 
> long get_x (void)
> {
>   return x.l;
> }

Thanks, Alan.
I took a quick look over the weekend (out of the office last week).

So, do you think I need to split things up so that the d-mode and ds-mode insn fragments can have different match conditions? - or is the issue that these patterns have no reload constraints (in which case I could at least investigate adding Y to the relevant cases)?
Comment 6 Alan Modra 2015-03-10 12:44:35 UTC
Created attachment 35001 [details]
workaround

You might like to consider this patch that effectively reverts r210201 for Darwin.  This should cure the regression.
Comment 7 Iain Sandoe 2015-03-10 12:59:21 UTC
(In reply to Alan Modra from comment #6)
> Created attachment 35001 [details]
> workaround
> 
> You might like to consider this patch that effectively reverts r210201 for
> Darwin.  This should cure the regression.

Yeah. I did something similar as a fall-back .. but (unless we run out of time) I'd rather fix the underlying problem - since it occurs in other places too.

[if we do run out of time, then I'm fine with applying the fallback].
Comment 8 Iain Sandoe 2015-03-10 13:13:02 UTC
BTW, is:

(define_insn "movdi_low_st"
  [(set (mem:DI (lo_sum:DI (match_operand:DI 1 "gpc_reg_operand" "b,b,b")
                           (match_operand 2 "" "Y,,")))
	(match_operand:DI 0 "gpc_reg_operand" "r,r,*!d"))]
  "TARGET_MACHO && TARGET_64BIT"
  "*
{
  switch (which_alternative)
    {
      case 0:
	return \"std %0,lo16(%2)(%1)\";
      case 1:
	{
	  output_asm_insn (\"la %1,lo16(%2)(%1)\", operands);

^^^^^
permitted? (i.e. modifying %1, which is an input operand)
	  return (\"std %0,0(%1)\");
	}
      case 2:
	return \"stfd %0,lo16(%2)(%1)\";
      default:
	gcc_unreachable ();
    }
}"
  [(set_attr "type" "store")
   (set_attr "length" "4")])
Comment 9 Alan Modra 2015-03-10 13:33:00 UTC
As far as fixing the real underlying problem goes, I'm not so familiar with the darwin support that I can state with certainty that you need to fix movdi_low and friends.

It might help to explain why powerpc64-linux -mcmodel=medium code doesn't hit the intrinsic_unpack problem even though the resultant code looks very similar to darwin code.  We don't see a similar propagation of offset into memory load because legitimate_addres_p() says that isn't valid.  It isn't valid because legitimate_constant_pool_address_p (also handles toc-relative medium model addresses) checks alignment.  So fwprop and combine don't create the problem memory loads in the first place.  However, if we did see such a memory load, reload would fix it for us due to the predicate and constraints in movdi_internal64.  Note that having legitimate_address_p() prevent these addresses does mean we lose some optimization opportunites, but on the other hand, if we allowed everything early we'd end up with lots of reloads and probably worse code.

You may be able to do something similar in legitimate_address_p for darwin.
Comment 10 Alan Modra 2015-03-10 13:36:53 UTC
> permitted? (i.e. modifying %1, which is an input operand)

Yes.  You're outputting assembly, practically anything goes.
Comment 11 Iain Sandoe 2015-03-16 09:38:44 UTC
Created attachment 35039 [details]
Patch for discussion

OK so this is a frustrating area to debug.  One can see the problem easily enough - but finding where it's introduced ... 

Question:
 Is there a reason that we don't have a constraint for DS-mode operands?
 Since there are so few constraint letters left, I have not attempted to add one - but curious about the reason.

The patch is intended to do the following:
 1. record that macho-pic indirections are always sufficiently aligned that a ds-mode offset is applicable (they reside in special sections with guaranteed alignment) - the picbase is also guaranteed to be aligned to 32b since it's a code label.
 - changes to config/darwin.{c,h}

 2. I've added a new predicate (put it in darwin.md, for now since I'm just testing).
 - the predicate tries to look through the various cases to determine when a ds-mode operand is safe.

 3. This is the bit that is in generic code and needs review:

ISTM that the Y constraint code needs an additional check - at least for Darwin - if *only* for darwin, then I'm happy to conditionalise it on TARGET_MACHO.  However, it's not obvious to me that no other target could be affected.

What can happen is that the offset to the mem returns NULL_RTX (no offset) but there's actually quite a complex address expression - and that might not resolve to a sufficiently aligned object.  Presently, the code just assumes that no offset means it's OK.


--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6378,11 +6378,20 @@ mem_operand_gpr (rtx op, machine_mode mode)
 {
   unsigned HOST_WIDE_INT offset;
   int extra;
+  unsigned align = MEM_ALIGN (op);
   rtx addr = XEXP (op, 0);
-
   op = address_offset (addr);
   if (op == NULL_RTX)
-    return true;
+    {
+      /* No offset:
+         A naked reg is OK - any alignment works.  */
+      if (REG_P (addr))
+        return true ;
+      /* Otherwise, we assume there will be a relocation, and we can't
+        guarantee it will fit unless the object referred to is sufficiently
+        aligned.  */
+      return align >= 32;
+    }


NOTE: there are some debug changes in the patch - this is not intended for submission until point 3 is reviewed.
Comment 12 Alan Modra 2015-03-16 12:02:33 UTC
We won't want that mem_operand_gpr change for Linux or AIX as we do the alignment checking of more complex expressions in legitimate_address_p.
Do take heed to the comment about accepting odd rtl generated by reload..
Comment 13 Iain Sandoe 2015-03-16 12:09:06 UTC
(In reply to Alan Modra from comment #12)
> We won't want that mem_operand_gpr change for Linux or AIX as we do the
> alignment checking of more complex expressions in legitimate_address_p.
> Do take heed to the comment about accepting odd rtl generated by reload..

Hmm.. OK.
FWIW, I did try to track through legitimate_address_p to see if i could identify where the problem was being introduced (without much luck). Will have to look again.
Comment 14 Dominique d'Humieres 2015-03-25 13:26:54 UTC
Results with the patch in comment 11 at

https://gcc.gnu.org/ml/gcc-testresults/2015-03/msg02484.html

Note that the patch in comment 6 also fixes this PR. It is probably the best short term solution. Could it be committed for 5.1?
Comment 15 Iain Sandoe 2015-04-09 10:13:36 UTC
(In reply to Dominique d'Humieres from comment #14)
> Results with the patch in comment 11 at
> 
> https://gcc.gnu.org/ml/gcc-testresults/2015-03/msg02484.html
> 
> Note that the patch in comment 6 also fixes this PR. It is probably the best
> short term solution. Could it be committed for 5.1?

I concur, there's no way I'm going to get any more cycles on this before GCC5 RC1 is built (probably in the next day or two).
 
Will try to find some time to address it properly (since there is a real underlying issue) for 6/5.2.
Comment 16 Dominique d'Humieres 2015-04-09 10:41:40 UTC
> Note that the patch in comment 6 also fixes this PR. It is probably the best
> short term solution. Could it be committed for 5.1?

Results with this patch at https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg00483.html. The failures are also present without the patch.
Comment 17 Jakub Jelinek 2015-04-22 11:58:46 UTC
GCC 5.1 has been released.
Comment 18 Richard Biener 2015-07-16 09:13:26 UTC
GCC 5.2 is being released, adjusting target milestone to 5.3.
Comment 19 Richard Biener 2015-12-04 10:46:41 UTC
GCC 5.3 is being released, adjusting target milestone.
Comment 20 Richard Biener 2016-06-03 10:06:25 UTC
GCC 5.4 is being released, adjusting target milestone.