This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/67530] New: Failure to eliminate dead code produced by vector lowering


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67530

            Bug ID: 67530
           Summary: Failure to eliminate dead code produced by vector
                    lowering
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wschmidt at gcc dot gnu.org
                CC: bergner at gcc dot gnu.org, rguenth at gcc dot gnu.org
  Target Milestone: ---
              Host: powerpc64le-unknown-linux-gnu
            Target: powerpc64le-unknown-linux-gnu
             Build: powerpc64le-unknown-linux-gnu

Test case gcc.dg/fold-compare-7.c is as follows:

/* { dg-do compile } */
/* { dg-options "-O2" } */

typedef float vecf __attribute__((vector_size(8*sizeof(float))));

long f(vecf *f1, vecf *f2){
  return ((*f1 == *f2) < 0)[2];
}

The initial GIMPLE code is simple (compiled with -O2):

f (vecf * f1, vecf * f2)
{
  long int D.2301;
  vector(8) int D.2299;
  vector(8) float D.2302;
  vector(8) float D.2303;
  vector(8) int D.2304;
  int D.2305;

  D.2302 = *f1;
  D.2303 = *f2;
  D.2304 = D.2302 == D.2303;
  D.2299 = D.2304;
  D.2305 = BIT_FIELD_REF <D.2299, 32, 64>;
  D.2301 = (long int) D.2305;
  return D.2301;
}

However, the vector lowering code expands this so we have a lot of comparisons
and BIT_FIELD_REF expressions, most of which turn out to not be needed.  The
optimized tree dump shows:

  <bb 2>:
  _3 = *f1_2(D);
  _5 = *f2_4(D);
  _9 = BIT_FIELD_REF <_3, 32, 0>;
  _10 = BIT_FIELD_REF <_5, 32, 0>;
  _11 = _9 == _10 ? -1 : 0;
  _12 = BIT_FIELD_REF <_3, 32, 32>;
  _13 = BIT_FIELD_REF <_5, 32, 32>;
  _14 = _12 == _13 ? -1 : 0;
  _15 = BIT_FIELD_REF <_3, 32, 64>;
  _16 = BIT_FIELD_REF <_5, 32, 64>;
  _17 = _15 == _16 ? -1 : 0;
  _18 = BIT_FIELD_REF <_3, 32, 96>;
  _19 = BIT_FIELD_REF <_5, 32, 96>;
  _20 = _18 == _19 ? -1 : 0;
  _21 = BIT_FIELD_REF <_3, 32, 128>;
  _22 = BIT_FIELD_REF <_5, 32, 128>;
  _23 = _21 == _22 ? -1 : 0;
  _24 = BIT_FIELD_REF <_3, 32, 160>;
  _25 = BIT_FIELD_REF <_5, 32, 160>;
  _26 = _24 == _25 ? -1 : 0;
  _27 = BIT_FIELD_REF <_3, 32, 192>;
  _28 = BIT_FIELD_REF <_5, 32, 192>;
  _29 = _27 == _28 ? -1 : 0;
  _30 = BIT_FIELD_REF <_3, 32, 224>;
  _31 = BIT_FIELD_REF <_5, 32, 224>;
  _32 = _30 == _31 ? -1 : 0;
  _6 = {_11, _14, _17, _20, _23, _26, _29, _32};
  _7 = _17;
  _8 = (long int) _17;
  return _8;

Note that the only instructions in here that matter are:

  _3 = *f1_2(D);
  _5 = *f2_4(D);
  _15 = BIT_FIELD_REF <_3, 32, 64>;
  _16 = BIT_FIELD_REF <_5, 32, 64>;
  _17 = _15 == _16 ? -1 : 0;
  _8 = (long int) _17;
  return _8;

We end up generating really horrible code for this.  The middle end should be
able to detect that _6 is dead and clean up the rest of this.  We don't even
get rid of the dead copy _7 = _17.

I haven't looked at this carefully, but presumably this is due to the late
running of pass_lower_vector.  Perhaps running DCE again would be appropriate
if pass_lower_vector makes any changes?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]