This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug middle-end/55653] New: Unnecessary initialization of vector register


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55653

             Bug #: 55653
           Summary: Unnecessary initialization of vector register
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: josh.m.conner@gmail.com


When initializing all lanes of a vector register, I notice that the register is
first initialized to zero and then all lanes of the vector are independently
initialized, resulting in extra code.

Specifically, I'm looking at the aarch64 target, with the following source:

void
fmla_loop (double * restrict result, double * restrict mul1,
       double mul2, int size)
{
  int i;

  for (i = 0; i < size; i++)
    result[i] = result[i] + mul1[i] * mul2;
}

Compiled with:

aarch64-linux-gnu-gcc -std=c99 -O3 -ftree-vectorize -S -o test.s test.c

The resultant code to initialize a vector register with two instances of mul2
is:

  adr     x3, .LC0
  ld1     {v3.2d}, [x3]
  ins     v3.d[0], v0.d[0]
  ins     v3.d[1], v0.d[0]
...
.LC0:
  .word   0
  .word   0
  .word   0
  .word   0

Where the first two instructions (that initialize the vector register) are
unnecessary, as is the space for .LC0.

Note that this initialization is being performed here in store_constructor:

        /* Inform later passes that the old value is dead.  */
        if (!cleared && !vector && REG_P (target))
          emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

right after another check to see if the vector needs to be cleared out (and
determine that it doesn't).

Instead of the emit_move_insn, that code used to be:

       emit_insn (gen_rtx_CLOBBER (VOIDmode, target));

But was changed in r101169, with the comment:

  "The expr.c change elides an extra move that's creeped in since we
changed clobbered values to get new registers in reload."

(see full checkin text here:
http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01584.html)

It's not clear to me whether this can be changed back, or if later passes
should be recognizing this initialization as redundant, or whether we need a
new expand pattern to match vector fill (vector duplicate).  At any rate, the
code is certainly not ideal as it stands.

Thanks!


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]