[Bug target/83850] New: [8 Regression] Spills on vector extract, gcc.target/i386/pr80846-1.c FAILs

Mon Jan 15 10:16:00 GMT 2018

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83850

            Bug ID: 83850
           Summary: [8 Regression] Spills on vector extract,
                    gcc.target/i386/pr80846-1.c FAILs
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, ra
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-*

This is a regression from development of the PR80846 patch to the point it
landed on trunk.  We now spill the accumulator:

    .L2:
        vmovdqa64       (%esp), %zmm3
        vpaddd  (%eax), %zmm3, %zmm2
        addl    $64, %eax
        vmovdqa64       %zmm2, (%esp)
        cmpl    %eax, %edx
        jne     .L2

which is because of the extraction of the lower/upper half with

  _20 = BIT_FIELD_REF <vect_sum_11.4_6, 256, 0>;
  _13 = BIT_FIELD_REF <vect_sum_11.4_6, 256, 256>;
  _18 = _13 + _20;

which when TERed into the add looks like 

(insn 16 15 18 4 (set (reg:V8SI 107)
        (plus:V8SI (subreg:V8SI (reg:V16SI 94 [ vect_sum_11.4 ]) 32)
            (subreg:V8SI (reg:V16SI 94 [ vect_sum_11.4 ]) 0))) 3004 {*addv8si3}
     (expr_list:REG_DEAD (reg:V16SI 94 [ vect_sum_11.4 ])
        (nil)))

before IRA/LRA but LRA then ends up spilling reg:V16SI 94.  That defeats
the optimization intended by the PR80846 fix.