PATCH: PR target/59379: [4.9 Regression] gomp_init_num_threads is compiled into an infinite loop with --with-arch=corei7 --with-cpu=slm

H.J. Lu hjl.tools@gmail.com
Sun Jan 19 14:30:00 GMT 2014


On Sun, Jan 19, 2014 at 6:24 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Sun, Jan 19, 2014 at 3:20 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
>>>> For LEA operation with SImode_address_operand, which zero-extends SImode
>>>> to DImode, ix86_split_lea_for_addr turns
>>>>
>>>> (set (reg:DI) ...)
>>>>
>>>> into
>>>>
>>>> (set (reg:SI) ...)
>>>>
>>>> We need to do
>>>>
>>>> (set (reg:DI) (zero_extend:DI (reg:SI)))
>>>>
>>>> at the end. If the LEA operation is
>>>>
>>>> (set (reg:DI) (zero_extend:DI (reg:SI)))
>>>
>>> ree pass should remove these. However, we can just emit zero-extend
>>> insn at the end of sequence, and ree (which is located after
>>> post-reload split) should handle it:
>>>
>>> --cut here--
>>> Index: config/i386/i386.md
>>> ===================================================================
>>> --- config/i386/i386.md (revision 206753)
>>> +++ config/i386/i386.md (working copy)
>>> @@ -5428,12 +5428,17 @@
>>>    operands[0] = SET_DEST (pat);
>>>    operands[1] = SET_SRC (pat);
>>>
>>> -  /* Emit all operations in SImode for zero-extended addresses.  Recall
>>> -     that x86_64 inheretly zero-extends SImode operations to DImode.  */
>>> +  /* Emit all operations in SImode for zero-extended addresses.  */
>>>    if (SImode_address_operand (operands[1], VOIDmode))
>>>      mode = SImode;
>>>
>>>    ix86_split_lea_for_addr (curr_insn, operands, mode);
>>> +
>>> +  /* Zero-extend return register to DImode for zero-extended addresses.  */
>>> +  if (mode != <MODE>mode)
>>> +    emit_insn (gen_zero_extendsidi2
>>> +              (operands[0], gen_lowpart ((mode), operands[0])));
>>> +
>>>    DONE;
>>>  }
>>>    [(set_attr "type" "lea")
>>> --cut here--
>>>
>>> The patch was tested with a testcase from Comment #9 of the PR using
>>> "-O --march=corei7 -mtune=slm", and resulting binary runs without
>>> problems.
>>>
>>
>> Yes, the resulting GCC works correctly.  However, we generate
>> extra
>>
>> (set (reg:DI) (zero_extend:DI (reg:SI)))
>>
>> It is because we generate
>>
>> (set (reg:SI) (reg:SI)
>> (set (reg:DI) (zero_extend:DI (reg:SI)))
>>
>> REE pass doesn't know
>>
>> (set (reg:SI) (reg:SI)
>>
>> has an implicit ZERO_EXTEND.  Here is a testcase:
>
> This is the correct sequence,and REE pass should be improved to handle
> this situation.
>
> Note, that the problem was that we assumed SImode operations
> (including move) have implicit DImode zero-extend, but in fact we
> haven't communicate this to the compiler in a proper way.
>
> So, I propose we go with my patch and file an enhancement PR for the REE pass.
>

That is fine with me.  Please install it on all affected branches
and close the PR.  I will open a new PR for REE pass.

Thanks.


-- 
H.J.



More information about the Gcc-patches mailing list