This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[dataflow]: PATCH COMMITTED Re: Dataflow miscompiles soft-fp/floatuntitf.c on x86_64


This patch by ian taylor fixes a regression on the x86-64.   The problem
appears to be explicit on the dataflow branch and latent on the trunk.

the patch has been bootstrapped and tested on x86-64, x86-32, ppc, and
ia-64.

Kenny

Ian Lance Taylor wrote:
> Uros Bizjak <ubizjak@gmail.com> writes:
>
>   
>> Latest dataflow branch segfaults for:
>>
>> FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c  -O0  execution test
>> FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c  -O1  execution test
>> FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c  -O2  execution test
>> FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c  -O3
>> -fomit-frame-pointer  execution test
>> FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c  -O3 -g
>> execution test
>> FAIL: gcc.dg/torture/fp-int-convert-float128-timode.c  -Os  execution test
>>
>> (The tests were run on FC6, x86_64-pc-linux-gnu).
>>
>> This is due to segfault in soft-fp/floatuntitf.c at __floatuntitf+202,
>> where %rsp is not aligned to 16 bytes for movdqa:
>>
>> 0x0000000000402c10 <__floatuntitf+192>: mov    %ax,0xfffffffffffffffe(%rsp)
>> 0x0000000000402c15 <__floatuntitf+197>: andb
>> $0x7f,0xffffffffffffffff(%rsp)
>> 0x0000000000402c1a <__floatuntitf+202>: movdqa
>> 0xfffffffffffffff0(%rsp),%xmm0   <<<
>> 0x0000000000402c20 <__floatuntitf+208>: pop    %rbx
>> 0x0000000000402c21 <__floatuntitf+209>: pop    %rbp
>> 0x0000000000402c22 <__floatuntitf+210>: retq  Uros.
>>     
>
>
> Seongbae tracked this down.  It's an interesting problem.
>
> In the .lreg dump we see this:
>
> ;; Register 79 in 5.
>
> ...
>
> (insn:HI 4 3 5 2 ../../../gccDBaseline/libgcc/../gcc/config/soft-fp/floatuntitf.c:35 (set (reg:TI 81 [ i ])
>         (subreg:TI (reg:DI 79 [ i ]) 0)) 87 {*movti_rex64} (expr_list:REG_DEAD (reg:DI 79 [ i ])
>         (nil)))
>
> Note that paradoxical subeg applied to reg 79.  Note that pseudo reg
> 79 is put in hard reg 5 (%rdi).
>
> When reload starts, it calls mark_home_live to call
> df_set_regs_ever_live for each hard register being used.  Since reg 79
> is DImode, and has been put in hard reg 5 this marks hard reg 5 as
> live (this is x86_64 so hard reg 5 is DImode).
>
> reload proceeds as usual.  At the end, reload calls
> cleanup_subreg_operands.  That function fixes up any subregs of hard
> registers to refer directly to the register in question.  In this
> case, though, it turns the paradoxical subreg into a TImode reference
> to hard reg 5, giving us this in the .greg dump:
>
> (insn:HI 4 7 5 2 ../../../gccDBaseline/libgcc/../gcc/config/soft-fp/floatuntitf.c:35 (set (reg:TI 37 r8 [orig:81 i ] [81])
>         (reg:TI 5 di [orig:79 i ] [79])) 87 {*movti_rex64} (nil))
>
> So now we have a TImode reference to hard reg 5, which means hard regs
> 5 and 6.  But when we set regs_ever_live, we only marked 5.  As it
> happens, reg 6 was not otherwise referenced in the function.  And, as
> it happens, reg 6 is %rbp, which is callee saved.
>
> All of the above happens on both mainline and dataflow branch.
>
> On mainline we simply carry on, with a reference to hard reg 6 but
> with that bit clear in regs_ever_live.  Since this is paradoxical
> subreg, hard reg 6 is never actually changed, and indeed all the
> references to it drop out as well.  So mainline is fine.
>
> Dataflow branch, however, recomputes regs_ever_live after reload is
> finished.  When it does this, it sees the TImode reference to reg 5,
> and therefore marks both reg 5 and 6 as live.
>
> But it only does this after reload is finished.  The effect is that
> the value of regs_ever_live is not the same during and after reload.
> reg 6 is clear in regs_ever_live during reload, and set after reload.
>
> This is bad.  It's bad because the x86_64 backend uses regs_ever_live
> when determining the frame layout, which is done when computing
> register elimination during reload.  And the x86_64 backend uses
> regs_ever_live again when deciding which registers to save and restore
> in the prologue and epilogue, after reload.  When the frame layout
> computation changes, it means that the stack usage changes, and the
> register elimination computations are incorrect.
>
> Fortunately the fix turns out to be relatively simple.  I have
> appended the untested patch.
>
> Note that this means that dataflow branch will generate worse code for
> this test case: it will save %rbp unnecessarily.
>
>
> In general it is risky to change regs_ever_live after reload is
> complete.  Changing whether a callee saved register is live will tend
> to break the assumptions made by register elimination and prologue and
> epilogue threading.  I think it would be appropriate to add some
> asserts to detect any case where this happens, and either fix it or
> make sure it is OK.
>
> Ian
>
>
> 2007-05-24  Ian Lance Taylor  <iant@google.com>
>
> 	* reload1.c (mark_home_live_1): New static function, broken out of
> 	mark_home_live.
> 	(mark_home_live): Call mark_home_live_1.
> 	(scan_paradoxical_subregs): Call mark_home_live_1.
>
>
> Index: gcc/reload1.c
> ===================================================================
> --- gcc/reload1.c	(revision 125036)
> +++ gcc/reload1.c	(working copy)
> @@ -2141,21 +2141,31 @@ alter_reg (int i, int from_reg)
>      }
>  }
>  
> -/* Mark the slots in regs_ever_live for the hard regs
> -   used by pseudo-reg number REGNO.  */
> +/* Mark the slots in regs_ever_live for the hard regs used by
> +   pseudo-reg number REGNO, accessed in MODE.  */
>  
> -void
> -mark_home_live (int regno)
> +static void
> +mark_home_live_1 (int regno, enum machine_mode mode)
>  {
>    int i, lim;
>  
>    i = reg_renumber[regno];
>    if (i < 0)
>      return;
> -  lim = i + hard_regno_nregs[i][PSEUDO_REGNO_MODE (regno)];
> +  lim = i + hard_regno_nregs[i][mode];
>    while (i < lim)
>      df_set_regs_ever_live(i++, true);
>  }
> +
> +/* Mark the slots in regs_ever_live for the hard regs
> +   used by pseudo-reg number REGNO.  */
> +
> +void
> +mark_home_live (int regno)
> +{
> +  if (reg_renumber[regno] >= 0)
> +    mark_home_live_1 (regno, PSEUDO_REGNO_MODE (regno));
> +}
>  
>  /* This function handles the tracking of elimination offsets around branches.
>  
> @@ -3905,8 +3915,11 @@ scan_paradoxical_subregs (rtx x)
>        if (REG_P (SUBREG_REG (x))
>  	  && (GET_MODE_SIZE (GET_MODE (x))
>  	      > reg_max_ref_width[REGNO (SUBREG_REG (x))]))
> -	reg_max_ref_width[REGNO (SUBREG_REG (x))]
> -	  = GET_MODE_SIZE (GET_MODE (x));
> +	{
> +	  reg_max_ref_width[REGNO (SUBREG_REG (x))]
> +	    = GET_MODE_SIZE (GET_MODE (x));
> +	  mark_home_live_1 (REGNO (SUBREG_REG (x)), GET_MODE (x));
> +	}
>        return;
>  
>      default:
>   


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]