This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Support signbit, signbitf and signbitl as GCC builtins
- From: Ulrich Weigand <weigand at i1 dot informatik dot uni-erlangen dot de>
- To: roger at eyesopen dot com (Roger Sayle)
- Cc: weigand at i1 dot informatik dot uni-erlangen dot de (Ulrich Weigand), rth at redhat dot com, gcc-patches at gcc dot gnu dot org
- Date: Thu, 5 Feb 2004 03:13:23 +0100 (CET)
- Subject: Re: [PATCH] Support signbit, signbitf and signbitl as GCC builtins
Roger Sayle wrote:
> Perhaps this is a misunderstanding on my part. My understanding is that
> the SUBREG rtx can potentially be used to extract any suitable aligned
> consecutive range of bits, as represented in one mode, from a larger/wider
> mode.
In rtl.texi, the first sentence on subregs is:
"subreg expressions are used to refer to a register in a machine mode other
than its natural one, or to refer to one register of a multi-part reg that
actually refers to several registers."
The first alternative allows you to access any *low* part of a register
(e.g. (subreg:QI (reg:SI) 3) or (subreg:HI (reg:SI) 2)), while the second
alternative allows you to access any word of a multi-word value.
In fact, the code allows you even to combine the two by accessing a low
part of one word of a multi-word value in a single subreg, e.g.
(subreg:QI (reg:DI) 3) on a big-endian 32-bit target.
The are various places in the code where just this condition is enforced,
most notably subreg_offset_representable_p (rtlanal.c), but see also the
sanity checks in gen_realpart / gen_imagpart (emit-rtl.c).
In any case I'm certain this restriction applies to subregs of hard
registers. However, I'm not sure (and the documentation isn't really
clear either) whether this restriction applies to subregs of pseudos
as well. The middle-end didn't appear to be generating subregs
violating this restriction up to now, however.
> For example, (subreg:QI (reg:DF ...) 0) and (subreg:QI (reg:DF ...) 5)
> are perfectly valid ways of retrieving/reinterpreting parts of an FP
> register. My belief was/is that reload was supposed to spill the register
> to memory if the neccesary outermode & offset can't be accessed directly.
If this is indeed the case, then we'd have a reload bug. However, I
don't think this ever worked, but it didn't matter since the middle-end
wouldn't generate such subregs ...
> Of course, I may be very mistaken. If this isn't the case, then I agree
> my patch may be incorrect. Can you provide more details of how the
> generated code is incorrect?
Well, the 'test' routine of builtins-32.c generates the following code
before reload (24.lreg):
(insn 3 18 11 0 (set (reg/v:DF 41 [ x ])
(reg:DF 16 %f0 [ x ])) 65 {*movdf_64} (nil)
(expr_list:REG_DEAD (reg:DF 16 %f0 [ x ])
(nil)))
(insn 11 3 13 0 (set (reg:SI 43)
(subreg:SI (reg/v:DF 41 [ x ]) 0)) 54 {*movsi_zarch} (insn_list 3 (nil))
(expr_list:REG_DEAD (reg/v:DF 41 [ x ])
(nil)))
(insn 13 11 21 0 (set (reg:DI 44)
(sign_extend:DI (reg:SI 43))) 91 {*extendsidi2} (insn_list 11 (nil))
(expr_list:REG_DEAD (reg:SI 43)
(nil)))
(insn 21 13 24 0 (parallel [
(set (reg/i:DI 2 %r2 [ <result> ])
(and:DI (reg:DI 44)
(const_int -2147483648 [0xffffffff80000000])))
(clobber (reg:CC 33 %cc))
]) 202 {anddi3} (insn_list 13 (nil))
(expr_list:REG_DEAD (reg:DI 44)
(expr_list:REG_UNUSED (reg:CC 33 %cc)
(nil))))
and reload makes of that:
Reloads for insn # 3
Reload 0: reload_in (DF) = (reg:DF 16 %f0 [ x ])
GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1)
reload_in_reg: (reg:DF 16 %f0 [ x ])
reload_reg_rtx: (reg/v:DF 1 %r1 [orig:41 x ] [41])
Reloads for insn # 11
Reload 0: reload_in (SI) = (reg:SI 1 %r1)
GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), can't combine
reload_in_reg: (subreg:SI (reg/v:DF 1 %r1 [orig:41 x ] [41]) 0)
reload_reg_rtx: (reg:SI 3 %r3)
(insn 30 18 31 0 (set (mem:DF (plus:DI (reg/f:DI 15 %r15)
(const_int 160 [0xa0])) [0 S8 A8])
(reg:DF 16 %f0 [ x ])) 65 {*movdf_64} (nil)
(nil))
(insn 31 30 3 0 (set (reg/v:DF 1 %r1 [orig:41 x ] [41])
(mem:DF (plus:DI (reg/f:DI 15 %r15)
(const_int 160 [0xa0])) [0 S8 A8])) 65 {*movdf_64} (nil)
(nil))
(insn 3 31 32 0 (set (reg/v:DF 1 %r1 [orig:41 x ] [41])
(reg/v:DF 1 %r1 [orig:41 x ] [41])) 65 {*movdf_64} (nil)
(nil))
(insn 32 3 11 0 (set (reg:SI 3 %r3)
(reg:SI 1 %r1)) 54 {*movsi_zarch} (nil)
(nil))
(insn 11 32 13 0 (set (reg:SI 2 %r2 [43])
(reg:SI 3 %r3)) 54 {*movsi_zarch} (insn_list 3 (nil))
(nil))
(insn 13 11 21 0 (set (reg:DI 2 %r2 [44])
(sign_extend:DI (reg:SI 2 %r2 [43]))) 91 {*extendsidi2} (insn_list 11 (nil))
(nil))
(insn 21 13 24 0 (parallel [
(set (reg/i:DI 2 %r2 [ <result> ])
(and:DI (reg:DI 2 %r2 [44])
(mem/u/f:DI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [4 S8 A64])))
(clobber (reg:CC 33 %cc))
]) 202 {anddi3} (insn_list 13 (nil))
(nil))
Note how insn 32 is broken? It simply reinterprets reg %r1 in SImode,
and hence loads the *low* half of the register into %r3. In fact to
do it right would require either a shift (which reload never does) or
else *another* secondary memory slot.
Bye,
Ulrich
--
Dr. Ulrich Weigand
weigand@informatik.uni-erlangen.de