This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH v2] combine: Improve change_zero_ext, call simplify_set afterwards.
On Thu, Dec 22, 2016 at 12:00:37PM +0100, Georg-Johann Lay wrote:
> On 21.12.2016 18:46, Segher Boessenkool wrote:
> >On Wed, Dec 21, 2016 at 01:58:18PM +0100, Georg-Johann Lay wrote:
> >>$ avr-gcc
> >>/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c -S
> >>-O1 -mmcu=avr4 -S -v
> >>
> >>/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c: In
> >>function 'yasm_lc3b__parse_insn':
> >>/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1:
> >>error: insn does not satisfy its constraints:
> >> }
> >> ^
> >>(jump_insn 58 98 59 9 (set (pc)
> >> (if_then_else (eq (and:HI (reg:HI 31 r31)
> >> (const_int 1 [0x1]))
> >> (const_int 0 [0]))
> >> (label_ref 70)
> >> (pc)))
> >>"/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11
> >>415 {*sbrx_and_branchhi}
> >> (int_list:REG_BR_PROB 375 (nil))
> >> -> 70)
> >
> >
> >>Combine comes up with the following insn:
> >>(jump_insn 58 57 59 7 (set (pc)
> >> (if_then_else (eq (and:HI (subreg:HI (mem:QI (reg/v/f:HI 75 [
> >>operands ]) [1 *operands_17(D)+0 S1 A8]) 0)
> >> (const_int 1 [0x1]))
> >> (const_int 0 [0]))
> >> (label_ref 70)
> >> (pc)))
> >>"/home/georg/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11
> >>415 {*sbrx_and_branchhi}
> >> (int_list:REG_BR_PROB 375 (nil))
> >> -> 70)
> >>
> >>which cannot be correct because avr.md::*sbrx_and_branchhi reads:
> >>
> >>(define_insn "*sbrx_and_branch<mode>"
> >> [(set (pc)
> >> (if_then_else
> >> (match_operator 0 "eqne_operator"
> >> [(and:QISI
> >> (match_operand:QISI 1 "register_operand" "r")
> >> (match_operand:QISI 2 "single_one_operand" "n"))
> >> (const_int 0)])
> >> (label_ref (match_operand 3 "" ""))
> >> (pc)))]
> >> "" { ... } ...)
> >>
> >>Hence we have a memory operand (subreg of mem)) where only a register is
> >>allowed. Reg alloc then reloads the mem:QI into R31, but R31 is the
> >>last hard reg, i.e. R31 cannot hold HImode.
> >
> >If you don't have instruction scheduling subregs of mem are allowed (and
> >are counted as registers). Combine asks recog, and it think this is fine.
> >
> >Why does reload use r31 here? Why does it think that is valid for HImode?
>
> I can't tell you, all I'm seeing bunch of new FAILs after r243578.
> For a simpler test case like the following
>
> unsigned u;
> int volatile v;
>
> void btest1 (void)
> {
> if (u & 1)
> v = 0;
> v = 0;
> }
>
> the combine pattern is also used but the register pressure is smaller.
> After quite some spills, reload then comes up with
>
> (insn 18 7 8 2 (set (reg:QI 24 r24)
> (mem/c:QI (symbol_ref:HI ("u") <var_decl 0x7f73ef303900 u>)
> [1 u+0 S1 A8])) "testbit.c":6 56 {movqi_insn}
> (nil))
> (jump_insn 8 18 9 2 (set (pc)
> (if_then_else (eq (and:HI (reg:HI 24 r24)
> (const_int 1 [0x1]))
> (const_int 0 [0]))
> (label_ref 11)
> (pc))) "testbit.c":6 415 {*sbrx_and_branchhi}
> (int_list:REG_BR_PROB 4600 (nil))
> -> 11)
>
> i.e. the paradoxical subreg could be resolved somehow and R25 is
> uninitialized. It this the purpose of the combine change to come
> up with uninitialized regs because it is known that just the
> initialized parts are used by the code?
If a
(zero_extract (foo) (const_int) (const_int)
does not match in combine, it calls change_zero_ext to replace it
with
(and (lshiftrt (foo) (const_int)) (const_int)
(The zero_extract expression is generated by combine for insn
55+56->57.) Prior to the patch, this was done only if the
zero_extract has the same mode as "foo". With the patch, it also
deals with mode expanding zero_extracts, e.g.
(zero_extract:DI (foo:SI) (const_int) (const_int))
->
(and:DI (subreg:DI (lshiftrt:SI (const_int) 0) (const_int))
This is what combine has done:
Trying 55, 56 -> 57:
...
Failed to match this instruction:
(set (reg:HI 78)
(zero_extract:HI (mem:QI (reg/v/f:HI 75 [ operands ]) [1
*operands_17(D)+0 S1 A8])
(const_int 1 [0x1])
(const_int 0 [0])))
Successfully matched this instruction:
(set (reg:HI 78)
(and:HI (subreg:HI (mem:QI (reg/v/f:HI 75 [ operands ]) [1
*operands_17(D)+0 S1 A8]) 0)
(const_int 1 [0x1])))
Successfully matched this instruction:
(set (cc0)
(compare (reg:HI 78)
(const_int 0 [0])))
allowing combination of insns 55, 56 and 57
Later it combinded insn 57 -> 58 to the expression that has
triggered the bad register allocation. This looks all correct to
me.
> Even if subregs are allowed, paradoxical subregs look really odd here.
>
> The ICE'ing test case, reload comes up with
>
> (insn 97 57 98 9 (set (reg:HI 30 r30)
> (reg/v/f:HI 20 r20 [orig:75 operands ] [75])) "pr26833.c":11
> 68 {*movhi}
> (nil))
> (insn 98 97 58 9 (set (reg:QI 31 r31)
> (mem:QI (reg:HI 30 r30) [1 *operands_17(D)+0 S1 A8]))
> "pr26833.c":11 56 {movqi_insn}
> (nil))
> (jump_insn 58 98 59 9 (set (pc)
> (if_then_else (eq (and:HI (reg:HI 31 r31)
> (const_int 1 [0x1]))
> (const_int 0 [0]))
> (label_ref 70)
> (pc))) "pr26833.c":11 415 {*sbrx_and_branchhi}
> (int_list:REG_BR_PROB 375 (nil))
> -> 70)
>
> insn 98 is valid but overrides parts of HI:30 set by insn 97.
>
> As it appears, the QI memory is loaded to R31 which is valid, and
> then reload drops in that register as replacement for the paradoxical
> subreg and comes up with HI:31 in insn 58.
>
> Actually a zero-extend would be needed, does it?
>
> As far as I know, reload generates only a very limited subset of insns,
> namely additions to compute frame addresses and mov insns. But
> reload is not supposed to generate extends or other arithmetic.
change_zero_ext rewrites any zero_extend/zero_extract. The
resulting expression doesn't need any extension because it's coded
into the AND constant:
(and:HI (...) (const_int 1 [0x1]))
> If the purpose of the change to combine.c was to optimize code,
> then it would be preferred to avoid HI altogether and just use QImode.
> avr.md has a similar insn for QI, so that would really give a
> reasonable match.
The purpose of the change_zero_ext function is: If combine cannot
match an expression that contains a zero_{extend,extract}, then
call the function to rephrase such subexpression in a more generic
form using logical operations and try to find a match for them.
The patch adds a rule to rewrite mode expanding zero_extracts,
which helps with some s390 patterns. Before the patch, combine
doesn't find a match for 55+56->57. With the patch it combines
55+56->57->58, eventually leading to the fault.
Please correct me if I'm wrong, but I see no bugs in the combined
expressions.
I think the patch only triggers something that is already broken,
but well hidden. Somehow the register allocator seems to miss
some important information about
(subreg:HI (mem:QI (reg/v/f:HI 75 ...
and looks for a register that is suitable for QImode while it
actually needs an HImode register. Maybe there's some place in
common code or the avr backend) where HARD_REGNO_MODE_OK should be
used but isn't?
Ciao
Dominik ^_^ ^_^
--
Dominik Vogt
IBM Germany