This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/78041] Wrong code on ARMv7 with -mthumb -mfpu=neon-fp16 -O0
- From: "rearnsha at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 20 Oct 2016 13:05:28 +0000
- Subject: [Bug rtl-optimization/78041] Wrong code on ARMv7 with -mthumb -mfpu=neon-fp16 -O0
- Auto-submitted: auto-generated
- References: <bug-78041-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
--- Comment #6 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
(In reply to Bernd Edlinger from comment #5)
> (In reply to Wilco from comment #4)
> > However dealing with partial overlaps is complex so maybe the best option
> > would be to add alternatives to <shift>di3_neon to either allow full overlap
> > "r 0 X X X" or no overlap "&r r X X X". The shift code works with full
> > overlap.
>
> That sounds like a good idea.
>
> Then this condition in <shift>di3_neon could go away too:
>
> && (!reg_overlap_mentioned_p (operands[0], operands[1])
> || REGNO (operands[0]) == REGNO (operands[1])))
Note that we don't want to restrict complete overlaps, only partial overlaps.
Restricting complete overlaps leads to significant increase in register
pressure and a lot of redundant copying.