This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults
- From: Segher Boessenkool <segher at kernel dot crashing dot org>
- To: Uros Bizjak <ubizjak at gmail dot com>
- Cc: Jeff Law <law at redhat dot com>, Kyrill Tkachov <kyrylo dot tkachov at arm dot com>, gcc Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 10 Nov 2015 13:53:03 -0600
- Subject: Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults
- Authentication-results: sourceware.org; auth=none
- References: <56376FFF dot 3070008 at arm dot com> <20151104235015 dot GA13203 at gate dot crashing dot org> <563B4516 dot 5090001 at arm dot com> <20151106005636 dot GA31412 at gate dot crashing dot org> <563CB6DE dot 7070106 at arm dot com> <563D1824 dot 8000607 at redhat dot com> <20151106220008 dot GA19110 at gate dot crashing dot org> <20151108205806 dot GA641 at gate dot crashing dot org> <CAFULd4bcHBD5fTg-hvTkoUMh8O5FWJgW1Me9vG9L3iiefaPvMQ at mail dot gmail dot com> <20151109095132 dot GA13304 at gate dot crashing dot org>
On Mon, Nov 09, 2015 at 03:51:32AM -0600, Segher Boessenkool wrote:
> > >From the original patch submission, it looks that this patch would
> > also benefit x86_32.
>
> Yes, that is what I thought too.
>
> > Regarding the above code size increase - do you perhaps have a
> > testcase, to see what causes the difference?
>
> I could extract some. It happens quite rarely on usual code.
>
> > It isn't necessary due to
> > the patch, but perhaps some loads are moved to the insn and aren't
> > CSE'd anymore.
I don't have a small testcase yet.
What causes the degradation is that sometimes we end up with imul reg,reg
instead of imul mem,reg. In the normal case we already have imul mem,reg
after expand, so the patch doesn't change anything in the normal case.
Even if expand didn't do it fwprop would I think.
It also isn't LRA that is doing it, the MEMs in case are not on stack.
Maybe as you say some CSE pass.
For x86_64, which has many more registers than i386, often a peephole
fires that turns a mov reg,reg ; imul mem,reg into an mov mem,reg ;
imul reg,reg which makes the generated machines code identical with
or without the patch (tested on a Linux build, 12MB text).
The i386 size regression is 0.01% btw (comparable to the gains for
other targets).
Segher