This is the mail archive of the
mailing list for the GCC project.
Re: PR63633: May middle-end come up width hard regs for insn expanders?
- From: Vladimir Makarov <vmakarov at redhat dot com>
- To: Georg-Johann Lay <avr at gjlay dot de>, GCC Development <gcc at gcc dot gnu dot org>
- Cc: Jeff Law <law at redhat dot com>, Jakub Jelinek <jakub at redhat dot com>, Denis Chertykov <chertykov at gmail dot com>, Senthil Kumar Selvaraj <senthil_kumar dot selvaraj at atmel dot com>
- Date: Wed, 22 Apr 2015 12:30:07 -0400
- Subject: Re: PR63633: May middle-end come up width hard regs for insn expanders?
- Authentication-results: sourceware.org; auth=none
- References: <544A5C14 dot 5030607 at gjlay dot de> <544A7CFA dot 700 at redhat dot com> <20141024162919 dot GL10376 at tucnak dot redhat dot com> <544A7EE5 dot 1040303 at redhat dot com> <544A984D dot 8080804 at gjlay dot de> <20141024182950 dot GQ10376 at tucnak dot redhat dot com> <5530D95F dot 3070603 at gjlay dot de> <55355D76 dot 9060500 at redhat dot com> <5537B2A0 dot 20208 at gjlay dot de>
On 22/04/15 10:39 AM, Georg-Johann Lay wrote:
Thanks for providing the patch and test. Sometimes it is even for me
hard to say how the RA will behave in complicated cases without
investigating actual code. After looking at IRA dumps, I think you will
Attached is a C test program which produces fine results with
$ avr-gcc -S -O2 -mmcu=atmega8
Also attached is a respective patch against the trunk avr backend that
indicates the transition from clobbers to hard-regs-by-constraint.
I don't actually remember when I tried this first; sometimes around
when 4.8 was in stage I or so.
If my recollection is right; the problem was not that small test
programs with mulsi3 produced large code, but that "ordinary" code
could get much worse. I had the impression it was because the bunch
of new, rarely used / rarely useful register classes, and that IRA's
cost computation got confused resp. much less accurate than with the
usual register classes (only 10 classes of GENERAL_REG).
The attached patch adds 27 new register classes, and to transform all
insns even more classes might be needed: 8-bit, 16-bit and 24-bit
multiplications including sign/zero extension of operands, fixed-point
functions from 8...32 bit, divmod, builtins implementations, support
functions for address spaces, ...
The insns which are using this all have the following properties in
- Only 1 constraint alternative
- Register allocation is uniquely determined, i.e. reg allocator has
no choice what register to pick for what operand (except for
commutative constraints with '%' which give exactly 2 solutions).
The patch avoids clobbers or scratches altogether. The only insn
where a register is affected that is not the output, are transformed
from single_set to parallels in split1. The 2nd set describes setting
a (reg:HI 26) to a useless value. The insn is not expanded as
parallel, because insn combine won't use them for combinations.
Is there a chance that register allocation gets worse just because so
many register classes are added?
When you use hard regs, pseudos moved to the hard regs got preferences
of the hard regs from the moves and possibility of the pseudos to get
the hard reg and moves to be eliminated will be high.
When you use one hard reg class constraints, the operand pseudo gets the
same hard reg preference. IRA has a code for this. So the final result
will be quite analogous.
The only difference is that RA (more accurately ira-costs.c code) will
be a bit slower as there are more reg classes.