[PATCH v2] combine: Improve change_zero_ext, call simplify_set afterwards.

Segher Boessenkool segher@kernel.crashing.org
Fri Dec 23 17:47:00 GMT 2016


On Fri, Dec 23, 2016 at 05:54:01PM +0100, Georg-Johann Lay wrote:
> >The purpose of the combine change is to write widening extracts in a
> >more general form, so that backends for processors that can do such
> >more general things do not have to write hundreds (literally) extra
> >patterns for all the cases that could be written as zero_extract.
> 
> One problem is that not all expressions are canonicalized and combine
> might come up with many different kinds of representations for the
> same action.  One common example is inserting one bit.  This might
> be represented as (set (zero_extract)) or as set with masking and
> shifting around or as (set (if_then_else)), with different
> representations if the sign bit is involved or if the source
> bit position is the same or lower or higher than the destination's
> bit position.

Yeah.

> In a private back end I had the same problem that I didn't want to
> support dozens of combine patterns and added a new hook that allows
> the back end to canonicalize expressions synthesized by combine.

The problem with this is that you create new RTL for every insn fed
to recog (so, a lot more insns then there are in your program).  That
isn't very nice (combine tries very hard to create as little garbage
as it can).

> This runs right before recog(_for_combine) and can replace single_set
> by equivalent ones.  This makes also simplifies porting the back end
> because just one place in combine has to be touches and not hundreds
> of places in combine.c.

I don't understand what you mean here, what hundreds of places in
combine?  What is "porting", here?

> Moreover, different targets might come up
> with different, conflicting preferences so that a one-fits-all
> solution in combine.c doesn't always exist anyway.

Yeah.

> >>>>Actually a zero-extend would be needed, does it?
> >
> >The AND clears the top bits already.
> 
> OK, I didn't know that the register allocator analyses that parts of
> a value (high part in this case) are effectively unused and skips
> the extension.  Cool feature.

It was a paradoxical subreg before, the RA just changes that to hard
regs?


Segher



More information about the Gcc-patches mailing list