This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: fix opt/8634


> Date: Thu, 10 Apr 2003 11:47:49 +0100
> From: Richard Earnshaw <rearnsha at arm dot com>

> > Combine is probably the most tricky phase, since it really does look
> > at the RTL.  The idea for that would be to pre-generate (when building
> > GCC) a bunch of rules of the general form 'if an insn of type 23 has
> > its first operand as the second operand of a later insn of type 175,
> > they can be combined into an insn 177 with its first operand a MEM of
> > the second operand of the type-23 insn...'.  Then, combine just checks
> > to see if the conditions of any rule is met for each insn.  No need
> > for simplify_rtx, no need for insn recognition; just a match against a
> > table.  (Of course, simplify_rtx doesn't go away; it's still needed to
> > build the table from the .md files, but all that happens while the
> > compiler is being built, not at run-time.)
> 
> However, much of combine's power comes from the fact that the 
> transformations it does depend on the values (or ranges) that particular 
> operands may have.  For example, it will transform
> 
> 	t1 := x << 28
> 	y := (unsigned) t1 >> 28
> 
> into
> 
> 	y := x & 0xf

This would require some care, but it's not a fundamental problem;
indeed, this particular case can be handled by a rule of the form I
described, that is if you have a left-shift insn followed by a
right-shift insn and two of their parameters match, you can produce an
AND insn.

> it may then do other transformations that are based on knowledge that just 
> some bits in y have known values.  For example, it can infer that since 
> we've just done the above, a subsequent y & 0xff would be redundant.

This is really a different optimisation, and would be treated
separately.  Ideally, you'd have a complete extended value-range
handler, that can take an insn and a state, and determine whether to
simplify the insn (to a different insn) based on information in the
state (like known-zero bits, or known-constant values, or ranges, or
whatever).

> Finally, the table would be huge -- many machine descriptions have several 
> hundred patterns, some of which are match_operator based -- and combine 
> works with up to 3 insns at a time; it can also do transformations that 
> collapse 3 insns into 2 by applying various splitting heuristics.

It'd be large, but not huge; I hope that each insn would only have a
few entries in the table.  The powerpc machine description has
thousands (not hundreds) of patterns...

> However, I don't think that would kill the whole idea.  Combine would just 
> have to work with the generic patterns as well as the data from the insn.  
> For example, the above would be done from
> 
> (insn ... 88 (reg:SI t1) (reg:SI x) (const_int 28)) // lshift
> (insn ... 89 (reg:SI y) (reg:SI t1) (const_int 28)) // lshiftrt
> 
> plus the patterns
> 
> (pattern 88 (set:SI (op 0) (lshift:SI (op 1) (op 2))))
> (pattern 89 (set:SI (op 0) (lshiftrt:SI (op 1) (op 2))))
> 
> and would (eventually) generate a temporary object of the form
> 
> (set:SI (reg:SI y) (and:SI (reg:SI x) (const_int 15)))
> 
> This would then match
> 
> (pattern 101 (set:SI (op 0) (and:SI (op 1) (op 2))))
> 
> So we would extract that as the insn
> 
> (insn ... 101 (reg:SI y) (reg:SI x) (const_int 15))
> 
> The great things about this would be that all temporary memory used could 
> be easily reclaimed after each combine attempt, giving good cache locality.

Yes, this would certainly work.  In fact, it may be that you can
create the RTL faster than you could load it from main memory into
cache.

-- 
- Geoffrey Keating <geoffk at geoffk dot org>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]