This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH: PR/40314] extract common address for memory access to fields of large struct


Thank you, Adam. Your method can solve part of the problems just as
you described. But my pass can expose more opportunities to CSE and
GCSE:

1. Due to lack of target specific information, CSE may do wrong
optimization. Suppose we have two byte memory accesses with address
(base+400) and (base+440), CSE may transform them to
     t1 = base + 400
     // t1 is used as address 1
     ...
     t2 = t1 + 40
     // t2 is used as address 2 to load a byte

   Actually in thumb mode byte loading can only have offset range of
[0, 31]. This transform does not help.

2. In other cases CSE may miss some optimization because it can only
use the available value. One simple example is:

   t1 = base + 404
   // t1 is used as address 1
   ...
   t2 = base + 400
   // t2 is used as address 2

  Address2 is 4 bytes smaller than address1, and used later than
address1. In thumb mode memory offset can't be negative. Current CSE
framework can't deal with this situation.

3. CSE works on superblock only. I hope this optimization can also be
done globally.

thanks
Carrot

On Sat, Jun 13, 2009 at 10:27 AM, Adam Nemet<anemet@caviumnetworks.com> wrote:
> Steven Bosscher <stevenb.gcc@gmail.com> writes:
>
>> On Wed, Jun 3, 2009 at 8:59 AM, Carrot Wei<carrot@google.com> wrote:
>>> I've tested CSiBE, nearly no changes to code size and compile time. It
>>> seems there is no large structure in CSiBE to trigger this
>>> optimization. For mcf from CPU SPEC 2006, this optimization can reduce
>>> about 7% static instructions.
>>
>> Alright, counts as an improvement worth pursuing.
>>
>> What I was wondering though -- isn't this just a special case of Adam
>> Nemet's "constant anchor" additions to CSE?
>
> No, but I think the existing related-value optimization in CSE might be
> able to handle this. ?You just need to avoid propagating the addition
> into the mem. ?I.e. instead of:
>
> (set r200 400) ? ? ? ? ? ? ? ? ? ? # 400 is offset of field1
> (set r201 (mem (plus r100 r200))) ?# r100 contains struct base
> ...
> (set r300 404) ? ? ? ? ? ? ? ? ? ? # 404 is offset of field2
> (set r301 (mem (plus r100 r300))) ?# r100 contains struct base
>
> start with:
>
> (set r200 400)
> (set t1 (plus r100 r200))
> (set r201 (mem t1))
> ...
> (set r300 404)
> (set t2 (plus r100 r300))
> (set r301 (mem t2))
>
> then t2 should be expressed with as t1 + 4 as a related value
> (hopefully!) and then fwprop can now propagate into both mems.
>
> Adam
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]