This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Exploiting dual mode operation, implementation.


Hello,
First of all thank you for your quick response and sorry for my 
late response, I was on a couple of weeks leave for exams.

> First, I'm unhappy to see that when you say "sign-extend", you really
> do mean SIGN_EXTEND and not sign extension in general, also covering
> the ZERO_EXTEND case. 
> Second, I'm unhappy to see that you're hard-coding SI and DImode as
> your source and destination modes. 
I'm working on these two topics.
Would it be sufficient if I deal with ZERO_EXTEND and any pair of modes 
as long as there is no mixture in the same web?
In other words, Is it good enough to handle only webs that has one type 
of extension (SIGN_EXTEND or ZERO_EXTEND and only one pair of modes for 
source and dest of the extension)?
Dealing with webs that contains a mixture of modes and extensions type 
is much more difficult and I suspect that such webs are rare.

> > def1 + se:
> > set ((reg:SI 10) (..def1rhs..))
> > set ((reg:DI 100) (sign_extend:DI (reg:SI 10)))
> ...
> > set ((subreg:SI (reg:DI 200)) (..def1rhs..))
> > set ((reg:DI 100) (sign_extend:DI ((subreg:SI (reg:DI 200)))))
> > set ((reg:SI 10) (subreg:SI (reg:DI 200)))
> 
> I'm a bit confused by this choice of replacement.  Why not just
> 
>   (set (reg:SI 10) (def1rhs))
>   (set (reg:DI 100) (sign_extend:DI (reg:SI 10)))
> 
> what's the point of reg 200?
Currently the gcc creates a temporary register for each sign extended 
definition. This means that in the same web you can find:
def1 + se:
  set ((reg:SI 10) (..def1rhs..))
  set ((reg:DI 100) (sign_extend:DI (reg:SI 10)))
def2 + se:
  set ((reg:SI 20) (..def2rhs..))
  set ((reg:DI 100) (sign_extend:DI (reg:SI 20)))

In order to make the gcse/lcm work I need that the sign-extension 
source will be of the same register. This is why I need a new temporary 
register that will replace all the local temporary registers of the 
current web.
The trivial way is to create (reg:SI 30) for instance and replace every 
instance of reg 10 and 20 with it.
A more sophisticated way is to create a new temporary register with the 
mode of the destination and use a subreg of it. 
In the example above I've used: (subreg:SI (reg:DI 200))
This makes the transformation much more confusing but it improves the 
way that the register allocator handles this pattern. 
At the end of the optimization all the sign extensions in the web will 
have the same source and destination and in many cases will be 
allocated to the same real register. 
This may save the need for an additional real register.

> why you don't simply generate the code you want directly and try to 
> recognize it. 
I guess this can be done easily when trying to combine the definitions.

> >       /* We need to merge two hash tables.  */
> >       use_s = xmalloc (sizeof (*use_s));
> >       use_s->use_insn = new_insn;
> >       use_s->hash = htab_create (10, hash_descriptor_se_insn,
> >              eq_descriptor_se_insn, NULL);
> >       htab_traverse (((struct see_use_s *) (stn_new->value))->hash,
> >            copy_to_htab_se_insn, (PTR) use_s->hash);
> >       htab_traverse (((struct see_use_s *) (stn_old->value))->hash,
> >            copy_to_htab_se_insn, (PTR) use_s->hash);
> > 
> >       splay_tree_remove (use_ds , INSN_UID (old_insn));
> >       splay_tree_remove (use_ds , INSN_UID (new_insn));
> >       splay_tree_insert (use_ds , INSN_UID (new_insn),
> >           (splay_tree_value) use_s);
> 
> Why are you not simply insertting into the stn_new table?
The new_insn was changed and it is no longer equal to the insn in 
stn_new that has the same INSN_UID.
You are right, this could be done more efficiently by simply insertting 
into the stn_new table but also updating use_s->use_insn to hold the 
new updated insn.

> 
> >   if (!*first || !*second)
> >     *first = NOT_RELEVANT;
> >   else if ((*first == SIGN_EXTENSION_DEF)
> >       || (*second == SIGN_EXTENSION_DEF))
> >     *first = SIGN_EXTENSION_DEF;
> >   else
> >     *first = CONST_DEF;
> 
> Why is (0 union 0) NOT_RELEVANT?  Because you don't know how to 
> manipulate NOT_RELEVANT, or because it's actually not relevant?
> I might think in the later case you'd want EXTENSION_DEF.  Many
> of the extension cases will be free; we should take advantage
> of that in the case the extension is partially unused.
I'm afraid that I didn't understand what you meant here.
This optimization works on webs. Only web that all its definitions are 
sign extended is considered as relevant.
Otherwise, it has a definition that its high part contains relevant 
information and therefore a sign extension instruction couldn't be 
generated before the uses.

> > see_try_combine (rtx I1, rtx I2, rtx I3, int pattern_type)
> 
> This hackery with combine is atrocious.
I'll work on this.

> via the web.  I see no reason you couldn't plug that into LCM and read
> back the results.
I'll try that too.


Thanks again,
Leehod.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]