This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
- From: "jsjodin at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 13 Nov 2007 17:07:16 -0000
- Subject: [Bug tree-optimization/34043] Missed optimization causing extra loads and stores when using x86_64 builtin function together with aggregate types.
- References: <bug-34043-14442@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #5 from jsjodin at gcc dot gnu dot org 2007-11-13 17:07 -------
(In reply to comment #4)
> Related to PR 33790 (and most likely fixed by it). There is another issue with
> that bug relating to not deleting the extra store.
>
Indeed the extra load disappeared when with the patch. The store did not get
deleted as expected. I looked at the differences between the good and bad case.
Compiling the good case has the following sequence before the fre pass:
Note: src and dst are unions
src.f = D.9650_45;
D.9630_31 = src.f;
D.9655_46 = __builtin_ia32_addps (D.9630_31, D.9630_31);
dst.f = D.9655_46;
D.9632_33 = dst.f;
After fre the temps have been propagated and replaced the uses of dst.f:
src.f = D.9650_45;
D.9630_31 = D.9650_45;
D.9655_46 = __builtin_ia32_addps (D.9630_31, D.9630_31);
dst.f = D.9655_46;
D.9632_33 = D.9655_46;
The extra stores to src.f are eliminated in dce.
The bad case has the following code before and after fre:
src.i = D.9651_44;
D.9630_31 = src.f;
D.9655_45 = __builtin_ia32_addps (D.9630_31, D.9630_31);
dst.f = D.9655_45;
D.9632_33 = dst.i;
Since the src.i and src.f are probably not considered to be the same the
propagation does not work. It might be possible to handle this case if one
consideres the size of data being written and read from unions.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34043