This is the mail archive of the
mailing list for the GCC project.
Optimization Option Question
- From: "Tangnianyao (ICT)" <tangnianyao at huawei dot com>
- To: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Wed, 19 Dec 2018 08:10:46 +0000
- Subject: Optimization Option Question
I am dealing with compile optimization comparison between arm64 and intel platform, with g++ (version 4.9.4).
Compile the following c++ code,
uint32 Witness::getEntityVolatileDataUpdateFlags(Entity* otherEntity)
uint32 flags = UPDATE_FLAG_NULL;
if (otherEntity->controlledBy() && pEntity_->id() == otherEntity->controlledBy()->id())
with successive load memory operations at the entry of a function, where the latter load memory operation has dependency on the former one.
Compiling result on intel x86-64 platform, we find that g++ will put one load memory instrution in front of push stack instructions of function call. It can save some time waiting the former load to finish on an out-of-order processor. We use these optimization options O1, -fpartial-inlining, -fschedule-insns2, -ftree-pre.
On arm64 platform, We use the same optimization options to compile the same code and find that there is no similar results. No load memory instructions is put before push stack instructions. Nor we get that result using O2, O3, or Ofast to complie on arm64.
Did we have similar compiling optimization on arm64 g++?
If yes, which optimization options can I use?