This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Optimization Option Question

From: "Tangnianyao (ICT)" <tangnianyao at huawei dot com>
To: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
Date: Wed, 19 Dec 2018 08:10:46 +0000
Subject: Optimization Option Question

Greetings All,
I am dealing with compile optimization comparison between arm64 and intel platform, with g++ (version 4.9.4).

Compile the following c++ code,

uint32 Witness::getEntityVolatileDataUpdateFlags(Entity* otherEntity)
{
         uint32 flags = UPDATE_FLAG_NULL;


         if (otherEntity->controlledBy() && pEntity_->id() == otherEntity->controlledBy()->id())
                   return flags;

         ...
}

with successive load memory operations at the entry of a function, where the latter load memory operation has dependency on the former one.
Compiling result on intel x86-64 platform, we find that g++ will put one load memory instrution in front of push stack instructions of function call. It can save some time waiting the former load to finish on an out-of-order processor.  We use these optimization options O1, -fpartial-inlining, -fschedule-insns2, -ftree-pre.
On arm64 platform, We use the same optimization options to compile the same code and find that there is no similar results. No load memory instructions is put before push stack instructions. Nor we get that result using O2, O3, or Ofast to complie on arm64.

Did we have similar compiling optimization on arm64 g++?
If yes, which optimization options can I use?

Thanks,
-Nianyao Tang

Follow-Ups:
- Re: Optimization Option Question
  - From: David Brown

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]