This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Comment on closed PR target/9757: Gcc should use swp instruction in ARM targets


> Dear maintainers,
> 
> I would like to argue on the closed PR target/9757: Gcc should use swp 
> instruction in ARM targets
> http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=9757
> 
> It was closed because of the following:
> 
>      1) It's very slow on some processors, since it forces an external 
> bus access even if the data is already in the cache.
> 
> My argue: if we optimize for size, speed is a trade-off. We could use a 
> dedicated switch such as -fuse-swp or as part of the -Os option.

It still doesn't help if it's not safe.  See below.

> 
>      2) It's behaviour is not defined if access is made to a MMU managed 
> page that is non-cacheable/bufferable.
> 
> However,  -mcpu and -mtune can be used to specify the ARM processor. If 
> that processor hasn't got a MMU, there is no such problem and this could 
> also be used to enable or disable the generation of swp.

Knowing the CPU type doesn't mean that you know enough about the memory 
system to safely use the instruction at any arbitrary address.

Anyway, that's not what -mcpu and -mtune mean.  -mcpu is purely a synonym 
for -march=<arch-of-cpu> -mtune=<cpu>

The architecture is purely a list of those instructions which may legally 
be used; since we don't know enough about the memory system from the 
architecture we can't include swp in the list.  The tuning affects which 
instructions we select from within the list for best performance (and 
should probably be ignored when optimizing for space).  Further, the 
archictecture information is considered by the compiler to be a set of 
strict super-sets (so as you select higher architecture variants the 
number of instructions available increases).  If you select a suitably 
low-numbered architecture then your code will run on any processor 
supporting that architecture or later.  As noted, this would not be 
possible if SWP/SWPB were to be used.

The best way of supporting this, if it is really wanted, is to create two 
new compiler builtins, __builtin_arm_swp() and builtin_arm_swpb() which 
expand to swap and swpb instructions.  Then a user can use these when 
SWP's semantics are really what is wanted.

R.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]