[PATCH, ARM] attribute target (thumb,arm)

Christian Bruel christian.bruel@st.com
Thu Oct 9 11:36:00 GMT 2014


On 10/08/2014 06:56 PM, Ramana Radhakrishnan wrote:
> Hi Christian,

<< snipped agreed stuf >>
> 3) about inlining
>    I dislike inlining different modes, From a conceptual use, a user
> might want to switch mode only when changing a function's hotness.
> Usually inlining a cold function into a hot one is not what the user
> explicitly required when setting different mode attributes for them,
> __attribute__((thumb)) should not imply coldness or hotness. Inlining 
> between cold and hot functions should be done based on profile feedback. 
> The choice of compiling in "Thumb1" state for coldness is a separate one 
> because that's where the choice needs to be made.

Ideally yes. but I think that a user motivation to use target attribute
(("thumb") is to reduce code size even in the cases where PFO is not
available (libraries, kernel or user application build spec packaged
without profile data). And there are cases where static probabilities
are not enough and that a user wants it own control with gprof or oprofile.
But in this case, we could point to the __attribute__ (("cold")) on the
function ? That would probably be the best workaround to propose if we
recommend this

But here is another scenario: Using of attribute (("arm")) for exception
entry points is indeed not related to hotness. But consider a typical
thumb binary with an entry point in arm compiled in C (ex handler, a
kernel...). Today due to the file boundary the thumb part is not inlined
into the arm point. (Using -flto is not possible because the whole
gimple would be thumb).

Now, using attribute (("target")) for the functions others than the
entry point, with your approach they would all be inlined (assuming the
cost allow this) and we would end up with a arm binary instead of a
thumb binary...

But there are still 3 points  :

- At least 2 other target (i386, Powerpc) that support attribute_target
disable inlining between modes that are not subsets. I like to think
about homogeneity between targets and I find odd to have different
inlining rules...

- Scanning the function body to check for ASM_INPUT does not look very
elegant (if this matters) because the asm could well be unrelated

The only case when it will always be a win to inline thumb into arm is
when the cost of the inlined body is less than a BX instruction (but
still, with branch prediction this cost is pondered).


>
>> The compiler would take a decision that is not what the user wrote. And
>> in addition if you consider the few instructions to modify R15 to switch
>> state that would end up with more code executed in the critical path,
>> voiding a possible size of speed gain.
> I do not expect there to be any additional instructions needed to switch 
> state. If function x is inlined into function y the state would be lost 
> and the state would be in terms of the state of function x.
Yes, indeed. I was in a LCM/mode-switching thinking mode when writing
this. In this case the mode is inherited.

> Obviously if the user doesn't want inlining - the user would add 
> attributes to disable inlining. You do have extensions such as 
> __attribute__((noinline)) and __attribute__((never_inline)) to give the 
> user that control and those bits need to be used in addition.

Those attributes are overkill. They would disable inlining between
caller-callee of a same mode. This is not what we want

>
> The attribute then purely reflects then the output instruction state of 
> the function if a copy of it's body is laid out separately in the output.
>
> IMHO, the heuristics for inlining should be the best judge of when 
> functions should be inlined between one and another and we shouldn't be 
> second guessing that in the backend
>
> If there is a copy of the function to be put out by the compiler, only 
> then should we choose this based on the state of the "target" i.e. arm 
> or thumb.
>
Yes,

So to summarize, we can:

  1) don't inline between different modes. Same behavior with other
targets. Solves asm case
  2) always inline unless the function contains asm statements. ( I
reject adding a new compilation switch)
  3) always inline. But recommend the use of attribute ((noinline)) to
handle asm or attribute ((cold,hot)) in the absence of profile datas

I obviously prefer 1) safe and  homogenous, 3) is the worse as it
requires additional user action (poor user). 2) is less worse.

Thanks for supporting me ::)

Christian



More information about the Gcc-patches mailing list