This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Optimization breaks inline asm code w/ptrs


On 8/12/2017 10:14 PM, Andrew Pinski wrote:
On Sat, Aug 12, 2017 at 10:08 PM, Andrew Pinski <pinskia@gmail.com> wrote:
On Sat, Aug 12, 2017 at 9:21 PM, David Wohlferd <dw@limegreensocks.com> wrote:
Environment:
gcc 6.1
compiling for 64bit i386
optimizations: -O2

Consider this simple bit of code (from
https://stackoverflow.com/a/45656087/2189500):

#include <stdio.h>

int getStringLength(const char *pStr){

     int len;

     __asm__  (
         "repne scasb\n\t"
         "not %%ecx\n\t"
         "dec %%ecx"
         :"=c" (len), "+D"(pStr)
         :"c"(-1), "a"(0)
     );

     return len;
}

int main()
{
    char buff[50] = "hello world";
    int a = getStringLength(buff);
    printf("%s: %d\n", buff, a);
}

This code works as expected and prints out 11.  Yay.

However, if you add "buff[4] = 0;" before the call to getStringLength, it
STILL prints out 11 (when optimizations are enabled), when it should print
4.

I would expect this kind of behavior if the asm were in 'main.' But it has
always been my understanding that function calls performed an implicit
memory clobber.  The fact that this clobber goes away during inlining means
that code can stop working any time the compiler makes a different decision
about whether or not to inline a function.  Ouch.

And before somebody asks:  Adding "+m"(pStr) does not help.
But does adding:
"+m"(*pStr)

Help?


"+m"(pStr)  Just says pStr variable changes, not what it points to.

Using "+m"(*pStr) gives a compile error (read-only location used as output).

Using "m"(*pStr) as an (unused) input parameter has no effect.

Using "m"(*pStr) as a used input parameter can work.

But let's not get off track here. The purpose of this question isn't "how do I make this code work?" I can already give you at least 3 different work-arounds. The goal here is to understand how function parameters can (safely) be passed to inline asm.

I should ask why are you using inline-asm for this?  strlen will have
the best optimized version for your processor anyways.

I don't actually care about this particular example. It's just a question I was answering for a user on SO (see link above) when someone pointed out the subtle, but more serious problem. So it just serves as a 'minimal' example to illustrate the issue I'm asking about, which is:

The result is that (apparently) you can NEVER safely pass a buffer pointer
to inline asm without using the memory clobber.  If this is true, I don't
believe it is widely known.

This seems like a "by definition" thing. By definition when writing a function, you should assume one of these two things:

1) It is perfectly reasonable for a function to assume that on entry all pointers to memory that the function can access have been clobbered and can thus those pointers can be safely passed to inline asm.

2) It is never (never never never) reasonable for a function to assume that on entry (or indeed at any time) that a pointer to memory can be used to read that memory via inline asm unless either a memory constraint is used, or a memory clobber is included.

Observation suggests that despite my expectations, #1 is false, which implies #2 is correct. I don't know that this is generally understood. Passing function parameters (including pointers) to inline asm is not uncommon. If #2 is the rule, I expect I'm not the first person to break it.

Next question: Is this by design? Just because it does behave this way doesn't mean that it should.

As I say, I expected that calling a function always does an implied clobber (at least for escaped pointers if not a complete clobber). And indeed, if getStringLength isn't inlined (ie via attribute), the code always works as expected. If there is an implicit clobber for noinline, why wouldn't there be one for inline? Wouldn't this be something inherent in the definition of "declaring and calling a function?"

And perhaps more significantly: This exact same code compiled with -m32 instead of -m64 works correctly (not sure about other architectures). Could this behavior be an unintended consequence of some overly aggressive optimization related to 64bit inlining?

How SHOULD this work?

Given how 'heavy' memory clobbers are, I would hope that only pointers that
have 'escaped' the function would get flushed before a function call.  But
not flushing *anything* seems very bad.

dw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]