This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/56199] strcpy/strcat builtins for constant strings generates suboptimal code.


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56199

Ondrej Bilka <neleai at seznam dot cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |

--- Comment #4 from Ondrej Bilka <neleai at seznam dot cz> 2013-02-04 15:19:22 UTC ---
> It should be faster if the string is not in the cache.  Which of course it is
> for your testcase (because you have an artificial loop here).

And also expected case because you did expansion. It should be on hot path and
string will be in case. Otherwise not doing expansion at all is faster.

As you mentioned cache behaviour it includes also instruction cache. And
current implementation is quite hostile to instruction cache (see another
benchmark). 
Cases where 

> So the benchmark does not show that the transform is bad but instead it shows
> that if repeatedly initializing sth from the same (large) constants then it's
> profitable to use a smaller instruction encoding.
One property of benchmark is minimality. I could write benchmark strcpy called
at five places with different strings and more complex control flow if that is
your point.

>  But of course that's again
> likely only true when 'cpy' is not inlined - in which case we cannot
> distinguish the cases.
Please explain.

And for strings larger that 128 bytes you inline repne strcpy variant that is
slower than calling strcpy.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]