Bug 89804 - optimization opportunity: move variable from stack to register
Summary: optimization opportunity: move variable from stack to register
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 8.3.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2019-03-24 06:36 UTC by Eugene
Modified: 2021-08-16 01:25 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-08-15 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eugene 2019-03-24 06:36:49 UTC
Description basically is here https://godbolt.org/z/vULPAZ

As I know from https://stackoverflow.com/questions/55314885/compiler-optimization-move-variable-from-stack-to-register a compiler is allowed to optimize a stack variable here.
Comment 1 Andrew Pinski 2019-03-24 21:18:33 UTC
Here is the testcase (and removed use of the headers):
typedef unsigned long long uint64_t;

uint64_t uint5korr(const unsigned char *p)
{
  uint64_t result = 0;
  __builtin_memcpy(&result, p, 5);
  return result;
}
---- CUT ---
Comment 2 Andrew Pinski 2019-03-24 21:21:17 UTC
(In reply to Andrew Pinski from comment #1)
> Here is the testcase (and removed use of the headers):
> typedef unsigned long long uint64_t;
> 
> uint64_t uint5korr(const unsigned char *p)
> {
>   uint64_t result = 0;
>   __builtin_memcpy(&result, p, 5);
>   return result;
> }
> ---- CUT ---

The memcpy could be changed into a BIT_INSERT which will then be optimized correctly.
Comment 3 Richard Biener 2019-03-25 09:12:09 UTC
The decision is currently to limit GIMPLE-level memcpy "inlining" to power-of-two
(mode-precision) (single) moves and leave the rest to RTL expansion.
Comment 4 Segher Boessenkool 2019-04-22 12:28:54 UTC
That sounds not too hard to fix, no?

Expand should expand and not do all kinds of other things.  Also, doing this
optimisation in RTL is much harder to do than in gimple, I think.