Bug 70646 - [4.9/5/6 Regression] Corrupt truncated function
Summary: [4.9/5/6 Regression] Corrupt truncated function
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: ipa (show other bugs)
Version: 5.3.1
: P2 normal
Target Milestone: 4.9.4
Assignee: Martin Jambor
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2016-04-13 15:05 UTC by Josh Poimboeuf
Modified: 2016-10-13 14:28 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.8.5, 7.0
Known to fail: 4.9.3, 5.3.0, 6.1.0
Last reconfirmed: 2016-04-14 00:00:00


Attachments
qla_attr.i.gz (442.88 KB, application/x-gzip)
2016-04-13 15:05 UTC, Josh Poimboeuf
Details
Linux kernel config (19.93 KB, text/plain)
2016-04-13 15:06 UTC, Josh Poimboeuf
Details
Fix storing access size to conditions (5.13 KB, patch)
2016-04-20 16:45 UTC, Martin Jambor
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Josh Poimboeuf 2016-04-13 15:05:36 UTC
Created attachment 38255 [details]
qla_attr.i.gz

The linux kernel has a new tool named "objtool" which follows all possible code paths for every .o file, looking for abnormalities.  In one rare case it has discovered a corrupt truncated function.

From the disassembly of drivers/scsi/qla2xxx/qla_attr.o:

0000000000002f53 <qla2x00_get_host_fabric_name>:
    2f53:       55                      push   %rbp
    2f54:       48 89 e5                mov    %rsp,%rbp

0000000000002f57 <qla2x00_get_fc_host_stats>:
    2f57:       55                      push   %rbp
    2f58:       b9 e8 00 00 00          mov    $0xe8,%ecx
    2f5d:       48 89 e5                mov    %rsp,%rbp
...

Note that qla2x00_get_host_fabric_name() is inexplicably truncated after
setting up the frame pointer.  It falls through to the next function, which is
very bad.

I can recreate it with gcc 5.3.1 or gcc 6.0 on the upstream Linux kernel at tag v4.6-rc3.

The call chain which appears to trigger the problem is:

qla2x00_get_host_fabric_name()
  wwn_to_u64()
    get_unaligned_be64()
      be64_to_cpup()
        __be64_to_cpup()

It occurs with the combination of the following two recent Linux commits:

- bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of some byteswap operations")
- ef3fb2422ffe ("scsi: fc: use get/put_unaligned64 for wwn access")


The gzipped .i file is attached.  I'll also attach the kernel .config file.
Comment 1 Josh Poimboeuf 2016-04-13 15:06:17 UTC
Created attachment 38256 [details]
Linux kernel config
Comment 2 Josh Poimboeuf 2016-04-13 15:07:40 UTC
$ gcc -Wp,-MD,drivers/scsi/qla2xxx/.qla_attr.o.d  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/5.3.1/include -I./arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated  -Iinclude -I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I./include/uapi -Iinclude/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -std=gnu89 -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -Os -Wno-maybe-uninitialized --param=allow-store-data-races=0 -Wframe-larger-than=2048 -fno-stack-protector -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-var-tracking-assignments -fno-inline-functions-called-once -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -DCC_HAVE_ASM_GOTO    -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(qla_attr)"  -D"KBUILD_MODNAME=KBUILD_STR(qla2xxx)" -c -o drivers/scsi/qla2xxx/qla_attr.o drivers/scsi/qla2xxx/qla_attr.c

$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/5.3.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --disable-libgcj --with-default-libstdcxx-abi=gcc4-compatible --with-isl --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 5.3.1 20151207 (Red Hat 5.3.1-2) (GCC)
Comment 3 Denis Vlasenko 2016-04-13 19:14:17 UTC
I can reproduce it with:

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/5.3.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --disable-libgcj --with-isl --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 5.3.1 20160406 (Red Hat 5.3.1-6) (GCC) 

No fancy compiler flags are necessary to thigger it.

Without "-fno-omit-frame-pointer", function loses its two remaining insns, I see an empty body:

	.type	qla2x00_get_host_fabric_name, @function
qla2x00_get_host_fabric_name:
.LFB4504:
	.cfi_startproc
	.cfi_endproc
.LFE4504:
	.size	qla2x00_get_host_fabric_name, .-qla2x00_get_host_fabric_name

Simple "gcc -Os qla_attr.i.c -S" would do.

gcc -O2 produces a normally-looking function.
Comment 4 Denis Vlasenko 2016-04-13 19:35:32 UTC
Shorter reproducer:

typedef __signed__ char __s8;
typedef unsigned char __u8;
typedef __signed__ short __s16;
typedef unsigned short __u16;
typedef __signed__ int __s32;
typedef unsigned int __u32;
__extension__ typedef __signed__ long long __s64;
__extension__ typedef unsigned long long __u64;
typedef signed char s8;
typedef unsigned char u8;
typedef signed short s16;
typedef unsigned short u16;
typedef signed int s32;
typedef unsigned int u32;
typedef signed long long s64;
typedef unsigned long long u64;
typedef __u64 __be64;
static inline __attribute__((no_instrument_function)) __attribute__((__const__)) __u64 __fswab64(__u64 val)
{
 return __builtin_bswap64(val);
}
static inline __attribute__((no_instrument_function)) __attribute__((always_inline)) __u64 __swab64p(const __u64 *p)
{
 return (__builtin_constant_p((__u64)(*p)) ? ((__u64)( (((__u64)(*p) & (__u64)0x00000000000000ffULL) << 56) | (((__u64)(*p) & (__u64)0x000000000000ff00ULL) << 40) | (((__u64)(*p) & (__u64)0x0000000000ff0000ULL) << 24) | (((__u64)(*p) & (__u64)0x00000000ff000000ULL) << 8) | (((__u64)(*p) & (__u64)0x000000ff00000000ULL) >> 8) | (((__u64)(*p) & (__u64)0x0000ff0000000000ULL) >> 24) | (((__u64)(*p) & (__u64)0x00ff000000000000ULL) >> 40) | (((__u64)(*p) & (__u64)0xff00000000000000ULL) >> 56))) : __fswab64(*p));
}
static inline __attribute__((no_instrument_function)) __attribute__((always_inline)) __u64 __be64_to_cpup(const __be64 *p)
{
 return __swab64p((__u64 *)p);
}
static inline __attribute__((no_instrument_function)) __attribute__((always_inline)) u64 get_unaligned_be64(const void *p)
{
 return __be64_to_cpup((__be64 *)p);
}
static inline __attribute__((no_instrument_function)) u64 wwn_to_u64(u8 *wwn)
{
 return get_unaligned_be64(wwn);
}

struct Scsi_Host {
 unsigned long base;
 unsigned long io_port;
 unsigned char n_io_port;
 unsigned char dma_channel;
 unsigned int irq;
 void *shost_data;
 unsigned long hostdata[0]
  __attribute__ ((aligned (sizeof(unsigned long))));
};
static inline __attribute__((no_instrument_function)) void *shost_priv(struct Scsi_Host *shost)
{
 return (void *)shost->hostdata;
}
typedef struct scsi_qla_host {
 u8 fabric_node_name[8];
 u32 device_flags;
} scsi_qla_host_t;
struct fc_host_attrs {
 u64 node_name;
 u64 port_name;
 u64 permanent_port_name;
 u32 supported_classes;
 u8 supported_fc4s[32];
 u32 supported_speeds;
 u32 maxframe_size;
 u16 max_npiv_vports;
 char serial_number[80];
 char manufacturer[80];
 char model[256];
 char model_description[256];
 char hardware_version[64];
 char driver_version[64];
 char firmware_version[64];
 char optionrom_version[64];
 u32 port_id;
 u8 active_fc4s[32];
 u32 speed;
 u64 fabric_name;
};

static void
qla2x00_get_host_fabric_name(struct Scsi_Host *shost)
{
 scsi_qla_host_t *vha = shost_priv(shost);
 u8 node_name[8] = { 0xFF, 0xFF, 0xFF, 0xFF,
  0xFF, 0xFF, 0xFF, 0xFF};
 u64 fabric_name = wwn_to_u64(node_name);

 if (vha->device_flags & 0x1)
  fabric_name = wwn_to_u64(vha->fabric_node_name);

 (((struct fc_host_attrs *)(shost)->shost_data)->fabric_name) = fabric_name;
}

void *get_host_fabric_name = qla2x00_get_host_fabric_name;
Comment 5 Denis Vlasenko 2016-04-13 19:46:19 UTC
Even smaller reproducer.

Bug disappears if "__attribute__((always_inline))" is removed everywhere.


typedef unsigned char u8;
typedef unsigned int u32;
typedef unsigned long long u64;
static inline __attribute__((__const__)) u64 __fswab64(u64 val)
{
 return __builtin_bswap64(val);
}
static inline __attribute__((always_inline)) u64 __swab64p(const u64 *p)
{
 return (__builtin_constant_p((u64)(*p)) ? ((u64)( (((u64)(*p) & (u64)0x00000000000000ffULL) << 56) | (((u64)(*p) & (u64)0x000000000000ff00ULL) << 40) | (((u64)(*p) & (u64)0x0000000000ff0000ULL) << 24) | (((u64)(*p) & (u64)0x00000000ff000000ULL) << 8) | (((u64)(*p) & (u64)0x000000ff00000000ULL) >> 8) | (((u64)(*p) & (u64)0x0000ff0000000000ULL) >> 24) | (((u64)(*p) & (u64)0x00ff000000000000ULL) >> 40) | (((u64)(*p) & (u64)0xff00000000000000ULL) >> 56))) : __fswab64(*p));
}
static inline __attribute__((always_inline)) u64 __be64_to_cpup(const u64 *p)
{
 return __swab64p((u64 *)p);
}
static inline __attribute__((always_inline)) u64 get_unaligned_be64(const void *p)
{
 return __be64_to_cpup((u64 *)p);
}
static inline u64 wwn_to_u64(u8 *wwn)
{
 return get_unaligned_be64(wwn);
}

struct Scsi_Host {
 void *shost_data;
 unsigned long hostdata[0];
};
static inline void *shost_priv(struct Scsi_Host *shost)
{
 return (void *)shost->hostdata;
}
typedef struct scsi_qla_host {
 u8 fabric_node_name[8];
 u32 device_flags;
} scsi_qla_host_t;
struct fc_host_attrs {
 u64 fabric_name;
};

static void
qla2x00_get_host_fabric_name(struct Scsi_Host *shost)
{
 scsi_qla_host_t *vha = shost_priv(shost);
 u8 node_name[8] = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
 u64 fabric_name = wwn_to_u64(node_name);
 if (vha->device_flags & 0x1)
  fabric_name = wwn_to_u64(vha->fabric_node_name);
 (((struct fc_host_attrs *)(shost)->shost_data)->fabric_name) = fabric_name;
}

void *get_host_fabric_name = qla2x00_get_host_fabric_name;
Comment 6 Denis Vlasenko 2016-04-13 19:57:49 UTC
I can collapse the chain of inlines down to this and still see the bug. Removing "__attribute__((always_inline))", or merging __swab64p() and wwn_to_u64(), makes bug disappear.


typedef unsigned char u8;
typedef unsigned int u32;
typedef unsigned long long u64;
static inline __attribute__((always_inline)) u64 __swab64p(const u64 *p)
{
 return (__builtin_constant_p((u64)(*p)) ? ((u64)( (((u64)(*p) & (u64)0x00000000000000ffULL) << 56) | (((u64)(*p) & (u64)0x000000000000ff00ULL) << 40) | (((u64)(*p) & (u64)0x0000000000ff0000ULL) << 24) | (((u64)(*p) & (u64)0x00000000ff000000ULL) << 8) | (((u64)(*p) & (u64)0x000000ff00000000ULL) >> 8) | (((u64)(*p) & (u64)0x0000ff0000000000ULL) >> 24) | (((u64)(*p) & (u64)0x00ff000000000000ULL) >> 40) | (((u64)(*p) & (u64)0xff00000000000000ULL) >> 56))) : __builtin_bswap64(*p));
}
static inline u64 wwn_to_u64(void *wwn)
{
 return __swab64p(wwn);
}

struct Scsi_Host {
 void *shost_data;
 unsigned long hostdata[0];
};
static inline void *shost_priv(struct Scsi_Host *shost)
{
 return (void *)shost->hostdata;
}
typedef struct scsi_qla_host {
 u8 fabric_node_name[8];
 u32 device_flags;
} scsi_qla_host_t;
struct fc_host_attrs {
 u64 fabric_name;
};

static void
qla2x00_get_host_fabric_name(struct Scsi_Host *shost)
{
 scsi_qla_host_t *vha = shost_priv(shost);
 u8 node_name[8] = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
 u64 fabric_name = wwn_to_u64(node_name);
 if (vha->device_flags & 0x1)
  fabric_name = wwn_to_u64(vha->fabric_node_name);
 (((struct fc_host_attrs *)(shost)->shost_data)->fabric_name) = fabric_name;
}

void *get_host_fabric_name = qla2x00_get_host_fabric_name;
Comment 7 mednafen 2016-04-13 22:23:29 UTC
Following code aborts on x86_64 4.9.2 and 5.3.0 at -O2, at least:

#pragma GCC optimize("no-unit-at-a-time")

typedef unsigned char u8;
typedef unsigned long long u64;

static inline __attribute__((always_inline)) u64 __swab64p(const u64 *p)
{
 return (__builtin_constant_p((u64)(*p)) ? ((u64)( (((u64)(*p) & (u64)0x00000000000000ffULL) << 56) | (((u64)(*p) & (u64)0x000000000000ff00ULL) << 40) | (((u64)(*p) & (u64)0x0000000000ff0000ULL) << 24) | (((u64)(*p) & (u64)0x00000000ff000000ULL) << 8) | (((u64)(*p) & (u64)0x000000ff00000000ULL) >> 8) | (((u64)(*p) & (u64)0x0000ff0000000000ULL) >> 24) | (((u64)(*p) & (u64)0x00ff000000000000ULL) >> 40) | (((u64)(*p) & (u64)0xff00000000000000ULL) >> 56))) : __builtin_bswap64(*p));
}

static inline u64 wwn_to_u64(void *wwn)
{
 return __swab64p(wwn);
}

void __attribute__((noinline,noclone)) broken(u64* shost)
{
 u8 node_name[8] = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
 *shost = wwn_to_u64(node_name);
}

void __attribute__((noinline,noclone)) dummy(void)
{
 __builtin_abort();
}

int main(int argc, char* argv[])
{
 u64 v;

 broken(&v);

 if(v != (u64)-1)
  __builtin_abort();

 return 0;
}
Comment 8 Richard Biener 2016-04-14 08:45:31 UTC
Confirmed.

;; Function broken (broken, funcdef_no=2, decl_uid=1764, cgraph_uid=2, symbol_order=2) (executed once)

__attribute__((noclone, noinline))
broken (u64 * shost)
{
  <bb 2>:
  __builtin_unreachable ();

}

after inlining we see:

  <bb 2>:
  node_name[0] = 255;
  node_name[1] = 255;
  node_name[2] = 255;
  node_name[3] = 255;
  node_name[4] = 255;
  node_name[5] = 255;
  node_name[6] = 255;
  node_name[7] = 255;
  _14 = MEM[(const u64 *)&node_name];
  _15 = __builtin_constant_p (_14);
  if (_15 != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

...

  <bb 4>:
  iftmp.0_37 = _39(D);
  __builtin_unreachable ();

which seems to be introduced by IPA CP:

__builtin_unreachable/9 (__builtin_unreachable) @0x7f1563274cf0
  Type: function
  Visibility: external public
  References:
  Referring:
  Availability: not_available
  First run: 0
  Function flags:
  Called by: wwn_to_u64.constprop.0/8
  Calls:
wwn_to_u64.constprop.0/8 (wwn_to_u64.constprop) @0x7f1563274000
  Type: function definition analyzed
  Visibility:
  References:
  Referring:
  Function wwn_to_u64.constprop/8 is inline copy in broken/2
  Clone of wwn_to_u64/1
  Availability: local
  First run: 0
  Function flags: local
  Called by: broken/2 (inlined) (1.00 per call)
  Calls: __builtin_constant_p/5 (1.00 per call) __builtin_unreachable/9
Comment 9 Richard Biener 2016-04-14 08:46:39 UTC
Note that the testcase is strictly invalid as it has 8-byte aligned node_name[]
passed to wwn_to_u64 which ends up performing a 64bit load on it via __swab64p.
Comment 10 Richard Biener 2016-04-14 08:48:50 UTC
Fixing that with

void __attribute__((noinline,noclone)) broken(u64* shost)
{
  u8 node_name[8] __attribute__((aligned(__alignof(u64)))) = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};
  *shost = wwn_to_u64(node_name);
}

still shows broken code.  Note __swab64p is a quite stupid implementation
assuming GCC can't constant-fold __builtin_bswap64.
Comment 11 Richard Biener 2016-04-14 08:58:49 UTC
Happens via

(gdb) bt
#0  redirect_to_unreachable (e=0x7ffff6891820)
    at /space/rguenther/src/svn/trunk3/gcc/ipa-inline-analysis.c:735
#1  0x0000000000bc9325 in edge_set_predicate (e=0x7ffff6891820, 
    predicate=0x7fffffffc980)
    at /space/rguenther/src/svn/trunk3/gcc/ipa-inline-analysis.c:768
#2  0x0000000000bd1df7 in remap_edge_summaries (inlined_edge=0x7ffff68912d8, 
    node=<cgraph_node* 0x7ffff69b1000 "wwn_to_u64.constprop">, 
    info=0x7ffff6994ba0, callee_info=0x7ffff6994c60, operand_map=..., 
    offset_map=..., possible_truths=0, toplev_predicate=0x7fffffffca30)
    at /space/rguenther/src/svn/trunk3/gcc/ipa-inline-analysis.c:3500
#3  0x0000000000bd2603 in inline_merge_summary (edge=0x7ffff68912d8)
    at /space/rguenther/src/svn/trunk3/gcc/ipa-inline-analysis.c:3647
#4  0x00000000017f2b84 in inline_call (e=0x7ffff68912d8, update_original=true, 
Python Exception <class 'gdb.error'> There is no member or method named m_vecpfx.: 
    new_edges=0x7fffffffd9d0, overall_size=0x25c63e0 <_ZL12overall_size>, 
    update_overall_summary=true, callee_removed=0x0)
    at /space/rguenther/src/svn/trunk3/gcc/ipa-inline-transform.c:370
#5  0x00000000017e8953 in inline_small_functions ()
    at /space/rguenther/src/svn/trunk3/gcc/ipa-inline.c:2022
#6  0x00000000017e9ca7 in ipa_inline ()

so somehow it get's confused:

 BB 2 predicate:(true)
  _3 = MEM[(const u64 *)wwn_2(D)];
                freq:1.00 size:  1 time:  1
                50% will be eliminated by inlining
                Accounting size:0.50, time:0.50 on new predicate:(op0[ref offset: 0] changed) && (not inlined)
                Accounting size:0.50, time:0.50 on new predicate:(op0[ref offset: 0] changed)
  _4 = __builtin_constant_p (_3);
                freq:1.00 size:  0 time:  0
  if (_4 != 0)
                freq:1.00 size:  2 time:  2
                Accounting size:2.00, time:2.00 on predicate:(op0[ref offset: 0] changed)

 BB 4 predicate:(op0[ref offset: 0] not constant)
  iftmp.0_26 = __builtin_bswap64 (_3);
                freq:0.61 size:  1 time:  1

 BB 3 predicate:(true)
  _5 = _3 << 56;

(BB 3 predicate shouldn't be true)

Honza?  This seems to be somewhat fragile (redirecting things to unreachable
but _not_ changing the actual predicates in the IL).  Claiming the
predicate is constant true is also a bit bogus (as can be seen in following
optimization).
Comment 12 H.J. Lu 2016-04-14 11:10:48 UTC
It was caused by r208831.
Comment 13 Josh Poimboeuf 2016-04-14 18:57:38 UTC
So if I understand correctly, some reachable code is incorrectly getting marked unreachable and then getting discarded.

Interestingly, the function's epilogue (frame pointer restore) and return instruction are also getting discarded.  Can you tell if that will always be the case when this bug triggers?

If so, that should make it possible for objtool to reliably detect the bug.
Comment 14 Richard Biener 2016-04-15 11:52:33 UTC
(In reply to Josh Poimboeuf from comment #13)
> So if I understand correctly, some reachable code is incorrectly getting
> marked unreachable and then getting discarded.
> 
> Interestingly, the function's epilogue (frame pointer restore) and return
> instruction are also getting discarded.  Can you tell if that will always be
> the case when this bug triggers?

I don't think we can rely on that.  The path could simply fall thru to
a random instruction which is still inside the function.  Say, if it was

  if (x)
    ...
  else
    ...
  foo ();

then the arm marked unreachable would simply disappear.

> If so, that should make it possible for objtool to reliably detect the bug.
Comment 15 Jakub Jelinek 2016-04-15 13:15:55 UTC
Even with the #c10 change it looks to be aliasing violation, does it work fine if you compile with -fno-strict-aliasing or use may_alias on the type through which it is read?
Comment 16 Jan Hubicka 2016-04-15 13:31:33 UTC
> Honza?  This seems to be somewhat fragile (redirecting things to unreachable
> but _not_ changing the actual predicates in the IL).  Claiming the
> predicate is constant true is also a bit bogus (as can be seen in following
> optimization).

At WPA stage inliner can not update IL, so it simply redirects to builtin_unreachable if predicate is FALSE and expects scalar passes to be monotonously smarter than IPA analysis. 

BB3 predicate true is conservatively correct: there is no predicate to represent that something is not constant because we assume scalar optimizers to possibly be stronger than us and decide better.

Here we manage to assume that builtin_constant_p is true and thus we conclude the following BB is unreachable:
 BB 4 predicate:(op0[ref offset: 0] not constant)                               
  iftmp.0_26 = __builtin_bswap64 (_3);                                          
                freq:0.61 size:  1 time:  1
This is because ipa-prop analyzes the values in node_name:
                                     
  Jump functions of caller  broken/2:                                           
    callsite  broken/2 -> wwn_to_u64/1 :                                        
       param 0: UNKNOWN                                                         
         Aggregate passed by reference:                                         
           offset: 0, cst: 255                                                  
           offset: 8, cst: 255                                                  
           offset: 16, cst: 255                                                 
           offset: 24, cst: 255                                                 
           offset: 32, cst: 255                                                 
           offset: 40, cst: 255                                                 
           offset: 48, cst: 255                                                 
           offset: 56, cst: 255                                                 

but later we fold:

  <bb 2>:                                                                       
  node_name[0] = 255;                                                           
  node_name[1] = 255;                                                           
  node_name[2] = 255;                                                           
  node_name[3] = 255;                                                           
  node_name[4] = 255;                                                           
  node_name[5] = 255;                                                           
  node_name[6] = 255;                                                           
  node_name[7] = 255;                                                           
  _14 = MEM[(const u64 *)&node_name];                                           
  _15 = __builtin_constant_p (_14);                                             

into false (which is stupid, but conservatively correct modulo the assumption about IPA passes being consistently weakter than local passes).

Making IPA passes to fold builtin_constant_p call into true is technically possible, if I add builtin_true and redirect to it, but it would not be very good if the local passes can not fold above (as the wrong arm is taken).

So can we possibly fix the local passes?

Honza
Comment 17 Josh Poimboeuf 2016-04-15 13:36:52 UTC
(In reply to Richard Biener from comment #14)
> (In reply to Josh Poimboeuf from comment #13)
> > Interestingly, the function's epilogue (frame pointer restore) and return
> > instruction are also getting discarded.  Can you tell if that will always be
> > the case when this bug triggers?
> 
> I don't think we can rely on that.  The path could simply fall thru to
> a random instruction which is still inside the function.  Say, if it was
> 
>   if (x)
>     ...
>   else
>     ...
>   foo ();
> 
> then the arm marked unreachable would simply disappear.

I tried doing that by putting the offending wwn_to_u64() call in an if statement, but then the bug disappeared.

In fact, adding *any* code before the call seems to make the bug disappear.
Comment 18 Jan Hubicka 2016-04-15 13:50:09 UTC
Jakub: There is indeed aliasing issue, but with -fno-strict-aliasing the bug is the same.

Apparently this is ipa-prop bug, because ipa-prop does not track size of accesses and thus it does not know there is a mismatch. The value produced is thus not INT_MAX as intended but 255. This is Martin's area.
Comment 19 Jan Hubicka 2016-04-15 13:51:40 UTC
Josh: This is limitation of ipa-prop tracking. It very easily gives up on determinging constantness of aggregate parameter. Hope Martin will fix it next stage1. WIP patches was done few releases back but not merged so far.
Comment 20 Josh Poimboeuf 2016-04-15 14:04:51 UTC
Thanks very much to everyone who has looked into this so far.  It would be very helpful to get answers to the following questions, so we can understand the impact to the kernel:

1) Is there a reliable way to avoid the bug, either in code or with a gcc flag?

2) Is there a reliable way to detect the bug by looking at the object code?
Comment 21 rguenther@suse.de 2016-04-15 14:45:55 UTC
On April 15, 2016 3:31:33 PM GMT+02:00, "hubicka at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
>
>--- Comment #16 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
>> Honza?  This seems to be somewhat fragile (redirecting things to
>unreachable
>> but _not_ changing the actual predicates in the IL).  Claiming the
>> predicate is constant true is also a bit bogus (as can be seen in
>following
>> optimization).
>
>At WPA stage inliner can not update IL, so it simply redirects to
>builtin_unreachable if predicate is FALSE and expects scalar passes to
>be
>monotonously smarter than IPA analysis. 
>
>BB3 predicate true is conservatively correct: there is no predicate to
>represent that something is not constant because we assume scalar
>optimizers to
>possibly be stronger than us and decide better.
>
>Here we manage to assume that builtin_constant_p is true and thus we
>conclude
>the following BB is unreachable:
>BB 4 predicate:(op0[ref offset: 0] not constant)                       
>       
>iftmp.0_26 = __builtin_bswap64 (_3);                                   
>      
>                freq:0.61 size:  1 time:  1
>This is because ipa-prop analyzes the values in node_name:
>
>Jump functions of caller  broken/2:                                    
>      
>callsite  broken/2 -> wwn_to_u64/1 :                                   
>    
>param 0: UNKNOWN                                                       
> 
>Aggregate passed by reference:                                         
>  offset: 0, cst: 255                                                  
>  offset: 8, cst: 255                                                  
>  offset: 16, cst: 255                                                 
>  offset: 24, cst: 255                                                 
>  offset: 32, cst: 255                                                 
>  offset: 40, cst: 255                                                 
>  offset: 48, cst: 255                                                 
>  offset: 56, cst: 255                                                 
>
>but later we fold:
>
><bb 2>:                                                                
>      
>node_name[0] = 255;                                                    
>      
>node_name[1] = 255;                                                    
>      
>node_name[2] = 255;                                                    
>      
>node_name[3] = 255;                                                    
>      
>node_name[4] = 255;                                                    
>      
>node_name[5] = 255;                                                    
>      
>node_name[6] = 255;                                                    
>      
>node_name[7] = 255;                                                    
>      
>_14 = MEM[(const u64 *)&node_name];                                    
>      
>_15 = __builtin_constant_p (_14);                                      
>      
>
>into false (which is stupid, but conservatively correct modulo the
>assumption
>about IPA passes being consistently weakter than local passes).
>
>Making IPA passes to fold builtin_constant_p call into true is
>technically
>possible, if I add builtin_true and redirect to it, but it would not be
>very
>good if the local passes can not fold above (as the wrong arm is
>taken).
>
>So can we possibly fix the local passes?You can't rely on this happening, consider -fno-foo.

So my original local hack was to not do this built-in constant stuff for is_refp

Richard.

>Honza
Comment 22 Martin Jambor 2016-04-15 19:26:28 UTC
(In reply to Jan Hubicka from comment #18)
> Jakub: There is indeed aliasing issue, but with -fno-strict-aliasing the bug
> is the same.
> 
> Apparently this is ipa-prop bug, because ipa-prop does not track size of
> accesses and thus it does not know there is a mismatch. The value produced
> is thus not INT_MAX as intended but 255. This is Martin's area.

ipa-prop does not track accesses or their sizes, inlining predicate
conditions do.

Anyway, I checked the two places where access sizes need to be checked
(i.e. ipcp_modif_dom_walker::before_dom_children in ipa-prop.c and
evaluate_conditions_for_known_args in ipa-inline-analysis.c) and they
actually are checked, with one exception causing this bug.

The exception is evaluating IS_NOT_CONSTANT and CHANGED types of
conditions in evaluate_conditions_for_known_args, which does not check
access size for them because the size is not even tracked for them
(for normal conditions, size is the size of the stored value to
compare with).

I suppose the easiest fix is to overload the value field to store the
size of the access for these two codes and then add the missing check.
Comment 23 Martin Jambor 2016-04-15 20:27:04 UTC
(In reply to Josh Poimboeuf from comment #20)
> Thanks very much to everyone who has looked into this so far.  It would be
> very helpful to get answers to the following questions, so we can understand
> the impact to the kernel:
> 
> 1) Is there a reliable way to avoid the bug, either in code or with a gcc
> flag?

If you mean in general, then unfortunately no, except for disabling
inlining altogether and removing all always_inline's.  In addition to
the main bug, I found out that I check --param ipa-max-agg-items only
after incrementing it, so even setting that to zero does not help.
I'll prepare a patch for that too.

If you want to avoid it at a particular place where you know it
already occurs, then it depends on how the code looks like.  Would for
the following work for you?

  void __attribute__((noinline,noclone)) broken(u64* shost)
  {
   u8 node_name[8];
   memset (&node_name, 0xFF, sizeof (node_name));
   *shost = wwn_to_u64(node_name);
  }

> 
> 2) Is there a reliable way to detect the bug by looking at the object code?

I cannot think of any.
Comment 24 Josh Poimboeuf 2016-04-15 21:00:25 UTC
(In reply to Martin Jambor from comment #23)
> (In reply to Josh Poimboeuf from comment #20)
> > Thanks very much to everyone who has looked into this so far.  It would be
> > very helpful to get answers to the following questions, so we can understand
> > the impact to the kernel:
> > 
> > 1) Is there a reliable way to avoid the bug, either in code or with a gcc
> > flag?
> 
> If you mean in general, then unfortunately no, except for disabling
> inlining altogether and removing all always_inline's.  In addition to
> the main bug, I found out that I check --param ipa-max-agg-items only
> after incrementing it, so even setting that to zero does not help.
> I'll prepare a patch for that too.

Yes, I'm looking for a general way to either prevent or try to detect potential other cases of the bug throughout the entire kernel.

Can it only occur with the use of __builtin_constant_p(exp) by an inline function (where exp is a constant)?
Comment 25 Martin Jambor 2016-04-15 21:33:06 UTC
(In reply to Martin Jambor from comment #22)
> I suppose the easiest fix is to overload the value field to store the
> size of the access for these two codes and then add the missing check.

OK, so the IS_NOT_CONSTANT case is relatively easy, the following
(untested) patch fixes this PR:

diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index f8ca825..6901cfb 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -867,8 +867,14 @@ evaluate_conditions_for_known_args (struct cgraph_node *node,
 	  clause |= 1 << (i + predicate_first_dynamic_condition);
 	  continue;
 	}
-      if (c->code == IS_NOT_CONSTANT || c->code == CHANGED)
+      if (c->code == CHANGED)
 	continue;
+      if (c->code == IS_NOT_CONSTANT)
+	{
+	  if (!operand_equal_p (c->val, TYPE_SIZE (TREE_TYPE (val)), 0))
+	    clause |= 1 << (i + predicate_first_dynamic_condition);
+	  continue;
+	}
 
       if (operand_equal_p (TYPE_SIZE (TREE_TYPE (c->val)),
 			   TYPE_SIZE (TREE_TYPE (val)), 0))
@@ -1807,8 +1813,9 @@ set_cond_stmt_execution_predicate (struct ipa_func_body_info *fbi,
     return;
   FOR_EACH_EDGE (e, ei, bb->succs) if (e->flags & EDGE_FALSE_VALUE)
     {
+      tree opsize = unshare_expr_without_location (TYPE_SIZE (TREE_TYPE (op2)));
       struct predicate p = add_condition (summary, index, &aggpos,
-					  IS_NOT_CONSTANT, NULL_TREE);
+					  IS_NOT_CONSTANT, opsize);
       e->aux = edge_predicate_pool.allocate ();
       *(struct predicate *) e->aux = p;
     }

On the other hand, the CHANGED part is more difficult because an
unchanged value is represented with an error_mark_node which of course
does not encode the size of the unchanged value.  So if we wanted to
handle it, we'd need to encode it differently.  However, I am not
convinced it is necessary because, unless I'm forgetting about
something, it is only used as inlining heuristics and can never lead
to any optimizations and thus neither to miscompilations.  Am I right?

In any event, I'd like to talk to Honza bout this before going any
further.
Comment 26 Martin Jambor 2016-04-15 21:58:39 UTC
(In reply to Josh Poimboeuf from comment #24)
> 
> Yes, I'm looking for a general way to either prevent or try to detect
> potential other cases of the bug throughout the entire kernel.
> 
> Can it only occur with the use of __builtin_constant_p(exp) by an inline
> function (where exp is a constant)?

Yes and no.  Yes, the problematic inlining analysis predicate
condition type IS_NOT_CONSTANT is only created for conditions
containing a call to __builtin_constant_p.  But no, a function does
not have to be explicitly marked inline to be an inline candidate.

This bug can occur when an inlineable function containing a call to
__builtin_constant_p, which checks a parameter or a value it
references and a (possibly indirect) caller of the function actually
passes a constant, but stores it using a type of a different size.

In your testcase, the array initialization is a series of byte stores
but the "load" in __swab64p is 64-bit.  Note that gcc cannot detect
that the parameter in your testcase was actually a compile-time
constant.  IPA level does by mistake and removes the slow path which
is later selected for real (this can be improved if deemed worthwhile
but IPA analysis still needs to check the size to be correct).
Comment 27 rguenther@suse.de 2016-04-16 05:46:14 UTC
On April 15, 2016 11:58:39 PM GMT+02:00, "jamborm at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
>
>--- Comment #26 from Martin Jambor <jamborm at gcc dot gnu.org> ---
>(In reply to Josh Poimboeuf from comment #24)
>> 
>> Yes, I'm looking for a general way to either prevent or try to detect
>> potential other cases of the bug throughout the entire kernel.
>> 
>> Can it only occur with the use of __builtin_constant_p(exp) by an
>inline
>> function (where exp is a constant)?
>
>Yes and no.  Yes, the problematic inlining analysis predicate
>condition type IS_NOT_CONSTANT is only created for conditions
>containing a call to __builtin_constant_p.  But no, a function does
>not have to be explicitly marked inline to be an inline candidate.
>
>This bug can occur when an inlineable function containing a call to
>__builtin_constant_p, which checks a parameter or a value it
>references and a (possibly indirect) caller of the function actually
>passes a constant, but stores it using a type of a different size.
>
>In your testcase, the array initialization is a series of byte stores
>but the "load" in __swab64p is 64-bit.  Note that gcc cannot detect
>that the parameter in your testcase was actually a compile-time
>constant.  IPA level does by mistake and removes the slow path which
>is later selected for real (this can be improved if deemed worthwhile
>but IPA analysis still needs to check the size to be correct).

I was arguing that if IPA proves a condition to true/false then it should adjust it that way in modification phase.
This would have prevented the bug as well (eventually using the 'wrong' path).
At least it would have avoided hitting a unreachable path at runtime which can be very tricky to debug.
Comment 28 Martin Jambor 2016-04-16 07:48:16 UTC
(In reply to rguenther@suse.de from comment #27)
> I was arguing that if IPA proves a condition to true/false then it
> should adjust it that way in modification phase.

The thing is that it does not prove it correctly.  If we modify the
testcase so that only the byte at offset zero is constant and others
are not, inlining would still think the whole eight-byte access is to
a known constant.

Redirecting to some builtin_true or builtin_false would have made
debugging easier, I agree.
Comment 29 Martin Jambor 2016-04-20 16:45:22 UTC
Created attachment 38316 [details]
Fix storing access size to conditions

Honza asked me to also come up with a version of the patch where we
store access size to the condition (as a HOST_WIDE_INT) and use that
for access size comparisons, so that we avoid any potential confusion
(for example if the loaded value is run through a NOP_EXPR after
loading but before the described use).

So this is it.  It survives both regular and LTO bootstrap and testing
on an x86_64-linux.
Comment 30 rguenther@suse.de 2016-04-21 11:07:32 UTC
On Wed, 20 Apr 2016, jamborm at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
> 
> --- Comment #29 from Martin Jambor <jamborm at gcc dot gnu.org> ---
> Created attachment 38316 [details]
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38316&action=edit
> Fix storing access size to conditions
> 
> Honza asked me to also come up with a version of the patch where we
> store access size to the condition (as a HOST_WIDE_INT) and use that
> for access size comparisons, so that we avoid any potential confusion
> (for example if the loaded value is run through a NOP_EXPR after
> loading but before the described use).
> 
> So this is it.  It survives both regular and LTO bootstrap and testing
> on an x86_64-linux.

Any reason it's not unsigned HOST_WIDE_INT size?
Comment 31 Martin Jambor 2016-04-21 11:26:11 UTC
(In reply to rguenther@suse.de from comment #30)
> 
> Any reason it's not unsigned HOST_WIDE_INT size?

The only reason is to use the same type in which
get_ref_base_and_extent returns size.
Comment 32 Martin Jambor 2016-05-18 13:04:55 UTC
Author: jamborm
Date: Wed May 18 13:04:23 2016
New Revision: 236389

URL: https://gcc.gnu.org/viewcvs?rev=236389&root=gcc&view=rev
Log:
[PR 70646] Store size to inlining predicate conditions

2016-05-18  Martin Jambor  <mjambor@suse.cz>

	PR ipa/70646
	* ipa-inline.h (condition): New field size.
	* ipa-inline-analysis.c (add_condition): New parameter SIZE, use it
	for comaprison and store it into the new condition.
	(evaluate_conditions_for_known_args): Use condition size to check
	access sizes for all but CHANGED conditions.
	(unmodified_parm_1): New parameter size_p, store access size into it.
	(unmodified_parm): Likewise.
	(unmodified_parm_or_parm_agg_item): Likewise.
	(eliminated_by_inlining_prob): Pass NULL to unmodified_parm as size_p.
	(set_cond_stmt_execution_predicate): Extract access sizes and store
	them to conditions.
	(set_switch_stmt_execution_predicate): Likewise.
	(will_be_nonconstant_expr_predicate): Likewise.
	(will_be_nonconstant_predicate): Likewise.
	(inline_read_section): Stream condition size.
	(inline_write_summary): Likewise.

testsuite/
	* gcc.dg/ipa/pr70646.c: New test.


Added:
    trunk/gcc/testsuite/gcc.dg/ipa/pr70646.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ipa-inline-analysis.c
    trunk/gcc/ipa-inline.h
    trunk/gcc/testsuite/ChangeLog
Comment 33 Martin Jambor 2016-05-18 13:06:56 UTC
Author: jamborm
Date: Wed May 18 13:06:24 2016
New Revision: 236390

URL: https://gcc.gnu.org/viewcvs?rev=236390&root=gcc&view=rev
Log:
Respect --param ipa-max-agg-items=0

2016-05-18  Martin Jambor  <mjambor@suse.cz>

	PR ipa/70646
	* ipa-prop.c (determine_locally_known_aggregate_parts): Bail out early
	if parameter PARAM_IPA_MAX_AGG_ITEMS is zero.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ipa-prop.c
Comment 34 Martin Jambor 2016-05-19 14:57:07 UTC
Author: jamborm
Date: Thu May 19 14:56:35 2016
New Revision: 236462

URL: https://gcc.gnu.org/viewcvs?rev=236462&root=gcc&view=rev
Log:
[PR 70646] Store size to inlining predicate conditions

2016-05-19  Martin Jambor  <mjambor@suse.cz>

	PR ipa/70646
	* ipa-inline.h (condition): New field size.
	* ipa-inline-analysis.c (add_condition): New parameter SIZE, use it
	for comaprison and store it into the new condition.
	(evaluate_conditions_for_known_args): Use condition size to check
	access sizes for all but CHANGED conditions.
	(unmodified_parm_1): New parameter size_p, store access size into it.
	(unmodified_parm): Likewise.
	(unmodified_parm_or_parm_agg_item): Likewise.
	(eliminated_by_inlining_prob): Pass NULL to unmodified_parm as size_p.
	(set_cond_stmt_execution_predicate): Extract access sizes and store
	them to conditions.
	(set_switch_stmt_execution_predicate): Likewise.
	(will_be_nonconstant_expr_predicate): Likewise.
	(will_be_nonconstant_predicate): Likewise.
	(inline_read_section): Stream condition size.
	(inline_write_summary): Likewise.
	* lto-streamer.h (LTO_minor_version): Bump.

testsuite/
	* gcc.dg/ipa/pr70646.c: New test.


Added:
    branches/gcc-6-branch/gcc/testsuite/gcc.dg/ipa/pr70646.c
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/ipa-inline-analysis.c
    branches/gcc-6-branch/gcc/ipa-inline.h
    branches/gcc-6-branch/gcc/lto-streamer.h
    branches/gcc-6-branch/gcc/testsuite/ChangeLog
Comment 35 Martin Jambor 2016-05-19 15:00:45 UTC
Author: jamborm
Date: Thu May 19 15:00:12 2016
New Revision: 236463

URL: https://gcc.gnu.org/viewcvs?rev=236463&root=gcc&view=rev
Log:
Respect --param ipa-max-agg-items=0

2016-05-19  Martin Jambor  <mjambor@suse.cz>

	PR ipa/70646
	* ipa-prop.c (determine_locally_known_aggregate_parts): Bail out early
	if parameter PARAM_IPA_MAX_AGG_ITEMS is zero.


Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/ipa-prop.c
Comment 36 Martin Jambor 2016-05-19 15:09:53 UTC
Author: jamborm
Date: Thu May 19 15:09:21 2016
New Revision: 236473

URL: https://gcc.gnu.org/viewcvs?rev=236473&root=gcc&view=rev
Log:
[PR 70646] Store size to inlining predicate conditions

2016-05-19  Martin Jambor  <mjambor@suse.cz>

	PR ipa/70646
	* ipa-inline.h (condition): New field size.
	* ipa-inline-analysis.c (add_condition): New parameter SIZE, use it
	for comaprison and store it into the new condition.
	(evaluate_conditions_for_known_args): Use condition size to check
	access sizes for all but CHANGED conditions.
	(unmodified_parm_1): New parameter size_p, store access size into it.
	(unmodified_parm): Likewise.
	(unmodified_parm_or_parm_agg_item): Likewise.
	(eliminated_by_inlining_prob): Pass NULL to unmodified_parm as size_p.
	(set_cond_stmt_execution_predicate): Extract access sizes and store
	them to conditions.
	(set_switch_stmt_execution_predicate): Likewise.
	(will_be_nonconstant_expr_predicate): Likewise.
	(will_be_nonconstant_predicate): Likewise.
	(inline_read_section): Stream condition size.
	(inline_write_summary): Likewise.
	* lto-streamer.h (LTO_minor_version): Bump.

testsuite/
	* gcc.dg/ipa/pr70646.c: New test.


Added:
    branches/gcc-5-branch/gcc/testsuite/gcc.dg/ipa/pr70646.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/ipa-inline-analysis.c
    branches/gcc-5-branch/gcc/ipa-inline.h
    branches/gcc-5-branch/gcc/lto-streamer.h
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
Comment 37 Martin Jambor 2016-05-19 15:11:21 UTC
Author: jamborm
Date: Thu May 19 15:10:50 2016
New Revision: 236474

URL: https://gcc.gnu.org/viewcvs?rev=236474&root=gcc&view=rev
Log:
Respect --param ipa-max-agg-items=0

2016-05-19  Martin Jambor  <mjambor@suse.cz>

	PR ipa/70646
	* ipa-prop.c (determine_locally_known_aggregate_parts): Bail out early
	if parameter PARAM_IPA_MAX_AGG_ITEMS is zero.


Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/ipa-prop.c
Comment 38 Martin Jambor 2016-05-19 15:17:42 UTC
Author: jamborm
Date: Thu May 19 15:17:10 2016
New Revision: 236475

URL: https://gcc.gnu.org/viewcvs?rev=236475&root=gcc&view=rev
Log:
[PR 70646] Store size to inlining predicate conditions

2016-05-19  Martin Jambor  <mjambor@suse.cz>

	PR ipa/70646
	* ipa-inline.h (condition): New field size.
	* ipa-inline-analysis.c (add_condition): New parameter SIZE, use it
	for comaprison and store it into the new condition.
	(evaluate_conditions_for_known_args): Use condition size to check
	access sizes for all but CHANGED conditions.
	(unmodified_parm_1): New parameter size_p, store access size into it.
	(unmodified_parm): Likewise.
	(unmodified_parm_or_parm_agg_item): Likewise.
	(eliminated_by_inlining_prob): Pass NULL to unmodified_parm as size_p.
	(set_cond_stmt_execution_predicate): Extract access sizes and store
	them to conditions.
	(set_switch_stmt_execution_predicate): Likewise.
	(will_be_nonconstant_expr_predicate): Likewise.
	(will_be_nonconstant_predicate): Likewise.
	(inline_read_section): Stream condition size.
	(inline_write_summary): Likewise.
	* lto-streamer.h (LTO_minor_version): Bump.
	* ipa-prop.c (ipa_load_from_parm_agg): Added size_p parameter, pass it
	to ipa_load_from_parm_agg_1.
	* ipa-prop.h (ipa_load_from_parm_agg): Update declaration.

testsuite/
	* gcc.dg/ipa/pr70646.c: New test.


Added:
    branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/ipa/pr70646.c
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/ipa-inline-analysis.c
    branches/gcc-4_9-branch/gcc/ipa-inline.h
    branches/gcc-4_9-branch/gcc/ipa-prop.c
    branches/gcc-4_9-branch/gcc/ipa-prop.h
    branches/gcc-4_9-branch/gcc/lto-streamer.h
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
Comment 39 Martin Jambor 2016-05-19 15:20:30 UTC
Author: jamborm
Date: Thu May 19 15:19:59 2016
New Revision: 236476

URL: https://gcc.gnu.org/viewcvs?rev=236476&root=gcc&view=rev
Log:
Respect --param ipa-max-agg-items=0

2016-05-19  Martin Jambor  <mjambor@suse.cz>

	PR ipa/70646
	* ipa-prop.c (determine_locally_known_aggregate_parts): Bail out early
	if parameter PARAM_IPA_MAX_AGG_ITEMS is zero.


Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/ipa-prop.c
Comment 40 Martin Jambor 2016-05-19 15:22:00 UTC
Fixed on all branches.