This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] PR target/68991: Add vector_memory_operand and "Bm" constraint


On Fri, Jan 15, 2016 at 6:11 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Jan 15, 2016 at 01:36:40PM +0100, Richard Biener wrote:
>> >> My patches only change SSE patterns without ssememalign
>> >> attribute, which defaults to
>> >>
>> >> (define_attr "ssememalign" "" (const_int 0))
>> >
>> > The patch is OK for mainline.
>> >
>> > (subst.md changes can IMO be considered obvious.)
>>
>> This change (r232087 or r232088) is responsible for a drop
>> of 482.sphinx3 on AMD Fam15 (bulldozer) from score 33 to 18.
>>
>> See http://gcc.opensuse.org/SPEC/CFP/sb-megrez-head-64-2006/recent.html
>
> Yeah, it seems to make a significant difference on code generated with
> -mavx, e.g. in cmn.c with
> -Ofast -quiet -march=bdver2 -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -mno-movbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mlwp -mfma -mfma4 -mxop -mbmi -mno-bmi2 -mtbm -mavx -mno-avx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mno-rdrnd -mf16c -mno-fsgsbase -mno-rdseed -mprfchw -mno-adx -mfxsr -mxsave -mno-xsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-clwb -mno-pcommit -mno-mwaitx -mno-clzero -mno-pku --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver2
> Reduced testcase:
>
> -Ofast -mavx -mno-avx2 -mtune=bdver2
>
> float *a, *b;
> int c, d, e, f;
> void
> foo (void)
> {
>   for (; c; c++)
>     a[c] = 0;
>   if (!d)
>     for (; c < f; c++)
>       b[c] = (double) e / b[c];
> }
>
> r232086 vs. r232088 gives.  I don't see significant differences before IRA,
> IRA seems to have some cost differences (strange), but the same dispositions,
> and LRA ends up with all the differences.
>

That may be due to the difference between define_memory_constraint and
define_constraint.  LRA doesn't consider register for define_constraint if
memory is true.

-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]