This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion (take 2)
- From: Richard Biener <rguenther at suse dot de>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Richard Henderson <rth at redhat dot com>, Uros Bizjak <ubizjak at gmail dot com>, gcc-patches at gcc dot gnu dot org
- Date: Thu, 21 Nov 2013 12:37:01 +0100 (CET)
- Subject: Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion (take 2)
- Authentication-results: sourceware.org; auth=none
- References: <20131120084245 dot GE892 at tucnak dot redhat dot com> <alpine dot LNX dot 2 dot 00 dot 1311201027590 dot 4261 at zhemvz dot fhfr dot qr> <20131120094433 dot GG892 at tucnak dot redhat dot com> <528D2D07 dot 4080106 at redhat dot com> <20131121103753 dot GM892 at tucnak dot redhat dot com> <alpine dot LNX dot 2 dot 00 dot 1311211211490 dot 8615 at zhemvz dot fhfr dot qr> <20131121112757 dot GN892 at tucnak dot redhat dot com>
On Thu, 21 Nov 2013, Jakub Jelinek wrote:
> On Thu, Nov 21, 2013 at 12:18:45PM +0100, Richard Biener wrote:
> > > Bootstrap/regtest pending, ok at least for this for the start and can be
> > > improved later on?
> >
> > Ok, this should catch most of the vectorizer cases.
> >
> > Zero could also be handled for PLUS_EXPR, likewise one for MULT_EXPR.
> > I think for induction it's common to have { base, base + 1, base + 2, ...
>
> Of course I handle base for PLUS_EXPR (i.e. zero addend), what I meant
> is that for MULT_EXPR, you can actually not have any base at all for
> a subset of the elements, just constant 0, because when you multiply
> arbitrary base with 0, you get 0.
>
> > why loop here? Do you want to catch base + 1 + 2? I think that's
> > hiding a missed optimization elsewhere for no good reason.
>
> I had that in the patch first, unfortunately it is a pass ordering issue.
> stmp_var_.25_67 = x_27 + 3;
> stmp_var_.25_68 = stmp_var_.25_67 + 3;
> stmp_var_.25_69 = stmp_var_.25_68 + 3;
> stmp_var_.25_70 = stmp_var_.25_69 + 3;
> stmp_var_.25_71 = stmp_var_.25_70 + 3;
> stmp_var_.25_72 = stmp_var_.25_71 + 3;
> stmp_var_.25_73 = stmp_var_.25_72 + 3;
> vect_cst_.26_74 = {x_27, stmp_var_.25_67, stmp_var_.25_68, stmp_var_.25_69, stmp_var_.25_70, stmp_var_.25_71, stmp_var_.25_72, stmp_var_.25_73};
> is exactly what I see in the last veclower pass, because there is no
> forwprop between vect pass and veclower. So, do you want to schedule
> another forwprop before veclower? Moving veclower later sounds bad,
> we really need the stuff created by veclower cleaned up too.
Oh, indeed. Bah. That case makes the whole stuff quadratic, too ;)
For
typedef int vLARGEsi __attribute__((vector_size(1024*1024)));
(we seem to ICE with vector_size(1024*1024*1024) in stor-layout.c - heh)
Or do we split up the IL into vectors which have a mode before
optimizing the constructors like above?
That said, I'm fine with the patch as-is - we can look at some
reall-large-vectors testcases as followup (I'd expect we have
other issues with them ...)
Richard.
- References:
- [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion
- Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion
- Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion
- Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion
- [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion (take 2)
- Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion (take 2)
- Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion (take 2)