Bug 55001 - Handle VEC_COND_EXPR better in tree-vect-generic.c
Summary: Handle VEC_COND_EXPR better in tree-vect-generic.c
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.8.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
: 104501 (view as bug list)
Depends on:
Blocks: genvector 55167
  Show dependency treegraph
 
Reported: 2012-10-20 18:38 UTC by Marc Glisse
Modified: 2024-06-28 12:43 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-10-22 00:00:00


Attachments
Old patch (1.35 KB, patch)
2012-10-20 18:38 UTC, Marc Glisse
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Glisse 2012-10-20 18:38:31 UTC
Created attachment 28497 [details]
Old patch

Hello,

the code in tree-vect-generic.c to handle vector operations not provided by the target doesn't know VEC_COND_EXPR. That didn't matter when the vectorizer was the only producer, but front-ends are going to produce them as well any day now.

Attaching the patch I was using when experimenting, but IIRC it wasn't in a state for submission, and its assumption that the first argument can't be an SSA_NAME or a constant is now wrong.
Comment 1 Richard Biener 2012-10-22 09:05:42 UTC
Confirmed.
Comment 2 Marc Glisse 2012-11-01 23:39:48 UTC
Author: glisse
Date: Thu Nov  1 23:39:44 2012
New Revision: 193077

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193077
Log:
2012-11-01  Marc Glisse  <marc.glisse@inria.fr>

	PR middle-end/55001

gcc/
	* tree-vect-generic.c (expand_vector_condition): New function.
	(expand_vector_operations_1): Call it.

testsuite/
	* g++.dg/ext/vector19.C: Remove target restrictions.
	* gcc.dg/fold-compare-7.c: New testcase.


Added:
    trunk/gcc/testsuite/gcc.dg/fold-compare-7.c   (with props)
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/g++.dg/ext/vector19.C
    trunk/gcc/tree-vect-generic.c

Propchange: trunk/gcc/testsuite/gcc.dg/fold-compare-7.c
            ('svn:eol-style' added)

Propchange: trunk/gcc/testsuite/gcc.dg/fold-compare-7.c
            ('svn:keywords' added)
Comment 3 Marc Glisse 2012-11-01 23:43:38 UTC
When vcond is not handled for large vector types, the code goes straight to scalars. It should first try small vectors, as is done for other operations.
Comment 4 Hans-Peter Nilsson 2012-11-27 23:29:08 UTC
Can this be considered fixed?  I don't see it failing for cris-* (anymore since at least r193085) nor mmix-* (r193808), neither ports having vector support.
Comment 5 Marc Glisse 2012-11-27 23:52:58 UTC
(In reply to comment #4)
> Can this be considered fixed?

Not completely. It doesn't fail anymore (so I marked PR55167 as fixed), but for architectures that support vectors of 2 elements, it will split a vector of 4 elements into 4 scalars instead of 2 subvectors, so this needs improving. The patch was whatever I managed to write quickly to stop the ICEs.

Now if you prefer, I could open another bug for this enhancement and we could close this one.
Comment 6 Hans-Peter Nilsson 2012-11-28 00:13:00 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > Can this be considered fixed?
> Not completely. It doesn't fail anymore (so I marked PR55167 as fixed),

> Now if you prefer, I could open another bug for this enhancement and we could
> close this one.

Sorry, no, I was actually mixing those PR's up, following up on a note I had for the other PR.  Thanks for staying on it.
Comment 7 Richard Biener 2022-02-14 07:14:23 UTC
*** Bug 104501 has been marked as a duplicate of this bug. ***
Comment 8 Richard Biener 2024-06-28 12:38:05 UTC
Testcase:

typedef int v32si __attribute__((vector_size(128)));

void foo (v32si *a, v32si *b, v32si *c)
{
  *c = *a < *b
      ? (v32si){-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
      -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1}
  : (v32si){0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
      0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
}

that also shows that we initially create a signed-boolean:32 mask temporary
even with AVX512VL which we'd need to change to signed-boolean:1 when
we want to lower to two V16SImode vectors.
Comment 9 Richard Biener 2024-06-28 12:43:47 UTC
Note it's going to be _way_ easier to tackle when we got rid of vcond{,u,eq}
since then the compares and the condition can be lowered separately.