21998 – (cond ? result1 : result2) is vectorized, where equivalent if-syntax isn't (store)

Bug 21998 - (cond ? result1 : result2) is vectorized, where equivalent if-syntax isn't (store)

Summary: (cond ? result1 : result2) is vectorized, where equivalent if-syntax isn't (s...

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	4.1.0

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:	vectorizer
	Show dependency tree / graph

Reported:	2005-06-10 13:21 UTC by Stefaan De Roeck
Modified:	2023-08-04 20:36 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2006-02-05 21:14:34

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Stefaan De Roeck 2005-06-10 13:21:59 UTC

The following two procedures are functionally equivalent, but the first (more
complicated) syntax is vectorized though the second isn't.

typedef int __attribute ((aligned (16))) aint;
void test(aint * __restrict a1, int const v1, int const v2) {
	for (int i=0; i<640; ++i)
		a1[i] = (a1[i] == v1 ? v2 : a1[i]);
}
void test2(aint * __restrict a1, int const v1, int const v2) {
	for (int i=0; i<640; ++i)
		if (a1[i] == v1) a1[i] = v2;
}

vecttest.cpp:7: note: === vect_analyze_loop_form ===
vecttest.cpp:7: note: not vectorized: too many BBs in loop.
vecttest.cpp:6: note: bad loop form.
vecttest.cpp:6: note: vectorized 0 loops in function.

Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /esat/alexandria1/sderoeck/src/gcc/main/configure
--prefix=/esat/olympia/install --program-suffix=-cvs --enable-languages=c,c++ :
(reconfigured) /esat/alexandria1/sderoeck/src/gcc/main/configure
--prefix=/esat/olympia/install --program-suffix=-cvs --enable-languages=c,c++ :
(reconfigured) /esat/alexandria1/sderoeck/src/gcc/main/configure
--prefix=/esat/olympia/install --program-suffix=-cvs --enable-languages=c,c++
--no-create --no-recursion : (reconfigured)
/esat/alexandria1/sderoeck/src/gcc/main/configure --prefix=/esat/olympia/install
--program-suffix=-cvs --enable-languages=c,c++ --no-create --no-recursion
Thread model: posix
gcc version 4.1.0 20050610 (experimental)
 /esat/olympia/install/libexec/gcc/i686-pc-linux-gnu/4.1.0/cc1plus -quiet -v
-D_GNU_SOURCE vecttest.cpp -quiet -dumpbase vecttest.cpp -march=pentium4
-auxbase-strip vecttest-gcc.s -O9 -version -fverbose-asm -ftree-vectorize
-fdump-tree-vect-details -fdump-tree-vect-stats -o vecttest-gcc.s
-- cut --
GNU C++ version 4.1.0 20050610 (experimental) (i686-pc-linux-gnu)
        compiled by GNU C version 3.4.4 (Gentoo 3.4.4, ssp-3.4.4-1.0, pie-8.7.8).
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 3cb76b13917ca148a15d77c9a1fb678d

Comment 1 Andrew Pinski 2005-06-10 13:26:41 UTC

They are not equivalent to GCC, the first always stores, the second has a conditional store.

Comment 2 Andrew Pinski 2005-06-19 14:24:25 UTC

Confirmed.

Comment 3 Richard Biener 2012-07-13 08:54:14 UTC

Link to vectorizer missed-optimization meta-bug.

Comment 4 Steven Bosscher 2012-07-13 11:04:13 UTC

(In reply to comment #1)
> They are not equivalent to GCC, the first always stores, the second has a
> conditional store.

Just to clarify, 7 years later: To GCC the two procedures are not equivalent.

In the first procedure,
 a1[i] = (a1[i] == v1 ? v2 : a1[i]);

expands as:

  if (a1[i] == v1)
    a1[i] = v2;
  else
    a1[i] = a1[i];

while the second procedure expands just as-is:
  if (a1[i] == v1)
    a1[i] = v2;

In the first case, there will always be a store to a1[i], in the second example this is not the case. Introducing new stores is not allowed, to avoid introducing data races, see http://gcc.gnu.org/wiki/Atomic/GCCMM/DataRaces.

I'm not sure how GCC should transform the second procedure to allow the loop to be vectorized.

Comment 5 Richard Biener 2012-07-13 11:28:45 UTC

We have two related flags here, -ftree-loop-if-convert-stores, and
--param allow-store-data-races.  We can adjust the former to honor the
latter if specified and then eventually vectorize this, too.

Comment 6 Richard Biener 2013-03-27 11:32:19 UTC

Note that the concern is also that a1 may be mapped to a read-only segment,
so introducing a store data-race may trap.  That's probably out of the C99
language standards scope, but the middle-end has to care about this
possibility.

Comment 7 Andrew Pinski 2023-08-04 20:36:50 UTC

We can vectorize test2 using mask stores ....