AVX generic mode tuning discussion.

Jagasia, Harsha harsha.jagasia@amd.com
Mon Oct 31 21:21:00 GMT 2011


> > > We would like to propose changing AVX generic mode tuning to
> generate
> > 128-bit
> > > AVX instead of 256-bit AVX.
> >
> > You indicate a 3% reduction on bulldozer with avx256.
> > How does avx128 compare to -mno-avx -msse4.2?
> 
> We see these % differences going from SSE42 to AVX128 to AVX256 on
> Bulldozer with "-mtune=generic -Ofast".
> (Positive is improvement, negative is degradation)
> 
> Bulldozer:
> 			AVX128/SSE42	AVX256/AVX-128
> 410.bwaves		-1.4%			-1.4%
> 416.gamess		-1.1%			0.0%
> 433.milc		0.5%			-2.4%
> 434.zeusmp		9.7%			-2.1%
> 435.gromacs		5.1%			0.5%
> 436.cactusADM	8.2%			-23.8%
> 437.leslie3d	8.1%			0.4%
> 444.namd		3.6%			0.0%
> 447.dealII		-1.4%			-0.4%
> 450.soplex		-0.4%			-0.4%
> 453.povray		0.0%			-1.5%
> 454.calculix	15.7%			-8.3%
> 459.GemsFDTD	4.9%			1.4%
> 465.tonto		1.3%			-0.6%
> 470.lbm		0.9%			0.3%
> 481.wrf		7.3%			-3.6%
> 482.sphinx3		5.0%			-9.8%
> SPECFP		3.8%			-3.2%
> 
> > Will the next AMD generation have a useable avx256?
> > I'm not keen on the idea of generic mode being tune
> > for a single processor revision that maybe shouldn't
> > actually be using avx at all.
> 
> We see a substantial gain in several SPECFP benchmarks going from SSE42
> to AVX128 on Bulldozer.
> IMHO, accomplishing even a 5% gain in an individual benchmark takes a
> hardware company several man months.
> The loss with AVX256 for Bulldozer is much more significant than the
> gain for SandyBridge.
> While the general trend in the industry is a move toward AVX256, for
> now we would be disadvantaging Bulldozer with this choice.
> 
> We have several customers who use -mtune=generic and it is default,
> unless a user explicitly overrides it with -mtune=native. They are the
> ones who want to experiment with latest ISA using gcc, but want to keep
> their ISA selection and tuning agnostic on x86/64. IMHO, it is with
> these customers in mind that generic was introduced in the first place.

Since stage 1 closure is around the corner, just wanted to ping to see if the maintainers have made up their mind on this one.
AVX-128 is an improvement over SSE42 for Bulldozer and AVX-256 wipes out pretty much all of that gain in generic mode.
Until there is a convergence on AVX-256 for x86/64, we would like to propose having generic generate avx-128 by default and have a user override to avx-256 manually when known to benefit performance.

Thanks,
Harsha



More information about the Gcc-patches mailing list