[PATCH] Use the middle-end boolean_type_node

Wed Jan 4 09:51:00 GMT 2017

On Tue, Jan 3, 2017 at 9:20 PM, FX <fxcoudert@gmail.com> wrote:
>> The regression with 8 bit boolean types surfaced with the z10 machine. The ABI is much older. Since
>> we cannot change it anymore we should rather talk to the hardware guys to add the instruction we
>> need. So for now we probably have to live with the regression in the Fortran cases since as I
>> understand it your change fixes an actual problem.
>
> As far as I understand (and Jane will correct me if I am wrong),

Sure, allow me to correct you: It's Janne, with two n in the middle. :)

> the patch does not fix anything in particular. The idea was that, by transitioning from having all boolean expressions from “int” to “bool” (in C terms), and thus from 32-bit to 8-bit on “typical” targets, the optimizer might be able to emit more compact code.

My motivations were:

- As you say, hopefully more compact code.

- Fixing the co-array intrinsics which pass arguments as
boolean_type_node and on the library side the corresponding arguments
are C _Bool's. I think this has worked previously accidentally, as
arguments are passed in registers anyway, and stack slots are 8
bytes(?) on x86-64. But of course, we shouldn't let this excuse us
from actually fixing it.

- Redefining an ABI type specified by the middle end is bad form, IMHO.

> I am not sure this was tested.
>
> So: maybe it is a case of "Profile. Don't speculate.”

I don't have access to spec2k6, but I just checked the polyhedron 2011
benchmark suite. Target x86_64-pc-linux-gnu, compile options -O3
-funroll-loops -ffast-math -ftree-loop-linear -march=native (native ==
amdfam10 here). The line counts of the .s files (~number of
instructions) are (working directory is current trunk,
gf_trunk_boolean_type_node is trunk with my patch reverted):

  6856 ac.s
  6856 ../gf_trunk_boolean_type_node/ac.s
 235206 aermod.s
 235337 ../gf_trunk_boolean_type_node/aermod.s
 22310 air.s
 22477 ../gf_trunk_boolean_type_node/air.s
 15221 capacita.s
 15252 ../gf_trunk_boolean_type_node/capacita.s
  4905 channel2.s
  4859 ../gf_trunk_boolean_type_node/channel2.s
  34269 doduc.s
  33610 ../gf_trunk_boolean_type_node/doduc.s
 11963 fatigue2.s
 11959 ../gf_trunk_boolean_type_node/fatigue2.s
 17369 gas_dyn2.s
 17304 ../gf_trunk_boolean_type_node/gas_dyn2.s
  31098 induct2.s
  31098 ../gf_trunk_boolean_type_node/induct2.s
  8366 linpk.s
  8367 ../gf_trunk_boolean_type_node/linpk.s
 15139 mdbx.s
 15155 ../gf_trunk_boolean_type_node/mdbx.s
  5302 mp_prop_design.s
  5314 ../gf_trunk_boolean_type_node/mp_prop_design.s
 17420 nf.s
 17457 ../gf_trunk_boolean_type_node/nf.s
 20804 protein.s
 20713 ../gf_trunk_boolean_type_node/protein.s
  52476 rnflow.s
  52449 ../gf_trunk_boolean_type_node/rnflow.s
  37339 test_fpu2.s
  37377 ../gf_trunk_boolean_type_node/test_fpu2.s
  8466 tfft2.s
  8448 ../gf_trunk_boolean_type_node/tfft2.s

So 7 of 17 benchmarks are slightly bigger after the patch (at least
for channel2 it seems due to trunk unrolling a loop which didn't
happen with the patch reverted, haven't checked others), 8 benchmarks
are slightly smaller after the patch, and 2 are equally long. Runtime
differences seem to be a wash, though I didn't test that very
carefully.

-- 
Janne Blomqvist