This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [wide-int] doloop fixes


On 11/02/2013 01:51 PM, Richard Sandiford wrote:
Kenneth Zadeck <zadeck@naturalbridge.com> writes:
On 11/02/2013 11:31 AM, Richard Sandiford wrote:
Kenneth Zadeck <zadeck@naturalbridge.com> writes:
On 11/02/2013 10:47 AM, Richard Sandiford wrote:
Kenneth Zadeck <zadeck@naturalbridge.com> writes:
On 11/02/2013 07:06 AM, Richard Sandiford wrote:
The first part of this is a simple type mismatch --
get_max_loop_iterations
returns a widest_int aka max_wide_int -- and I'd have installed it as
obvious.  The second part isn't as obvious though.  The old code stored
the maximum iterations as:

      if (!get_max_loop_iterations (loop, &iter)
          || !iter.fits_shwi ())
        iterations_max = const0_rtx;
      else
        iterations_max = GEN_INT (iter.to_shwi ());

and the new code uses:

      if (!get_max_loop_iterations (loop, &iter)
          || !wi::fits_shwi_p (iter))
        iterations_max = const0_rtx;
      else
        iterations_max = immed_wide_int_const (iter, mode);

which includes an extra canonicalisation.  I agree it would be good to do
that in principle, but I'm not sure it copes correctly with the case where
the loop iterates 1 << GET_MODE_PRECISION (mode) times.  Plus I think the
real fix would be to avoid the host dependence altogether, i.e. get rid
of the fits_shwi_p too.

As it stands, this breaks bfin-elf's pattern, which has:

      /* Due to limitations in the hardware (an initial loop count of 0
         does not loop 2^32 times) we must avoid to generate a hardware
         loops when we cannot rule out this case.  */
      if (!flag_unsafe_loop_optimizations
          && (unsigned HOST_WIDE_INT) INTVAL (operands[2]) >= 0xFFFFFFFF)
        FAIL;

With the sign-extending conversion, this now triggers more often than
it was supposed to.

Since the old "GEN_INT (iter.to_shwi ());" works verbatim in wide-int too,
and since we still use that form in the doloop_begin code, I think we should
just back the immed_wide_int_const out.

Tested on powerpc64-linux-gnu and x86_64-linux-gnu.  It fixes some
unwanted testsuite changes in bfin-elf.  OK to install?
I dislike this a lot.     I think that this is a badly written pattern
in the bfin port if it depends on the host size.   As machines get
bigger, (they may not be getting faster but they are still getting
bigger) having traps like this at the portable level will bite us.
But this isn't at the portable level, it's backend-specific code.
It knows that it's dealing with a 32-bit counter.
yes, but the patch is dumbing down the portable part of the compiler.
in truth this code should really be iterations_max =
immed_wide_int_const (iter, mode) with no tests at all.
But like I say, that won't cope with loops that iterate
1 << GET_MODE_PRECISION (mode) times, even if that fits
in a signed HWI.  (That case could happen if the hardware
supports it and if the counter starts out as 0.)

The mainline code is self-consistent in that the number of
iterations is passed as a positive signed HWI, using 0 if the
value isn't known or is greater than HOST_WIDE_INT_MAX.
In the current interface, the CONST_INT doesn't really have a mode.
It's just used as a convenient way of passing a HWI to an expander.

I'm not saying that it's the ideal interface.  But if we want to change it,
let's do that separately, and test it on targets that have doloops.
Until then, the wide-int branch can provide the current interface
just as easily as mainline.
we do not have a predicate at the rtl level like we do at the tree level
where we can ask if the value can be represented in a mode without
loosing any information.   But the proper thing to do here is to check
to see if iter fits in the mode, if it does, then generate the
immed_wide_int_const and if not generate the 0.   In this way it does
the right thing for loops that iterate 1<<GET_MODE_PRECISION (mode),
i.e. it will produce a 0.   And that will be correct even if the mode is
SI or TI.
Well, like I say, I don't see the current rtx has having a mode,
even conceptually.  The CONST_INT is just a convenient way of passing
an iteration count to a function that only accepts rtxes as arguments.
If we could, we'd be passing the widest_int directly as a pointer or
reference.

But since I'm just repeating what I said above, I should probably bow
out of this one.
you point seems to be that this is a place where the "assumption" that
rtxes are signed bites us.
if mode happens to be SImode but the variable was unsigned and large,
then we will sign extend this and get a pessimistic answer for this
pattern.
Right.

on the other hand, if we do what you suggest, then we are
forever limited by the host precision.

I see your point, but i do not know what to do.
Note that what I'm suggesting is simply restoring the mainline "else" code,
rather than something new.  wide-int doesn't change which of the if-else
arms are chosen, and if we keep the else arm as-is, existing code will
be happy.

"Forever" is a bit pessimistic (though probably true).  We're only
limited until somehow who cares about wide counters comes along and
changes things.  Or if you feel particularly motivated, you could do
it after the merge.  It just doesn't seem like something that needs to
be tackled here and now.

I mean, the current code is limited to CONST_INT -- and the doloop_end
pattern is specifically documented as taking a CONST_INT -- even though
we already have CONST_DOUBLE for wider types.  So the current set of rtxes
would already allow the pattern to pass wider iteration counts if necessary,
it's just that no port has needed them yet.

Thanks,
Richard
you are correct, that for now it needs to be "as was".


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]