This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, SMS 1/3] Support closing_branch_deps (second try)


Hello,

The attached patch includes enhancements for SMS to support targets
that their doloop part is not decoupled from the rest of the loop's
instructions, as SMS currently requires. In this case, the branch can
not be placed wherever we want (as is currently done) due to the fact
it must honor dependencies and thus we schedule the branch instruction
with the rest of the loop's instructions and rotate it to be in row
ii-1 at the end of the scheduling procedure to make sure it's the last
instruction in the iteration.

The attached patch changes the current implementation to always schedule
the branch in order to support the above case.

As explained in http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00250.html
by always scheduling the branch the code size might be effected due to
the fact SC can be increased by 1, which means adding instructions from
at most one iteration to the prologue and epilogue.  Also, it might
be that ii will be increased by one due to resources constraints --
unavailability of free slots to schedule the branch.

The patch was tested together with the rest of the patches in this series
and on top of the patch to support do-loop for ARM (not yet in mainline,
but approved http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01718.html).
On ppc64-redhat-linux regtest as well as bootstrap with SMS flags
enabling SMS also on loops with stage count 1.  Regtested on SPU.
On arm-linux-gnueabi regtseted on c,c++. Bootstrap c language with SMS
flags enabling SMS also on loops with stage count 1.

OK for mainline?

Thanks,
Revital

ChangeLog:

        * ddg.c (create_ddg_dep_from_intra_loop_link): If a true dep edge
        enters the branch create an anti edge in the opposite direction
        to prevent the creation of reg-moves.
        * modulo-sched.c: Adjust comment to reflect the fact we are
        scheduling closing branch.
        (PS_STAGE_COUNT): Rename to CALC_STAGE_COUNT and redefine.
        (stage_count): New field in struct partial_schedule.
        (calculate_stage_count): New function.
        (normalize_sched_times): Rename to reset_sched_times and handle
        incrementing the sched time of the nodes by a constant value
        passed as parameter.
        (duplicate_insns_of_cycles): Skip closing branch.
        (sms_schedule_by_order): Schedule closing branch.
        (ps_insn_find_column): Handle closing branch.
        (sms_schedule): Call reset_sched_times and adjust the code to
        support scheduling of the closing branch.
        (ps_insert_empty_row): Update calls to normalize_sched_times
        and rotate_partial_schedule functions.

testsuite Changlog:

        * gcc.target/arm/sms-9.c: New file.
        * gcc.target/arm/sms-10.c: New file.

Attachment: patch_final_linaro_6_5.txt
Description: Text document

/* { dg-do run } */
/* { dg-require-effective-target arm_thumb2_ok } */
/* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms -fno-auto-inc-dec -mthumb  -march=armv7-a" } */

extern void abort (void);

int filter1[8][4] = {
  {
   23170, -23170, -23170, 23170,},
  {
   22005, -26319, -16846, 29621,},
  {
   22005, -26319, -16846, 29621,},
  {
   5, -26319, -16846, 29621,},
  {
   55, -26319, -16846, 29621,},
  {
   77, -26319, -16846, 29621,},
  {
   22005, -26319, -16846, 29621,},
  {
   22005, -26319, -16846, 29621,},

};


int out[32] = {
  22, -22, -22, 22, 21, -25, -16, 28, 21, -25, -16, 28, 0, -25, -16, 28, 0,
    -25, -16, 28, 0, -25, -16, 28, 21, -25, -16, 28, 21, -25, -16, 28
};

__attribute__ ((noinline))
static void
foo (int *arr, int *accums)
{
  typedef int NN[8][4];
  static NN *filter;
  int i;
  filter = &filter1;

  int *filterp;
  int *arrp;
  arrp = arr;
  filterp = (int *) ((*filter)[0]);
  i = 32;

  while (i--)
    {
      *accums++ = (arrp[0] * filterp[0] + arrp[8] * filterp[0]) / 32768;
      filterp += 1;
    }
}

int
main ()
{
  int inarr[32];
  int accums[32];
  int i;
  for (i = 0; i < 32; i++)
    inarr[i] = i << 2;
  foo (inarr, accums);
  for (i = 0; i < 32; i++)
    if (out[i] != accums[i])
      abort ();
  return 0;
}

/* { dg-final { scan-rtl-dump-times "SMS succeeded" 1 "sms" } }  */
/* { dg-final { cleanup-rtl-dump "sms" } } */



/* { dg-do run } */
/* { dg-require-effective-target arm_thumb1_ok } */
/* { dg-options "-O2 -fmodulo-sched -fdump-rtl-sms -fno-auto-inc-dec -fmodulo-sched-allow-regmoves  -mthumb -gtoggle" } */

extern void abort (void);

unsigned char filter1[8] = { 2, 3, 1, 2, 3, 2, 2, 1 };


void
foo (int val, unsigned int size, unsigned char *dest)
{
  while (size != 0)
    {
      *dest++ = val & 0xff;
      --size;
    }
}


int
main ()
{
  int i;
  foo (50, 4, filter1);
  for (i = 0; i < 4; i++)
    if (filter1[i] != 50)
      abort ();
  return 0;
}

/* { dg-final { scan-rtl-dump-times "OK" 1 "sms" } }  */
/* { dg-final { cleanup-rtl-dump "sms" } } */



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]