27006 – [4.1/4.2 Regression] Invalid altivec constant loading code

Bug 27006 - [4.1/4.2 Regression] Invalid altivec constant loading code

Summary: [4.1/4.2 Regression] Invalid altivec constant loading code

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	4.2.0

Importance:	P3 normal
Target Milestone:	4.1.1
Assignee:	Paolo Bonzini

URL:	http://gcc.gnu.org/ml/gcc-patches/200...
Keywords:	patch, wrong-code

Depends on:
Blocks:

Reported:	2006-04-03 18:52 UTC by Ulrich Weigand
Modified:	2006-04-13 20:35 UTC (History)
CC List:	7 users (show)

See Also:
Host:
Target:	powerpc--*
Build:
Known to work:	4.0.2
Known to fail:
Last reconfirmed:	2006-04-03 19:17:18

Attachments
patch to fix the bug (386 bytes, patch) 2006-04-06 14:47 UTC, Paolo Bonzini	Details \| Diff
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Ulrich Weigand 2006-04-03 18:52:14 UTC

When compiling the following code with -O0 -maltivec:

typedef union
{
  int i[4];
  __attribute__((altivec(vector__))) int v;
} vec_int4;

int main (void)
{
   vec_int4 i1;

   i1.v = (__attribute__((altivec(vector__))) int){31, 31, 31, 31};
   printf ("%d\n", i1.i[0]);

   return 0;
}

the output printed is 30, not 31.

The load of the vector constant is done by the following pair
of instructions:

        vspltisw 0,15
        vadduwm 0,0,0

which are generated by this splitter in altivec.md:

(define_split
  [(set (match_operand:VI 0 "altivec_register_operand" "")
        (match_operand:VI 1 "easy_vector_constant_add_self" ""))]
  "TARGET_ALTIVEC && reload_completed"
  [(set (match_dup 0) (match_dup 3))
   (set (match_dup 0) (plus:VI (match_dup 0)
                               (match_dup 0)))]
{
  rtx dup = gen_easy_altivec_constant (operands[1]);
  rtx const_vec;

  /* Divide the operand of the resulting VEC_DUPLICATE, and use
     simplify_rtx to make a CONST_VECTOR.  */
  XEXP (dup, 0) = simplify_const_binary_operation (ASHIFTRT, QImode,
                                                   XEXP (dup, 0), const1_rtx);
  const_vec = simplify_rtx (dup);

  if (GET_MODE (const_vec) == <MODE>mode)
    operands[3] = const_vec;
  else
    operands[3] = gen_lowpart (<MODE>mode, const_vec);
})

Now, easy_vector_constand_add_self accepts all constants between
16 and 31, where I think it should really only be accepting *even*
constants.

The test is really implemented in rs6000.h:

#define EASY_VECTOR_15_ADD_SELF(n) (!EASY_VECTOR_15((n))        \
                                    && EASY_VECTOR_15((n) >> 1))

and adding a condition ((n) & 1) == 0 here fixes the problem.

Is this the proper solution?

Comment 1 Andrew Pinski 2006-04-03 19:10:27 UTC

This worked in 4.0.2, by just loading the constant via memory so this is a regression.

Comment 2 Andrew Pinski 2006-04-03 19:17:18 UTC

Confirmed, this is definitely wrong but we can do better than disabling this for odd vectors I think.  Adding 1 to them should work.

Comment 3 Paolo Bonzini 2006-04-06 10:56:14 UTC

For 4.1 I'd surely go with Ulrich's patch.  And for 4.2 I think that it starts being too expensive to have two vsplat and two adds.

        vspltisw 0,15
        vspltisw 1,1
        vadduwm 0,0,0
        vadduwm 0,0,1

Ulrich, can you prepare a patch or should I do so?

Comment 4 Ulrich Weigand 2006-04-06 14:03:36 UTC

(In reply to comment #3)
> Ulrich, can you prepare a patch or should I do so?

It would be great if you could do that, I don't yet
have a proper setup for ppc testing ...

Comment 5 Paolo Bonzini 2006-04-06 14:47:07 UTC

Created attachment 11217 [details]
patch to fix the bug

Patch untested but quite trivial.

Comment 6 Ulrich Weigand 2006-04-13 11:47:09 UTC

I've now tested and submitted the patch, thanks.

Comment 7 David Edelsohn 2006-04-13 15:37:41 UTC

Paolo, I'm not sure why you think that two splats and two adds is too expensive.  Memory is moving farther and farther away from the processor.

Comment 8 Ulrich Weigand 2006-04-13 20:27:05 UTC

Subject: Bug 27006

Author: uweigand
Date: Thu Apr 13 20:26:59 2006
New Revision: 112923

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=112923
Log:
2006-04-13  Paolo Bonzini  <bonzini@gnu.org>
	    Ulrich Weigand  <uweigand@de.ibm.com>

	PR target/27006
	* config/rs6000/rs6000.h (EASY_VECTOR_15_ADD_SELF): Require n
	to be even.

	PR target/27006
	* gcc.dg/vmx/pr27006.c: New testcase.


Added:
    trunk/gcc/testsuite/gcc.dg/vmx/pr27006.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/rs6000/rs6000.h
    trunk/gcc/testsuite/ChangeLog

Comment 9 Ulrich Weigand 2006-04-13 20:33:59 UTC

Subject: Bug 27006

Author: uweigand
Date: Thu Apr 13 20:33:51 2006
New Revision: 112924

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=112924
Log:
2006-04-13  Paolo Bonzini  <bonzini@gnu.org>
	    Ulrich Weigand  <uweigand@de.ibm.com>

	PR target/27006
	* config/rs6000/rs6000.h (EASY_VECTOR_15_ADD_SELF): Require n
	to be even.

	PR target/27006
	* gcc.dg/vmx/pr27006.c: New testcase.

Added:
    branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/vmx/pr27006.c
Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/config/rs6000/rs6000.h
    branches/gcc-4_1-branch/gcc/testsuite/ChangeLog

Comment 10 Ulrich Weigand 2006-04-13 20:35:40 UTC

Fixed for 4.1 and mainline.

Comment 11 paolo.bonzini@lu.unisi.ch 2006-04-14 07:27:31 UTC

Subject: Re:  [4.1/4.2 Regression] Invalid altivec constant
 loading code


> I'm not sure why you think that two splats and two adds is too
> expensive.
>   
I'd hope that these constants stay in the cache...

I'd be more interested in implementing a splat+shift combination to make 
EASY_VECTOR_15_ADD_SELF more generic.

Comment 12 David Edelsohn 2006-04-14 14:32:05 UTC

Subject: Re:  [4.1/4.2 Regression] Invalid altivec constant loading code 

	One can produce a few more values fairly easily, but having GCC
know the optimal sequence for all constants could be rather bulky.

http://www.informatik.uni-bremen.de/~hobold/av/AltiVecConstants.asc
http://www.informatik.uni-bremen.de/~hobold/AltiVec.html