Bug 48052

Summary: loop not vectorized if index is "unsigned int"
Product: gcc Reporter: vincenzo Innocente <vincenzo.innocente>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: evstupac, paolo.carlini, rguenth, spop
Priority: P3    
Version: 4.6.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2011-03-10 09:46:16
Bug Depends on: 66396    
Bug Blocks:    
Attachments: patch

Description vincenzo Innocente 2011-03-09 18:49:49 UTC
is there any reason why "unsigned int" is not suited to index loop for auto-vectorization?
example

cat simpleLoop.cc
#include<cstddef>

void loop1( double const * __restrict__ x_in,  double * __restrict__ x_out, double const * __restrict__ c, int N) { 
   for(int i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}



void loop2( double const * __restrict__ x_in,  double * __restrict__ x_out, double const * __restrict__ c, unsigned int N) {
   for(unsigned int i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}

void loop21( double const * __restrict__ x_in,  double * __restrict__ x_out, double const * __restrict__ c, size_t N) {
   for(size_t i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}

void loop21( double const * __restrict__ x_in,  double * __restrict__ x_out, double const * __restrict__ c, unsigned long long N) {
   for(unsigned long long  i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}


void loop3( double const * __restrict__ x_in,  double * __restrict__ x_out, double const * __restrict__ c, size_t N) {
   double const * end = x_in+N;
   for(; x_in!=end; ++x_in, ++x_out, ++c)
       (*x_out) = (*c) * (*x_in);
}

result:

g++ -v -O2 -ftree-vectorize -ftree-vectorizer-verbose=2 -c simpleLoop.cc
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ./configure --enable-gold=yes --enable-lto --with-fpmath=avx
Thread model: posix
gcc version 4.6.0 20110205 (experimental) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-O2' '-ftree-vectorize' '-ftree-vectorizer-verbose=2' '-c' '-shared-libgcc' '-mtune=generic' '-march=x86-64'
 /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/cc1plus -quiet -v -D_GNU_SOURCE simpleLoop.cc -quiet -dumpbase simpleLoop.cc -mtune=generic -march=x86-64 -auxbase simpleLoop -O2 -version -ftree-vectorize -ftree-vectorizer-verbose=2 -o /tmp/innocent/ccUB9xBg.s
GNU C++ (GCC) version 4.6.0 20110205 (experimental) (x86_64-unknown-linux-gnu)
	compiled by GNU C version 4.6.0 20110205 (experimental), GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
ignoring nonexistent directory "/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../x86_64-unknown-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../include/c++/4.6.0
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../include/c++/4.6.0/x86_64-unknown-linux-gnu
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../include/c++/4.6.0/backward
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/include
 /usr/local/include
 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.0/include-fixed
 /usr/include
End of search list.
GNU C++ (GCC) version 4.6.0 20110205 (experimental) (x86_64-unknown-linux-gnu)
	compiled by GNU C version 4.6.0 20110205 (experimental), GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.1
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 0d52c927b640361d99f7371685058a2b

simpleLoop.cc:4: note: LOOP VECTORIZED.
simpleLoop.cc:3: note: vectorized 1 loops in function.

simpleLoop.cc:11: note: not vectorized: data ref analysis failed D.2386_13 = *D.2385_12;

simpleLoop.cc:10: note: vectorized 0 loops in function.

simpleLoop.cc:16: note: LOOP VECTORIZED.
simpleLoop.cc:15: note: vectorized 1 loops in function.

simpleLoop.cc:21: note: LOOP VECTORIZED.
simpleLoop.cc:20: note: vectorized 1 loops in function.

simpleLoop.cc:28: note: LOOP VECTORIZED.
simpleLoop.cc:26: note: vectorized 1 loops in function.
Comment 1 Paolo Carlini 2011-03-10 00:21:19 UTC
Richard, is this issue known? Seems indeed rather weird to me.
Comment 2 Richard Biener 2011-03-10 09:46:17 UTC
This is a known issue with POINTER_PLUS_EXPR semantics, how the C frontend
handles pointer-based array accesses and fold.  And in the end SCEV analysis.
The issue is we end up with

  *(c + (((long unsigned int)i) * 8))

with that 'long unsigned int' being sizetype.  At the point of SCEV
analysis we do not factor in the fact that i does not wrap around and
that because of this the evolution is
{ c, +, 8 }

With signed integers we simply exploit undefined behavior.

So yes, it's a known problem (but I always fail to remember a testcase
where it matters ;)).

In the very end my plan was to fix this all with no-undefined-overflow
branch, but maybe Sebastian can think of a way to use number-of-iteration
analysis in SCEV?  (Ugh, that's a chicken-and-egg problem, no?)
Comment 3 Paolo Carlini 2011-03-10 10:22:48 UTC
Thanks for the analysis. I knew about the difference between signed and unsigned, makes sense. Not knowing in detail the internals of the optimization the puzzling bit is that types wider than unsigned int already work fine. The problem seems fixable, somehow ;)
Comment 4 vincenzo Innocente 2011-03-10 10:54:07 UTC
  Thanks for the fast reation.
I would like to point out that, at least on x86_64, the only one that does not work is
"unsigned int"
"unsigned long long (aka size_t)" seems to work (see 3,4 and 5th loop in my example)

vincenzo


On 10 Mar, 2011, at 11:23 AM, paolo.carlini at oracle dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
> 
> --- Comment #3 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-03-10 10:22:48 UTC ---
> Thanks for the analysis. I knew about the difference between signed and
> unsigned, makes sense. Not knowing in detail the internals of the optimization
> the puzzling bit is that types wider than unsigned int already work fine. The
> problem seems fixable, somehow ;)
> 
> -- 
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.

--
Il est bon de suivre sa pente, pourvu que ce soit en montant. 
A.G.
http://www.flickr.com/photos/vin60/1320965757/
Comment 5 Paolo Carlini 2011-03-10 11:22:26 UTC
Vincenzo, if I understand correctly, maybe Sebastian can also tell us more, the issue seems that, at some stage, the logic is fully general only assuming the widest unsigned type (*), doesn't cope with smaller types. Thus, if my theory is correct, unsigned char, unsigned short, etc, all should cause problems. On the other hand, on x86_64, unsigned long, unsigned long long, size_t, are all the same size, and all work (**)

(*) I don't consider int128, I don't think is relevant for loop optimization.
(**) On x86, however, unsigned int (aka unsigned long) appears to work, hum.
Comment 6 Paolo Carlini 2011-03-10 11:30:58 UTC
Well, on x86, in terms of addressing unsigned int (aka long) *is* the widest type, morally unsigned long long doesn't count.
Comment 7 vincenzo Innocente 2011-03-11 10:16:37 UTC
what's the probablity to have this fixed?
We depend on a third party matrix library
that is fully templated and uses everywhere "unsigned int"
I made a test with a
sed -i 's/unsigned int/unsigned long long/g'
and it MAKES a difference (up to a factor 2 in speed).
This modification (although trivial) changes the type of templated vector and matrix, the signature of functions
and also affects user code. 
It is neither transparent nor backward compatible.
I think we cannot afford the change in production: much easier to change compiler version!

    Thanks for any effort dedicated to solve this issue,

         Vincenzo

On 10 Mar, 2011, at 12:22 PM, paolo.carlini at oracle dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
> 
> Paolo Carlini <paolo.carlini at oracle dot com> changed:
> 
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                 CC|                            |paolo.carlini at oracle dot
>                   |                            |com
> 
> --- Comment #5 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-03-10 11:22:26 UTC ---
> Vincenzo, if I understand correctly, maybe Sebastian can also tell us more, the
> issue seems that, at some stage, the logic is fully general only assuming the
> widest unsigned type (*), doesn't cope with smaller types. Thus, if my theory
> is correct, unsigned char, unsigned short, etc, all should cause problems. On
> the other hand, on x86_64, unsigned long, unsigned long long, size_t, are all
> the same size, and all work (**)
> 
> (*) I don't consider int128, I don't think is relevant for loop optimization.
> (**) On x86, however, unsigned int (aka unsigned long) appears to work, hum.
> 
> -- 
> Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.

--
Il est bon de suivre sa pente, pourvu que ce soit en montant. 
A.G.
http://www.flickr.com/photos/vin60/1320965757/
Comment 8 rguenther@suse.de 2011-03-11 10:26:47 UTC
On Fri, 11 Mar 2011, vincenzo.innocente at cern dot ch wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
> 
> --- Comment #7 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 2011-03-11 10:16:37 UTC ---
> what's the probablity to have this fixed?

A fix for this is quite involved (and honestly its on my TODO list for
at least two years - but I chickened out repeatedly because of all the
issues).  I'm not sure if a SCEV local fix is possible, Sebastian
will probably comment on this.

Richard.
Comment 9 vincenzo Innocente 2011-03-14 10:08:29 UTC
It is interesting to note that in case of fixed size (such as in these trivial or template examples)
vectorization works also for unsigned int

void loop10( double const * __restrict__ x_in,  double * __restrict__ x_out, double const * __restrict__ c) {
   for(unsigned int i=0; i!=10; ++i)
       x_out[i] = c[i]*x_in[i];
}


template<typename T, unsigned int N>
void loopTu( T const * __restrict__ x_in,  T * __restrict__ x_out, T const * __restrict__ c) {
   for(unsigned int i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}

template<typename T, unsigned long long N>
void loopTull( T const * __restrict__ x_in,  T * __restrict__ x_out, T const * __restrict__ c) {
   for(unsigned long long i=0; i!=N; ++i)
       x_out[i] = c[i]*x_in[i];
}


void go(double const * __restrict__ x_in,  double * __restrict__ x_out, double const * __restrict__ c) {
 
  loopTu<double,10>(x_in, x_out, c);

  loopTull<double,10>(x_in, x_out, c);

}
Comment 10 zaafrani 2015-05-04 19:33:31 UTC
Created attachment 35459 [details]
patch

This is an old thread and we are still running into similar issues: Code is not being vectorized on 64-bit target due to scev not being able to optimally analyze overflow condition.
While the original test case shown here seems to work now, it does not work if the start value is not a constant and the loop index variable is of unsigned type: Ex

void loop2( double const * __restrict__ x_in,  double * __restrict__ x_out, double const * __restrict__ c, unsigned int N, unsigned int start) {
       for(unsigned int i=start; i!=N; ++i)
                x_out[i] = c[i]*x_in[i];
  }

Here is our unit test: 

int foo(int* A, int* B,  unsigned start, unsigned B)
{
  int s;  
 for (unsigned k = start;  k <start+B; k++)
     s += A[k] * B[k];

    return s;
 }

Our unit test case is extracted from a matrix multiply of a two-dimensional array and all loops are blocked by hand by a factor of B.  Even though a bit modified, above loop corresponds to the innermost loop of the blocked matrix multiply. 

We worked on patch to solve the problem (see attachment)
Comment 11 Richard Biener 2015-05-06 06:56:57 UTC
That's an interesting idea - your argument is that if niter analysis was able to compute an expression for the number of iterations and the cast we are looking at
is a widening of a BIV then it is ok to assume the BIV does not wrap.

Unfortunately this breaks down (eventually not in practice due to your exclusion of constant initial BIV value) for cases like


  for (unsigned i = 3; i != 2; i+=7)
    ;

where niter analysis can still compute the number of iterations (I've made
the numbers up, so maybe that loop will never terminate...).

Still the idea is interesting as we might be able to record whether BIVs
overflow or not.
Comment 12 zaafrani 2015-05-07 20:43:24 UTC
Thank you for the feedback.

We excluded start value that is constant because it is already
working. To our knowledge, only when the start value is unknown and
the loop index type is of unsigned type that we fail to recognize
non-overflow for situations when it is possible to deduce so. For most
other cases, current analysis done in scev_probably_wraps_p seems to
be working fine. We also added the assumption of step equal 1 so that
we can make correct decision about non-overflow. So, basically we’d
rather catch few simple cases and make them work  then try to
generalize the scope and not being to prove much.


On Wed, May 6, 2015 at 1:56 AM, rguenth at gcc dot gnu.org
<gcc-bugzilla@gcc.gnu.org> wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48052
>
> --- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
> That's an interesting idea - your argument is that if niter analysis was able
> to compute an expression for the number of iterations and the cast we are
> looking at
> is a widening of a BIV then it is ok to assume the BIV does not wrap.
>
> Unfortunately this breaks down (eventually not in practice due to your
> exclusion of constant initial BIV value) for cases like
>
>
>   for (unsigned i = 3; i != 2; i+=7)
>     ;
>
> where niter analysis can still compute the number of iterations (I've made
> the numbers up, so maybe that loop will never terminate...).
>
> Still the idea is interesting as we might be able to record whether BIVs
> overflow or not.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 13 AK 2015-05-22 16:18:56 UTC
We have an updated patch that works for both the cases.
https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01991.html
Comment 14 bin cheng 2015-06-02 10:19:50 UTC
Author: amker
Date: Tue Jun  2 10:19:18 2015
New Revision: 224020

URL: https://gcc.gnu.org/viewcvs?rev=224020&root=gcc&view=rev
Log:

	PR tree-optimization/48052
	* cfgloop.h (struct control_iv): New.
	(struct loop): New field control_ivs.
	* tree-ssa-loop-niter.c : Include "stor-layout.h".
	(number_of_iterations_lt): Set no_overflow information.
	(number_of_iterations_exit): Init control iv in niter struct.
	(record_control_iv): New.
	(estimate_numbers_of_iterations_loop): Call record_control_iv.
	(loop_exits_before_overflow): New.  Interface factored out of
	scev_probably_wraps_p.
	(scev_probably_wraps_p): Factor loop niter related code into
	loop_exits_before_overflow.
	(free_numbers_of_iterations_estimates_loop): Free control ivs.
	* tree-ssa-loop-niter.h (free_loop_control_ivs): New.

	gcc/testsuite/ChangeLog
	PR tree-optimization/48052
	* gcc.dg/tree-ssa/scev-8.c: New.
	* gcc.dg/tree-ssa/scev-9.c: New.
	* gcc.dg/tree-ssa/scev-10.c: New.
	* gcc.dg/vect/pr48052.c: New.


Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-10.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-8.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/scev-9.c
    trunk/gcc/testsuite/gcc.dg/vect/pr48052.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgloop.h
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-loop-niter.c
    trunk/gcc/tree-ssa-loop-niter.h
Comment 15 Stupachenko Evgeny 2015-06-23 13:23:19 UTC
The commit caused regressions on some benchmarks. Test to reproduce:
(compilations flags: -Ofast)

int foo (int flag, char *a)                                                    
{                                                                              
  short i, j;                                                                  
  short l = 0;                                                                 
  if (flag == 1)                                                               
    l = 3;                                                                     

  for (i = 0; i < 4; i++)                                                      
    {                                                                          
      for (j = l - 1; j > 0; j--)                                              
        a[j] = a[j - 1];                                                       
      a[0] = i;                                                                
    }                                                                          
}

Here value of l is between 0 and 3, and therefore value of the innermost loop bound (l - 1) is between -1 and 2.

After the commit the innermost loop is replaced with memmove call. This is obviously not optimal as amount of memory to move is not greater than 2.
Comment 16 bin cheng 2015-06-23 13:52:51 UTC
(In reply to Stupachenko Evgeny from comment #15)
> The commit caused regressions on some benchmarks. Test to reproduce:
> (compilations flags: -Ofast)
> 
> int foo (int flag, char *a)                                                 
> 
> {                                                                           
> 
>   short i, j;                                                               
> 
>   short l = 0;                                                              
> 
>   if (flag == 1)                                                            
> 
>     l = 3;                                                                  
> 
> 
>   for (i = 0; i < 4; i++)                                                   
> 
>     {                                                                       
> 
>       for (j = l - 1; j > 0; j--)                                           
> 
>         a[j] = a[j - 1];                                                    
> 
>       a[0] = i;                                                             
> 
>     }                                                                       
> 
> }
> 
> Here value of l is between 0 and 3, and therefore value of the innermost
> loop bound (l - 1) is between -1 and 2.
> 
> After the commit the innermost loop is replaced with memmove call. This is
> obviously not optimal as amount of memory to move is not greater than 2.

Hi, thank you for reporting this.  I shall have a look.
Comment 17 bin cheng 2015-06-24 02:34:07 UTC
(In reply to Stupachenko Evgeny from comment #15)
> The commit caused regressions on some benchmarks. Test to reproduce:
> (compilations flags: -Ofast)
> 
> int foo (int flag, char *a)                                                 
> 
> {                                                                           
> 
>   short i, j;                                                               
> 
>   short l = 0;                                                              
> 
>   if (flag == 1)                                                            
> 
>     l = 3;                                                                  
> 
> 
>   for (i = 0; i < 4; i++)                                                   
> 
>     {                                                                       
> 
>       for (j = l - 1; j > 0; j--)                                           
> 
>         a[j] = a[j - 1];                                                    
> 
>       a[0] = i;                                                             
> 
>     }                                                                       
> 
> }
> 
> Here value of l is between 0 and 3, and therefore value of the innermost
> loop bound (l - 1) is between -1 and 2.
> 
> After the commit the innermost loop is replaced with memmove call. This is
> obviously not optimal as amount of memory to move is not greater than 2.

(In reply to amker from comment #16)
> (In reply to Stupachenko Evgeny from comment #15)
> > The commit caused regressions on some benchmarks. Test to reproduce:
> > (compilations flags: -Ofast)
> > 
> > int foo (int flag, char *a)                                                 
> > 
> > {                                                                           
> > 
> >   short i, j;                                                               
> > 
> >   short l = 0;                                                              
> > 
> >   if (flag == 1)                                                            
> > 
> >     l = 3;                                                                  
> > 
> > 
> >   for (i = 0; i < 4; i++)                                                   
> > 
> >     {                                                                       
> > 
> >       for (j = l - 1; j > 0; j--)                                           
> > 
> >         a[j] = a[j - 1];                                                    
> > 
> >       a[0] = i;                                                             
> > 
> >     }                                                                       
> > 
> > }
> > 
> > Here value of l is between 0 and 3, and therefore value of the innermost
> > loop bound (l - 1) is between -1 and 2.
> > 
> > After the commit the innermost loop is replaced with memmove call. This is
> > obviously not optimal as amount of memory to move is not greater than 2.
> 
> Hi, thank you for reporting this.  I shall have a look.

This is latent optimization issue in loop-niter/loop-dist revealed because more scev are recognized now.
I filed PR66646 for tracking.

Thanks,
bin
Comment 18 ctice 2016-04-08 17:09:42 UTC
Author: ctice
Date: Fri Apr  8 17:09:09 2016
New Revision: 234832

URL: https://gcc.gnu.org/viewcvs?rev=234832&root=gcc&view=rev
Log:
Unify changes with Android's GCC 4.9 compiler.

Add the following changes from the Android
GCC 4.9 compiler (mostly adding fixes for aarch64):

Fix mingw build breakage
    1) Add missing _GCOV_fopen if !__KERNEL__
    2) Use _fullpath

Backport Cortex-A57's machine description support from trunk

Adjust generic move costs for aarch64. Backport from trunk

Enable C++ exceptions and RTTI by default.

Modify LINK_SPEC to pass --fix-cortex-a53-843419 as default

Rename libstdc++.so to libgnustl_shared.so when enabling bionic libs.

Drop mips64r2 from Android compiler's multilib

Merge "Drop mips64r2 from Android compiler's multilib"

Adjust several costs for AArch64:
  Refactor aarch64_address_costs; add cost tables for Cortex-A7;
  better estimate cost of building a constant; wrap aarch64_rtx_costs
  to dump verbose output; factor out common MULT cases; det default
  costs and handle vector modes; cost memory accesses using address
  costs; better cost logical operations; improve costs for div/mod and
  sign/zero extend operations; cost comparisons, flag setting
  operators and IF_THEN_ELSE; cost more Floating point RTX; cost
  TRUNCATE, SET, SYMBOL_REF, HIGH and LO_SUM; dump a message if we are
  unable to cost an insn; fix typos in cost data structure.


Add several improvements for AArch64 (Backported from GCC 5):
  (spill code - swap order in shr patterns; spill code - swap order in
  shl pattern; fix aarch64_rtx_costs of PLUS/MINUS; cost operand 0 in
  FP compare-with-0.0 case; properly cost FABD pattern; properly
  handle mvn-register and add EON+shift pattern and cost
  appropriately).

Disable inlining of memcpy for x86 with 'rep movs'.
Default to TLS guard for x86 stack-protector.
Change gcc BASE-VER from 4.9.x-google to 4.9.x

Cherry pick the following fixes from trunk: PR bootstrap/66638, 67954
(svn rev 230894, PR tree-optimization/65447,
PR tree-optimization/52563, tree-optimization/62173,
PR tree-optimization/48052, PR 64878, PR65048, PR65177, PR65735.

Port revision 219584 from linaro/gcc-4_9-branch

Fix for arm64 bad code for copysignl.


Added:
    branches/google/gcc-4_9-mobile/gcc/sancov.c
    branches/google/gcc-4_9-mobile/gcc/testsuite/gcc.dg/sancov/
    branches/google/gcc-4_9-mobile/gcc/testsuite/gcc.dg/sancov/asan.c
    branches/google/gcc-4_9-mobile/gcc/testsuite/gcc.dg/sancov/basic0.c
    branches/google/gcc-4_9-mobile/gcc/testsuite/gcc.dg/sancov/basic1.c
    branches/google/gcc-4_9-mobile/gcc/testsuite/gcc.dg/sancov/basic2.c
Modified:
    branches/google/gcc-4_9-mobile/ChangeLog
    branches/google/gcc-4_9-mobile/config/futex.m4
    branches/google/gcc-4_9-mobile/configure
    branches/google/gcc-4_9-mobile/configure.ac
    branches/google/gcc-4_9-mobile/gcc/BASE-VER
    branches/google/gcc-4_9-mobile/gcc/ChangeLog
    branches/google/gcc-4_9-mobile/gcc/Makefile.in
    branches/google/gcc-4_9-mobile/gcc/builtins.def
    branches/google/gcc-4_9-mobile/gcc/cfghooks.c
    branches/google/gcc-4_9-mobile/gcc/cfgloop.c
    branches/google/gcc-4_9-mobile/gcc/cfgloop.h
    branches/google/gcc-4_9-mobile/gcc/common.opt
    branches/google/gcc-4_9-mobile/gcc/config/aarch64/aarch64-cores.def
    branches/google/gcc-4_9-mobile/gcc/config/aarch64/aarch64-elf-raw.h
    branches/google/gcc-4_9-mobile/gcc/config/aarch64/aarch64-linux.h
    branches/google/gcc-4_9-mobile/gcc/config/aarch64/aarch64-protos.h
    branches/google/gcc-4_9-mobile/gcc/config/aarch64/aarch64-tune.md
    branches/google/gcc-4_9-mobile/gcc/config/aarch64/aarch64.c
    branches/google/gcc-4_9-mobile/gcc/config/aarch64/aarch64.md
    branches/google/gcc-4_9-mobile/gcc/config/aarch64/aarch64.opt
    branches/google/gcc-4_9-mobile/gcc/config/i386/i386.c
    branches/google/gcc-4_9-mobile/gcc/config/linux-android.h
    branches/google/gcc-4_9-mobile/gcc/config/mips/t-linux-android64
    branches/google/gcc-4_9-mobile/gcc/configure
    branches/google/gcc-4_9-mobile/gcc/doc/invoke.texi
    branches/google/gcc-4_9-mobile/gcc/except.c
    branches/google/gcc-4_9-mobile/gcc/expmed.c
    branches/google/gcc-4_9-mobile/gcc/gcov-io.h
    branches/google/gcc-4_9-mobile/gcc/loop-init.c
    branches/google/gcc-4_9-mobile/gcc/lra-constraints.c
    branches/google/gcc-4_9-mobile/gcc/omp-low.c
    branches/google/gcc-4_9-mobile/gcc/params.def
    branches/google/gcc-4_9-mobile/gcc/passes.def
    branches/google/gcc-4_9-mobile/gcc/sanitizer.def
    branches/google/gcc-4_9-mobile/gcc/testsuite/ChangeLog
    branches/google/gcc-4_9-mobile/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
    branches/google/gcc-4_9-mobile/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
    branches/google/gcc-4_9-mobile/gcc/tree-cfg.c
    branches/google/gcc-4_9-mobile/gcc/tree-cfg.h
    branches/google/gcc-4_9-mobile/gcc/tree-chrec.c
    branches/google/gcc-4_9-mobile/gcc/tree-chrec.h
    branches/google/gcc-4_9-mobile/gcc/tree-pass.h
    branches/google/gcc-4_9-mobile/gcc/tree-scalar-evolution.c
    branches/google/gcc-4_9-mobile/gcc/tree-scalar-evolution.h
    branches/google/gcc-4_9-mobile/gcc/tree-ssa-loop-ivopts.c
    branches/google/gcc-4_9-mobile/gcc/tree-ssa-loop-niter.c
    branches/google/gcc-4_9-mobile/gcc/tree-ssa-loop-niter.h
    branches/google/gcc-4_9-mobile/gcc/tree-ssa-threadedge.c
    branches/google/gcc-4_9-mobile/gcc/tree-ssa-threadupdate.c
    branches/google/gcc-4_9-mobile/gcc/tree-ssa-threadupdate.h
    branches/google/gcc-4_9-mobile/libgcc/libgcov-util.c
    branches/google/gcc-4_9-mobile/libstdc++-v3/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/acinclude.m4
    branches/google/gcc-4_9-mobile/libstdc++-v3/configure
    branches/google/gcc-4_9-mobile/libstdc++-v3/configure.ac
    branches/google/gcc-4_9-mobile/libstdc++-v3/doc/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/include/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/libsupc++/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/po/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/python/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/src/Makefile.am
    branches/google/gcc-4_9-mobile/libstdc++-v3/src/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/src/c++11/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/src/c++98/Makefile.in
    branches/google/gcc-4_9-mobile/libstdc++-v3/testsuite/Makefile.in
Comment 19 bin cheng 2016-04-08 17:25:42 UTC
I think this is fixed now.
Comment 20 bin cheng 2016-04-18 11:00:50 UTC
Fixed