Bug 52357 - 64bit-out.go and go.test/test/cmplxdivide.go time out on Solaris/SPARC
Summary: 64bit-out.go and go.test/test/cmplxdivide.go time out on Solaris/SPARC
Status: SUSPENDED
Alias: None
Product: gcc
Classification: Unclassified
Component: go (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: ---
Assignee: Ian Lance Taylor
URL:
Keywords:
Depends on: 53125
Blocks:
  Show dependency treegraph
 
Reported: 2012-02-23 17:07 UTC by Rainer Orth
Modified: 2019-04-01 04:27 UTC (History)
2 users (show)

See Also:
Host: sparc-sun-solaris2*
Target: sparc-sun-solaris2*
Build: sparc-sun-solaris2*
Known to work:
Known to fail:
Last reconfirmed: 2012-04-25 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rainer Orth 2012-02-23 17:07:57 UTC
The 64bit-out.go and go.test/test/cmplxdivide.go often time out on Solaris/SPARC:

On unloaded machines, I find for cmplxdivide.go:

Solaris 11, Sun Fire V890, 1.35 GHz UltraSPARC-IV:

real        1:07.33
user        1:02.18
sys            0.64

Solaris 8, Sun Enterprise T5220, 1.2 GHz UltraSPARC-T2:

real     2:09.40
user     2:07.73
sys         0.63

The latter is too close to the default 5 min timeout.

It's similar for 64bit-out.go:

real        1:13.68
user        1:07.82
sys            0.79

vs.

real     2:17.81
user     2:16.11
sys         1.14

  Rainer
Comment 1 Ian Lance Taylor 2012-04-25 17:39:35 UTC
Interestingly, the time for cmpldivide.go on SPARC appears to be primarily in the register allocator while compiling.  This is true even though no -O option is used.  Actually running the program after it has been compiled takes less than a second.


Execution times (seconds)
 phase setup             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall     109 kB ( 0%) ggc
 phase parsing           :   0.72 ( 1%) usr   0.04 ( 6%) sys   0.77 ( 1%) wall       8 kB ( 0%) ggc
 phase generate          : 118.51 (99%) usr   0.67 (93%) sys 119.17 (99%) wall   54226 kB (100%) ggc
 callgraph construction  :   0.09 ( 0%) usr   0.01 ( 1%) sys   0.09 ( 0%) wall    1806 kB ( 3%) ggc
 callgraph optimization  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall       3 kB ( 0%) ggc
 cfg cleanup             :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       0 kB ( 0%) ggc
 trivially dead code     :   0.42 ( 0%) usr   0.00 ( 0%) sys   0.43 ( 0%) wall       0 kB ( 0%) ggc
 df scan insns           :   0.39 ( 0%) usr   0.08 (11%) sys   0.47 ( 0%) wall       0 kB ( 0%) ggc
 df live regs            :   0.32 ( 0%) usr   0.00 ( 0%) sys   0.31 ( 0%) wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   0.39 ( 0%) usr   0.02 ( 3%) sys   0.41 ( 0%) wall    1261 kB ( 2%) ggc
 register information    :  52.00 (44%) usr   0.00 ( 0%) sys  52.00 (43%) wall       0 kB ( 0%) ggc
 alias analysis          :   0.20 ( 0%) usr   0.01 ( 1%) sys   0.21 ( 0%) wall    1026 kB ( 2%) ggc
 rebuild jump labels     :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall       0 kB ( 0%) ggc
 parser (global)         :   0.72 ( 1%) usr   0.04 ( 6%) sys   0.77 ( 1%) wall       8 kB ( 0%) ggc
 inline heuristics       :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall       4 kB ( 0%) ggc
 tree gimplify           :   0.38 ( 0%) usr   0.02 ( 3%) sys   0.41 ( 0%) wall    5832 kB (11%) ggc
 tree eh                 :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall       5 kB ( 0%) ggc
 tree CFG construction   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall      10 kB ( 0%) ggc
 tree find ref. vars     :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall     548 kB ( 1%) ggc
 tree PHI insertion      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       1 kB ( 0%) ggc
 tree SSA rewrite        :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall    1131 kB ( 2%) ggc
 tree SSA other          :   0.15 ( 0%) usr   0.05 ( 7%) sys   0.13 ( 0%) wall       0 kB ( 0%) ggc
 tree operand scan       :   0.07 ( 0%) usr   0.02 ( 3%) sys   0.17 ( 0%) wall     673 kB ( 1%) ggc
 tree STMT verifier      :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall       0 kB ( 0%) ggc
 out of ssa              :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall       0 kB ( 0%) ggc
 expand vars             :   0.08 ( 0%) usr   0.03 ( 4%) sys   0.10 ( 0%) wall    1535 kB ( 3%) ggc
 expand                  :   1.24 ( 1%) usr   0.04 ( 6%) sys   1.29 ( 1%) wall   12793 kB (24%) ggc
 post expand cleanups    :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall       5 kB ( 0%) ggc
 integrated RA           :  50.16 (42%) usr   0.20 (28%) sys  50.35 (42%) wall   12377 kB (23%) ggc
 reload                  :   8.03 ( 7%) usr   0.17 (24%) sys   8.19 ( 7%) wall   13804 kB (25%) ggc
 thread pro- & epilogue  :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall       4 kB ( 0%) ggc
 final                   :   2.48 ( 2%) usr   0.02 ( 3%) sys   2.50 ( 2%) wall       9 kB ( 0%) ggc
 rest of compilation     :   0.98 ( 1%) usr   0.00 ( 0%) sys   1.02 ( 1%) wall      31 kB ( 0%) ggc
 unaccounted todo        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 TOTAL                 : 119.24             0.72           119.96              54344 kB

real    2m2.183s
user    2m0.976s
sys     0m1.074s
Comment 2 Ian Lance Taylor 2012-04-25 22:15:19 UTC
SPARC register allocator slowness filed as PR 53125.
Comment 3 Ian Lance Taylor 2012-04-25 22:48:54 UTC
The 64bit-out.go case appears to be similar.  It is also a generated file, and it also takes a long time to compile.  The register allocator is not quite as dominant, only 43% of compilation time.  In any case I will revisit 64bit-out when and if cmplxdivide is fixed.
Comment 4 Dominik Vogt 2015-02-27 08:48:15 UTC
This also happens intermittently on my s390x development machine (a zEC12) with the current 5.0 development trunk.

(In reply to Ian Lance Taylor from comment #1)
> Interestingly, the time for cmpldivide.go on SPARC appears to be primarily
> in the register allocator while compiling.

To be specific:

   LRA hard reg assignment : 217.88 (95%) usr   0.29 (74%) sys 218.24 (95%) wall       0 kB ( 0%) ggc


> This is true even though no -O option is used.

Actually, on s390x it does not happen

--

Observation
-----------

Compile time of the test is normally about 4 minutes, but I've seen ~3:50 as well as ~4:45.  When the machine is slow for some reason (probably does not matter why), compile time may become more than 5 minutes and therefore the test times out.

Explanation
-----------

The test defines a long array of structures with three complex numbers in cmplxdivide1.go:

  var tests = []Test{ 
    Test{complex(0, 0), complex(0, 0), complex(-nan, -nan)}, 
    Test{complex(0, 0), complex(0, 1), complex(0, 0)}, 
    ...
  }

The constants like "nan" map to exported symbols of the math package (unlike C where this would probably be done with macros): "nan" appears in the code as "math.NaN@plt".  With dynamic linkage the actual value is unknown at compile time, and the structure "tests" is initialised in the init function of the main package.  Compiling with -O0, the executable is about 1.5 MB, and more than 90% of that is code in the init function.  For each line in the table the assembler instuctions to initialise is consume about 420 bytes.

As far as I was told, the register allocation code has some trouble with huge basic blocks of simple code like in this case, when the number of possibilities explodes.

Note: With -O3, the code compiles in less than two seconds, probably because the code in the init function is reduced drastically before the expensive register allocation pass.
Comment 5 Eric Gallager 2018-10-01 18:04:21 UTC
(In reply to Ian Lance Taylor from comment #3)
> The 64bit-out.go case appears to be similar.  It is also a generated file,
> and it also takes a long time to compile.  The register allocator is not
> quite as dominant, only 43% of compilation time.  In any case I will revisit
> 64bit-out when and if cmplxdivide is fixed.

Has cmplxdivide been fixed yet?
Comment 6 Eric Gallager 2019-04-01 04:27:17 UTC
(In reply to Eric Gallager from comment #5)
> (In reply to Ian Lance Taylor from comment #3)
> > The 64bit-out.go case appears to be similar.  It is also a generated file,
> > and it also takes a long time to compile.  The register allocator is not
> > quite as dominant, only 43% of compilation time.  In any case I will revisit
> > 64bit-out when and if cmplxdivide is fixed.
> 
> Has cmplxdivide been fixed yet?

No reply; changing to SUSPENDED since this isn't really a case where closing as INVALID (due to lack of response) is applicable