The to-be-attached file compiles very slowly with 4.5: 4.3 ([gcc-4_3-branch revision 135036]): 37s 4.4 ([gcc-4_4-branch revision 150482]): 30s 4.5 ([trunk revision 157940]): 6m35s gfortran -fbounds-check -g -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native -c hog.f90
Created attachment 20287 [details] testcase reproduce with gfortran -fbounds-check -g -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native -c hog.f90
And a timing report as well (notice the machine is not fully idle). The major consumer is tree canonical. Execution times (seconds) garbage collection : 7.71 ( 2%) usr 0.07 ( 4%) sys 14.12 ( 2%) wall 0 kB ( 0%) ggc callgraph construction: 0.18 ( 0%) usr 0.01 ( 1%) sys 0.24 ( 0%) wall 6675 kB ( 1%) ggc callgraph optimization: 0.61 ( 0%) usr 0.03 ( 2%) sys 0.61 ( 0%) wall 1655 kB ( 0%) ggc ipa cp : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 539 kB ( 0%) ggc ipa reference : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall 0 kB ( 0%) ggc ipa pure const : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 0 kB ( 0%) ggc ipa SRA : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc cfg cleanup : 0.78 ( 0%) usr 0.01 ( 1%) sys 1.27 ( 0%) wall 3661 kB ( 0%) ggc CFG verifier : 2.10 ( 1%) usr 0.00 ( 0%) sys 3.40 ( 1%) wall 0 kB ( 0%) ggc trivially dead code : 0.38 ( 0%) usr 0.00 ( 0%) sys 0.40 ( 0%) wall 0 kB ( 0%) ggc df multiple defs : 0.59 ( 0%) usr 0.00 ( 0%) sys 0.92 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.86 ( 0%) usr 0.00 ( 0%) sys 1.83 ( 0%) wall 0 kB ( 0%) ggc df live regs : 4.92 ( 1%) usr 0.01 ( 1%) sys 8.23 ( 1%) wall 0 kB ( 0%) ggc df live&initialized regs: 1.48 ( 0%) usr 0.01 ( 1%) sys 3.37 ( 1%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 0.71 ( 0%) usr 0.00 ( 0%) sys 1.39 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 4.15 ( 1%) usr 0.01 ( 1%) sys 7.47 ( 1%) wall 9314 kB ( 1%) ggc register information : 1.29 ( 0%) usr 0.01 ( 1%) sys 3.00 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 0.64 ( 0%) usr 0.00 ( 0%) sys 0.74 ( 0%) wall 21770 kB ( 3%) ggc alias stmt walking : 1.94 ( 1%) usr 0.06 ( 4%) sys 3.50 ( 1%) wall 0 kB ( 0%) ggc register scan : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 0 kB ( 0%) ggc rebuild jump labels : 0.23 ( 0%) usr 0.00 ( 0%) sys 0.26 ( 0%) wall 0 kB ( 0%) ggc parser : 1.27 ( 0%) usr 0.12 ( 7%) sys 1.50 ( 0%) wall 42200 kB ( 5%) ggc inline heuristics : 0.43 ( 0%) usr 0.02 ( 1%) sys 0.34 ( 0%) wall 0 kB ( 0%) ggc tree gimplify : 0.69 ( 0%) usr 0.03 ( 2%) sys 0.79 ( 0%) wall 52375 kB ( 6%) ggc tree eh : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc tree CFG construction : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 9418 kB ( 1%) ggc tree CFG cleanup : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.80 ( 0%) wall 418 kB ( 0%) ggc tree VRP : 2.08 ( 1%) usr 0.05 ( 3%) sys 3.67 ( 1%) wall 54923 kB ( 7%) ggc tree copy propagation : 0.37 ( 0%) usr 0.00 ( 0%) sys 0.59 ( 0%) wall 237 kB ( 0%) ggc tree find ref. vars : 0.07 ( 0%) usr 0.02 ( 1%) sys 0.09 ( 0%) wall 3774 kB ( 0%) ggc tree PTA : 0.19 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 425 kB ( 0%) ggc tree PHI insertion : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 315 kB ( 0%) ggc tree SSA rewrite : 0.44 ( 0%) usr 0.03 ( 2%) sys 0.80 ( 0%) wall 20682 kB ( 3%) ggc tree SSA other : 0.22 ( 0%) usr 0.02 ( 1%) sys 0.23 ( 0%) wall 434 kB ( 0%) ggc tree SSA incremental : 0.62 ( 0%) usr 0.04 ( 2%) sys 0.91 ( 0%) wall 438 kB ( 0%) ggc tree operand scan : 0.27 ( 0%) usr 0.14 ( 8%) sys 0.53 ( 0%) wall 21791 kB ( 3%) ggc dominator optimization: 0.42 ( 0%) usr 0.00 ( 0%) sys 0.72 ( 0%) wall 4190 kB ( 1%) ggc tree CCP : 0.56 ( 0%) usr 0.01 ( 1%) sys 0.70 ( 0%) wall 3081 kB ( 0%) ggc tree PHI const/copy prop: 0.05 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 22 kB ( 0%) ggc tree split crit edges : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 3268 kB ( 0%) ggc tree reassociation : 0.17 ( 0%) usr 0.00 ( 0%) sys 0.36 ( 0%) wall 161 kB ( 0%) ggc tree PRE : 6.54 ( 2%) usr 0.02 ( 1%) sys 11.71 ( 2%) wall 25200 kB ( 3%) ggc tree FRE : 0.76 ( 0%) usr 0.03 ( 2%) sys 1.15 ( 0%) wall 8100 kB ( 1%) ggc tree code sinking : 0.23 ( 0%) usr 0.04 ( 2%) sys 0.44 ( 0%) wall 12275 kB ( 2%) ggc tree linearize phis : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc tree forward propagate: 0.19 ( 0%) usr 0.01 ( 1%) sys 0.25 ( 0%) wall 9572 kB ( 1%) ggc tree phiprop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree conservative DCE : 0.19 ( 0%) usr 0.02 ( 1%) sys 0.51 ( 0%) wall 17 kB ( 0%) ggc tree aggressive DCE : 0.49 ( 0%) usr 0.01 ( 1%) sys 0.74 ( 0%) wall 2998 kB ( 0%) ggc tree buildin call DCE : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree DSE : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 27 kB ( 0%) ggc tree loop bounds : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.47 ( 0%) wall 6310 kB ( 1%) ggc tree loop invariant motion: 0.29 ( 0%) usr 0.01 ( 1%) sys 0.45 ( 0%) wall 498 kB ( 0%) ggc tree canonical iv : 230.79 (62%) usr 0.10 ( 6%) sys 393.03 (61%) wall 146373 kB (18%) ggc scev constant prop : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%) wall 5809 kB ( 1%) ggc tree loop unswitching : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc complete unrolling : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall 1123 kB ( 0%) ggc tree vectorization : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 40 kB ( 0%) ggc tree slp vectorization: 0.48 ( 0%) usr 0.00 ( 0%) sys 0.83 ( 0%) wall 19329 kB ( 2%) ggc tree iv optimization : 0.59 ( 0%) usr 0.00 ( 0%) sys 0.77 ( 0%) wall 13315 kB ( 2%) ggc predictive commoning : 1.44 ( 0%) usr 0.00 ( 0%) sys 2.29 ( 0%) wall 40577 kB ( 5%) ggc tree loop init : 0.17 ( 0%) usr 0.01 ( 1%) sys 0.31 ( 0%) wall 5246 kB ( 1%) ggc tree loop fini : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree copy headers : 0.02 ( 0%) usr 0.01 ( 1%) sys 0.07 ( 0%) wall 758 kB ( 0%) ggc tree SSA uncprop : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 0 kB ( 0%) ggc tree rename SSA copies: 0.06 ( 0%) usr 0.00 ( 0%) sys 0.13 ( 0%) wall 0 kB ( 0%) ggc tree SSA verifier : 9.57 ( 3%) usr 0.01 ( 1%) sys 15.09 ( 2%) wall 0 kB ( 0%) ggc tree STMT verifier : 18.08 ( 5%) usr 0.10 ( 6%) sys 30.59 ( 5%) wall 0 kB ( 0%) ggc tree switch initialization conversion: 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc callgraph verifier : 1.64 ( 0%) usr 0.00 ( 0%) sys 1.83 ( 0%) wall 0 kB ( 0%) ggc dominance frontiers : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.58 ( 0%) usr 0.00 ( 0%) sys 0.84 ( 0%) wall 0 kB ( 0%) ggc control dependences : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc expand : 8.51 ( 2%) usr 0.05 ( 3%) sys 15.28 ( 2%) wall 76554 kB ( 9%) ggc jump : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc forward prop : 1.18 ( 0%) usr 0.00 ( 0%) sys 2.75 ( 0%) wall 6749 kB ( 1%) ggc CSE : 1.51 ( 0%) usr 0.01 ( 1%) sys 2.73 ( 0%) wall 1375 kB ( 0%) ggc dead code elimination : 0.73 ( 0%) usr 0.00 ( 0%) sys 1.60 ( 0%) wall 0 kB ( 0%) ggc dead store elim1 : 0.75 ( 0%) usr 0.01 ( 1%) sys 1.18 ( 0%) wall 5337 kB ( 1%) ggc dead store elim2 : 1.39 ( 0%) usr 0.00 ( 0%) sys 2.67 ( 0%) wall 6079 kB ( 1%) ggc loop analysis : 0.08 ( 0%) usr 0.01 ( 1%) sys 0.06 ( 0%) wall 61 kB ( 0%) ggc loop invariant motion : 0.10 ( 0%) usr 0.01 ( 1%) sys 0.16 ( 0%) wall 1 kB ( 0%) ggc loop unswitching : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 0 kB ( 0%) ggc loop unrolling : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 190 kB ( 0%) ggc CPROP : 1.05 ( 0%) usr 0.00 ( 0%) sys 1.94 ( 0%) wall 7896 kB ( 1%) ggc PRE : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.52 ( 0%) wall 882 kB ( 0%) ggc web : 1.08 ( 0%) usr 0.00 ( 0%) sys 1.81 ( 0%) wall 23 kB ( 0%) ggc CSE 2 : 1.53 ( 0%) usr 0.00 ( 0%) sys 2.51 ( 0%) wall 793 kB ( 0%) ggc branch prediction : 0.14 ( 0%) usr 0.01 ( 1%) sys 0.25 ( 0%) wall 4053 kB ( 0%) ggc combiner : 2.39 ( 1%) usr 0.02 ( 1%) sys 4.13 ( 1%) wall 26323 kB ( 3%) ggc if-conversion : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 130 kB ( 0%) ggc regmove : 0.36 ( 0%) usr 0.00 ( 0%) sys 0.47 ( 0%) wall 4 kB ( 0%) ggc integrated RA : 8.51 ( 2%) usr 0.01 ( 1%) sys 14.18 ( 2%) wall 8933 kB ( 1%) ggc reload : 1.93 ( 1%) usr 0.04 ( 2%) sys 3.31 ( 1%) wall 1774 kB ( 0%) ggc reload CSE regs : 0.80 ( 0%) usr 0.01 ( 1%) sys 1.54 ( 0%) wall 9904 kB ( 1%) ggc load CSE after reload : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 0 kB ( 0%) ggc thread pro- & epilogue: 0.14 ( 0%) usr 0.00 ( 0%) sys 0.24 ( 0%) wall 572 kB ( 0%) ggc if-conversion 2 : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 59 kB ( 0%) ggc combine stack adjustments: 0.04 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc peephole 2 : 0.44 ( 0%) usr 0.00 ( 0%) sys 0.56 ( 0%) wall 2057 kB ( 0%) ggc rename registers : 0.44 ( 0%) usr 0.00 ( 0%) sys 0.85 ( 0%) wall 701 kB ( 0%) ggc hard reg cprop : 0.64 ( 0%) usr 0.00 ( 0%) sys 1.03 ( 0%) wall 35 kB ( 0%) ggc scheduling 2 : 1.70 ( 0%) usr 0.03 ( 2%) sys 3.15 ( 0%) wall 257 kB ( 0%) ggc machine dep reorg : 0.18 ( 0%) usr 0.00 ( 0%) sys 0.41 ( 0%) wall 0 kB ( 0%) ggc reorder blocks : 0.13 ( 0%) usr 0.00 ( 0%) sys 0.26 ( 0%) wall 2145 kB ( 0%) ggc final : 0.91 ( 0%) usr 0.03 ( 2%) sys 1.67 ( 0%) wall 5904 kB ( 1%) ggc symout : 0.47 ( 0%) usr 0.07 ( 4%) sys 1.15 ( 0%) wall 50781 kB ( 6%) ggc variable tracking : 26.64 ( 7%) usr 0.32 (19%) sys 48.05 ( 7%) wall 38563 kB ( 5%) ggc plugin execution : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc TOTAL : 374.92 1.71 641.15 817719 kB Extra diagnostic checks enabled; compiler may run slowly. Configure with --enable-checking=release to disable checks. real 10m46.566s user 6m17.140s
This tells me you are comparing apples and cows: "Extra diagnostic checks enabled; compiler may run slowly." Could you try again with a compiler configured with --enable=checking=release?
(In reply to comment #3) > This tells me you are comparing apples and cows: "Extra diagnostic checks > enabled; compiler may run slowly." > > Could you try again with a compiler configured with --enable=checking=release? > I'll do now... for reference, 4.4 has: > gfortran -ftime-report -fbounds-check -g -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native hog.f90 Execution times (seconds) garbage collection : 0.15 ( 1%) usr 0.00 ( 0%) sys 0.14 ( 1%) wall 0 kB ( 0%) ggc callgraph construction: 0.33 ( 1%) usr 0.03 ( 4%) sys 0.33 ( 1%) wall 9447 kB ( 2%) ggc callgraph optimization: 0.46 ( 2%) usr 0.01 ( 1%) sys 0.50 ( 2%) wall 239 kB ( 0%) ggc ipa cp : 0.22 ( 1%) usr 0.00 ( 0%) sys 0.24 ( 1%) wall 0 kB ( 0%) ggc ipa reference : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc cfg cleanup : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 914 kB ( 0%) ggc trivially dead code : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 0 kB ( 0%) ggc df live regs : 0.22 ( 1%) usr 0.00 ( 0%) sys 0.18 ( 1%) wall 0 kB ( 0%) ggc df live&initialized regs: 0.10 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 0.11 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 0.17 ( 1%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 3443 kB ( 1%) ggc register information : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 0.14 ( 1%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 6273 kB ( 1%) ggc register scan : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc rebuild jump labels : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc parser : 1.27 ( 5%) usr 0.12 (16%) sys 1.31 ( 5%) wall 50936 kB ( 9%) ggc inline heuristics : 0.13 ( 1%) usr 0.05 ( 6%) sys 0.25 ( 1%) wall 0 kB ( 0%) ggc tree gimplify : 0.44 ( 2%) usr 0.04 ( 5%) sys 0.54 ( 2%) wall 61550 kB (11%) ggc tree eh : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc tree CFG construction : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 9734 kB ( 2%) ggc tree CFG cleanup : 0.28 ( 1%) usr 0.00 ( 0%) sys 0.18 ( 1%) wall 668 kB ( 0%) ggc tree VRP : 1.21 ( 5%) usr 0.03 ( 4%) sys 1.26 ( 5%) wall 42193 kB ( 8%) ggc tree copy propagation : 0.21 ( 1%) usr 0.00 ( 0%) sys 0.24 ( 1%) wall 315 kB ( 0%) ggc tree find ref. vars : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 8937 kB ( 2%) ggc tree PTA : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 758 kB ( 0%) ggc tree alias analysis : 0.12 ( 0%) usr 0.05 ( 6%) sys 0.12 ( 0%) wall 77 kB ( 0%) ggc tree call clobbering : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 18 kB ( 0%) ggc tree flow sensitive alias: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 121 kB ( 0%) ggc tree flow insensitive alias: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree memory partitioning: 0.00 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 21 kB ( 0%) ggc tree PHI insertion : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 201 kB ( 0%) ggc tree SSA rewrite : 0.17 ( 1%) usr 0.01 ( 1%) sys 0.13 ( 1%) wall 19668 kB ( 4%) ggc tree SSA other : 0.11 ( 0%) usr 0.03 ( 4%) sys 0.18 ( 1%) wall 360 kB ( 0%) ggc tree SSA incremental : 0.24 ( 1%) usr 0.02 ( 3%) sys 0.25 ( 1%) wall 40 kB ( 0%) ggc tree operand scan : 0.36 ( 1%) usr 0.15 (19%) sys 0.58 ( 2%) wall 27070 kB ( 5%) ggc dominator optimization: 0.26 ( 1%) usr 0.00 ( 0%) sys 0.14 ( 1%) wall 2270 kB ( 0%) ggc tree SRA : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree CCP : 0.33 ( 1%) usr 0.01 ( 1%) sys 0.24 ( 1%) wall 4060 kB ( 1%) ggc tree reassociation : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 124 kB ( 0%) ggc tree PRE : 5.18 (21%) usr 0.05 ( 6%) sys 5.07 (20%) wall 87699 kB (16%) ggc tree FRE : 0.51 ( 2%) usr 0.00 ( 0%) sys 0.55 ( 2%) wall 7664 kB ( 1%) ggc tree code sinking : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 75 kB ( 0%) ggc tree linearize phis : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree forward propagate: 0.11 ( 0%) usr 0.02 ( 3%) sys 0.11 ( 0%) wall 11274 kB ( 2%) ggc tree phiprop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree conservative DCE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 2 kB ( 0%) ggc tree aggressive DCE : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 5 kB ( 0%) ggc tree DSE : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 65 kB ( 0%) ggc tree loop bounds : 0.23 ( 1%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 5701 kB ( 1%) ggc loop invariant motion : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 1%) wall 0 kB ( 0%) ggc tree canonical iv : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 23 kB ( 0%) ggc scev constant prop : 0.08 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 520 kB ( 0%) ggc tree loop unswitching : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc complete unrolling : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 1121 kB ( 0%) ggc tree iv optimization : 2.30 (10%) usr 0.01 ( 1%) sys 2.33 ( 9%) wall 34677 kB ( 6%) ggc predictive commoning : 0.94 ( 4%) usr 0.02 ( 3%) sys 1.00 ( 4%) wall 42843 kB ( 8%) ggc tree loop init : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 232 kB ( 0%) ggc tree SSA uncprop : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA to normal : 0.11 ( 0%) usr 0.02 ( 3%) sys 0.22 ( 1%) wall 17790 kB ( 3%) ggc tree rename SSA copies: 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc dominance frontiers : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 0 kB ( 0%) ggc control dependences : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc expand : 0.87 ( 4%) usr 0.01 ( 1%) sys 0.89 ( 4%) wall 27910 kB ( 5%) ggc forward prop : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 3922 kB ( 1%) ggc CSE : 0.53 ( 2%) usr 0.01 ( 1%) sys 0.65 ( 3%) wall 639 kB ( 0%) ggc dead code elimination : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc dead store elim1 : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 2892 kB ( 1%) ggc dead store elim2 : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 2730 kB ( 1%) ggc loop analysis : 0.03 ( 0%) usr 0.01 ( 1%) sys 0.04 ( 0%) wall 389 kB ( 0%) ggc CPROP 1 : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 430 kB ( 0%) ggc PRE : 0.08 ( 0%) usr 0.01 ( 1%) sys 0.11 ( 0%) wall 15 kB ( 0%) ggc CPROP 2 : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 1014 kB ( 0%) ggc bypass jumps : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 1039 kB ( 0%) ggc web : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 14 kB ( 0%) ggc CSE 2 : 0.20 ( 1%) usr 0.00 ( 0%) sys 0.25 ( 1%) wall 228 kB ( 0%) ggc branch prediction : 0.14 ( 1%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 4586 kB ( 1%) ggc combiner : 0.79 ( 3%) usr 0.01 ( 1%) sys 0.78 ( 3%) wall 15629 kB ( 3%) ggc if-conversion : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 140 kB ( 0%) ggc regmove : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 0 kB ( 0%) ggc integrated RA : 0.81 ( 3%) usr 0.00 ( 0%) sys 1.02 ( 4%) wall 2360 kB ( 0%) ggc reload : 0.57 ( 2%) usr 0.00 ( 0%) sys 0.43 ( 2%) wall 2090 kB ( 0%) ggc reload CSE regs : 0.39 ( 2%) usr 0.01 ( 1%) sys 0.43 ( 2%) wall 4804 kB ( 1%) ggc load CSE after reload : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 31 kB ( 0%) ggc thread pro- & epilogue: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 451 kB ( 0%) ggc if-conversion 2 : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 59 kB ( 0%) ggc peephole 2 : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 81 kB ( 0%) ggc rename registers : 0.35 ( 1%) usr 0.00 ( 0%) sys 0.36 ( 1%) wall 379 kB ( 0%) ggc scheduling 2 : 0.51 ( 2%) usr 0.00 ( 0%) sys 0.56 ( 2%) wall 342 kB ( 0%) ggc machine dep reorg : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 2 kB ( 0%) ggc reorder blocks : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 129 kB ( 0%) ggc final : 0.16 ( 1%) usr 0.03 ( 4%) sys 0.23 ( 1%) wall 745 kB ( 0%) ggc symout : 0.04 ( 0%) usr 0.01 ( 1%) sys 0.04 ( 0%) wall 3436 kB ( 1%) ggc variable tracking : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 1278 kB ( 0%) ggc TOTAL : 24.14 0.77 24.91 537367 kB /data03/vondele/gcc_4_4_branch/build/lib/gcc/x86_64-unknown-linux-gnu/4.4.2/libgfortranbegin.a(fmain.o): In function `main': /data03/vondele/gcc_4_4_branch/gcc/libgfortran/fmain.c:21: undefined reference to `MAIN__' collect2: ld returned 1 exit status
(In reply to comment #3) cows with cows now (i.e. --enable-checking=release), on an idle machine. Execution times (seconds) garbage collection : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.31 ( 0%) wall 0 kB ( 0%) ggc callgraph construction: 0.11 ( 0%) usr 0.01 ( 1%) sys 0.12 ( 0%) wall 5939 kB ( 1%) ggc callgraph optimization: 0.29 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall 184 kB ( 0%) ggc ipa cp : 0.10 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 539 kB ( 0%) ggc ipa reference : 0.07 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 0 kB ( 0%) ggc ipa pure const : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.14 ( 0%) wall 0 kB ( 0%) ggc cfg cleanup : 0.67 ( 0%) usr 0.00 ( 0%) sys 0.83 ( 0%) wall 3661 kB ( 1%) ggc trivially dead code : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 0 kB ( 0%) ggc df multiple defs : 0.35 ( 0%) usr 0.00 ( 0%) sys 0.36 ( 0%) wall 0 kB ( 0%) ggc df reaching defs : 0.69 ( 0%) usr 0.00 ( 0%) sys 0.65 ( 0%) wall 0 kB ( 0%) ggc df live regs : 3.08 ( 1%) usr 0.00 ( 0%) sys 3.07 ( 1%) wall 0 kB ( 0%) ggc df live&initialized regs: 1.17 ( 0%) usr 0.00 ( 0%) sys 1.07 ( 0%) wall 0 kB ( 0%) ggc df use-def / def-use chains: 0.53 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%) wall 0 kB ( 0%) ggc df reg dead/unused notes: 2.50 ( 1%) usr 0.00 ( 0%) sys 2.73 ( 1%) wall 9314 kB ( 1%) ggc register information : 1.05 ( 0%) usr 0.00 ( 0%) sys 0.84 ( 0%) wall 0 kB ( 0%) ggc alias analysis : 0.58 ( 0%) usr 0.00 ( 0%) sys 0.61 ( 0%) wall 21770 kB ( 3%) ggc alias stmt walking : 1.29 ( 0%) usr 0.04 ( 4%) sys 1.36 ( 0%) wall 0 kB ( 0%) ggc register scan : 0.09 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 0 kB ( 0%) ggc rebuild jump labels : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.25 ( 0%) wall 0 kB ( 0%) ggc parser : 1.15 ( 0%) usr 0.12 (11%) sys 1.26 ( 0%) wall 42200 kB ( 6%) ggc inline heuristics : 0.24 ( 0%) usr 0.01 ( 1%) sys 0.24 ( 0%) wall 0 kB ( 0%) ggc tree gimplify : 0.43 ( 0%) usr 0.05 ( 4%) sys 0.47 ( 0%) wall 52375 kB ( 8%) ggc tree eh : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc tree CFG construction : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 9418 kB ( 1%) ggc tree CFG cleanup : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.46 ( 0%) wall 418 kB ( 0%) ggc tree VRP : 1.57 ( 1%) usr 0.06 ( 5%) sys 1.60 ( 1%) wall 54731 kB ( 8%) ggc tree copy propagation : 0.20 ( 0%) usr 0.00 ( 0%) sys 0.29 ( 0%) wall 237 kB ( 0%) ggc tree find ref. vars : 0.03 ( 0%) usr 0.01 ( 1%) sys 0.10 ( 0%) wall 3774 kB ( 1%) ggc tree PTA : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.08 ( 0%) wall 423 kB ( 0%) ggc tree PHI insertion : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 315 kB ( 0%) ggc tree SSA rewrite : 0.24 ( 0%) usr 0.02 ( 2%) sys 0.19 ( 0%) wall 20682 kB ( 3%) ggc tree SSA other : 0.10 ( 0%) usr 0.04 ( 4%) sys 0.19 ( 0%) wall 434 kB ( 0%) ggc tree SSA incremental : 0.56 ( 0%) usr 0.02 ( 2%) sys 0.66 ( 0%) wall 438 kB ( 0%) ggc tree operand scan : 0.21 ( 0%) usr 0.20 (18%) sys 0.42 ( 0%) wall 21791 kB ( 3%) ggc dominator optimization: 0.35 ( 0%) usr 0.01 ( 1%) sys 0.36 ( 0%) wall 4189 kB ( 1%) ggc tree SRA : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree CCP : 0.49 ( 0%) usr 0.00 ( 0%) sys 0.34 ( 0%) wall 3081 kB ( 0%) ggc tree PHI const/copy prop: 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 22 kB ( 0%) ggc tree split crit edges : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 3265 kB ( 0%) ggc tree reassociation : 0.12 ( 0%) usr 0.01 ( 1%) sys 0.11 ( 0%) wall 161 kB ( 0%) ggc tree PRE : 4.88 ( 2%) usr 0.00 ( 0%) sys 4.89 ( 2%) wall 25200 kB ( 4%) ggc tree FRE : 0.65 ( 0%) usr 0.02 ( 2%) sys 0.67 ( 0%) wall 8099 kB ( 1%) ggc tree code sinking : 0.16 ( 0%) usr 0.05 ( 4%) sys 0.17 ( 0%) wall 12275 kB ( 2%) ggc tree linearize phis : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc tree forward propagate: 0.14 ( 0%) usr 0.00 ( 0%) sys 0.17 ( 0%) wall 9572 kB ( 1%) ggc tree phiprop : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree conservative DCE : 0.21 ( 0%) usr 0.03 ( 3%) sys 0.15 ( 0%) wall 17 kB ( 0%) ggc tree aggressive DCE : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.16 ( 0%) wall 2998 kB ( 0%) ggc tree DSE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 26 kB ( 0%) ggc PHI merge : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 5 kB ( 0%) ggc tree loop bounds : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.19 ( 0%) wall 6263 kB ( 1%) ggc tree loop invariant motion: 0.19 ( 0%) usr 0.01 ( 1%) sys 0.19 ( 0%) wall 497 kB ( 0%) ggc tree canonical iv : 223.30 (75%) usr 0.01 ( 1%) sys 223.28 (75%) wall 21873 kB ( 3%) ggc scev constant prop : 0.05 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 5809 kB ( 1%) ggc tree loop unswitching : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc complete unrolling : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.06 ( 0%) wall 1123 kB ( 0%) ggc tree slp vectorization: 0.38 ( 0%) usr 0.00 ( 0%) sys 0.31 ( 0%) wall 19328 kB ( 3%) ggc tree iv optimization : 0.39 ( 0%) usr 0.00 ( 0%) sys 0.47 ( 0%) wall 13309 kB ( 2%) ggc predictive commoning : 1.13 ( 0%) usr 0.01 ( 1%) sys 1.17 ( 0%) wall 40528 kB ( 6%) ggc tree loop init : 0.13 ( 0%) usr 0.01 ( 1%) sys 0.07 ( 0%) wall 5208 kB ( 1%) ggc tree copy headers : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 758 kB ( 0%) ggc tree SSA uncprop : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 0 kB ( 0%) ggc tree rename SSA copies: 0.08 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 0 kB ( 0%) ggc dominance frontiers : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.07 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.26 ( 0%) usr 0.00 ( 0%) sys 0.21 ( 0%) wall 0 kB ( 0%) ggc control dependences : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc expand : 2.97 ( 1%) usr 0.04 ( 4%) sys 3.11 ( 1%) wall 76883 kB (11%) ggc lower subreg : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 0 kB ( 0%) ggc forward prop : 0.89 ( 0%) usr 0.00 ( 0%) sys 0.85 ( 0%) wall 6749 kB ( 1%) ggc CSE : 1.46 ( 0%) usr 0.01 ( 1%) sys 1.51 ( 1%) wall 1369 kB ( 0%) ggc dead code elimination : 0.45 ( 0%) usr 0.00 ( 0%) sys 0.43 ( 0%) wall 0 kB ( 0%) ggc dead store elim1 : 0.60 ( 0%) usr 0.00 ( 0%) sys 0.44 ( 0%) wall 5337 kB ( 1%) ggc dead store elim2 : 0.48 ( 0%) usr 0.00 ( 0%) sys 0.42 ( 0%) wall 6072 kB ( 1%) ggc loop invariant motion : 0.06 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 1 kB ( 0%) ggc loop unswitching : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.05 ( 0%) wall 0 kB ( 0%) ggc loop unrolling : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 190 kB ( 0%) ggc CPROP : 0.84 ( 0%) usr 0.02 ( 2%) sys 0.81 ( 0%) wall 7746 kB ( 1%) ggc PRE : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%) wall 777 kB ( 0%) ggc web : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.11 ( 0%) wall 16 kB ( 0%) ggc CSE 2 : 1.42 ( 0%) usr 0.00 ( 0%) sys 1.54 ( 1%) wall 793 kB ( 0%) ggc branch prediction : 0.16 ( 0%) usr 0.00 ( 0%) sys 0.10 ( 0%) wall 4053 kB ( 1%) ggc combiner : 2.05 ( 1%) usr 0.02 ( 2%) sys 2.10 ( 1%) wall 26058 kB ( 4%) ggc if-conversion : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.04 ( 0%) wall 130 kB ( 0%) ggc regmove : 0.26 ( 0%) usr 0.00 ( 0%) sys 0.21 ( 0%) wall 4 kB ( 0%) ggc integrated RA : 4.46 ( 1%) usr 0.00 ( 0%) sys 4.24 ( 1%) wall 8905 kB ( 1%) ggc reload : 1.47 ( 0%) usr 0.00 ( 0%) sys 1.55 ( 1%) wall 1737 kB ( 0%) ggc reload CSE regs : 0.73 ( 0%) usr 0.01 ( 1%) sys 0.76 ( 0%) wall 9904 kB ( 1%) ggc load CSE after reload : 0.11 ( 0%) usr 0.00 ( 0%) sys 0.12 ( 0%) wall 0 kB ( 0%) ggc thread pro- & epilogue: 0.09 ( 0%) usr 0.00 ( 0%) sys 0.15 ( 0%) wall 572 kB ( 0%) ggc if-conversion 2 : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 59 kB ( 0%) ggc combine stack adjustments: 0.07 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc peephole 2 : 0.22 ( 0%) usr 0.00 ( 0%) sys 0.30 ( 0%) wall 2057 kB ( 0%) ggc rename registers : 0.48 ( 0%) usr 0.00 ( 0%) sys 0.50 ( 0%) wall 701 kB ( 0%) ggc hard reg cprop : 0.29 ( 0%) usr 0.00 ( 0%) sys 0.35 ( 0%) wall 35 kB ( 0%) ggc scheduling 2 : 1.42 ( 0%) usr 0.00 ( 0%) sys 1.42 ( 0%) wall 222 kB ( 0%) ggc machine dep reorg : 0.24 ( 0%) usr 0.00 ( 0%) sys 0.23 ( 0%) wall 0 kB ( 0%) ggc reorder blocks : 0.12 ( 0%) usr 0.00 ( 0%) sys 0.09 ( 0%) wall 2144 kB ( 0%) ggc final : 0.56 ( 0%) usr 0.06 ( 5%) sys 0.76 ( 0%) wall 5904 kB ( 1%) ggc symout : 0.39 ( 0%) usr 0.06 ( 5%) sys 0.44 ( 0%) wall 50781 kB ( 7%) ggc variable tracking : 23.48 ( 8%) usr 0.17 (15%) sys 23.48 ( 8%) wall 38556 kB ( 6%) ggc plugin execution : 0.02 ( 0%) usr 0.01 ( 1%) sys 0.03 ( 0%) wall 0 kB ( 0%) ggc TOTAL : 298.36 1.14 299.51 690347 kB COLLECT_GCC_OPTIONS='-v' '-ftime-report' '-fbounds-check' '-g' '-O3' '-ffast-math' '-funroll-loops' '-ftree-vectorize' '-c' as -V -Qy --64 -o hog.o /tmp/cclB9I15.s GNU assembler version 2.18.50 (x86_64-suse-linux) using BFD version (GNU Binutils; openSUSE 11.0) 2.18.50.20080409-11.1 COMPILER_PATH=/data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/:/data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/:/data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/:/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.5.0/:/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/ LIBRARY_PATH=/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.5.0/:/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.5.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.5.0/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-ftime-report' '-fbounds-check' '-g' '-O3' '-ffast-math' '-funroll-loops' '-ftree-vectorize' '-c'
The issue is for certain the many manually unrolled loops and possibly the new autoinc code. What's your native arch? I can't reproduce this on a core i?86.
(In reply to comment #6) > What's your native arch? I can't reproduce this on a core i?86. -v output: /data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/f951 hog.f90 -march=k8-sse3 -mcx16 -msahf --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=1024 -mtune=k8 -quiet -dumpbase hog.f90 -auxbase hog -g -O3 -version -fbounds-check -ffast-math -funroll-loops -ftree-vectorize -fintrinsic-modules-path /data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.5.0/finclude -o /tmp/ccA2YvFn.s
Confirmed on x86_64-linux with -O2 -fbounds-check. find_loop_niter_by_eval takes a lot of time in each of the ints2bits_* routines because the loops have a lot of exits (due to -fbounds-check).
Created attachment 20290 [details] smaller testcase (needs 3s, 80% in tree canonical iv)
Created attachment 20291 [details] reduced testcase
Compared to 4.4 we no longer eliminate most of the bound checks in 4.5.
(In reply to comment #9) > Created an attachment (id=20290) [edit] > smaller testcase (needs 3s, 80% in tree canonical iv) from valgrind, I see some 13000000 cals to get_val_for / fold_binary_loc, for the small testcase
Testcase for that: MODULE hfx_compression_core_methods IMPLICIT NONE INTEGER, PARAMETER :: int_8=8 CONTAINS SUBROUTINE ints2bits_3(Ndata,packed_data,full_data) INTEGER, INTENT(IN) :: Ndata INTEGER(KIND=int_8), INTENT(OUT) :: packed_data(*) INTEGER(KIND=int_8), INTENT(IN) :: full_data(*) INTEGER, PARAMETER :: Nbits = 3 INTEGER :: idata, ipack, kdata, Ndata_rep INTEGER(KIND=int_8) :: data_tmp, pack_tmp idata=0 ipack=0 Ndata_rep=(Ndata/2)*2 DO kdata=1,Ndata_rep,2 pack_tmp=0 idata=idata+1 data_tmp = full_data(idata) data_tmp = ISHFT(data_tmp,61) pack_tmp = IOR(pack_tmp,data_tmp) pack_tmp = ISHFT(pack_tmp,-3) idata=idata+1 data_tmp = full_data(idata) data_tmp = ISHFT(data_tmp,61) pack_tmp = IOR(pack_tmp,data_tmp) pack_tmp = ISHFT(pack_tmp,0) pack_tmp = ISHFT(pack_tmp,0) ipack = ipack + 1 packed_data(ipack) = pack_tmp ENDDO END SUBROUTINE ints2bits_3 END MODULE hfx_compression_core_methods likely caused by 2010-02-16 Richard Guenther <rguenther@suse.de> PR tree-optimization/41043 * tree-vrp.c (vrp_var_may_overflow): Only ask SCEV for real loops. (vrp_visit_assignment_or_call): Do not ask SCEV for regular statements ... (vrp_visit_phi_node): ... but only for loop PHI nodes.
Interestingly it works on i?86 ...
C testcase for the missed VRP, fails with long on x86_64 only, with long long also on i?86: extern void link_error (void) __attribute__((noreturn)); int n; float *x; int main() { if (n > 0) { int i = 0; do { long index; i = i + 1; index = i; if (index <= 0) link_error (); x[index] = 0; i = i + 1; index = i; if (index <= 0) link_error (); x[index] = 0; } while (i < n); } }
It's the strict-overflow stuff that cripples VRP again here. I have a kludge.
Created attachment 20292 [details] minimal patch I'm testing this minimal patch.
GCC 4.5.0 is being released. Deferring to 4.5.1.
Subject: Bug 43627 Author: rguenth Date: Tue Apr 6 12:32:25 2010 New Revision: 157992 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=157992 Log: 2010-04-06 Richard Guenther <rguenther@suse.de> PR tree-optimization/43627 * tree-vrp.c (extract_range_from_unary_expr): Widenings of [1, +INF(OVF)] go to [1, +INF(OVF)] of the wider type, not varying. * gcc.dg/tree-ssa/vrp49.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp49.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vrp.c
Fixed on trunk sofar. Queued for 4.5.1.
Subject: Bug 43627 Author: rguenth Date: Thu Apr 15 13:46:42 2010 New Revision: 158377 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=158377 Log: 2010-04-15 Richard Guenther <rguenther@suse.de> PR tree-optimization/43627 * tree-vrp.c (extract_range_from_unary_expr): Widenings of [1, +INF(OVF)] go to [1, +INF(OVF)] of the wider type, not varying. * gcc.dg/tree-ssa/vrp49.c: New testcase. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.dg/tree-ssa/vrp49.c Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/testsuite/ChangeLog branches/gcc-4_5-branch/gcc/tree-vrp.c
Fixed.