This caused me a lot of pain in trying to disable specific optimizations with -O1. $ gfortran -O1 -o dasum-1.s -S -fverbose-asm -fno-loop-optimize -fno-tree-sra -fno-tree-ter -fno-omit-frame-pointer -fno-tree-dse -fno-tree-dominator-opts -fno-tree-ch -fno-tree-fre -fno-merge-constants -fno-cprop-registers -fno-if-conversion2 -fno-defer-pop -fno-tree-lrs -fno-guess-branch-probability -fno-tree-ccp -fno-tree-copyrename -fno-tree-dce -fno-if-conversion ../dasum.f $ gfortran -O0 -o dasum-0.s -S -fverbose-asm ../dasum.f $ diff -u dasum-0.s dasum-1.s | head -30 --- dasum-0.s 2005-02-09 13:38:52.000000000 +0100 +++ dasum-1.s 2005-02-09 13:38:46.000000000 +0100 @@ -3,7 +3,12 @@ // GNU F95 version 4.0.0 20050130 (experimental) (ia64-unknown-linux-gnu) // compiled by GNU C version 4.0.0 20050130 (experimental). // GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 -// options passed: -ffixed-form -auxbase-strip -O0 -fverbose-asm +// options passed: -ffixed-form -auxbase-strip -O1 -fverbose-asm +// -fno-loop-optimize -fno-tree-sra -fno-tree-ter -fno-omit-frame-pointer +// -fno-tree-dse -fno-tree-dominator-opts -fno-tree-ch -fno-tree-fre +// -fno-merge-constants -fno-cprop-registers -fno-if-conversion2 +// -fno-defer-pop -fno-tree-lrs -fno-guess-branch-probability -fno-tree-ccp +// -fno-tree-copyrename -fno-tree-dce -fno-if-conversion // options enabled: -falign-loops -fargument-noalias-global // -fbranch-count-reg -fcommon -feliminate-unused-debug-types // -ffunction-cse -fgcse-lm -fident -fivopts -fkeep-static-consts @@ -23,531 +28,280 @@ .prologue 14, 35 .save ar.pfs, r36 alloc r36 = ar.pfs, 3, 4, 2, 0 //,,,, + adds r16 = -8, r12 //,, .vframe r37 mov r37 = r12 //, - adds r12 = -80, r12 //,, + adds r12 = -32, r12 //,, + mov r17 = ar.lc //, + ;; + .savepsp ar.lc, 8 + st8 [r16] = r17, 8 //, mov r38 = r1 //, As you can see from the fact that there is no differences in the "options enabled:" list in the assembly output, the assembly should be identical. However, the compilation results are very different, so -O1 seems to do other, undocumented things. $ cat ../dasum.f double precision function dasum(n,dx,incx) c c takes the sum of the absolute values. c jack dongarra, linpack, 3/11/78. c modified 3/93 to return if incx .le. 0. c modified 12/3/93, array(1) declarations changed to array(*) c double precision dx(*),dtemp integer i,incx,m,mp1,n,nincx c dasum = 0.0d0 dtemp = 0.0d0 if( n.le.0 .or. incx.le.0 )return if(incx.eq.1)go to 20 c c code for increment not equal to 1 c nincx = n*incx do 10 i = 1,nincx,incx dtemp = dtemp + dabs(dx(i)) 10 continue dasum = dtemp return c c code for increment equal to 1 c c c clean-up loop c 20 m = mod(n,6) if( m .eq. 0 ) go to 40 do 30 i = 1,m dtemp = dtemp + dabs(dx(i)) 30 continue if( n .lt. 6 ) go to 60 40 mp1 = m + 1 do 50 i = mp1,n,6 dtemp = dtemp + dabs(dx(i)) + dabs(dx(i + 1)) + dabs(dx(i + 2)) * + dabs(dx(i + 3)) + dabs(dx(i + 4)) + dabs(dx(i + 5)) 50 continue 60 dasum = dtemp return end
Same thing on i686-pc-linux-gnu with the gcc driver: $ cat main.c int main() { return 0; } $ gcc -S -fverbose-asm -o main-o0.s main.c $ gcc -S -fno-cprop-registers -fno-defer-pop -fno-guess-branch-probability -fno-if-conversion -fno-if-conversion2 -fno-loop-optimize -fno-merge-constants -fno-tree-ccp -fno-tree-ch -fno-tree-copyrename -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse -fno-tree-fre -fno-tree-lrs -fno-tree-sra -fno-tree-ter -fverbose-asm -O1 -o main-o1.s main.c $ diff -u main-o0.s main-o1.s --- main-o0.s 2005-02-09 22:17:54.000000000 +0100 +++ main-o1.s 2005-02-09 22:18:14.000000000 +0100 @@ -2,7 +2,12 @@ # GNU C version 4.0.0 20050208 (experimental) (i686-pc-linux-gnu) # compiled by GNU C version 4.0.0 20050203 (experimental). # GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 -# options passed: -mtune=pentiumpro -auxbase-strip -fverbose-asm +# options passed: -mtune=pentiumpro -auxbase-strip -O1 +# -fno-cprop-registers -fno-defer-pop -fno-guess-branch-probability +# -fno-if-conversion -fno-if-conversion2 -fno-loop-optimize +# -fno-merge-constants -fno-tree-ccp -fno-tree-ch -fno-tree-copyrename +# -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse -fno-tree-fre +# -fno-tree-lrs -fno-tree-sra -fno-tree-ter -fverbose-asm # options enabled: -falign-loops -fargument-alias -fbranch-count-reg # -fcommon -feliminate-unused-debug-types -ffunction-cse -fgcse-lm -fident # -fivopts -fkeep-static-consts -fleading-underscore -floop-optimize2 @@ -21,13 +26,8 @@ movl %esp, %ebp #, subl $8, %esp #, andl $-16, %esp #, - movl $0, %eax #, tmp60 - addl $15, %eax #, tmp61 - addl $15, %eax #, tmp62 - shrl $4, %eax #, tmp63 - sall $4, %eax #, tmp64 - subl %eax, %esp # tmp64, - movl $0, %eax #, D.1118 + subl $16, %esp #, + movl $0, %eax #, <result> leave ret .size main, .-main
There are a gazillion places where we just check "if (optimize)" without any specific flag. It would be quite a lot of work to introduce flags for all of them, and I'm not sure it's worth it...
(In reply to comment #2) > There are a gazillion places where we just check "if (optimize)" without > any specific flag. It would be quite a lot of work to introduce flags for all > of them, and I'm not sure it's worth it... $ find . -name '*.c' | xargs grep '( *optimize[) =!><|&]' | wc -l 151 Hmm... It would still be better if this could be at least lumped into an option (maybe -foptimize-misc or whatever) which would still be visible in -fverbose-asm.
$ find . -name '*.c' | xargs grep '[(&|!] *optimize[) =!><|&]' | wc -l 204
*** Bug 19825 has been marked as a duplicate of this bug. ***
(In reply to comment #4) > $ find . -name '*.c' | xargs grep '[(&|!] *optimize[) =!><|&]' | wc -l > 204 Any idea how I should go about further debugging PR 5900? There is a wrong-code for ia-64 there, which apparently depends on one of these 204 places. Thomas
Thomas, perhaps we could divide and conquer. We could manually eliminate each of the 204 optimzations one at a time until the breakage disappears. Maybe we develop a script to locate those, bubblestrap, test, remove, and go on to the next until its discovered. I know its a pain, but it could be done. If not a script, then get several volunteers and assign them out to check. It would be nice to claim a final victory on this. Any thoughts anyone
I don't think this is a valid bug. There are optimizations that won't be controlled by a flag. We already have way too many flags. Please, reopen if I misunderstood the report.