15135 – program hangs in call to sqrt when compiled with -O

Bug 15135 - program hangs in call to sqrt when compiled with -O

Summary: program hangs in call to sqrt when compiled with -O

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	3.4.0

Importance:	P2 normal
Target Milestone:	4.0.0
Assignee:	Not yet assigned to anyone

URL:
Keywords:	wrong-code

Depends on:
Blocks:

Reported:	2004-04-25 16:52 UTC by dfg
Modified:	2006-05-29 10:37 UTC (History)
CC List:	1 user (show)

See Also:
Host:	i686-pc-linux-gnu
Target:	i686-pc-linux-gnu
Build:	i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:	2004-04-25 17:34:12

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description dfg 2004-04-25 16:52:45 UTC

The program below compiles and runs when compiled without any compiler options,
but hangs in the call to sqrt when compiled with -O. All simple changes I made
to it make it run through.

The system I'm working on is a dual processor IBM blade with 2Gb of memory,
running RedHat enterprise. Please contact me if you need further specs. I
compiled using version 3.2.3 20030502 of gcc, 3.4.0 myself with no special
configuration options:

> gcc-3.4.0 -v       
Reading specs from /usr/local/lib/gcc/i686-pc-linux-gnu/3.4.0/specs
Configured with: ./configure --program-suffix=-3.4.0
Thread model: posix
gcc version 3.4.0

Again "g++-3.4.0 -O test.cc; ./a.out" hangs indefinately, while "g++-3.4.0
test.cc; ./a.out" exits.

I changed inclusion of math.h to explicit def of NAN and direct call to
__builtin_sqrt so that temp files would be smaller.

Daniel

test.cc:
# define NAN \
  (__extension__                                                            \
   ((union { unsigned __l __attribute__((__mode__(__SI__))); float __d; })  \
    { __l: 0x7fc00000UL }).__d)

inline double fmax(double a, double b) {
  if (a < b) return b;
  else return a;
}

double f() {
  return __builtin_sqrt(NAN/fmax(1., 1./.0));
}

int main() {
  double u = f();
  return 0;
}


test.ii:
# 1 "test.cc"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "test.cc"





inline double fmax(double a, double b) {
  if (a < b) return b;
  else return a;
}

double f() {
  return __builtin_sqrt((__extension__ ((union { unsigned __l
__attribute__((__mode__(__SI__))); float __d; }) { __l: 0x7fc00000UL
}).__d)/fmax(1., 1./.0));
}

int main() {
  double u = f();
  return 0;
}


test.s:
	.file	"test.cc"
	.section	.rodata.cst8,"aM",@progbits,8
	.align 8
.LC3:
	.long	0
	.long	0
	.text
	.align 2
.globl _Z1fv
	.type	_Z1fv, @function
_Z1fv:
.LFB3:
	pushl	%ebp
.LCFI0:
	movl	%esp, %ebp
.LCFI1:
	subl	$4, %esp
.LCFI2:
	movl	$2143289344, %eax
	movl	%eax, -4(%ebp)
	flds	-4(%ebp)
	fld1
	fld	%st(0)
	fdivl	.LC3
	fucom	%st(1)
	fnstsw	%ax
	sahf
	ja	.L16
	fstp	%st(0)
	jmp	.L15
.L16:
	fstp	%st(1)
	jmp	.L15
.L18:
	fstp	%st(1)
.L15:
	fdivr	%st(1), %st
	fsqrt
	fucom	%st(0)
	fnstsw	%ax
	sahf
	jp	.L17
	je	.L6
	fstp	%st(0)
	jmp	.L14
.L17:
	fstp	%st(0)
.L14:
	fld1
	fld	%st(0)
	fdivl	.LC3
	fucom	%st(1)
	fnstsw	%ax
	sahf
	ja	.L18
	fstp	%st(0)
	jmp	.L15
.L6:
	fstp	%st(1)
	leave
	ret
.LFE3:
	.size	_Z1fv, .-_Z1fv
	.align 2
.globl main
	.type	main, @function
main:
.LFB4:
	pushl	%ebp
.LCFI3:
	movl	%esp, %ebp
.LCFI4:
	subl	$8, %esp
.LCFI5:
	andl	$-16, %esp
	subl	$16, %esp
	call	_Z1fv
	fstp	%st(0)
	movl	$0, %eax
	leave
	ret
.LFE4:
	.size	main, .-main
	.section	.note.GNU-stack,"",@progbits
	.ident	"GCC: (GNU) 3.4.0"

Comment 1 Andrew Pinski 2004-04-25 17:34:12 UTC

A simpler testcase for 3.4.0 and above:
inline double fmax(double a, double b) {
 return (a<=b)? b : a;
}
double f() {
  return __builtin_sqrt(__builtin_nan("")/fmax(1., 1./.0));
}
int main() {
  double u = f();
  return 0;
}

Confirmed.

Comment 2 Andrew Pinski 2004-07-25 07:20:01 UTC

Fixed or really masked so much I can no longer reproduce it with any variant of the source.

Comment 3 Markus Schoder 2004-12-08 19:39:10 UTC

This bug seems to have resurfaced in 3.4.3. It is actually enough to have a
negative argument to sqrt.

Comment 4 douze 2006-05-29 10:37:04 UTC

What builtin_sqrt does is:

try fsqrt
if result is ok (fucom on it sets flags for =), return it
else call library sqrt

This was coded badly in gcc 3.4.1, causing an infinite loop (btw, I can't find where the asm code is)

The bug seems corrected in gcc 3.4.3