ARM modulo patch

Nick Clifton nickc@redhat.com
Mon Aug 21 12:59:00 GMT 2000


Hi Guys,

  I am withdrawing my earlier patch to fix the ARM port's calculation
  of the modulos of big dividends by small divisors.  Further testing
  revealed that the patch broke modulos of small numbers (eg it
  computed "3 % 2" as 0).

  I believe that the patch below is the correct solution.  It
  certainly works for all of the test cases that I have tried so far
  (including big and small dividends).  It also does not add an extra
  instruction into the body of the main loop, so I hope that it will
  not affect the performance of the code.

  Note - the same bug exists in both the Thumb and ARM versions of the
  modulo routines, so the patch fixes both sets of functions.

  Any objections to my applying this patch ?

Cheers
	Nick


2000-08-21  Nick Clifton  <nickc@redhat.com>

	* config/arm/lib1funcs.asm (__umodsi3): Before performing any
	restorative additions, test for bottom bits of IP being set,
	rather than relying upon the RORs not matching.
	(__modsi3): Ditto.


Index: gcc/config/arm/lib1funcs.asm
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/arm/lib1funcs.asm,v
retrieving revision 1.12
diff -w -p -r1.12 lib1funcs.asm
*** lib1funcs.asm	2000/08/18 10:18:14	1.12
--- lib1funcs.asm	2000/08/21 19:57:03
*************** Over6:	
*** 400,415 ****
  	@ Any subtractions that we should not have done will be recorded in
  	@ the top three bits of "overdone".  Exactly which were not needed
  	@ are governed by the position of the bit, stored in ip.
- 	@ If we terminated early, because dividend became zero,
- 	@ then none of the below will match, since the bit in ip will not be
- 	@ in the bottom nibble.
- 
  	mov	work, #0xe
  	lsl	work, #28	
  	and	overdone, work
  	bne	Over7
  	pop	{ work }
  	RET					@ No fixups needed
  Over7:
  	mov	curbit, ip
  	mov	work, #3
--- 400,423 ----
  	@ Any subtractions that we should not have done will be recorded in
  	@ the top three bits of "overdone".  Exactly which were not needed
  	@ are governed by the position of the bit, stored in ip.
  	mov	work, #0xe
  	lsl	work, #28	
  	and	overdone, work
  	bne	Over7
  	pop	{ work }
  	RET					@ No fixups needed
+ 	
+ 	@ If we terminated early, because dividend became zero, then the 
+ 	@ bit in ip will not be in the bottom nibble, and we should not
+ 	@ perform the additions below.  We must test for this though
+ 	@ (rather relying upon the TSTs to prevent the additions) since
+ 	@ the bit in ip could be in the top two bits which might then match
+ 	@ with one of the smaller RORs.
+ 	mov	curbit, ip
+ 	mov	work, #0x7
+ 	tst	curbit, work
+ 	beq	Over10
+ 	
  Over7:
  	mov	curbit, ip
  	mov	work, #3
*************** Loop3:
*** 490,499 ****
  	@ Any subtractions that we should not have done will be recorded in
  	@ the top three bits of "overdone".  Exactly which were not needed
  	@ are governed by the position of the bit, stored in ip.
- 	@ If we terminated early, because dividend became zero,
- 	@ then none of the below will match, since the bit in ip will not be
- 	@ in the bottom nibble.
  	ands	overdone, overdone, #0xe0000000
  	RETc(eq)				@ No fixups needed
  	tst	overdone, ip, ror #3
  	addne	dividend, dividend, divisor, lsr #3
--- 498,511 ----
  	@ Any subtractions that we should not have done will be recorded in
  	@ the top three bits of "overdone".  Exactly which were not needed
  	@ are governed by the position of the bit, stored in ip.
  	ands	overdone, overdone, #0xe0000000
+ 	@ If we terminated early, because dividend became zero, then the 
+ 	@ bit in ip will not be in the bottom nibble, and we should not
+ 	@ perform the additions below.  We must test for this though
+ 	@ (rather relying upon the TSTs to prevent the additions) since
+ 	@ the bit in ip could be in the top two bits which might then match
+ 	@ with one of the smaller RORs.
+ 	tstNE	ip, #0x7
  	RETc(eq)				@ No fixups needed
  	tst	overdone, ip, ror #3
  	addne	dividend, dividend, divisor, lsr #3
*************** Over7:	
*** 797,810 ****
  	@ Any subtractions that we should not have done will be recorded in
  	@ the top three bits of "overdone".  Exactly which were not needed
  	@ are governed by the position of the bit, stored in ip.
- 	@ If we terminated early, because dividend became zero,
- 	@ then none of the below will match, since the bit in ip will not be
- 	@ in the bottom nibble.
  	mov	work, #0xe
  	lsl	work, #28
  	and	overdone, work
  	beq	Lgot_result
  	
  	mov	curbit, ip
  	mov	work, #3
  	ror	curbit, work
--- 809,830 ----
  	@ Any subtractions that we should not have done will be recorded in
  	@ the top three bits of "overdone".  Exactly which were not needed
  	@ are governed by the position of the bit, stored in ip.
  	mov	work, #0xe
  	lsl	work, #28
  	and	overdone, work
  	beq	Lgot_result
  	
+ 	@ If we terminated early, because dividend became zero, then the 
+ 	@ bit in ip will not be in the bottom nibble, and we should not
+ 	@ perform the additions below.  We must test for this though
+ 	@ (rather relying upon the TSTs to prevent the additions) since
+ 	@ the bit in ip could be in the top two bits which might then match
+ 	@ with one of the smaller RORs.
+ 	mov	curbit, ip
+ 	mov	work, #0x7
+ 	tst	curbit, work
+ 	beq	Lgot_result
+ 	
  	mov	curbit, ip
  	mov	work, #3
  	ror	curbit, work
*************** Loop3:
*** 896,905 ****
  	@ Any subtractions that we should not have done will be recorded in
  	@ the top three bits of "overdone".  Exactly which were not needed
  	@ are governed by the position of the bit, stored in ip.
- 	@ If we terminated early, because dividend became zero,
- 	@ then none of the below will match, since the bit in ip will not be
- 	@ in the bottom nibble.
  	ands	overdone, overdone, #0xe0000000
  	beq	Lgot_result
  	tst	overdone, ip, ror #3
  	addne	dividend, dividend, divisor, lsr #3
--- 916,929 ----
  	@ Any subtractions that we should not have done will be recorded in
  	@ the top three bits of "overdone".  Exactly which were not needed
  	@ are governed by the position of the bit, stored in ip.
  	ands	overdone, overdone, #0xe0000000
+ 	@ If we terminated early, because dividend became zero, then the 
+ 	@ bit in ip will not be in the bottom nibble, and we should not
+ 	@ perform the additions below.  We must test for this though
+ 	@ (rather relying upon the TSTs to prevent the additions) since
+ 	@ the bit in ip could be in the top two bits which might then match
+ 	@ with one of the smaller RORs.
+ 	tstNE	ip, #0x7
  	beq	Lgot_result
  	tst	overdone, ip, ror #3
  	addne	dividend, dividend, divisor, lsr #3


More information about the Gcc-patches mailing list