Bug 32494 - gcc-4.3.x _32-bit_ becoming irrelevant to kernel
Summary: gcc-4.3.x _32-bit_ becoming irrelevant to kernel
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-06-25 12:11 UTC by Ray Malitzke
Modified: 2007-06-30 07:44 UTC (History)
4 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ray Malitzke 2007-06-25 12:11:56 UTC
On Fri, 18 May 2007, Andrew Morton wrote:
> 
> gcc-4.3 appears to have cunningly converted this:

Very cunning indeed.

Considerign that gcc converted straightforward and simple code to a total 
disaster with a 64-bit divide, I'd call it a gcc bug.

> into a divide-by-1000000000 operation, so it emits a call to udivdi3 and we
> don't link.

I think the proper fix is to just tell people that version of gcc is 
broken.

		Linus


If anybody wonders where this quote came from; just google "udivdi3 gcc".

Current gcc-20070623 message:

  ld -m elf_i386 -m elf_i386  -o .tmp_vmlinux1 -T arch/i386/kernel/vmlinux.lds arch/i386/kernel/head.o arch/i386/kernel/init_task.o  init/built-in.o --start-group  usr/built-in.o  arch/i386/kernel/built-in.o  arch/i386/mm/built-in.o  arch/i386/mach-default/built-in.o  arch/i386/crypto/built-in.o  kernel/built-in.o  mm/built-in.o  fs/built-in.o  ipc/built-in.o  security/built-in.o  crypto/built-in.o  block/built-in.o  lib/lib.a  arch/i386/lib/lib.a  lib/built-in.o  arch/i386/lib/built-in.o  drivers/built-in.o  sound/built-in.o  arch/i386/pci/built-in.o  arch/i386/power/built-in.o  net/built-in.o --end-group 
kernel/built-in.o: In function `getnstimeofday':
(.text+0x1eba5): undefined reference to `__udivdi3'
kernel/built-in.o: In function `do_gettimeofday':
(.text+0x1ecca): undefined reference to `__udivdi3'
kernel/built-in.o: In function `update_wall_time':
(.text+0x1f0f4): undefined reference to `__udivdi3'
make: *** [.tmp_vmlinux1] Error 1


For more details see PR31541 PR32044 PR31990
Comment 1 Andrew Pinski 2007-06-25 12:16:30 UTC
You need to learn this is not a bug.
if you do:

long long f(long long a, long long b)
{
  return a/b;
}

You will get a reference to divdi3.  There is no bug here except inside the lInux kernel.

Linus is wrong in saying that 4.3 is broken because it is acting like what it should be acting like.
Comment 2 Andrew Pinski 2007-06-25 12:33:11 UTC
Also by the way the divide is only inside the unlikely part of the code so it will not slow down the common code.
Comment 3 Ray Malitzke 2007-06-25 12:35:36 UTC
Ping?
Comment 4 Andrew Pinski 2007-06-25 12:37:34 UTC
Huh?  This bug is invalid.  Linus is incorrect.  Please read all the emails including Segher's.  Note GCC is not ignorining unlikely at all (except maybe for a code size issue).
Comment 5 Ray Malitzke 2007-06-25 12:50:15 UTC
Ping?
Comment 6 Richard Biener 2007-06-25 12:53:46 UTC
Pong.
Comment 7 Andrew Pinski 2007-06-25 12:53:59 UTC
What is there to ping?????
The problem again is in the Linux kernel.

Please read http://lkml.org/lkml/2007/5/18/371 as I mentioned before.
Linus is incorrect.  GCC is not ignoring unlikely as the divide is only reachable via the unlikely path.  I already checked that.
Comment 8 Ray Malitzke 2007-06-25 13:03:09 UTC
Ping?
Comment 9 Andrew Pinski 2007-06-25 13:06:56 UTC
Now you are getting annoying.  Richard closed the bug already too.  Please read my whole comments and Segher's.  Nobody has really looked into the code produced except for the fact GCC is emitting a call to divdi3 which is really to support divide long longs.

As I (and others) have mentioned that Linus is wrong in assuming GCC is doing something wrong and not taking into acount of unlikely because GCC is. Just not the way you think.
Comment 10 Ray Malitzke 2007-06-25 17:01:04 UTC
ping
Comment 11 Andrew Pinski 2007-06-25 17:12:04 UTC
The standard is clear that long long is fully supported by freestanding programs which means that the implementation needs to support it. GCC supports it by providing libgcc.a support library.

4/6:
The two forms of conforming implementation are hosted and freestanding. A conforming
hosted implementation shall accept any strictly conforming program. A conforming
freestanding implementation shall accept any strictly conforming program that does not
use complex types and in which the use of the features specified in the library clause
(clause 7) is confined to the contents of the standard headers <float.h>,
<iso646.h>, <limits.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, and
<stdint.h>. A conforming implementation may have extensions (including additional
library functions), provided they do not alter the behavior of any strictly conforming
program

5.1.2.1/1:
In a freestanding environment (in which C program execution may take place without any
benefit of an operating system), the name and type of the function called at program
startup are implementation-defined. Any library facilities available to a freestanding
program, other than the minimal set required by clause 4, are implementation-defined.

So GCC is doing the correct thing and I just read the standard.
Comment 12 Ray Malitzke 2007-06-28 03:53:16 UTC
Mr Pinski! Thanks for again doing the work for me.

I just had to take some time out for my annual checkup and to rebuild my big machine's software after Gentoo on shutdown -h now deleted my /bin, /etc, and /sbin directories. Luckkily Volkering is back in good health and completely revamped slackware. I do not use redhat because they try to force down my throat
Gnome and I do not use SUSe because they try to do the same thing with KDE. 

I hope the somewhat out of context quotes you provide come from the the real spec and not the preliminary copy available on the net. GCC should have the real spec, while I can find better use for my money. 

Now to your first sentence in comment 11:

The support obligation (contrary to that sentence is levied on the conforming implementation and not on the conforming program. Now let us go the the real issue namely the transformation of  unsigned long (not unsigned long long as in my toy program) minuend and subtrahend subtraction into a 64 bit udivdi3. The proof of my assertion lies in the fact that for a 64 bit machine the issue never arises. I, certainly can not speak for Mr. Torvalds, he certainly doe not need my help, but speaking for myself if find it preposterous a clearly faulty interpretation of the standard used to ram down __my__ throat a transformation from a clearly specified subtraction into a udivdi3 division. 

The erroneous interpretation stems from a complete ignorance (feigned or actual)
of a non reflexive relation between program (this is the part missing from your quote) and implementation into a reflexive. or, worse, equivalence relation.

I never disputed the fact that under the standard even the free-standing implementation has an obligation to to provide udivdi3 for 32 bit machines. You 
disclosed your ignorance or, worse, morally questionable (your choice) disregard for the real issue, namely trying to cover up a grave deficiency (bug) in the GNU C compiler by specious arguments in using a division, instead of the subtraction in your comment 1. If I or the kernel people had specified an actual division,  instead of a carefully circumscribed subtraction we would have gotten what we deserved.   

Yes, Mr Morton, looked for alternatives in order to avoid a confrontation, I am not as conciliatory. In my opinion Mr. Torvalds hit the nail right on the head. As an aside, I am not your messenger boy to propagate your ignorance or maliciousness to third parties.

If this goes unchallenged the next possible result could be as follows:

Some GCC maintainer, claiming register pressure would resort to the following: All Pentium CPU's have a floating point unit and there exists an integer load and store operation for that floating point unit. Last time I looked floating register, plus flags, save and restore are not atomic hence reentrancy goes down the drain which potentially fatal results. 

The remainder clearly shows that some members of the GCC community agree that I as a user have right to avoid having this inane substitution rammed down my throat. Unfortunately thei proposed remedies either do not work or do not avoid the real issue:

It would of course be easy to prevent the optimization by declaring nsec to be
volatile.  The question is whether the compiler can reasonably determine that
the optimization is inappropriate in this particular case.

Richard Guenther

an optimization issue and __udivdi3 can be
avoided
by using volatile as stated and verified.

  Manuel López-Ibáñez

Isn't there a way for __builtin_expect to modify this behaviour? After all, it
is telling us that the loop is cheap. And the difference in computation time is
not trivial at all.

The volatile fix would be fine, but (at least for me) does not work with the
kernel. There is that little message:

kernel/time.c:479: warning: passing argument 3 of 'div_long_rem_signed'
discards qualifiers from pointer target type.

and others like it, and, udivdi3 reappears.



Thank you (muchas gracias) for looking at the matter from a user's point of
view and considering my arguments concerning __builtin_expect. You seem to be
the first to look at the timings and amount of code generated. If you are
interested I have equivalent data taken on a MAC with dual G4's. I did not send
it so far because until you intervened I got mostly legalistic arguments and
proposed fixes that do no solve the real problem of avoiding both udivdi3 and
more importantly libgcc.

 Richard Guenther 

So this is now an enhancement request for sccp to honor loop roll count or
basic-block frequency and cost of the replacement.  Note the loop appears to be
peeled twice before sccp already, but peeling doesn't decay probabilities
further.

Testcase:

int rmg(unsigned long long nsec)
{
   int sec = 0;
   nsec++;
   while (__builtin_expect(nsec >= 1000000000UL, 0)) {
      nsec -= 1000000000UL;
      ++sec;
   }
   return sec;
}

note this can be worked around with -fno-tree-scev-cprop as well.

Manuel López-Ibáñez 


The flag just disables an optimisation. If you want to disable optimisations
just  use -O0. On the other hand, shouldn't -ffreestanding prevent udivdi3 ?
What about -fno-builtin-udivdi3 ?

Zdenek Dvorak


we used to take the cost of the replacement into account.  It caused so many
missed-optimization PRs that I decided to just disable it.  The main problem is
that while theoretically you can determine whether replacement is more costly
then performing the computation in the loop (although even this is nontrivial
in practice), it is very difficult to estimate the gains of enabling further
optimizations.

One possible solution would be to annotate the division by the expected value
of the result.  Division expanders then may decide whether to expand to machine
instruction/libcall or to check for small values of the result in if-guards
first.





 
Comment 13 Andrew Pinski 2007-06-28 19:06:08 UTC
Again please read What I wrote about what the C99 standard requires.  It requires long long support for a freestanding compiler.  So that is provided with libgcc.  If the Linux kernel team decides that they don't want to use libgcc, how can this be a GCC bug then?
Comment 14 Ray Malitzke 2007-06-29 21:14:15 UTC
The first two sentences of your comment was never disputed by either myself nor from how I read Mr Torvald's comment. 

The only thing under dispute is the completely unwarrented trnasformation of a subtraction into a division. 

I am not speaking for the kernel people here but for myself; their subtraction just started me off. There are vey good reasons to avoid ligcc, like atomicity, reentrancy or plain orneryness.  If I clearly specify a subtraction any C compiler worthy of its name has no right transform that subtraction into a division and then claim that substitution of entitles GCC to ram  libgcc down my throat. 

In freestanding program I do not want, and apparently the linux kernel, does not want libgcc painted any color. It is our prerogative to specify the operations we want. In hosted programs it might not be worthwhile fighting aganst the under-handed way libgcc is dragged (remember ldd does not show its use). Even the US Supreme Court looks at the drafting process preceeding the Constitution and any laws passed by Congress. Now the below Is what boud the C99 committee in drafting the standard.

If this is what bound the standardization committee it is certainly binding on
myself the GCC apparently feels differently. The Xfree86-xorg inspires me to
believe that reason will prevail one way or another. 


The original X3J11 charter clearly mandated codifying common existing practice,
and the C89 Committee held fast to precedent wherever that was clear and
unambiguous. The vast majority of the language defined by C89 was precisely the 
same as defined in Appendix A of the first edition of The C Programming 
Language by Brian Kernighan and Dennis Ritchie, and as was implemented in 
almost all C translators of the time. (That document is hereinafter referred to
asK&R.)
K&R was not the only source of existing practice. Much work had been done over
the years to improve the C language by addressing its weaknesses, and the C89
Committee formalized enhancements of proven value which had become part of the
various dialects of C. This practice has continued in the present Committee.

Existing practice, however, has not always been consistent. Various dialects of
C have approached problems in different and sometimes diametrically opposed
ways. This divergence has happened for several reasons. First, K&R, which once
served as the language specification for almost all C translators, is imprecise
in some areas (thereby allowing divergent interpretations), and it does not
address some issues (such as a complete specification of a library) important
for code portability. Second, as the language has matured over the years,
various extensions have been added in different dialects to address limitations
and weaknesses of the language; but these extensions have not been consistent
across dialects.

One of the C89 Committee's goals was to consider such areas of divergence and 
to establish a set of clear, unambiguous rules consistent with the rest of the
language. This effort included the consideration of extensions made in various
C dialects, the specification of a complete set of required library functions,
and the development of a complete, correct syntax for C.

Much of the Committee's work has always been in large part a balancing act. The
C89 Committee tried to improve portability while retaining the definition of
certain features of C as machine-dependent, it attempted to incorporate 
valuable new ideas without disrupting the basic structure and fabric of the
language, and it tried to develop a clear and consistent language without
invalidating existing programs. All of the goals were important and each
decision was weighed in the light of sometimes contradictory requirements in an
attempt to reach a workable compromise.

In specifying a standard language, the C89 Committee used several principles
which continue to guide our deliberations today. The most important of these
are:

Existing code is important, existing implementations are not. A large body of C
code exists of considerable commercial value. Every attempt has been made to
ensure that the bulk of this code will be acceptable to any implementation
conforming to the Standard. The C89 Committee did not want to force most
programmers to modify their C programs just to have them accepted by a
conforming translator.

On the other hand, no one implementation was held up as the exemplar by which
to define C. It was assumed that all existing implementations must change
somewhat to conform to the Standard.

C code can be portable. Although the C language was originally born with the
UNIX operating system on the PDP-11, it has since been implemented on a wide
variety of computers and operating systems. It has also seen considerable use
in cross-compilation of code for embedded systems to be executed in a
free-standing environment. The C89 Committee attempted to specify the language
and the library to be as widely implementable as possible, while recognizing
that a system must meet certain minimum criteria to be considered a viable host
or
target for the language.

C code can be non-portable. Although it strove to give programmers the
opportunity to write truly portable programs, the C89 Committee did not want to
force programmers into writing portably, to preclude the use of C as a
high-level assembler: the ability to write machine-specific code is one of the
strengths of C. It is this principle which largely motivates drawing the
distinction between strictly conforming program and conforming program .
Avoid quiet changes. Any change to widespread practice altering the meaning of
existing code causes problems. Changes that cause code to be so ill-formed as
to require diagnostic messages are at least easy to detect. As much as seemed
possible consistent with its other goals, the C89 Committee avoided changes
that quietly alter one valid program to another with different semantics, that
cause a working program to work differently without notice. In important places
where this principle is violated, both the C89 Rationale and this Rationale
point
out a QUIET CHANGE.

A standard is a __treaty__ (contract) between implementor and programmer. Some
numerical limits were added to the Standard to give both implementors and
programmers a better understanding of what must be provided by an
implementation, of what can be expected and depended upon to
exist. These limits were, and still are, presented as minimum maxima (that is,
lower limits placed on the values of upper limits specified by an
implementation) with the understanding that any implementor is at liberty to
provide higher limits than the Standard mandates. Any program that
takes advantage of these more tolerant limits is not strictly conforming,
however, since other implementations are at liberty to enforce the mandated
limits.

Keep the spirit of C. The C89 Committee kept as a major goal to preserve the
traditional spirit of C. There are many facets of the spirit of C, but the
essence is a community sentiment of the underlying principles upon which the C
language is based. Some of the facets of the spirit of C can be summarized in
phrases like:

 Trust the programmer.

 Don't prevent the programmer from doing what needs to be done.

 Keep the language small and simple.

 Provide only one way to do an operation.

 
They say it much batter than I ever could. I found it when looking for the price of the official C99 Standard. I am leaving for the benefit of less fanatical people.


Comment 15 Andrew Pinski 2007-06-29 21:20:42 UTC
> A standard is a __treaty__ (contract) between implementor and programmer.

And in our implementation, there is a library for support functions.  If you don't see that, then please stop your rants.  Your rants actually make you look bad.
Comment 16 Ray Malitzke 2007-06-29 21:42:09 UTC
A treaty is a bilateral agreement. No something shoved down one Side throat.

The worse I look the more I accomplish for others than GCC fanatics
Comment 17 Andrew Pinski 2007-06-29 21:45:08 UTC
1) The compiler needs a support library to implement all required features in the standard.
2) That library is libgcc.
3) Linux kernel has its own support library for these functions
4) The linux kernel support library for C is not complete.

So where is the GCC bug if the linux support library not complete?
Comment 18 Ray Malitzke 2007-06-29 22:19:45 UTC

As I am clearly rejected by the GCC insiders in my attempts to help make the C
compiler more attuned to the spirit of the C99 committee; I am now forced to
alert the user community of what is happening with a near monopoly. 

And why is a GCC maintainer, with priveledged access to GCC's bugzilla, and
hence a spokesperson for the GCC community claiming again and again on GCC's
bugzilla that Mr Linus Torvalds is wrong, instead of having the guts to
confront Mr torvalds directly.  I do not work for Mr Torvalds nor am I part of
the the kernel community to deliver an inane message to somebody of the stature
of Mr Torvalds. Actually it is evidently clear that Mr Linus Torvalds and Mr
Andrew Morton do not need my help. I am actually pursuing this on my own as
part of a larger picture.

As an outsider GCC's bugzilla is the equivalent to the leads of the good old EE black box. You use use the tools available. 

The bugzilla.kernel.org thread started by myself is 8501. It was my ignorance at the time that led to a poor title. Actually the linux-kernel had for many years the udivdi3 algorithm. Udivdi3 originally came from BSD. Udividi3 was removed, under some controversy, from kernel-2.6.x. There must have been a good reason, which I as an outsider ignore. However having examined the algorithm in libgcc  I thorougly applaud the removal. I would never use udivdi3 in a real time executive and I, as a project engineer,  would fire for cause some programmer  , who slipped it in against my edict and made it hidden from ldd to avoid detection. Again I am not speaking for Mr. Torvalds.   
Comment 19 Andreas Schwab 2007-06-29 22:34:42 UTC
There is no violation of any C standard.
Comment 20 Ray Malitzke 2007-06-29 23:53:12 UTC
Ping
Comment 21 Andrew Pinski 2007-06-29 23:55:36 UTC
How many times do I and others, GCC is doing the correct thing?
If you want to ping someone, go talk to Linus.
Comment 22 Andrew Pinski 2007-06-29 23:57:09 UTC
When I am saying GCC is doing the correct thing, I am talking about the library issue and not about the code gen issue (the code gen issue is filed in a different bug and will be fixed, it just takes time though your rants make it harder to fix stuff because we have to take time out to respond to them).
Comment 23 Ray Malitzke 2007-06-30 00:18:40 UTC
Segher was mentioned twice. First, according to my research he is not a kernel maintainer as implied in comments 4 and 9. He is actuallu Segher Boessenkool, a GCC maintainer, inactive since  2005-02-01, his latest email address is kernel.crashing.org; earlier it was de.ibm.com. I cannot quote him because he has rather offensively forbidden me to do so. Maybe Mr Pinski can quote him.

Now to comment 19, What is violated are my rights as a programmer in transforming a carefully circumscribed subtraction into a division. Just repeating the C99 standard writers:

 Trust the programmer.

 Don't prevent the programmer from doing what needs to be done.

 Keep the language small and simple.

 Provide only one way to do an operation.

All I am asking is an effective way to impede the utterly inane and counterproductive (also mentioned as cunning) subtraction to  division transformation. Just give me a flag or attribute that works without having use -O0. 

having to use -O0  would make gcc-4.3 utterly irrelevant to programmers and negate:

We strive to provide regular, high quality releases, which we want to work well on a variety of native and cross targets (including GNU/Linux); Anybody recognize this?
Comment 24 Ray Malitzke 2007-06-30 00:22:21 UTC
Mr. Torvalds has already answered in comment 1
Comment 25 Andrew Pinski 2007-06-30 00:29:55 UTC
Hello anyone in here? I guess you did not see my comment about the code gen issue is going to be fixed.  The issue with the library is a different problem and not really an GCC issue and that if the programer uses long long, he either has to use libgcc or create a library which has the same API which means implementing udivid3 also; if they don't then they don't have a way to have a compliant freestanding compiler.

  I never said Segher was a kernel maintainer, only a kernel hacker.
Comment 26 Andrew Pinski 2007-06-30 00:30:57 UTC
As mentioned many times, libgcc is correct and the issue with the libgcc is a kernel issue.  Yes GCC should not be emitting udivid3 in the case of the loop but that is a different bug which is still open.