From eager@eagercon.com Sat Dec 1 00:56:00 2007 From: eager@eagercon.com (Michael Eager) Date: Sat, 01 Dec 2007 00:56:00 -0000 Subject: BITS_PER_UNIT larger than 8 -- word addressing In-Reply-To: <474C61FB.1070903@eagercon.com> References: <20071127034635.ABBD773D73@caffeine.csclub.uwaterloo.ca> <474BC7B7.30400@eagercon.com> <474C5B13.201@eagercon.com> <474C61FB.1070903@eagercon.com> Message-ID: <4750B149.8070801@eagercon.com> Michael Eager wrote: > Joseph S. Myers wrote: >> On Tue, 27 Nov 2007, Michael Eager wrote: >> >>> I think that there is a pervasive understanding that SImode is >>> single precision integer, 32-bits long. >> >> Only among contributors not considering non-8-bit bytes. SImode is 4 >> times QImode, 4*BITS_PER_UNIT bits, and may not exist (or at least not >> be particularly usable, much like the limitations on TImode on 32-bit >> targets) with large BITS_PER_UNIT. > > I think you just described the majority of contributors. :-) > It may be that my current problem is limited to unwind-dw2.c. > If so, then I may be able to work around it by simply not building it. There's also __mode__ (__SI__) in include/sys/types.h where it's used to define _ST_INT32. The comments are interesting: This is meant to insure that the stat struct layout doesn't change when sizeof(int) changes, but forgets that SI not always 32 bits. -- Michael Eager eager@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 From joseph@codesourcery.com Sat Dec 1 01:32:00 2007 From: joseph@codesourcery.com (Joseph S. Myers) Date: Sat, 01 Dec 2007 01:32:00 -0000 Subject: BITS_PER_UNIT larger than 8 -- word addressing In-Reply-To: <4750B149.8070801@eagercon.com> References: <20071127034635.ABBD773D73@caffeine.csclub.uwaterloo.ca> <474BC7B7.30400@eagercon.com> <474C5B13.201@eagercon.com> <474C61FB.1070903@eagercon.com> <4750B149.8070801@eagercon.com> Message-ID: On Fri, 30 Nov 2007, Michael Eager wrote: > There's also __mode__ (__SI__) in include/sys/types.h Not in GCC. I don't know about the portability assumptions of newlib. -- Joseph S. Myers joseph@codesourcery.com From mark@codesourcery.com Sat Dec 1 03:32:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Sat, 01 Dec 2007 03:32:00 -0000 Subject: Describing commercial support on our website In-Reply-To: <20071130180914.GC10692@synopsys.com> References: <6c33472e0711281608t37d0f9b6m71d5820d1765c766@mail.gmail.com> <20071129221317.GB20723@synopsys.com> <6c33472e0711300653t78654d75u483f72308cb5137a@mail.gmail.com> <20071130175209.GA10692@synopsys.com> <47505067.1020601@avtrex.com> <20071130180914.GC10692@synopsys.com> Message-ID: <4750D5C8.2070907@codesourcery.com> Joe Buck wrote: > I suggest killing the file; we could later establish a consultants web > page for gcc.gnu.org but we could treat that as a separate issue. I suggest killing the file as well. I'm certainly honored that CodeSourcery is being used as an example of a company that works on GCC in this thread. But, I'm not very keen on replacing the SERVICE file with some kind of listing on gcc.gnu.org, either. I've always thought it was a good thing that we kept the GCC site and mailing lists non-commercial. While various participants in GCC development work for companies that compete in the marketplace (whether for GCC business, for design wins, for operating system business, or whatever), we are, in the context of the GCC web site, we are collaborators aiming towards the common goal of a great free compiler. I'd rather let people who want customization, support, etc. use other means to figure out who provides those services. Then we don't have to worry about who's listed in what order on the web site, etc. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From mastapeel@hotmail.com Sat Dec 1 05:54:00 2007 From: mastapeel@hotmail.com (darvish) Date: Sat, 01 Dec 2007 05:54:00 -0000 Subject: problems compiling a program with -m32 Message-ID: <14102516.post@talk.nabble.com> Hey, I'm running openSUSE 2.6.22.12-0.1-default with x86_64. I have several files: pizza.asm, driver.c and asm_io.o What I'm trying to do is to compile all these files to a single executable file pizza, I use the following command: nasm -g -f elf -d ELF_TYPE pizza.asm gcc -m32 -o pizza -g driver.c pizza.o asm_io.o and I get this whenever I execute the gcc line: /usr/lib64/gcc/x86_64-suse-linux/4.2.1/../../../../x86_64-suse-linux/bin/ld: skipping incompatible /usr/lib64/gcc/x86_64-suse-linux/4.2.1/libgcc.a when searching for -lgcc /usr/lib64/gcc/x86_64-suse-linux/4.2.1/../../../../x86_64-suse-linux/bin/ld: cannot find -lgcc collect2: ld returned 1 exit status I tried to compile it again without the -m32 parameter and I get: /usr/lib64/gcc/x86_64-suse-linux/4.2.1/../../../../x86_64-suse-linux/bin/ld: i386 architecture of input file `pizza.o' is incompatible with i386:x86-64 output /usr/lib64/gcc/x86_64-suse-linux/4.2.1/../../../../x86_64-suse-linux/bin/ld: i386 architecture of input file `asm_io.o' is incompatible with i386:x86-64 output collect2: ld returned 1 exit status I don't understand...what am I doing wrong and how can I get pizza.o and asm_io.o file to link with my 64bit architecture. I'd appreciate any help, thanks. -- View this message in context: http://www.nabble.com/problems-compiling-a-program-with--m32-tf4927203.html#a14102516 Sent from the gcc - Dev mailing list archive at Nabble.com. From rsandifo@nildram.co.uk Sat Dec 1 09:48:00 2007 From: rsandifo@nildram.co.uk (Richard Sandiford) Date: Sat, 01 Dec 2007 09:48:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <20071130211005.GQ17368@sygehus.dk> (Rask Ingemann Lambertsen's message of "Fri\, 30 Nov 2007 22\:10\:05 +0100") References: <474C98AA.50105@t-online.de> <474C9A65.2060902@codesourcery.com> <474C9B33.8060503@t-online.de> <474C9CBD.2070708@codesourcery.com> <87fxyqdc45.fsf@firetop.home> <474D943C.4030106@codesourcery.com> <20071128210420.GH17368@sygehus.dk> <474DF7E4.6050308@codesourcery.com> <20071130181424.GO17368@sygehus.dk> <4750559E.2090800@codesourcery.com> <20071130211005.GQ17368@sygehus.dk> Message-ID: <87d4tqu4nv.fsf@firetop.home> Rask Ingemann Lambertsen writes: > On Fri, Nov 30, 2007 at 10:25:34AM -0800, Mark Mitchell wrote: >> Rask Ingemann Lambertsen wrote: >> >> >>> I have a feeling it would be more robust to simulate the link tests >> >>> inside the autoconf/libtool macros themselves as opposed to explicitly >> >>> avoiding them in each and every configure.{ac,in}. Supply an option >> >>> --simulate-link-tests=file_with_results with the default being no and be >> >>> happy. > >> The advantage to the current setup is that you get a >> loud failure, and have to go actually work out the right answer. > > That's the --cache-file option, except for clobbering the file. I'll see > if I can arrange for the toplevel Makefile to copy a pre-made config.cache > into target library build directories just before running configure. That > ought to deal with all AC_FUNC(S) macros. That leaves just symbol versioning > and AC_LIBTOOL_DLOPEN, which is manageble. I've lost track of whether we're still talking about what to do for 4.3, or whether we're talking about future directions. So: are we considering this for 4.3, or for 4.4+? Richard From rsandifo@nildram.co.uk Sat Dec 1 09:55:00 2007 From: rsandifo@nildram.co.uk (Richard Sandiford) Date: Sat, 01 Dec 2007 09:55:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <47505D76.4040207@codesourcery.com> (Mark Mitchell's message of "Fri\, 30 Nov 2007 10\:59\:02 -0800") References: <474C8FA4.2040603@codesourcery.com> <474C95BA.1060807@t-online.de> <474C96C1.7010208@codesourcery.com> <474C98AA.50105@t-online.de> <474C9A65.2060902@codesourcery.com> <474C9B33.8060503@t-online.de> <474C9CBD.2070708@codesourcery.com> <87fxyqdc45.fsf@firetop.home> <474D943C.4030106@codesourcery.com> <877ik0aerh.fsf@firetop.home> <20071130022132.GL17368@sygehus.dk> <87sl2o6s1g.fsf@firetop.home> <47505D76.4040207@codesourcery.com> Message-ID: <878x4eu4c8.fsf@firetop.home> Mark Mitchell writes: > Richard Sandiford wrote: >> 2006-04-18 DJ Delorie >> >> * configure.in (m32c): Build libstdc++-v3. Pass flags to >> reference libgloss so that libssp can be built in a combined >> tree. >> * configure: Regenerate. > >> Mark, DJ? Do you agree it's OK to drop that hunk? > > I'm not quite sure if you're asking for agreement to leave it in our > sourcebase, or to remove it therefrom, so, unambiguously: Yeah, sorry about that. And... > I would prefer to revert DJ's change, for the same reason as the other > changes under discussion, so that we're consistent across architectures. ...I was indeed asking whether I could remove that hunk from the source, rather than restoring it to its original position. Anyway, given that there have been objections to the patch generally, I realise that the pre-approval is void. Richard From richard.guenther@gmail.com Sat Dec 1 10:21:00 2007 From: richard.guenther@gmail.com (Richard Guenther) Date: Sat, 01 Dec 2007 10:21:00 -0000 Subject: problems compiling a program with -m32 In-Reply-To: <14102516.post@talk.nabble.com> References: <14102516.post@talk.nabble.com> Message-ID: <84fc9c000712010221l1160ada1jd3afa4e4e9e60719@mail.gmail.com> On Dec 1, 2007 6:54 AM, darvish wrote: > > Hey, I'm running openSUSE 2.6.22.12-0.1-default with x86_64. > I have several files: pizza.asm, driver.c and asm_io.o > What I'm trying to do is to compile all these files to a single executable > file pizza, I use the following command: > > nasm -g -f elf -d ELF_TYPE pizza.asm > gcc -m32 -o pizza -g driver.c pizza.o asm_io.o > > and I get this whenever I execute the gcc line: > /usr/lib64/gcc/x86_64-suse-linux/4.2.1/../../../../x86_64-suse-linux/bin/ld: > skipping incompatible /usr/lib64/gcc/x86_64-suse-linux/4.2.1/libgcc.a when > searching for -lgcc > /usr/lib64/gcc/x86_64-suse-linux/4.2.1/../../../../x86_64-suse-linux/bin/ld: > cannot find -lgcc > collect2: ld returned 1 exit status > > I tried to compile it again without the -m32 parameter and I get: > > /usr/lib64/gcc/x86_64-suse-linux/4.2.1/../../../../x86_64-suse-linux/bin/ld: > i386 architecture of input file `pizza.o' is incompatible with i386:x86-64 > output > /usr/lib64/gcc/x86_64-suse-linux/4.2.1/../../../../x86_64-suse-linux/bin/ld: > i386 architecture of input file `asm_io.o' is incompatible with i386:x86-64 > output > collect2: ld returned 1 exit status > > I don't understand...what am I doing wrong and how can I get pizza.o and > asm_io.o file to link with my 64bit architecture. I'd appreciate any help, > thanks. First, this is off-topic for this list, use gcc-help for this kind of questions. Second, you probably miss the gcc-32bit package. Richard. From rask@sygehus.dk Sat Dec 1 11:53:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Sat, 01 Dec 2007 11:53:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <87d4tqu4nv.fsf@firetop.home> References: <474C9B33.8060503@t-online.de> <474C9CBD.2070708@codesourcery.com> <87fxyqdc45.fsf@firetop.home> <474D943C.4030106@codesourcery.com> <20071128210420.GH17368@sygehus.dk> <474DF7E4.6050308@codesourcery.com> <20071130181424.GO17368@sygehus.dk> <4750559E.2090800@codesourcery.com> <20071130211005.GQ17368@sygehus.dk> <87d4tqu4nv.fsf@firetop.home> Message-ID: <20071201115252.GS17368@sygehus.dk> On Sat, Dec 01, 2007 at 09:48:20AM +0000, Richard Sandiford wrote: > Rask Ingemann Lambertsen writes: > > > > That's the --cache-file option, except for clobbering the file. I'll see > > if I can arrange for the toplevel Makefile to copy a pre-made config.cache > > into target library build directories just before running configure. That > > ought to deal with all AC_FUNC(S) macros. That leaves just symbol versioning > > and AC_LIBTOOL_DLOPEN, which is manageble. > > I've lost track of whether we're still talking about what to do for 4.3, > or whether we're talking about future directions. So: are we considering > this for 4.3, or for 4.4+? I'll post a patch to implement the --cache-file trick just as soon as I figure out why the $with_newlib variable is lost sometime before configuring libgfortran, because it seems to basicly work apart from that. Then we can decide for 4.3 or 4.4. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From rask@sygehus.dk Sat Dec 1 12:03:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Sat, 01 Dec 2007 12:03:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <20071201115252.GS17368@sygehus.dk> References: <474C9CBD.2070708@codesourcery.com> <87fxyqdc45.fsf@firetop.home> <474D943C.4030106@codesourcery.com> <20071128210420.GH17368@sygehus.dk> <474DF7E4.6050308@codesourcery.com> <20071130181424.GO17368@sygehus.dk> <4750559E.2090800@codesourcery.com> <20071130211005.GQ17368@sygehus.dk> <87d4tqu4nv.fsf@firetop.home> <20071201115252.GS17368@sygehus.dk> Message-ID: <20071201120251.GT17368@sygehus.dk> On Sat, Dec 01, 2007 at 12:52:52PM +0100, Rask Ingemann Lambertsen wrote: > > I'll post a patch to implement the --cache-file trick just as soon as I > figure out why the $with_newlib variable is lost sometime before configuring > libgfortran, because it seems to basicly work apart from that. Then we can > decide for 4.3 or 4.4. Correction: $with_newlib seems to be completely unavailable in the toplevel Makefile(.tpl). Any ideas? -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From ebotcazou@adacore.com Sat Dec 1 12:04:00 2007 From: ebotcazou@adacore.com (Eric Botcazou) Date: Sat, 01 Dec 2007 12:04:00 -0000 Subject: GNAT SVN trunk on PowerPC and SPARC In-Reply-To: <47503628.3080709@oarcorp.com> References: <474F3713.4030303@oarcorp.com> <200711300847.06729.ebotcazou@adacore.com> <47503628.3080709@oarcorp.com> Message-ID: <200712011304.54535.ebotcazou@adacore.com> > Unfortunately I am nearly 100% sure that > it did break the run-time. Attached is a very > simple Ada program that works correctly > when the run-time is compiled at -O2 and > breaks when the run-time is at -O0. I cannot reproduce on x86/Linux, runtime compiled with -g. The ACATS and GNAT testsuites are also clean. Maybe occurs only on non-DWARF2 platforms too. -- Eric Botcazou From schwab@suse.de Sat Dec 1 13:37:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Sat, 01 Dec 2007 13:37:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <20071201120251.GT17368@sygehus.dk> (Rask Ingemann Lambertsen's message of "Sat\, 1 Dec 2007 13\:02\:51 +0100") References: <474C9CBD.2070708@codesourcery.com> <87fxyqdc45.fsf@firetop.home> <474D943C.4030106@codesourcery.com> <20071128210420.GH17368@sygehus.dk> <474DF7E4.6050308@codesourcery.com> <20071130181424.GO17368@sygehus.dk> <4750559E.2090800@codesourcery.com> <20071130211005.GQ17368@sygehus.dk> <87d4tqu4nv.fsf@firetop.home> <20071201115252.GS17368@sygehus.dk> <20071201120251.GT17368@sygehus.dk> Message-ID: Rask Ingemann Lambertsen writes: > On Sat, Dec 01, 2007 at 12:52:52PM +0100, Rask Ingemann Lambertsen wrote: >> >> I'll post a patch to implement the --cache-file trick just as soon as I >> figure out why the $with_newlib variable is lost sometime before configuring >> libgfortran, because it seems to basicly work apart from that. Then we can >> decide for 4.3 or 4.4. > > Correction: $with_newlib seems to be completely unavailable in the toplevel > Makefile(.tpl). Any ideas? Only variables subject to AC_SUBST are available in the generated Makefile. There is extra_host_args which includes --with-newlib, but this is only passed to configure scripts in host directories (via host_configargs), not target directories. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From rask@sygehus.dk Sat Dec 1 22:35:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Sat, 01 Dec 2007 22:35:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: References: <474D943C.4030106@codesourcery.com> <20071128210420.GH17368@sygehus.dk> <474DF7E4.6050308@codesourcery.com> <20071130181424.GO17368@sygehus.dk> <4750559E.2090800@codesourcery.com> <20071130211005.GQ17368@sygehus.dk> <87d4tqu4nv.fsf@firetop.home> <20071201115252.GS17368@sygehus.dk> <20071201120251.GT17368@sygehus.dk> Message-ID: <20071201223447.GU17368@sygehus.dk> On Sat, Dec 01, 2007 at 02:37:38PM +0100, Andreas Schwab wrote: > > Only variables subject to AC_SUBST are available in the generated > Makefile. There is extra_host_args which includes --with-newlib, but > this is only passed to configure scripts in host directories (via > host_configargs), not target directories. Thanks. I frequently configure using --with-newlib --without-newlib and configure gets it right, so I won't just grep the configure args for --with-newlib. Instead I use AC_SUBST to export with_newlib to the Makefile. So, here is the patch to implement the config.cache file trick: Create a config.cache file with all the right link test answers for newlib just before running configure, in both Makefile.tpl and config-ml.in. This allows sparc-unknown-elf to build libstdc++-v3 with unmodified libstdc++-v3/configure.ac. Libgfortran's configure.ac needs just the symbol versioning patch ported from libssp. And that's it! The file config-newlib-linktests.cache is just an extract of the config.cache files I had lying around, except for the last four entries that I had to extract manually from libgfortran's config.log because they're not saved to config.cache. It's reasonably straight-forward to add entries as needed. Index: configure.ac =================================================================== --- configure.ac (revision 130442) +++ configure.ac (working copy) AC_SUBST(CONFIGURE_GDB_TK) AC_SUBST(GDB_TK) AC_SUBST(INSTALL_GDB_TK) +AC_SUBST(with_newlib) # Build module lists & subconfigure args. AC_SUBST(build_configargs) Index: Makefile.tpl =================================================================== --- Makefile.tpl (revision 130442) +++ Makefile.tpl (working copy) @@ -394,6 +394,7 @@ # ------------------------------------ # Miscellaneous targets and flag lists # ------------------------------------ +with_newlib = @with_newlib@ # The first rule in the file had better be this one. Don't put any above it. # This lives here to allow makefile fragments to contain dependencies. @@ -814,7 +815,12 @@ *) topdir=`echo [+subdir+]/[+module+]/ | \ sed -e 's,\./,,g' -e 's,[^/]*/,../,g' `$(srcdir) ;; \ esac; \ - srcdiroption="--srcdir=$${topdir}/[+module+]"; \ + [+ IF check_multilibs + +]if test ! -f config.cache -a "x$(with_newlib)" = "xyes"; then \ + echo "Using link test cache $${s}/config-newlib-linktests.cache."; \ + cp $${s}/config-newlib-linktests.cache config.cache; \ + fi; \ + [+ ENDIF check_multilibs +]srcdiroption="--srcdir=$${topdir}/[+module+]"; \ libsrcdir="$$s/[+module+]"; \ [+ IF no-config-site +]rm -f no-such-file || : ; \ CONFIG_SITE=no-such-file [+ ENDIF +]$(SHELL) $${libsrcdir}/configure \ Index: config-ml.in =================================================================== --- config-ml.in (revision 130442) +++ config-ml.in (working copy) @@ -893,6 +893,10 @@ fi fi + if test ! -f config.cache -a "x${with_newlib}" = "xyes"; then + echo "Using link test cache ${s}/config-newlib-linktests.cache." + cp ${s}/config-newlib-linktests.cache config.cache + fi if eval ${ml_config_env} ${ml_config_shell} ${ml_recprog} \ --with-multisubdir=${ml_dir} --with-multisrctop=${multisrctop} \ ${ac_configure_args} ${ml_config_env} ${ml_srcdiroption} ; then Index: libgfortran/configure.ac =================================================================== --- libgfortran/configure.ac (revision 130442) +++ libgfortran/configure.ac (working copy) @@ -144,7 +144,13 @@ EOF save_LDFLAGS="$LDFLAGS" LDFLAGS="$LDFLAGS -fPIC -shared -Wl,--version-script,./conftest.map" -AC_TRY_LINK([int foo;],[],[gfortran_use_symver=yes],[gfortran_use_symver=no]) +if test x$gcc_no_link = xyes; then + # If we cannot link, we cannot build shared libraries, so do not use + # symbol versioning. + gfortran_use_symver=no +else + AC_TRY_LINK([int foo;],[],[gfortran_use_symver=yes],[gfortran_use_symver=no]) +fi LDFLAGS="$save_LDFLAGS" AC_MSG_RESULT($gfortran_use_symver) AM_CONDITIONAL(LIBGFOR_USE_SYMVER, [test "x$gfortran_use_symver" = xyes]) --- /dev/null 2007-11-25 19:34:41.836000250 +0100 +++ config-newlib-linktests.cache 2007-12-01 20:48:47.000000000 +0100 @@ -0,0 +1,298 @@ +ac_cv_func_accept=${ac_cv_func_accept=no} +ac_cv_func_access=${ac_cv_func_access=no} +ac_cv_func_alarm=${ac_cv_func_alarm=no} +ac_cv_func_alloca_works=${ac_cv_func_alloca_works=yes} +ac_cv_func_argz_append=${ac_cv_func_argz_append=yes} +ac_cv_func_argz_create_sep=${ac_cv_func_argz_create_sep=yes} +ac_cv_func_argz_insert=${ac_cv_func_argz_insert=yes} +ac_cv_func_argz_next=${ac_cv_func_argz_next=yes} +ac_cv_func_argz_stringify=${ac_cv_func_argz_stringify=yes} +ac_cv_func_backtrace=${ac_cv_func_backtrace=no} +ac_cv_func_backtrace_symbols=${ac_cv_func_backtrace_symbols=no} +ac_cv_func_bind=${ac_cv_func_bind=no} +ac_cv_func_chdir=${ac_cv_func_chdir=no} +ac_cv_func_chsize=${ac_cv_func_chsize=no} +ac_cv_func_clock=${ac_cv_func_clock=yes} +ac_cv_func_close=${ac_cv_func_close=yes} +ac_cv_func_closedir=${ac_cv_func_closedir=no} +ac_cv_func_connect=${ac_cv_func_connect=no} +ac_cv_func_ctime=${ac_cv_func_ctime=yes} +ac_cv_func_dlopen=${ac_cv_func_dlopen=no} +ac_cv_func_dup2=${ac_cv_func_dup2=no} +ac_cv_func_dup=${ac_cv_func_dup=no} +ac_cv_func__dyld_func_lookup=${ac_cv_func__dyld_func_lookup=no} +ac_cv_func_epoll_create=${ac_cv_func_epoll_create=no} +ac_cv_func_execl=${ac_cv_func_execl=no} +ac_cv_func_execve=${ac_cv_func_execve=no} +ac_cv_func_execvp=${ac_cv_func_execvp=no} +ac_cv_func_fcntl=${ac_cv_func_fcntl=yes} +ac_cv_func_fdopen=${ac_cv_func_fdopen=yes} +ac_cv_func_fork=${ac_cv_func_fork=no} +ac_cv_func_fork=${ac_cv_func_fork=yes} +ac_cv_func_fp_enable=${ac_cv_func_fp_enable=no} +ac_cv_func_fp_trap=${ac_cv_func_fp_trap=no} +ac_cv_func_fstat=${ac_cv_func_fstat=yes} +ac_cv_func_fsync=${ac_cv_func_fsync=no} +ac_cv_func_ftruncate=${ac_cv_func_ftruncate=no} +ac_cv_func_getcwd=${ac_cv_func_getcwd=no} +ac_cv_func_gethostbyname_r=${ac_cv_func_gethostbyname_r=no} +ac_cv_func_gethostname=${ac_cv_func_gethostname=no} +ac_cv_func_getifaddrs=${ac_cv_func_getifaddrs=no} +ac_cv_func_getloadavg=${ac_cv_func_getloadavg=no} +ac_cv_func_getlogin=${ac_cv_func_getlogin=no} +ac_cv_func_getpagesize=${ac_cv_func_getpagesize=no} +ac_cv_func_getpeername=${ac_cv_func_getpeername=no} +ac_cv_func_getpwuid=${ac_cv_func_getpwuid=no} +ac_cv_func_getrlimit=${ac_cv_func_getrlimit=no} +ac_cv_func_getrusage=${ac_cv_func_getrusage=no} +ac_cv_func_getsockname=${ac_cv_func_getsockname=no} +ac_cv_func_getsockopt=${ac_cv_func_getsockopt=no} +ac_cv_func_gettimeofday=${ac_cv_func_gettimeofday=yes} +ac_cv_func_htonl=${ac_cv_func_htonl=no} +ac_cv_func_htons=${ac_cv_func_htons=no} +ac_cv_func_inet_addr=${ac_cv_func_inet_addr=no} +ac_cv_func_inet_aton=${ac_cv_func_inet_aton=no} +ac_cv_func_inet_pton=${ac_cv_func_inet_pton=no} +ac_cv_func_kevent=${ac_cv_func_kevent=no} +ac_cv_func_kill=${ac_cv_func_kill=yes} +ac_cv_func_kqueue=${ac_cv_func_kqueue=no} +ac_cv_func_link=${ac_cv_func_link=yes} +ac_cv_func_listen=${ac_cv_func_listen=no} +ac_cv_func_localtime_r=${ac_cv_func_localtime_r=yes} +ac_cv_func_lseek=${ac_cv_func_lseek=yes} +ac_cv_func_lstat=${ac_cv_func_lstat=no} +ac_cv_func_madvise=${ac_cv_func_madvise=no} +ac_cv_func_memcpy=${ac_cv_func_memcpy=yes} +ac_cv_func_memmove=${ac_cv_func_memmove=yes} +ac_cv_func_memset=${ac_cv_func_memset=yes} +ac_cv_func_mincore=${ac_cv_func_mincore=no} +ac_cv_func_mkstemp=${ac_cv_func_mkstemp=yes} +ac_cv_func_mktime=${ac_cv_func_mktime=yes} +ac_cv_func_mmap=${ac_cv_func_mmap=no} +ac_cv_func_mmap_anon=${ac_cv_func_mmap_anon=no} +ac_cv_func_mmap_dev_zero=${ac_cv_func_mmap_dev_zero=no} +ac_cv_func_mmap_file=${ac_cv_func_mmap_file=no} +ac_cv_func_mmap_fixed_mapped=${ac_cv_func_mmap_fixed_mapped=no} +ac_cv_func_msync=${ac_cv_func_msync=no} +ac_cv_func_munmap=${ac_cv_func_munmap=no} +ac_cv_func_open=${ac_cv_func_open=yes} +ac_cv_func_opendir=${ac_cv_func_opendir=no} +ac_cv_func_perror=${ac_cv_func_perror=yes} +ac_cv_func_pipe=${ac_cv_func_pipe=no} +ac_cv_func_read=${ac_cv_func_read=yes} +ac_cv_func_readdir=${ac_cv_func_readdir=no} +ac_cv_func_readdir_r=${ac_cv_func_readdir_r=no} +ac_cv_func_readlink=${ac_cv_func_readlink=no} +ac_cv_func_readv=${ac_cv_func_readv=no} +ac_cv_func_recvfrom=${ac_cv_func_recvfrom=no} +ac_cv_func_select=${ac_cv_func_select=no} +ac_cv_func_send=${ac_cv_func_send=no} +ac_cv_func_sendto=${ac_cv_func_sendto=no} +ac_cv_func_setmode=${ac_cv_func_setmode=no} +ac_cv_func_setsockopt=${ac_cv_func_setsockopt=no} +ac_cv_func_shl_load=${ac_cv_func_shl_load=no} +ac_cv_func_signal=${ac_cv_func_signal=yes} +ac_cv_func_sleep=${ac_cv_func_sleep=no} +ac_cv_func_snprintf=${ac_cv_func_snprintf=yes} +ac_cv_func_socket=${ac_cv_func_socket=no} +ac_cv_func_stat=${ac_cv_func_stat=yes} +ac_cv_func_strcasestr=${ac_cv_func_strcasestr=yes} +ac_cv_func_strchr=${ac_cv_func_strchr=yes} +ac_cv_func_strcmp=${ac_cv_func_strcmp=yes} +ac_cv_func_strerror=${ac_cv_func_strerror=yes} +ac_cv_func_strerror_r=${ac_cv_func_strerror_r=yes} +ac_cv_func_strncmp_works=${ac_cv_func_strncmp_works=no} +ac_cv_func_strrchr=${ac_cv_func_strrchr=yes} +ac_cv_func_strtof=${ac_cv_func_strtof=yes} +ac_cv_func_strtold=${ac_cv_func_strtold=no} +ac_cv_func_symlink=${ac_cv_func_symlink=no} +ac_cv_func_sysconf=${ac_cv_func_sysconf=no} +ac_cv_func_time=${ac_cv_func_time=yes} +ac_cv_func_times=${ac_cv_func_times=yes} +ac_cv_func_ttyname=${ac_cv_func_ttyname=no} +ac_cv_func_vsnprintf=${ac_cv_func_vsnprintf=yes} +ac_cv_func_wait=${ac_cv_func_wait=yes} +ac_cv_func_which_gethostbyname_r=${ac_cv_func_which_gethostbyname_r=unknown} +ac_cv_func_write=${ac_cv_func_write=yes} +ac_cv_func_writev=${ac_cv_func_writev=no} +ac_cv_lib_c_geteuid=${ac_cv_lib_c_geteuid=no} +ac_cv_lib_c_getgid=${ac_cv_lib_c_getgid=no} +ac_cv_lib_c_getpid=${ac_cv_lib_c_getpid=yes} +ac_cv_lib_c_getppid=${ac_cv_lib_c_getppid=no} +ac_cv_lib_c_getuid=${ac_cv_lib_c_getuid=no} +ac_cv_lib_dld_dld_link=${ac_cv_lib_dld_dld_link=no} +ac_cv_lib_dl_dlopen=${ac_cv_lib_dl_dlopen=no} +ac_cv_lib_dld_shl_load=${ac_cv_lib_dld_shl_load=no} +ac_cv_lib_m_acos=${ac_cv_lib_m_acos=yes} +ac_cv_lib_m_acosf=${ac_cv_lib_m_acosf=yes} +ac_cv_lib_m_acosh=${ac_cv_lib_m_acosh=yes} +ac_cv_lib_m_acoshf=${ac_cv_lib_m_acoshf=yes} +ac_cv_lib_m_acoshl=${ac_cv_lib_m_acoshl=no} +ac_cv_lib_m_acosl=${ac_cv_lib_m_acosl=no} +ac_cv_lib_magic_magic_open=${ac_cv_lib_magic_magic_open=no} +ac_cv_lib_m_asin=${ac_cv_lib_m_asin=yes} +ac_cv_lib_m_asinf=${ac_cv_lib_m_asinf=yes} +ac_cv_lib_m_asinh=${ac_cv_lib_m_asinh=yes} +ac_cv_lib_m_asinhf=${ac_cv_lib_m_asinhf=yes} +ac_cv_lib_m_asinhl=${ac_cv_lib_m_asinhl=no} +ac_cv_lib_m_asinl=${ac_cv_lib_m_asinl=no} +ac_cv_lib_m_atan2=${ac_cv_lib_m_atan2=yes} +ac_cv_lib_m_atan2f=${ac_cv_lib_m_atan2f=yes} +ac_cv_lib_m_atan2l=${ac_cv_lib_m_atan2l=no} +ac_cv_lib_m_atan=${ac_cv_lib_m_atan=yes} +ac_cv_lib_m_atanf=${ac_cv_lib_m_atanf=yes} +ac_cv_lib_m_atanh=${ac_cv_lib_m_atanh=yes} +ac_cv_lib_m_atanhf=${ac_cv_lib_m_atanhf=yes} +ac_cv_lib_m_atanhl=${ac_cv_lib_m_atanhl=no} +ac_cv_lib_m_atanl=${ac_cv_lib_m_atanl=no} +ac_cv_lib_m_cabs=${ac_cv_lib_m_cabs=yes} +ac_cv_lib_m_cabsf=${ac_cv_lib_m_cabsf=yes} +ac_cv_lib_m_cabsl=${ac_cv_lib_m_cabsl=no} +ac_cv_lib_m_carg=${ac_cv_lib_m_carg=no} +ac_cv_lib_m_cargf=${ac_cv_lib_m_cargf=no} +ac_cv_lib_m_cargl=${ac_cv_lib_m_cargl=no} +ac_cv_lib_m_ccos=${ac_cv_lib_m_ccos=no} +ac_cv_lib_m_ccosf=${ac_cv_lib_m_ccosf=no} +ac_cv_lib_m_ccosh=${ac_cv_lib_m_ccosh=no} +ac_cv_lib_m_ccoshf=${ac_cv_lib_m_ccoshf=no} +ac_cv_lib_m_ccoshl=${ac_cv_lib_m_ccoshl=no} +ac_cv_lib_m_ccosl=${ac_cv_lib_m_ccosl=no} +ac_cv_lib_m_ceil=${ac_cv_lib_m_ceil=yes} +ac_cv_lib_m_ceilf=${ac_cv_lib_m_ceilf=yes} +ac_cv_lib_m_ceill=${ac_cv_lib_m_ceill=no} +ac_cv_lib_m_cexp=${ac_cv_lib_m_cexp=no} +ac_cv_lib_m_cexpf=${ac_cv_lib_m_cexpf=no} +ac_cv_lib_m_cexpl=${ac_cv_lib_m_cexpl=no} +ac_cv_lib_m_clog10=${ac_cv_lib_m_clog10=no} +ac_cv_lib_m_clog10f=${ac_cv_lib_m_clog10f=no} +ac_cv_lib_m_clog10l=${ac_cv_lib_m_clog10l=no} +ac_cv_lib_m___clog=${ac_cv_lib_m___clog=no} +ac_cv_lib_m_clog=${ac_cv_lib_m_clog=no} +ac_cv_lib_m_clogf=${ac_cv_lib_m_clogf=no} +ac_cv_lib_m_clogl=${ac_cv_lib_m_clogl=no} +ac_cv_lib_m_copysign=${ac_cv_lib_m_copysign=yes} +ac_cv_lib_m_copysignf=${ac_cv_lib_m_copysignf=yes} +ac_cv_lib_m_copysignl=${ac_cv_lib_m_copysignl=no} +ac_cv_lib_m_cos=${ac_cv_lib_m_cos=yes} +ac_cv_lib_m_cosf=${ac_cv_lib_m_cosf=yes} +ac_cv_lib_m_cosh=${ac_cv_lib_m_cosh=yes} +ac_cv_lib_m_coshf=${ac_cv_lib_m_coshf=yes} +ac_cv_lib_m_coshl=${ac_cv_lib_m_coshl=no} +ac_cv_lib_m_cosl=${ac_cv_lib_m_cosl=no} +ac_cv_lib_m_cpow=${ac_cv_lib_m_cpow=no} +ac_cv_lib_m_cpowf=${ac_cv_lib_m_cpowf=no} +ac_cv_lib_m_cpowl=${ac_cv_lib_m_cpowl=no} +ac_cv_lib_m_csin=${ac_cv_lib_m_csin=no} +ac_cv_lib_m_csinf=${ac_cv_lib_m_csinf=no} +ac_cv_lib_m_csinh=${ac_cv_lib_m_csinh=no} +ac_cv_lib_m_csinhf=${ac_cv_lib_m_csinhf=no} +ac_cv_lib_m_csinhl=${ac_cv_lib_m_csinhl=no} +ac_cv_lib_m_csinl=${ac_cv_lib_m_csinl=no} +ac_cv_lib_m_csqrt=${ac_cv_lib_m_csqrt=no} +ac_cv_lib_m_csqrtf=${ac_cv_lib_m_csqrtf=no} +ac_cv_lib_m_csqrtl=${ac_cv_lib_m_csqrtl=no} +ac_cv_lib_m_ctan=${ac_cv_lib_m_ctan=no} +ac_cv_lib_m_ctanf=${ac_cv_lib_m_ctanf=no} +ac_cv_lib_m_ctanh=${ac_cv_lib_m_ctanh=no} +ac_cv_lib_m_ctanhf=${ac_cv_lib_m_ctanhf=no} +ac_cv_lib_m_ctanhl=${ac_cv_lib_m_ctanhl=no} +ac_cv_lib_m_ctanl=${ac_cv_lib_m_ctanl=no} +ac_cv_lib_m_erf=${ac_cv_lib_m_erf=yes} +ac_cv_lib_m_erfc=${ac_cv_lib_m_erfc=yes} +ac_cv_lib_m_erfcf=${ac_cv_lib_m_erfcf=yes} +ac_cv_lib_m_erfcl=${ac_cv_lib_m_erfcl=no} +ac_cv_lib_m_erff=${ac_cv_lib_m_erff=yes} +ac_cv_lib_m_erfl=${ac_cv_lib_m_erfl=no} +ac_cv_lib_m_exp=${ac_cv_lib_m_exp=yes} +ac_cv_lib_m_expf=${ac_cv_lib_m_expf=yes} +ac_cv_lib_m_expl=${ac_cv_lib_m_expl=no} +ac_cv_lib_m_fabs=${ac_cv_lib_m_fabs=yes} +ac_cv_lib_m_fabsf=${ac_cv_lib_m_fabsf=yes} +ac_cv_lib_m_fabsl=${ac_cv_lib_m_fabsl=no} +ac_cv_lib_m_feenableexcept=${ac_cv_lib_m_feenableexcept=no} +ac_cv_lib_m_floor=${ac_cv_lib_m_floor=yes} +ac_cv_lib_m_floorf=${ac_cv_lib_m_floorf=yes} +ac_cv_lib_m_floorl=${ac_cv_lib_m_floorl=no} +ac_cv_lib_m_fmod=${ac_cv_lib_m_fmod=yes} +ac_cv_lib_m_fmodf=${ac_cv_lib_m_fmodf=yes} +ac_cv_lib_m_fmodl=${ac_cv_lib_m_fmodl=no} +ac_cv_lib_m_frexp=${ac_cv_lib_m_frexp=yes} +ac_cv_lib_m_frexpf=${ac_cv_lib_m_frexpf=yes} +ac_cv_lib_m_frexpl=${ac_cv_lib_m_frexpl=no} +ac_cv_lib_m_hypot=${ac_cv_lib_m_hypot=yes} +ac_cv_lib_m_hypotf=${ac_cv_lib_m_hypotf=yes} +ac_cv_lib_m_hypotl=${ac_cv_lib_m_hypotl=no} +ac_cv_lib_m_j0=${ac_cv_lib_m_j0=yes} +ac_cv_lib_m_j0f=${ac_cv_lib_m_j0f=yes} +ac_cv_lib_m_j0l=${ac_cv_lib_m_j0l=no} +ac_cv_lib_m_j1=${ac_cv_lib_m_j1=yes} +ac_cv_lib_m_j1f=${ac_cv_lib_m_j1f=yes} +ac_cv_lib_m_j1l=${ac_cv_lib_m_j1l=no} +ac_cv_lib_m_jn=${ac_cv_lib_m_jn=yes} +ac_cv_lib_m_jnf=${ac_cv_lib_m_jnf=yes} +ac_cv_lib_m_jnl=${ac_cv_lib_m_jnl=no} +ac_cv_lib_m_ldexp=${ac_cv_lib_m_ldexp=yes} +ac_cv_lib_m_ldexpf=${ac_cv_lib_m_ldexpf=yes} +ac_cv_lib_m_ldexpl=${ac_cv_lib_m_ldexpl=no} +ac_cv_lib_m_lgamma=${ac_cv_lib_m_lgamma=yes} +ac_cv_lib_m_lgammaf=${ac_cv_lib_m_lgammaf=yes} +ac_cv_lib_m_lgammal=${ac_cv_lib_m_lgammal=no} +ac_cv_lib_m_llround=${ac_cv_lib_m_llround=no} +ac_cv_lib_m_llroundf=${ac_cv_lib_m_llroundf=no} +ac_cv_lib_m_llroundl=${ac_cv_lib_m_llroundl=no} +ac_cv_lib_m_log10=${ac_cv_lib_m_log10=yes} +ac_cv_lib_m_log10f=${ac_cv_lib_m_log10f=yes} +ac_cv_lib_m_log10l=${ac_cv_lib_m_log10l=no} +ac_cv_lib_m_log=${ac_cv_lib_m_log=yes} +ac_cv_lib_m_logf=${ac_cv_lib_m_logf=yes} +ac_cv_lib_m_logl=${ac_cv_lib_m_logl=no} +ac_cv_lib_m_lround=${ac_cv_lib_m_lround=yes} +ac_cv_lib_m_lroundf=${ac_cv_lib_m_lroundf=yes} +ac_cv_lib_m_lroundl=${ac_cv_lib_m_lroundl=no} +ac_cv_lib_m_nextafter=${ac_cv_lib_m_nextafter=yes} +ac_cv_lib_m_nextafterf=${ac_cv_lib_m_nextafterf=yes} +ac_cv_lib_m_nextafterl=${ac_cv_lib_m_nextafterl=no} +ac_cv_lib_m_pow=${ac_cv_lib_m_pow=yes} +ac_cv_lib_m_powf=${ac_cv_lib_m_powf=yes} +ac_cv_lib_m_powl=${ac_cv_lib_m_powl=no} +ac_cv_lib_m_round=${ac_cv_lib_m_round=yes} +ac_cv_lib_m_roundf=${ac_cv_lib_m_roundf=yes} +ac_cv_lib_m_roundl=${ac_cv_lib_m_roundl=no} +ac_cv_lib_m_scalbn=${ac_cv_lib_m_scalbn=yes} +ac_cv_lib_m_scalbnf=${ac_cv_lib_m_scalbnf=yes} +ac_cv_lib_m_scalbnl=${ac_cv_lib_m_scalbnl=no} +ac_cv_lib_m_sin=${ac_cv_lib_m_sin=yes} +ac_cv_lib_m_sinf=${ac_cv_lib_m_sinf=yes} +ac_cv_lib_m_sinh=${ac_cv_lib_m_sinh=yes} +ac_cv_lib_m_sinhf=${ac_cv_lib_m_sinhf=yes} +ac_cv_lib_m_sinhl=${ac_cv_lib_m_sinhl=no} +ac_cv_lib_m_sinl=${ac_cv_lib_m_sinl=no} +ac_cv_lib_m_sqrt=${ac_cv_lib_m_sqrt=yes} +ac_cv_lib_m_sqrtf=${ac_cv_lib_m_sqrtf=yes} +ac_cv_lib_m_sqrtl=${ac_cv_lib_m_sqrtl=no} +ac_cv_lib_m_tan=${ac_cv_lib_m_tan=yes} +ac_cv_lib_m_tanf=${ac_cv_lib_m_tanf=yes} +ac_cv_lib_m_tanh=${ac_cv_lib_m_tanh=yes} +ac_cv_lib_m_tanhf=${ac_cv_lib_m_tanhf=yes} +ac_cv_lib_m_tanhl=${ac_cv_lib_m_tanhl=no} +ac_cv_lib_m_tanl=${ac_cv_lib_m_tanl=no} +ac_cv_lib_m_tgamma=${ac_cv_lib_m_tgamma=yes} +ac_cv_lib_m_tgammaf=${ac_cv_lib_m_tgammaf=yes} +ac_cv_lib_m_tgammal=${ac_cv_lib_m_tgammal=no} +ac_cv_lib_m_trunc=${ac_cv_lib_m_trunc=yes} +ac_cv_lib_m_truncf=${ac_cv_lib_m_truncf=yes} +ac_cv_lib_m_truncl=${ac_cv_lib_m_truncl=no} +ac_cv_lib_m_y0=${ac_cv_lib_m_y0=yes} +ac_cv_lib_m_y0f=${ac_cv_lib_m_y0f=yes} +ac_cv_lib_m_y0l=${ac_cv_lib_m_y0l=no} +ac_cv_lib_m_y1=${ac_cv_lib_m_y1=yes} +ac_cv_lib_m_y1f=${ac_cv_lib_m_y1f=yes} +ac_cv_lib_m_y1l=${ac_cv_lib_m_y1l=no} +ac_cv_lib_m_yn=${ac_cv_lib_m_yn=yes} +ac_cv_lib_m_ynf=${ac_cv_lib_m_ynf=yes} +ac_cv_lib_m_ynl=${ac_cv_lib_m_ynl=no} +ac_cv_lib_svld_dlopen=${ac_cv_lib_svld_dlopen=no} +have_attribute_alias=${have_attribute_alias=yes} +have_fpsetmask=${have_fpsetmask=no} +have_mingw_snprintf=${have_mingw_snprintf=no} +have_sync_fetch_and_add=${have_sync_fetch_and_add=no} -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From dewar@adacore.com Sat Dec 1 23:32:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Sat, 01 Dec 2007 23:32:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> Message-ID: <4751EEDB.6070008@adacore.com> Samuel Tardieu wrote: > When looking at an Ada PR, I stumbled upon the equivalent of the > following C code: > > unsigned char x; > volatile unsigned char y; > > void f () > { > x |= 16; > y |= 32; > } > > With trunk/i686, the following code is generated (-O3 -fomit-frame-pointer): > > f: > movzbl y, %eax > orb $16, x > orl $32, %eax > movb %al, y > ret > > I cannot see a reason not to use "orb $32,y" here instead of a three > steps read/modify/write operation. Is this only a missed optimization? > (in which case I will open a PR) Are you sure it is an optimization, the timing on these things is very subtle. What evidence do you have that there is a missed optimization here? From info@petbreedersworld.com Sun Dec 2 02:34:00 2007 From: info@petbreedersworld.com (PET BREEDERS WORLD) Date: Sun, 02 Dec 2007 02:34:00 -0000 Subject: ADOPT A PUP OR KITTEN THIS XMAS. Message-ID: This Xmas season Pet breeders world currently has English bulldog pups and Exotic show kittens available for free adoption. Contact us for pictures, availability and more details. All cats and dogs have all vet shots till date. Pets make the perfect gifts for this season email us back for more details. Regards, John Lewis Pet breeder. From sam@rfc1149.net Sun Dec 2 08:59:00 2007 From: sam@rfc1149.net (Samuel Tardieu) Date: Sun, 02 Dec 2007 08:59:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <4751EEDB.6070008@adacore.com> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> <4751EEDB.6070008@adacore.com> Message-ID: <2007-12-02-09-59-09+trackit+sam@rfc1149.net> On 1/12, Robert Dewar wrote: >> I cannot see a reason not to use "orb $32,y" here instead of a three >> steps read/modify/write operation. Is this only a missed optimization? >> (in which case I will open a PR) > > Are you sure it is an optimization, the timing on these things is > very subtle. What evidence do you have that there is a missed > optimization here? For this pattern (isolated setting of one bit in the middle of a byte at a random memory location), this is the best code on this platform AFAIK. As an evidence, if you mark neither variable as volatile GCC generates with -O3 -fomit-frame-pointer: f: orb $16, x orb $32, y ret And I sure expect that GCC didn't choose to generate worst code when I *removed* the volatile constraint :) From ERES@il.ibm.com Sun Dec 2 09:41:00 2007 From: ERES@il.ibm.com (Revital1 Eres) Date: Sun, 02 Dec 2007 09:41:00 -0000 Subject: [RFC] Cleaning up latch blocks Message-ID: Hello, SMS currently works only on single-basic-block loops. This simplifies the task of software pipelining. PR34263 is an example where outof-ssa creates a non-empty latch block for a single-basic-block loop and thus prevents SMS to be applied on it. This issue was raised in the past (http://gcc.gnu.org/ml/gcc-patches/2005-11/msg01971.html) but I am not sure what is the best approach to address it. Cleaning-up those latch blocks for the propose of restoring the single-basic-block loop should be helpful in general (and not only in the SMS perspective). So, we can address this inside SMS or alternatively in outof-ssa as in the attached patch (which was originally written by Andrew Pinski for PR19038 and rewritten by Mircea, it also passes bootstrap and regtest on ppc64). Thanks, Revital (See attached file: patch_empty_latch_2_12.txt) -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: patch_empty_latch_2_12.txt URL: From sam@rfc1149.net Sun Dec 2 10:05:00 2007 From: sam@rfc1149.net (Samuel Tardieu) Date: Sun, 02 Dec 2007 10:05:00 -0000 Subject: Rant about ChangeLog entries and commit messages Message-ID: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> As a recent committer to GCC, I am always surprised to see the content of ChangeLog entries and commit messages. I tend to find GCC ChangeLog entries useless. For example, the more recent ChangeLog entry in gcc/ChangeLog is: | 2007-11-30 Jan Hubicka | | * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect | flag. How could a newcomer guess why the gcc_force_collect flag needs to be reset? Jan posted a useful explanation on gcc-patches, but finding it by searching the mailing-list is not practical and it is not coupled with the checkin. Let's look at the corresponding svn log message, which can be found with "svn blame" if a particular line needs to be pinpointed: | r130560 | hubicka | 2007-12-01 22:06:31 +0100 (Sat, 01 Dec 2007) | 4 lines | | * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect | flag. Ok, same information is mirrored here, not useful. Let's look at the change itself then: | Index: gcc/ChangeLog | =================================================================== | --- gcc/ChangeLog (revision 130559) | +++ gcc/ChangeLog (revision 130560) | @@ -1,3 +1,8 @@ | +2007-11-30 Jan Hubicka | + | + * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect | + flag. | + | 2007-11-30 Seongbae Park | | PR rtl-optimization/34171 | Index: gcc/ggc-common.c | =================================================================== | --- gcc/ggc-common.c (revision 130559) | +++ gcc/ggc-common.c (revision 130560) | @@ -1018,5 +1018,6 @@ | fprintf (stderr, "%-48s %10s %10s %10s %10s %10s\n", | "source location", "Garbage", "Freed", "Leak", "Overhead", "Times"); | fprintf (stderr, "-------------------------------------------------------\n") | ; | + ggc_force_collect = false; | #endif | } Ok, still the same information because of the ChangeLog diff, and we can see that the change shows that... gcc_force_collect is reset. Wow! Now, compare that with Jan's message on the list: | pre-ipa-mem-reports force GGC to be done at each invokation in order to | collect data on live memory references, but forgets to reset the flag | after done. This means that compiler continues GGCcollecting and works | slowly. This *is* the information I would expect to be present somewhere in GCC history. A clear and detailed information on why the change was necessary. Sure, in some case the checkin references a PR, but the PR often contains information of what didn't work before the change and the same information which is already repeated three times (ChangeLog, svn log and svn diff). Compare this to a typical commit in the Linux kernel: | commit b1812582ba94b5f377d5d3cec7646cc17d84e733 | Author: Joachim Fenkes | Date: Fri Nov 30 16:19:41 2007 -0800 | | IB/ehca: Fix static rate if path faster than link | | The formula would yield -1 if the path is faster than the link, which | is wrong in a bad way (max throttling). Clamp to 0, which is the | correct value. | | Signed-off-by: Joachim Fenkes | Signed-off-by: Roland Dreier | | diff --git a/drivers/infiniband/hw/ehca/ehca_av.c b/drivers/infiniband/hw/ehca/ehca_av.c | index 453eb99..f7782c8 100644 | --- a/drivers/infiniband/hw/ehca/ehca_av.c | +++ b/drivers/infiniband/hw/ehca/ehca_av.c | @@ -76,8 +76,12 @@ int ehca_calc_ipd(struct ehca_shca *shca, int port, | | link = ib_width_enum_to_int(pa.active_width) * pa.active_speed; | | - /* IPD = round((link / path) - 1) */ | - *ipd = ((link + (path >> 1)) / path) - 1; | + if (path >= link) | + /* no need to throttle if path faster than link */ | + *ipd = 0; | + else | + /* IPD = round((link / path) - 1) */ | + *ipd = ((link + (path >> 1)) / path) - 1; | | return 0; | } What do we have here? A one-line high-level description identifying what has been done, a synthetic analysis of the problem cause and its solution, and the audit trail with the Signed-off-by lines (which in the Linux case is more important than in GCC as copyrights are not assigned to one entity). Linux doesn't use ChangeLog, but its history is much more useful to developers and casual observers than GCC one. And it could be done for GCC (with SVN) as well. Maybe we should consider dropping ChangeLogs and using better checkin messages. Sam From schwab@suse.de Sun Dec 2 10:23:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Sun, 02 Dec 2007 10:23:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> (Samuel Tardieu's message of "Sun\, 02 Dec 2007 11\:05\:39 +0100") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: Samuel Tardieu writes: > As a recent committer to GCC, I am always surprised to see the content > of ChangeLog entries and commit messages. > > I tend to find GCC ChangeLog entries useless. For example, the more > recent ChangeLog entry in gcc/ChangeLog is: > > | 2007-11-30 Jan Hubicka > | > | * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect > | flag. > > How could a newcomer guess why the gcc_force_collect flag needs to be > reset? That is supposed to be written in a comment. The change log entry should describe _what_ is being changed, so that you can find out when a particular change was made. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From ebotcazou@libertysurf.fr Sun Dec 2 10:27:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Sun, 02 Dec 2007 10:27:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: <200712021127.46198.ebotcazou@libertysurf.fr> > I tend to find GCC ChangeLog entries useless. For example, the more > > recent ChangeLog entry in gcc/ChangeLog is: > | 2007-11-30 Jan Hubicka > | > | * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect > | flag. > > How could a newcomer guess why the gcc_force_collect flag needs to be > reset? He indeed cannot, but the ChangeLog is not meant to make it possible either. See http://gcc.gnu.org/contribute.html, especially the GNU Coding Standards. > Jan posted a useful explanation on gcc-patches, but finding it > by searching the mailing-list is not practical and it is not coupled > with the checkin. That's how it has always worked so it should be more or less practical. For PRs, there is a link (URL: field), maybe we should use PRs more often. -- Eric Botcazou From sam@rfc1149.net Sun Dec 2 10:43:00 2007 From: sam@rfc1149.net (Samuel Tardieu) Date: Sun, 02 Dec 2007 10:43:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712021127.46198.ebotcazou@libertysurf.fr> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712021127.46198.ebotcazou@libertysurf.fr> Message-ID: <2007-12-02-11-43-44+trackit+sam@rfc1149.net> On 2/12, Eric Botcazou wrote: | He indeed cannot, but the ChangeLog is not meant to make it possible either. | See http://gcc.gnu.org/contribute.html, especially the GNU Coding Standards. I know this document and I think the part on ChangeLog doesn't achieve its purpose: http://www.gnu.org/prep/standards/standards.html#Change-Logs Keep a change log to describe all the changes made to program source files. The purpose of this is so that people investigating bugs in the future will know about the changes that might have introduced the bug. Often a new bug can be found by looking at what was recently changed. More importantly, change logs can help you eliminate conceptual inconsistencies between different parts of a program, by giving you a history of how the conflicting concepts arose and who they came from. This is precisely why I am proposing an evolution in the current process. Also, this document states: There's no need to describe the full purpose of the changes or how they work together. If you think that a change calls for explanation, you're probably right. Please do explain it???but please put the explanation in comments in the code, where people will see it whenever they see the code. When you fix a bug by changing a constant (for example if there has been an offset by one error or, as I did a few minutes ago in config/sh/sh.md, there was an error in the argument to consider), this doesn't always mandate a comment in the code. For example, I think a description such as the one I wrote when describing the problem cmpgeusi_t splitting code compares operand 0 to 0, while this constant value can only be in operand 1. When compiling the Ada runtime, this leads to a "cmp/hs #0,r7" instruction which is not valid as "cmp/hs" operands must be two registers. along with the above change would have been a better commit message than just gcc/ * config/sh/sh.md (cmpgeusi_t): Fix condition. which I used as suggested. | That's how it has always worked so it should be more or less practical. Sure, it works. But this is not a reason not to improve the process. | For PRs, there is a link (URL: field), maybe we should use PRs more often. This field is useful to look at the discussion that led to the change, but PRs often contain no synthetic information on the analysis of the problem unless when the PR submitter sends a patch himself (in which case he often includes his analysis to get a better chance to get his patch checked in). From ebotcazou@libertysurf.fr Sun Dec 2 10:50:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Sun, 02 Dec 2007 10:50:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-11-43-44+trackit+sam@rfc1149.net> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712021127.46198.ebotcazou@libertysurf.fr> <2007-12-02-11-43-44+trackit+sam@rfc1149.net> Message-ID: <200712021151.08971.ebotcazou@libertysurf.fr> > I know this document and I think the part on ChangeLog doesn't achieve > its purpose: > > http://www.gnu.org/prep/standards/standards.html#Change-Logs > > Keep a change log to describe all the changes made to program source > files. The purpose of this is so that people investigating bugs in the > future will know about the changes that might have introduced the bug. > Often a new bug can be found by looking at what was recently changed. > More importantly, change logs can help you eliminate conceptual > inconsistencies between different parts of a program, by giving you a > history of how the conflicting concepts arose and who they came from. Could you elaborate? > When you fix a bug by changing a constant (for example if there has been > an offset by one error or, as I did a few minutes ago in > config/sh/sh.md, there was an error in the argument to consider), this > doesn't always mandate a comment in the code. For example, I think a > description such as the one I wrote when describing the problem > > cmpgeusi_t splitting code compares operand 0 to 0, while this constant > value can only be in operand 1. When compiling the Ada runtime, this > leads to a "cmp/hs #0,r7" instruction which is not valid as "cmp/hs" > operands must be two registers. > > along with the above change would have been a better commit message than > just > > gcc/ > * config/sh/sh.md (cmpgeusi_t): Fix condition. > > which I used as suggested. Not really in my opinion, it's a trivial fix and totally unrelated to Ada in itself, "Fix typo" or "Fix obvious mistake" would have been just fine too. -- Eric Botcazou From sam@rfc1149.net Sun Dec 2 10:52:00 2007 From: sam@rfc1149.net (Samuel Tardieu) Date: Sun, 02 Dec 2007 10:52:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: <2007-12-02-11-52-03+trackit+sam@rfc1149.net> On 2/12, Andreas Schwab wrote: | That is supposed to be written in a comment. The change log entry | should describe _what_ is being changed, so that you can find out when a | particular change was made. This should be the job of the VCS, e.g. "svn log" and "svn blame". Moreover, ChangeLogs are organized by directories. You have to look at the "svn log" to see if a test corresponds to a code change and identify it. From sam@rfc1149.net Sun Dec 2 11:14:00 2007 From: sam@rfc1149.net (Samuel Tardieu) Date: Sun, 02 Dec 2007 11:14:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712021151.08971.ebotcazou@libertysurf.fr> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712021127.46198.ebotcazou@libertysurf.fr> <2007-12-02-11-43-44+trackit+sam@rfc1149.net> <200712021151.08971.ebotcazou@libertysurf.fr> Message-ID: <2007-12-02-12-14-00+trackit+sam@rfc1149.net> On 2/12, Eric Botcazou wrote: | > I know this document and I think the part on ChangeLog doesn't achieve | > its purpose: | > | > http://www.gnu.org/prep/standards/standards.html#Change-Logs | > | > Keep a change log to describe all the changes made to program source | > files. The purpose of this is so that people investigating bugs in the | > future will know about the changes that might have introduced the bug. | > Often a new bug can be found by looking at what was recently changed. | > More importantly, change logs can help you eliminate conceptual | > inconsistencies between different parts of a program, by giving you a | > history of how the conflicting concepts arose and who they came from. | | Could you elaborate? I'll take an example from one of your recent changes in gcc/ChangeLog: 2007-11-19 Eric Botcazou * stor-layout.c (lang_adjust_rli): Delete. (set_lang_adjust_rli): Likewise. (layout_type): Do not call lang_adjust_rli hook. * tree.h (set_lang_adjust_rli): Delete. Without digging in the mailing-list archives to see why you made the change, if something new breaks on a STABS platform I will have no hint that this change was in any way related to STABS. If we didn't use ChangeLogs and if commit logs contained the information you gave on the mailing-list, this would be much easier: The compiler has been broken on STABS platform since mapped locations were enabled by default. The Ada front-end is emitting debug info too early. The patch also reorganizes a little the front-end's initialization and gets rid of dead code in the process, which in turn enables a further cleanup in the middle-end. Also note that the ChangeLog doesn't give any hint that changes in the ada directory have been made at the same time, only "svn log" reveals that. So far for the "The purpose of this is so that people investigating bugs in the future will know about the changes that might have introduced the bug." sentence. I would prefer that information which is deemed necessary for peer-review when a patch is sent to gcc-patches@ is also included in GCC log history. | > [sh.md fix] | | Not really in my opinion, it's a trivial fix and totally unrelated to Ada in | itself, "Fix typo" or "Fix obvious mistake" would have been just fine too. Well, I find it useful to know which part of the compiler has exercized this code path (as obviously there was no test associated with this optimization) and uncovered the bug, but I agree that this was an obvious typo fix. From ebotcazou@libertysurf.fr Sun Dec 2 11:32:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Sun, 02 Dec 2007 11:32:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-12-14-00+trackit+sam@rfc1149.net> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712021151.08971.ebotcazou@libertysurf.fr> <2007-12-02-12-14-00+trackit+sam@rfc1149.net> Message-ID: <200712021232.25974.ebotcazou@libertysurf.fr> > I'll take an example from one of your recent changes in gcc/ChangeLog: > > 2007-11-19 Eric Botcazou > > * stor-layout.c (lang_adjust_rli): Delete. > (set_lang_adjust_rli): Likewise. > (layout_type): Do not call lang_adjust_rli hook. > * tree.h (set_lang_adjust_rli): Delete. > > Without digging in the mailing-list archives to see why you made the > change, if something new breaks on a STABS platform I will have no hint > that this change was in any way related to STABS. But this change has nothing to do with STABS. :-) > Also note that the ChangeLog doesn't give any hint that changes in the > ada directory have been made at the same time, only "svn log" reveals > that. So far for the "The purpose of this is so that people investigating > bugs in the future will know about the changes that might have introduced > the bug." sentence. Well, any changes in the compiler can potentially introduce bugs elsewhere and I suppose that you aren't proposing to mention all the known dependencies in the commit message. -- Eric Botcazou From sam@rfc1149.net Sun Dec 2 11:43:00 2007 From: sam@rfc1149.net (Samuel Tardieu) Date: Sun, 02 Dec 2007 11:43:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: (Andreas Schwab's message of "Sun\, 02 Dec 2007 11\:23\:30 +0100") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: <2007-12-02-12-42-51+trackit+sam@rfc1149.net> Sam> How could a newcomer guess why the gcc_force_collect flag needs to Sam> be reset? Andreas> That is supposed to be written in a comment. The change log Andreas> entry should describe _what_ is being changed, so that you Andreas> can find out when a particular change was made. Out of curiosity, I looked for an example with one of your own commits in the Linux kernel tree and I found one: I think http://tinyurl.com/2j7lt7 is a very helpful explanation of the corresponding change (http://tinyurl.com/2tpw8l). Someone trying to fix a similar bug in another driver will benefit from having this message in the VCS history. A comment in the code would probably have been much shorter than this explanation and would probably not contain the "headphone", "line out" and "muted" words. Once again, I agree that the current mechanism works for GCC developers, but I think it could be much better if: 1- commit messages didn't duplicate ChangeLog entries (maybe by getting rid of ChangeLogs) 2- commit messages contained a synthetic information such as the one provided for peer-review I'm not trying to launch a revolution in the GCC development process, I'm only comparing two ways of documenting changes as they are committed and explaining why I find the Linux way of doing it more useful. As a side note, I know several (sick?) people (including myself) who casually read the Linux kernel RSS feed in their RSS aggregator and find it very insightful (if you exclude the "Merge" messages) while lighter than reading the whole linux-kernel mailing-list. People reading this can look at http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=atom with RSS-capable software to see what I mean. The GCC ChangeLogs, even when aggregated, aren't as nice to read when you're having breakfast :) Sam From sam@rfc1149.net Sun Dec 2 11:48:00 2007 From: sam@rfc1149.net (Samuel Tardieu) Date: Sun, 02 Dec 2007 11:48:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712021232.25974.ebotcazou@libertysurf.fr> (Eric Botcazou's message of "Sun\, 2 Dec 2007 12\:32\:25 +0100") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712021151.08971.ebotcazou@libertysurf.fr> <2007-12-02-12-14-00+trackit+sam@rfc1149.net> <200712021232.25974.ebotcazou@libertysurf.fr> Message-ID: <2007-12-02-12-48-12+trackit+sam@rfc1149.net> >>>>> "Eric" == Eric Botcazou writes: >> Without digging in the mailing-list archives to see why you made >> the change, if something new breaks on a STABS platform I will have >> no hint that this change was in any way related to STABS. Eric> But this change has nothing to do with STABS. :-) Sure, but as you explained yourself in the message I cited, the reason to do this change was because of a problem in STABS info generation :) Eric> Well, any changes in the compiler can potentially introduce bugs Eric> elsewhere and I suppose that you aren't proposing to mention all Eric> the known dependencies in the commit message. Of course not, but when such a dependency is the reason you make a change in the first place, I think it ought to be mentionned. If the chance caused a regression on an obscure platform using some barely used debugging format, this could give a hint of where to look (STABS for example) to see how it is done there and if bogus assumptions were made about the previous behaviour. Sam -- Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/ From hp@bitrange.com Sun Dec 2 12:04:00 2007 From: hp@bitrange.com (Hans-Peter Nilsson) Date: Sun, 02 Dec 2007 12:04:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-11-52-03+trackit+sam@rfc1149.net> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <2007-12-02-11-52-03+trackit+sam@rfc1149.net> Message-ID: <20071202065558.S8303@dair.pair.com> On Sun, 2 Dec 2007, Samuel Tardieu wrote: > On 2/12, Andreas Schwab wrote: > > | That is supposed to be written in a comment. The change log entry > | should describe _what_ is being changed, so that you can find out when a > | particular change was made. > > This should be the job of the VCS, e.g. "svn log" and "svn blame". > Moreover, ChangeLogs are organized by directories. You have to look > at the "svn log" to see if a test corresponds to a code change and > identify it. The comment *in the code* is lacking, other than that I don't see much point in your rant; it's all been said before, for one. It usually takes a while for newcomers to understand the process, in particular the what-not-why of ChangeLogs... ;) brgds, H-P From ebotcazou@libertysurf.fr Sun Dec 2 12:25:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Sun, 02 Dec 2007 12:25:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-12-48-12+trackit+sam@rfc1149.net> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712021232.25974.ebotcazou@libertysurf.fr> <2007-12-02-12-48-12+trackit+sam@rfc1149.net> Message-ID: <200712021325.30463.ebotcazou@libertysurf.fr> > Sure, but as you explained yourself in the message I cited, the reason > to do this change was because of a problem in STABS info generation :) "reason" is not quite appropriate in this case, "occasion" is more accurate. -- Eric Botcazou From kenner@vlsi1.ultra.nyu.edu Sun Dec 2 12:28:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Sun, 02 Dec 2007 12:28:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: <10712021228.AA23790@vlsi1.ultra.nyu.edu> > > How could a newcomer guess why the gcc_force_collect flag needs to be > > reset? > > That is supposed to be written in a comment. The change log entry > should describe _what_ is being changed, so that you can find out when a > particular change was made. Not quite. The comments are supposed to say why the code is doing what it's doing (and, where it's helpful, why it ISN'T doing something else). Purely historical references in the comments that don't serve to clarify the present code are discouraged. (We don't want comments turning in a blog, for example.) I view the description in the gcc-patches message as PART of the CM history of GCC in that IT'S the place to go to get this information. What's unfortunate, I think, is that there's no easy way to find this message from the CM revision number. From bernds_cb1@t-online.de Sun Dec 2 12:43:00 2007 From: bernds_cb1@t-online.de (Bernd Schmidt) Date: Sun, 02 Dec 2007 12:43:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712021228.AA23790@vlsi1.ultra.nyu.edu> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <10712021228.AA23790@vlsi1.ultra.nyu.edu> Message-ID: <4752A817.7000206@t-online.de> Richard Kenner wrote: >>> How could a newcomer guess why the gcc_force_collect flag needs to be >>> reset? >> That is supposed to be written in a comment. The change log entry >> should describe _what_ is being changed, so that you can find out when a >> particular change was made. > > Not quite. The comments are supposed to say why the code is doing what > it's doing (and, where it's helpful, why it ISN'T doing something else). > Purely historical references in the comments that don't serve to clarify > the present code are discouraged. (We don't want comments turning in a > blog, for example.) > > I view the description in the gcc-patches message as PART of the CM history > of GCC in that IT'S the place to go to get this information. What's > unfortunate, I think, is that there's no easy way to find this message from > the CM revision number. I think that's Samuel's point - it would be much better to have them in the commit log. FWIW, I agree completely - I've never found ChangeLogs useful, I hate writing them, and I think the linux-kernel guys these days generally have much better checkin messages than we do. Bernd -- This footer brought to you by insane German lawmakers. Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif From ebotcazou@libertysurf.fr Sun Dec 2 13:05:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Sun, 02 Dec 2007 13:05:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4752A817.7000206@t-online.de> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <10712021228.AA23790@vlsi1.ultra.nyu.edu> <4752A817.7000206@t-online.de> Message-ID: <200712021405.16103.ebotcazou@libertysurf.fr> > FWIW, I agree completely - I've never found ChangeLogs useful, I hate > writing them, and I think the linux-kernel guys these days generally have > much better checkin messages than we do. I guess nobody really loves writing ChangeLog entries, but in my opinion there are quite effective "executive summaries" for the patches and helpful to the reader/reviewer. Please let's not throw the baby with the bath's water. -- Eric Botcazou From kiesling@earthlink.net Sun Dec 2 14:27:00 2007 From: kiesling@earthlink.net (Robert Kiesling) Date: Sun, 02 Dec 2007 14:27:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712021405.16103.ebotcazou@libertysurf.fr> Message-ID: [ Charset ISO-8859-1 unsupported, converting... ] > > FWIW, I agree completely - I've never found ChangeLogs useful, I hate > > writing them, and I think the linux-kernel guys these days generally have > > much better checkin messages than we do. > > I guess nobody really loves writing ChangeLog entries, but in my opinion there > are quite effective "executive summaries" for the patches and helpful to the > reader/reviewer. Please let's not throw the baby with the bath's water. If there's a mechanism to filter checkin messages to ChangeLog summaries, I would be happy to use it - in cases of multiple packages, especially, it's important to know what changes were made, when, and when the changes propagated through packages and releases, and where they got to, occasionally. Anybody know of a useful, built-in mechanism for this task? -- Ctalk Home Page: http://ctalk-lang.sourceforge.net From andi@firstfloor.org Sun Dec 2 18:33:00 2007 From: andi@firstfloor.org (Andi Kleen) Date: Sun, 02 Dec 2007 18:33:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-11-05-39+trackit+sam@rfc1149.net.suse.lists.egcs> (Samuel Tardieu's message of "Sun\, 2 Dec 2007 10\:06\:16 +0000 \(UTC\)") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net.suse.lists.egcs> Message-ID: Samuel Tardieu writes: > recent ChangeLog entry in gcc/ChangeLog is: >From my understanding the gcc changelogs serve two purposes these days: - Force the submitter to read (or rather speed read) the patch again before sending it out. - Serve as a "hash key" to search the gcc-patches archives for the real changelog. The second could be solved far better in many obvious ways, but it's unclear how to replace the first. Linux kernel doesn't have a solution for it. -Andi From dberlin@dberlin.org Sun Dec 2 20:23:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Sun, 02 Dec 2007 20:23:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712021228.AA23790@vlsi1.ultra.nyu.edu> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <10712021228.AA23790@vlsi1.ultra.nyu.edu> Message-ID: <4aca3dc20712021222j34c30fd2y1acd6354a507b621@mail.gmail.com> On 12/2/07, Richard Kenner wrote: > > > How could a newcomer guess why the gcc_force_collect flag needs to be > > > reset? > > > > That is supposed to be written in a comment. The change log entry > > should describe _what_ is being changed, so that you can find out when a > > particular change was made. > > Not quite. The comments are supposed to say why the code is doing what > it's doing (and, where it's helpful, why it ISN'T doing something else). > Purely historical references in the comments that don't serve to clarify > the present code are discouraged. (We don't want comments turning in a > blog, for example.) > > I view the description in the gcc-patches message as PART of the CM history > of GCC in that IT'S the place to go to get this information. What's > unfortunate, I think, is that there's no easy way to find this message from > the CM revision number. > Nothing stops people from putting URL's. However, I'd much rather see us put more detailed explanations in svn log or the ChangeLog than try to associate mailing list threads with commits. I'm certainly not going to hunt down the URL for every thread for every patch I commit. From dberlin@dberlin.org Sun Dec 2 20:28:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Sun, 02 Dec 2007 20:28:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4752A817.7000206@t-online.de> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <10712021228.AA23790@vlsi1.ultra.nyu.edu> <4752A817.7000206@t-online.de> Message-ID: <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> On 12/2/07, Bernd Schmidt wrote: > Richard Kenner wrote: > >>> How could a newcomer guess why the gcc_force_collect flag needs to be > >>> reset? > >> That is supposed to be written in a comment. The change log entry > >> should describe _what_ is being changed, so that you can find out when a > >> particular change was made. > > > > Not quite. The comments are supposed to say why the code is doing what > > it's doing (and, where it's helpful, why it ISN'T doing something else). > > Purely historical references in the comments that don't serve to clarify > > the present code are discouraged. (We don't want comments turning in a > > blog, for example.) > > > > I view the description in the gcc-patches message as PART of the CM history > > of GCC in that IT'S the place to go to get this information. What's > > unfortunate, I think, is that there's no easy way to find this message from > > the CM revision number. > > I think that's Samuel's point - it would be much better to have them in > the commit log. FWIW, I agree completely - I've never found ChangeLogs > useful, I hate writing them, and I think the linux-kernel guys these > days generally have much better checkin messages than we do. > +1. I have never, in 7 years of working on and debugging gcc, found the ChangeLog to be useful in debugging a problem. They are like a useless version of svn diff output. If I wanted to know what the change did, i'd look at what the change did. The ChangeLog is well nigh useless even for that compared to the actual diff. I'd go even further, and say if the GNU coding standards say we shouldn't be putting descriptions of why we are changing things in the ChangeLog, than they should be changed and should be ignored on this point until they do. Pointing to them as the if they are The One True Way seems very suspect to me. After all, how else would they ever improve if nobody tries anything different? --Dan From ebotcazou@libertysurf.fr Sun Dec 2 20:36:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Sun, 02 Dec 2007 20:36:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <4752A817.7000206@t-online.de> <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> Message-ID: <200712022136.57819.ebotcazou@libertysurf.fr> > I'd go even further, and say if the GNU coding standards say we > shouldn't be putting descriptions of why we are changing things in the > ChangeLog, than they should be changed and should be ignored on this > point until they do. Pointing to them as the if they are The One True > Way seems very suspect to me. After all, how else would they ever > improve if nobody tries anything different? The people who wrote them presumably thought about these issues, too. -- Eric Botcazou From dberlin@dberlin.org Sun Dec 2 20:40:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Sun, 02 Dec 2007 20:40:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712022136.57819.ebotcazou@libertysurf.fr> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <4752A817.7000206@t-online.de> <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> <200712022136.57819.ebotcazou@libertysurf.fr> Message-ID: <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> On 12/2/07, Eric Botcazou wrote: > > I'd go even further, and say if the GNU coding standards say we > > shouldn't be putting descriptions of why we are changing things in the > > ChangeLog, than they should be changed and should be ignored on this > > point until they do. Pointing to them as the if they are The One True > > Way seems very suspect to me. After all, how else would they ever > > improve if nobody tries anything different? > > The people who wrote them presumably thought about these issues, too. Right, because surely one size fits all projects and possibilities, and workflow and processes have certainly not changed since then. From tejgcc@westnet.com.au Sun Dec 2 20:52:00 2007 From: tejgcc@westnet.com.au (tim) Date: Sun, 02 Dec 2007 20:52:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: Message-ID: <1196628738.6106.16.camel@tim-gcc> On Sun, 2007-12-02 at 09:26 -0500, Robert Kiesling wrote: > > I guess nobody really loves writing ChangeLog entries, but in my opinion there > > are quite effective "executive summaries" for the patches and helpful to the > > reader/reviewer. Please let's not throw the baby with the bath's water. > > If there's a mechanism to filter checkin messages to ChangeLog summaries, > I would be happy to use it - in cases of multiple packages, especially, it's > important to know what changes were made, when, and when the changes propagated > through packages and releases, and where they got to, occasionally. Anybody > know of a useful, built-in mechanism for this task? > Personally I find it slow and inefficient tracing through why a given change was made. It is just a slow process searching and sometimes I don't bother because it is so inconvenient. The ChangeLog entries provide little help and there does not seem to be a good alternative. If there is a good alternative no-one has said what it is so far. As people have pointed out, the RCSs pretty well cover the "what" these days. And writing changelog entries, which largely duplicate this information, is time-consuming and tedious. And there are of of little to no value to me at least. The coding standards do allow, in some cases, that giving some context would be useful: > See also what the GNU Coding Standards have to say about what goes in > ChangeLogs; in particular, descriptions of the purpose of code and > changes should go in comments rather than the ChangeLog, though a > single line overall description of the changes may be useful above the > ChangeLog entry for a large batch of changes. I personally would strongly favour each ChangeLog entry having a single line of context. This could be the PR number or a single line giving the purpose of the change or what bigger change it is part of. As pointed out by Zach Weinberg in his paper "A Maintenance Programmer's View of GCC", there are many impediments to contributing to GCC. http://www.linux.org.uk/~ajh/gcc/gccsummit-2003-proceedings.pdf Things are not much better than they were when Zach wrote his paper. This small change would be one positive step n the right direction, IMHO. Tim Josling From tejgcc@westnet.com.au Sun Dec 2 20:59:00 2007 From: tejgcc@westnet.com.au (tim) Date: Sun, 02 Dec 2007 20:59:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712022136.57819.ebotcazou@libertysurf.fr> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <4752A817.7000206@t-online.de> <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> <200712022136.57819.ebotcazou@libertysurf.fr> Message-ID: <1196629129.6106.21.camel@tim-gcc> On Sun, 2007-12-02 at 21:36 +0100, Eric Botcazou wrote: > > I'd go even further, and say if the GNU coding standards say we > > shouldn't be putting descriptions of why we are changing things in the > > ChangeLog, than they should be changed and should be ignored on this > > point until they do. Pointing to them as the if they are The One True > > Way seems very suspect to me. After all, how else would they ever > > improve if nobody tries anything different? > > The people who wrote them presumably thought about these issues, too. > Unfortunately they didn't document the "why" just the "what"! Tim Josling From mark@codesourcery.com Sun Dec 2 21:01:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Sun, 02 Dec 2007 21:01:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <878x4eu4c8.fsf@firetop.home> References: <474C8FA4.2040603@codesourcery.com> <474C95BA.1060807@t-online.de> <474C96C1.7010208@codesourcery.com> <474C98AA.50105@t-online.de> <474C9A65.2060902@codesourcery.com> <474C9B33.8060503@t-online.de> <474C9CBD.2070708@codesourcery.com> <87fxyqdc45.fsf@firetop.home> <474D943C.4030106@codesourcery.com> <877ik0aerh.fsf@firetop.home> <20071130022132.GL17368@sygehus.dk> <87sl2o6s1g.fsf@firetop.home> <47505D76.4040207@codesourcery.com> <878x4eu4c8.fsf@firetop.home> Message-ID: <47531D1D.8020805@codesourcery.com> Richard Sandiford wrote: > Anyway, given that there have been objections to the patch generally, > I realise that the pre-approval is void. I think there's no controversy over the libstdc++ change, so let's put that in. If nothing else, it makes the libstdc++ configury more self-consistent; if we decide to change the overall strategy, then we can do that all at once. Thanks, -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From mark@codesourcery.com Sun Dec 2 21:11:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Sun, 02 Dec 2007 21:11:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <20071201223447.GU17368@sygehus.dk> References: <474D943C.4030106@codesourcery.com> <20071128210420.GH17368@sygehus.dk> <474DF7E4.6050308@codesourcery.com> <20071130181424.GO17368@sygehus.dk> <4750559E.2090800@codesourcery.com> <20071130211005.GQ17368@sygehus.dk> <87d4tqu4nv.fsf@firetop.home> <20071201115252.GS17368@sygehus.dk> <20071201120251.GT17368@sygehus.dk> <20071201223447.GU17368@sygehus.dk> Message-ID: <47531F54.6010802@codesourcery.com> Rask Ingemann Lambertsen wrote: > So, here is the patch to implement the config.cache file trick: Create a > config.cache file with all the right link test answers for newlib just > before running configure, in both Makefile.tpl and config-ml.in. This allows > sparc-unknown-elf to build libstdc++-v3 with unmodified > libstdc++-v3/configure.ac. Libgfortran's configure.ac needs just the symbol > versioning patch ported from libssp. And that's it! This trick seems plausible to me. Certainly, if it works, it would simplify development of configure scripts for run-time libraries. My only concern with this approach is that Newlib might not be entirely consistent across configurations and architectures. The libstdc++ approach presumably entails some manual verification of each function's presence or absence; before we claim that "Newlib has foo" someone verifies that. I don't know if this is a problem in practice. For example, these lines seem like things that might vary. > +have_fpsetmask=${have_fpsetmask=no} ... > +have_sync_fetch_and_add=${have_sync_fetch_and_add=no} I suppose we could solve that problem, if it arises, with different config.cache files for different targets. Perhaps it would be best to generalize this by adding a top-level --with-target-lib-cache= option, and then, if that's not present, and $with_newlib is set, passing in the Newlib cache that you have? That would give people a way to say that for their particular RTOS and/or C library the following functions are available. In theory, at least, we might also have differences between multilibs. It Would Be Nice to be moving GCC in the direction of allowing different multilibs for different operating systems and/or C libraries. So, I suppose the all-singing, all-dancing version of this would be some option that allows you to specify a cache file per multilib. But, I think that could be left for later. What do you and others think? -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From ebotcazou@libertysurf.fr Sun Dec 2 22:55:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Sun, 02 Dec 2007 22:55:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> Message-ID: <200712022355.23871.ebotcazou@libertysurf.fr> > Right, because surely one size fits all projects and possibilities, It was supposed to be the Coding Standards for The GNU Project. > and workflow and processes have certainly not changed since then. In my opinion, it's now easier to work around their perceived deficiencies. -- Eric Botcazou From bje@au1.ibm.com Sun Dec 2 23:03:00 2007 From: bje@au1.ibm.com (Ben Elliston) Date: Sun, 02 Dec 2007 23:03:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: <1196636533.16908.5.camel@localhost> > This *is* the information I would expect to be present somewhere in > GCC history. A clear and detailed information on why the change was > necessary. Sure, in some case the checkin references a PR, but the PR > often contains information of what didn't work before the change and > the same information which is already repeated three times (ChangeLog, > svn log and svn diff). Keep in mind that the GNU coding standard introduced ChangeLogs before networked version control systems. In those days, you would receive a GCC release tarball with a ChangeLog. There was no way to do "svn log" or "svn diff" operations. Even in recent years, I have worked on GCC trees that were exported from the version control systems of other companies and that I did not have access to. In these situations, ChangeLogs are quite a bit more valuable. Having said that, I find the lack of rationale for some changes to be a bit irritating. I know that this should be done through code comments, but those are often made across the changeset and in different files. There is rarely a single summary of the need for the change. It would be nice to consider a practice similar to that used by NetBSD, which is to use a paragraph or so describing the need for the change (similar to what we do when we introduce a patch on gcc-patches) and inserting that comment into the svn commit message. Ben From bje@au1.ibm.com Sun Dec 2 23:05:00 2007 From: bje@au1.ibm.com (Ben Elliston) Date: Sun, 02 Dec 2007 23:05:00 -0000 Subject: Describing commercial support on our website In-Reply-To: <20071130175209.GA10692@synopsys.com> References: <6c33472e0711281608t37d0f9b6m71d5820d1765c766@mail.gmail.com> <20071129221317.GB20723@synopsys.com> <6c33472e0711300653t78654d75u483f72308cb5137a@mail.gmail.com> <20071130175209.GA10692@synopsys.com> Message-ID: <1196636697.16908.7.camel@localhost> > But the problem with ordering based on contributions is that people > will then fight over whether company A or company B has contributed > more; also, people who do their homework will know about, say, > CodeSourcery's role in GCC even if we sort the list by alphabetical > order. I'd rather avoid those kind of judgment calls. Indeed. Why do you think so many plumbers call their businesses AAA Plumbing? :-) Ben From dewar@adacore.com Sun Dec 2 23:28:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Sun, 02 Dec 2007 23:28:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <20071202065558.S8303@dair.pair.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <2007-12-02-11-52-03+trackit+sam@rfc1149.net> <20071202065558.S8303@dair.pair.com> Message-ID: <47533F95.5080608@adacore.com> Hans-Peter Nilsson wrote: > The comment *in the code* is lacking, other than that I don't > see much point in your rant; it's all been said before, for one. > It usually takes a while for newcomers to understand the > process, in particular the what-not-why of ChangeLogs... ;) I actually think that often it is helpful to have the why in changelogs. Yes, this should never take the place of commments in the code, and that is a failing you want to watch out for, but sometimes a global motivation for a chage would make a changelog entry easier to understand. If all the changelog entry says is something like (xyz): new function I don't see much point, since a diff can always easily tell you *what* was changed. > > brgds, H-P From dewar@adacore.com Sun Dec 2 23:29:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Sun, 02 Dec 2007 23:29:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712021405.16103.ebotcazou@libertysurf.fr> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <10712021228.AA23790@vlsi1.ultra.nyu.edu> <4752A817.7000206@t-online.de> <200712021405.16103.ebotcazou@libertysurf.fr> Message-ID: <47533FCD.4030101@adacore.com> Eric Botcazou wrote: >> FWIW, I agree completely - I've never found ChangeLogs useful, I hate >> writing them, and I think the linux-kernel guys these days generally have >> much better checkin messages than we do. > > I guess nobody really loves writing ChangeLog entries, but in my opinion there > are quite effective "executive summaries" for the patches and helpful to the > reader/reviewer. Please let's not throw the baby with the bath's water. Right, but to me an important part of the "executive summary" is why you did it. You don't go to the boss and say "I fired Joe", you go and say "I fired Joe, because ..." > From joseph@codesourcery.com Sun Dec 2 23:31:00 2007 From: joseph@codesourcery.com (Joseph S. Myers) Date: Sun, 02 Dec 2007 23:31:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <10712021228.AA23790@vlsi1.ultra.nyu.edu> <4752A817.7000206@t-online.de> <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> Message-ID: On Sun, 2 Dec 2007, Daniel Berlin wrote: > I have never, in 7 years of working on and debugging gcc, found the > ChangeLog to be useful in debugging a problem. I find they are useful for finding what has changed in function X (or in functions matching pattern Y) since 4.1, say (given a bug in 4.1-based sources that might be fixed by a backport of a more recent patch, which has been traced to involve function X in some way). The key feature here of course is not that the logs do not contain "why", but that they do contain the names of all the functions changed (beyond purely mechanical "all callers changed" type changes) - and the function names can be stable even as the functions themselves move between source files. I think that part of the standards remains useful with logs with the more detailed "why" as used in the gcc/ada/ directory. -- Joseph S. Myers joseph@codesourcery.com From dewar@adacore.com Sun Dec 2 23:32:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Sun, 02 Dec 2007 23:32:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712022136.57819.ebotcazou@libertysurf.fr> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <4752A817.7000206@t-online.de> <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> <200712022136.57819.ebotcazou@libertysurf.fr> Message-ID: <47534081.50801@adacore.com> Eric Botcazou wrote: >> I'd go even further, and say if the GNU coding standards say we >> shouldn't be putting descriptions of why we are changing things in the >> ChangeLog, than they should be changed and should be ignored on this >> point until they do. Pointing to them as the if they are The One True >> Way seems very suspect to me. After all, how else would they ever >> improve if nobody tries anything different? > > The people who wrote them presumably thought about these issues, too. Maybe so, but I guess we only have a record of what they came up with and not why :-) :-) In the Ada revision histories, we have always given the what-and-the-why (and if necessary the why-not), and they have proved very helpful, I always found the RH's for gigi (done in the gcc style, much less helpful because they omitted the why). Of course you have to watch out for people forgetting that RH's never substitute for comments, but all patches are reviewed, and if that happens it gets fixed during the review process. > From drow@false.org Sun Dec 2 23:54:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Sun, 02 Dec 2007 23:54:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1196636533.16908.5.camel@localhost> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <1196636533.16908.5.camel@localhost> Message-ID: <20071202235358.GA9446@caradoc.them.org> On Mon, Dec 03, 2007 at 10:02:13AM +1100, Ben Elliston wrote: > Having said that, I find the lack of rationale for some changes to be a > bit irritating. I know that this should be done through code comments, > but those are often made across the changeset and in different files. > There is rarely a single summary of the need for the change. It would > be nice to consider a practice similar to that used by NetBSD, which is > to use a paragraph or so describing the need for the change (similar to > what we do when we introduce a patch on gcc-patches) and inserting that > comment into the svn commit message. Or even into the ChangeLog... I've worked on other projects that did this. I found it incredibly helpful. -- Daniel Jacobowitz CodeSourcery From dewar@adacore.com Mon Dec 3 00:16:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Mon, 03 Dec 2007 00:16:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <2007-12-02-09-59-09+trackit+sam@rfc1149.net> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> <4751EEDB.6070008@adacore.com> <2007-12-02-09-59-09+trackit+sam@rfc1149.net> Message-ID: <47534ADB.3080502@adacore.com> Samuel Tardieu wrote: > For this pattern (isolated setting of one bit in the middle of a byte at > a random memory location), this is the best code on this platform AFAIK. > > As an evidence, if you mark neither variable as volatile GCC generates > with -O3 -fomit-frame-pointer: > > f: > orb $16, x > orb $32, y > ret > > And I sure expect that GCC didn't choose to generate worst code when > I *removed* the volatile constraint :) OK, sounds reasonable, but then I don't understand the logic behind avoiding this instruction sequence for the volatile case, this is two accesses at the bus level so what's the difference? I think on earlier pentiums these instructions were supposed to be avoided but of course this may have changed. From dberlin@dberlin.org Mon Dec 3 00:21:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Mon, 03 Dec 2007 00:21:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712022355.23871.ebotcazou@libertysurf.fr> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> Message-ID: <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> On 12/2/07, Eric Botcazou wrote: > > Right, because surely one size fits all projects and possibilities, > > It was supposed to be the Coding Standards for The GNU Project. Sorry, but again, this is not a good enough justification to me. We do a lot of things different than "The GNU Project". So do plenty of parts of the "official GNU project". They use different coding standards, bug tracking systems, version control systems, checkin policies, etc, than each other. If you have a better justification, i'd love to hear it. From dj@redhat.com Mon Dec 3 00:30:00 2007 From: dj@redhat.com (DJ Delorie) Date: Mon, 03 Dec 2007 00:30:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <47533F95.5080608@adacore.com> References: <20071202065558.S8303@dair.pair.com> <47533F95.5080608@adacore.com> Message-ID: Robert Dewar writes: > I don't see much point, since a diff can always easily tell > you *what* was changed. A changelog does help recreate a change *set* though, since CVS is lacking such a thing. Thus, the CL helps you determine what files to diff. True that SVN solves part of that, but SVN is not universal. From kiesling@earthlink.net Mon Dec 3 02:41:00 2007 From: kiesling@earthlink.net (Robert Kiesling) Date: Mon, 03 Dec 2007 02:41:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1196628738.6106.16.camel@tim-gcc> Message-ID: > On Sun, 2007-12-02 at 09:26 -0500, Robert Kiesling wrote: > > > I guess nobody really loves writing ChangeLog entries, but in my opinion there > > > are quite effective "executive summaries" for the patches and helpful to the > > > reader/reviewer. Please let's not throw the baby with the bath's water. > > > > If there's a mechanism to filter checkin messages to ChangeLog summaries, > > I would be happy to use it - in cases of multiple packages, especially, it's > > important to know what changes were made, when, and when the changes propagated > > through packages and releases, and where they got to, occasionally. Anybody > > know of a useful, built-in mechanism for this task? > > > > Personally I find it slow and inefficient tracing through why a given > change was made. It is just a slow process searching and sometimes I > don't bother because it is so inconvenient. The ChangeLog entries > provide little help and there does not seem to be a good alternative. If > there is a good alternative no-one has said what it is so far. > > As people have pointed out, the RCSs pretty well cover the "what" these > days. And writing changelog entries, which largely duplicate this > information, is time-consuming and tedious. And there are of of little > to no value to me at least. > > The coding standards do allow, in some cases, that giving some context > would be useful: > > > See also what the GNU Coding Standards have to say about what goes in > > ChangeLogs; in particular, descriptions of the purpose of code and > > changes should go in comments rather than the ChangeLog, though a > > single line overall description of the changes may be useful above the > > ChangeLog entry for a large batch of changes. > > I personally would strongly favour each ChangeLog entry having a single > line of context. This could be the PR number or a single line giving the > purpose of the change or what bigger change it is part of. > > As pointed out by Zach Weinberg in his paper "A Maintenance Programmer's > View of GCC", there are many impediments to contributing to GCC. > > http://www.linux.org.uk/~ajh/gcc/gccsummit-2003-proceedings.pdf > > Things are not much better than they were when Zach wrote his paper. This small change would be one positive step n the right direction, IMHO. One line of reference would be sufficient provided that branches other than the main development trunk stick to revisions in that branch only. I haven't glanced through the references yet, but maintenance programming is considerably different than writing new code, or even modifying someone else's code. If it's the latter you're trying to achieve, or anticipate achieving, then an accurate line of reference would be most helpful. Unfortunately, then, _someone_ has to maintain the comments accurately. I wouldn't care to say who (whom?)... just... someone. :) -- Ctalk Home Page: http://ctalk-lang.sourceforge.net From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 03:55:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 03:55:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <200712022136.57819.ebotcazou@libertysurf.fr> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <4752A817.7000206@t-online.de> <4aca3dc20712021227l666309jf7da5c53e9c68352@mail.gmail.com> <200712022136.57819.ebotcazou@libertysurf.fr> Message-ID: <10712030355.AA12237@vlsi1.ultra.nyu.edu> > > I'd go even further, and say if the GNU coding standards say we > > shouldn't be putting descriptions of why we are changing things in the > > ChangeLog, than they should be changed and should be ignored on this > > point until they do. Pointing to them as the if they are The One True > > Way seems very suspect to me. After all, how else would they ever > > improve if nobody tries anything different? > > The people who wrote them presumably thought about these issues, too. My understanding is that the concern in going the other way was in having a ChangeLog that was too long to easy scan. Now yes, it's true that the concept of scanning a ChangeLog rather than a CM log quite dated at this point, but that's GNU coding standards issue, not a GCC issue and I don't think that trying to change that will produce much more than heat. I do, however, think that we have significant flexibility in content of the svn commit message and could well decide that it's useful to do more than echo the ChangeLog entry, but instead could include most of the text of the patch submission message. From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 04:03:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 04:03:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1196636533.16908.5.camel@localhost> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <1196636533.16908.5.camel@localhost> Message-ID: <10712030403.AA12660@vlsi1.ultra.nyu.edu> > Having said that, I find the lack of rationale for some changes to be a > bit irritating. I know that this should be done through code comments, > but those are often made across the changeset and in different files. And it's often not appropriate to put the reason (or even nature) of the change in comments because you run the risk of creating what is essentially a useless monologue to somebody trying to understand the present code: /* Extract the two operands of the expression. Note that at one point this code had a typo and extracted the same operand twice. Paul tried to fix it on February 10, 1997, but actually introduced another bug where it again extracted the same operand twice, but this time the other operand. Jeff finally got it right on February 12, 1997 and that's the present code. */ op0 = TREE_OPERAND (t, 0); op1 = TREE_OPERAND (t, 1); That's not very useful! From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 04:06:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 04:06:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <47533F95.5080608@adacore.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <2007-12-02-11-52-03+trackit+sam@rfc1149.net> <20071202065558.S8303@dair.pair.com> <47533F95.5080608@adacore.com> Message-ID: <10712030406.AA12801@vlsi1.ultra.nyu.edu> > If all the changelog entry says is something like > > (xyz): new function > > I don't see much point, since a diff can always easily tell > you *what* was changed. The point is that, by just looking at the ChangeLog, you can tell when xyz was introduced and by whom. I've used that quite a number of times. Moreover, as was pointed out, when you have a source distribution, you don't get the commit logs, just the ChangeLog. From dberlin@dberlin.org Mon Dec 3 04:51:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Mon, 03 Dec 2007 04:51:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712030406.AA12801@vlsi1.ultra.nyu.edu> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <2007-12-02-11-52-03+trackit+sam@rfc1149.net> <20071202065558.S8303@dair.pair.com> <47533F95.5080608@adacore.com> <10712030406.AA12801@vlsi1.ultra.nyu.edu> Message-ID: <4aca3dc20712022051u7bafa9bbqba7ef27f13dce46f@mail.gmail.com> On 12/2/07, Richard Kenner wrote: > > If all the changelog entry says is something like > > > > (xyz): new function > > > > I don't see much point, since a diff can always easily tell > > you *what* was changed. > > The point is that, by just looking at the ChangeLog, you can tell when > xyz was introduced and by whom. I've used that quite a number of > times. Moreover, as was pointed out, when you have a source > distribution, you don't get the commit logs, just the ChangeLog. There are in fact, already programs that will generate GNU format changelogs from svn log (see http://ch.tudelft.nl/~arthur/svn2cl/) It would be very easy to run these as part of the release process. Last time I looked, this is in fact how some GNU projects generate ChangeLog for distribution! > From ebotcazou@libertysurf.fr Mon Dec 3 06:02:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Mon, 03 Dec 2007 06:02:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <47534081.50801@adacore.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <47534081.50801@adacore.com> Message-ID: <200712030700.46354.ebotcazou@libertysurf.fr> > In the Ada revision histories, we have always given the what-and-the-why > (and if necessary the why-not), and they have proved very helpful, I > always found the RH's for gigi (done in the gcc style, much less helpful > because they omitted the why). Sorry, that's simply not true, the ChangeLog and the commit messages in the ada/ subdirectory are on par with the rest of the GCC tree, and that's fine. -- Eric Botcazou From ebotcazou@libertysurf.fr Mon Dec 3 06:16:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Mon, 03 Dec 2007 06:16:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> Message-ID: <200712030717.17565.ebotcazou@libertysurf.fr> > If you have a better justification, i'd love to hear it. Justification for what? I only tried to explain to Sam why we do things the way we do, I didn't write the GNU Coding Standards either. -- Eric Botcazou From njn@csse.unimelb.edu.au Mon Dec 3 06:45:00 2007 From: njn@csse.unimelb.edu.au (Nicholas Nethercote) Date: Mon, 03 Dec 2007 06:45:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: On Sun, 2 Dec 2007, Andreas Schwab wrote: >> | 2007-11-30 Jan Hubicka >> | >> | * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect >> | flag. >> >> How could a newcomer guess why the gcc_force_collect flag needs to be >> reset? > > That is supposed to be written in a comment. Indeed. Some advice I once wrote: Often I see a commit with a log message that lovingly explains a small change made to fix a subtle problem, but adds no comments to the code. Don't do this! Put that careful description in a comment, where people can actually see it. (Commit logs are basically invisible; even if they are auto-emailed to all developers, they are soon forgotten, and they don't benefit people not on the email list.) That comment is not a blemish but an invaluable record of an unusual case that someone didn't anticipate. If the bug-fix was pre-empted by a lengthy email exchange, include some or all of that exchange if it helps. Nick From schwab@suse.de Mon Dec 3 09:28:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Mon, 03 Dec 2007 09:28:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: (Nicholas Nethercote's message of "Mon\, 3 Dec 2007 17\:40\:47 +1100 \(EST\)") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: Nicholas Nethercote writes: > On Sun, 2 Dec 2007, Andreas Schwab wrote: > >>> | 2007-11-30 Jan Hubicka >>> | >>> | * ggc-common.c (dump_ggc_loc_statistics): Reset ggc_force_collect >>> | flag. >>> >>> How could a newcomer guess why the gcc_force_collect flag needs to be >>> reset? >> >> That is supposed to be written in a comment. > > Indeed. Some advice I once wrote: Often I see a commit with a log > message that lovingly explains a small change made to fix a subtle > problem, but adds no comments to the code. Don't do this! Put that > careful description in a comment, where people can actually see it. > (Commit logs are basically invisible; even if they are auto-emailed to all > developers, they are soon forgotten, and they don't benefit people not on > the email list.) Moreover, if you later look at a commit log you don't know whether it still describes the current code, you have to carefully inspect the later history whether there were any further refinements, for example. A comment will be updated over time and is always (supposed to be) on par with the code. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From andi@firstfloor.org Mon Dec 3 12:20:00 2007 From: andi@firstfloor.org (Andi Kleen) Date: Mon, 03 Dec 2007 12:20:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: (Nicholas Nethercote's message of "Mon\, 3 Dec 2007 06\:46\:00 +0000 \(UTC\)") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net.suse.lists.egcs> Message-ID: Nicholas Nethercote writes: > Commit logs are basically invisible; That's just a (fixable) problem in your coding setup. In other projects it is very common to use tools like cvs annotate / cvsps / git blame / git log / etc. to find the reasons for why code is the way it is. In fact in several editors these can be functions on hot keys. Programming is hard enough as is without ignoring such valuable information sources. Don't do it. -Andi From george@georgeshagov.com Mon Dec 3 13:16:00 2007 From: george@georgeshagov.com (george@georgeshagov.com) Date: Mon, 03 Dec 2007 13:16:00 -0000 Subject: [Fwd: Re: FW: matrix linking] In-Reply-To: <4749ADA3.5000107@georgeshagov.com> References: <474693FF.4020706@georgeshagov.com> <20071123095054.GA25192@dspnet.fr.eu.org> <4749ADA3.5000107@georgeshagov.com> Message-ID: <475401A1.9000807@georgeshagov.com> Oliver. Have you got a chance to take a look at the materials? If yes, what do you think on it? Yours sincerely, George. george@georgeshagov.com ?????: > Thank you for your reply, Oliver. > > Briefly speaking the solution to the problems you have mentioned looks > like this: > 1. take a loot at the first picture here: > http://docs.georgeshagov.com/twiki/tiki-index.php?page=Matrix+Linking+how+it+works > > 2. Pointer 1, 2... are vptrs > 3. The idea is that each module, library (.so) has a row of vptrs, > when it is required to make a dynamic binding this row is going to be > copied to the similar one, the new vptrs are applied to the new row of > vptrs, the previous, old row is unchanged. Then the shift looks like > incremental lock of integer value which is the version of this module > (.so). So it means that these threads which execute the code inside > the 'old module' they are unchanged, and the new code is going to be > executed in case we will have got the new call to the functions of the > module. It might have been said it does not answer the question, since > there might be some loops which needs to be reloaded also, though I > believe it does, since this tends more to the architecture than to > linkage already :-) > > This is quite brief and uncertain explanation. In reality it does not > look like this. More details could have been found here: > http://docs.georgeshagov.com/twiki/tiki-index.php?page=Matrix+Linking+how+it+works. > > > I hope you will find worthy the reading. > > In case of questions do not hesitate to ask. > > Yours sincerely, > George. > > > Olivier Galibert ?????: >> On Fri, Nov 23, 2007 at 11:49:03AM +0300, george@georgeshagov.com wrote: >> [Changing the _vptr or C equivalent dynamically] >>> I would like the community would have considered the idea. I am >>> ready to answer all the questions you might have. >> >> Changing the virtual function pointer dynamically using a serializing >> instruction is I'm afraid just the tip of the iceberb. Even >> forgetting for a second that some architectures do not have >> serializing instructions per se, there are some not-so-simple details >> to take into account: >> >> - the compiler can cache the vptr in a register, making your >> serialization less than serialized suddently >> >> - changing a group of functions is usually not enough. A component >> version change usually means its internal representation of the state >> changes. Which, in turn, means you need to serialize the object >> (whatever the programming language) in the older version and >> unserialize it in the newer, while deferring calls into the object >> from any thread >> >> - previous point means you also need to be able to know if any thread >> is "inside" the object in order to have it get out before you do a >> version change. Which in objects that use a somee of message fifo >> for work dispatching may never happen in the first place >> >> Dynamic vtpr binding is only the start of the solution. >> >> OG. >> > From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 13:24:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 13:24:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <47534ADB.3080502@adacore.com> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> <4751EEDB.6070008@adacore.com> <2007-12-02-09-59-09+trackit+sam@rfc1149.net> <47534ADB.3080502@adacore.com> Message-ID: <10712031323.AA20179@vlsi1.ultra.nyu.edu> > OK, sounds reasonable, but then I don't understand the logic behind > avoiding this instruction sequence for the volatile case, this is > two accesses at the bus level so what's the difference? There's no difference from that perspective. The logic behind what's generated is that instead of trying to do a case-by-case analysis of what instruction combinations might actually be valid for volatile memory (which could potentially be target-specific), GCC takes the conservative approach of simply disabling all but trivial combinations for volatile access. From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 13:29:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 13:29:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> Message-ID: <10712031329.AA20246@vlsi1.ultra.nyu.edu> > Sorry, but again, this is not a good enough justification to me. > We do a lot of things different than "The GNU Project". > So do plenty of parts of the "official GNU project". > They use different coding standards, bug tracking systems, version > control systems, checkin policies, etc, than each other. Yes, but none of those are visible other than to the development community. People who obtain the source distributions of projects don't get to see those things. They DO see things like the ChangeLog format and coding and documentation conventions and THOSE are the things that need to be common among GNU projects. In my view, ChangeLog is mostly "write-only" from a developer's perspective. It's a document that the GNU project requires us to produce for the benefit of people who DON'T want access to our checkin-logs, bug tracking information, and mailing lsits. But for our own development purposes, we use the above information much more than ChangeLog. From rguenther@suse.de Mon Dec 3 14:07:00 2007 From: rguenther@suse.de (Richard Guenther) Date: Mon, 03 Dec 2007 14:07:00 -0000 Subject: [RFC] Introduce middle-end expressions on arrays Message-ID: >From the last GCC Summit we learned about (Fortran) Frontend Based Optimizations from Toon and it was suggested to tackle one issue by promoting arrays to a first-class middle-end type. Discussion with Sebastian also reveals that this makes sense to ease the analysis of the into-GRAPHITE transformation by not dropping information that cannot be re-computed. For example Fortran has the "load-before-store" semantics on its vector operations, i.e. if the same array cell is accessed twice it is first read, then written i.e. updated. The scalarizer has to use temporary arrays for ensuring this semantics, so for example for "A = A + B" with A and B arrays, the scalarizer creates a temporary buffer that will hold the intermediate results of the addition before updating the array A: i.e. "T = A + B" followed by "A = T". This decompression of the memory use is easy to perform (see the array privatization techniques for parallelization), but the inverse operation, i.e. the compression is much more difficult, as it has to analyze in IPA mode the uses of T: before removing the initialization of T it has to ensure that there are no other uses of T. Another point for the motivation is that vector constructs should survive till the point where GCC knows/uses information about the vector target machine with using the vectorizer (as part of the GRAPHITE framework) to create regular (optimized) scalar IL from the array operations. So this is a proposal to make this happen, allow statements and expressions that operate on (multidimensional) arrays. The first and maybe most important issue is that temporaries required by either design issues of the IL or by optimizations are not going to be cheap. The second issue is the question of if (yes I'd say) and when (very early I'd say) we want to lower statements / expressions on arrays and generate loop constructs for them. On the first issue, how to represent array expressions in the IL we can take the pragmatic approach for now and merely try to make the expressions survive until they get picked up by GRAPHITE (or other to-be-invented special optimizers) but allow scalar optimizers to operate on scalar parts of the expressions. To avoid creating temporaries of array type, which the gimplifier in its current form cannot do for variable length types, we do not create full GIMPLE form of array expressions, but retain GENERIC expressions in RHS. With the very experimental prototype patch (that just was done to experiment with forms of the IL) we would for example end up with float a[16]; float b[16]; float c[16]; tmp = 0.0; a.1 = &a; b.2 = &b; c.3 = &c; n.4 = n; D.1558 = (long unsigned int) n.4; D.1559 = D.1558 * 4; *a.1 = WITH_SIZE_EXPR <(*b.2 + *c.3) + tmp, D.1559>; D.1560 = a[0]; return D.1560; for the whole array expression C source float test5 (int n) { float a[16]; float b[16]; float c[16]; float tmp = 0.0; *(float (*)[n])&a = *(float (*)[n])&b + *(float (*)[n])&c + tmp; return a[0]; } Ignore the WITH_SIZE_EXPR here, the point is that we have two operations and three operands on the rhs, two arrays, one scalar. CCP happily propagates 0.0 into the expression and if you do not run SRA (which ICEs) you survive until expand, which of course has no idea how to deal with this expression ;) So we think this approach is sound if these expressions do not survive for too long (that is, only until after out-of-GRAPHITE). A different approach would be to only allow whole arrays as gimple values and proceed with usual gimplification of expressions. To avoid to have to allocate memory for the array temporaries we could represent them as pointers to (not yet) allocated storage like for the above case float (* D.29)[n] = (float (*)[n])&a; float (* D.30)[n] = (float (*)[n])&b; float (* D.31)[n] = (float (*)[n])&c; float (* D.42)[n]; D.42 = __builtin_vla_tmp_alloc (n * 4); *D.42 = *D.30 + *D.31; *D.29 = *D.42 + tmp; __builtin_vla_tmp_free (D.43); where we still would have de-references of pointers to arrays als gimple values as well. This approach would also fit inserted temporaries that are required for language semantic reasons. On the second issue we are thinking of leveraging GRAPHITE which needs to create scalar code out of its intermediate representation anyway to do the lowering to scalar code. Especially we do not want to force expand to deal with the whole-array expressions. But this is obviously the issue which we can most easily post-pone until we can do experiments. There are two more aspects that need consideration. First, if and if yes, what kind of, tree codes we want to use for the operators. It would be possible to re-use PLUS_EXPR and MINUS_EXPR in general and use MULT_EXPR and RDIV_EXPR for mixed array / scalar operands. But in the end I don't think this is desirable and certainly not sufficient thinking of matrix multiplication, contraction and expansions. So a sound approach would be to introduce a ARRAY_EXPR and use sub-codes to distinguish the operations. The alternative is to use builtin functions, but that ends up with nested function calls in the IL which I think should be avoided, if only because of estethic reasons. The second aspect is that with the current experimental patch (and possibly with the final design) the array shape is encoded in the type of the access rather than using an explicit descriptor object as the Fortran frontend does right now. This has the advantage of less IL (and alias conflicts with such an object) and probably makes the analysis phase easier. But this raises the issue that the middle-end as of today cannot express array shapes with strides not equal to one - still this should be an easy extension of the current array type description. Then there is the question of semantics of an array expression. The simplest and most useful one is IMHO to make iteration order undefined, that is make statements that have overlapping LHS and RHS undefined and require the frontend to introduce temporaries where language semantics request that. The other semantics to follow could be the Fortran load-before-store semantics, which would move all the dependency checking code out of the frontend into the middle-end. But we can also easily have both semantics in parallel by creating two different assignment operators, one that states there is no need for temporaries to follow load-before-store semantics and one stating the opposite. Obviously the second can be lowered to the first one somewhen. So, to sum up all of the above we propose to: - Allow expressions on whole arrays. - Iteration order for evaluating an expression is undefined. - There will be both load-before-store and undefined load/store ordering assignment statements. - The IL will have whole-arrays including pointer-dereferences to whole-arrays as gimple values. - The IL will have fake / intermediate VLA temporaries as pointers to (defered) allocated memory. - The IL will be lowered to GIMPLE loops using GRAPHITE. - Expressions on arrays and mixed scalar/array operations will use separate tree codes possibly sub-coding a common parent code. - ARRAY_TYPEs will be extended to specify a stride. We do not (yet) propose language extensions to leverage the middle-end capabilities to other frontends than Fortran. But making this feature available using builtin functions that would be lowered to the array IL by the gimplifier is probably easy. (No, the prototype patch is _not_ a language extension proposal ;)) Find the experimentation patch below, with the limitation that it only allows binary + as operand and that it implements GENERIC rhs instead of the VLA temporaries approach. All testcases will ICE at least at the point of expansion, SRA is another ICEr. Otherwise working testcases are appended below, before the actual patch. Thanks, Richard. void test0 (double *p, int pn0, /* int pstride0 == 0, */ int pn1, int pstride1, int pn2 /*, int pstride2 unused, */) { *((double (*)[pn2 + pn1 * pstride1][pn1 /* + pn0 * pstride0 */][pn0])p) = 0.0; } void test1 (double *p, int pn0, int pn1, int pstride1, int pn2, double *q, int qn0, int qn1, int qstride1, int qn2) { *((double (*)[pn2 + pn1 * pstride1][pn1][pn0])p) = *((double (*)[qn2 + qn1 * qstride1][qn1][qn0])q); } void test2 (double *p, int pn0, int pn1, int pstride1, int pn2, double *q, int qn0, int qn1, int qstride1, int qn2) { *((double (*)[pn2 + pn1 * pstride1][pn1][pn0])p) = *((double (*)[qn2 + qn1 * qstride1][qn1][qn0])q) + 1.0; } void test3 (double *p, int pn0, int pn1, int pstride1, int pn2, double *q, int qn0, int qn1, int qstride1, int qn2) { *((double (*)[pn2 + pn1 * pstride1][pn1][pn0])p) += *((double (*)[qn2 + qn1 * qstride1][qn1][qn0])q); } void test4 (double *p, int pn0, int pn1, int pstride1, int pn2, double *q, int qn0, int qn1, int qstride1, int qn2) { *((double (*)[pn2 + pn1 * pstride1][pn1][pn0])p) += *((double (*)[qn2 + qn1 * qstride1][qn1][qn0])q) + 1.0; } float test5 (int n) { float a[16]; float b[16]; float c[16]; float tmp = 0.0; *(float (*)[n])&a = *(float (*)[n])&b + *(float (*)[n])&c + tmp; return a[0]; } Index: gcc/fold-const.c =================================================================== *** gcc/fold-const.c (revision 130435) --- gcc/fold-const.c (working copy) *************** fold_convert (tree type, tree arg) *** 2487,2492 **** --- 2487,2504 ---- if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (orig)) return fold_build1 (NOP_EXPR, type, arg); + /* XXX With VLA types we would need to do more complicated + matching if a conversion is possible or not. */ + if (TREE_CODE (type) == ARRAY_TYPE + && TREE_CODE (orig) == ARRAY_TYPE) + return fold_build1 (NOP_EXPR, type, arg); + + /* XXX Hack alert! We for sure need extra tree codes for + mixed-mode arguments. */ + if (TREE_CODE (type) == ARRAY_TYPE + && TREE_CODE (orig) == REAL_TYPE) + return arg; + switch (TREE_CODE (type)) { case INTEGER_TYPE: case ENUMERAL_TYPE: case BOOLEAN_TYPE: Index: gcc/tree-gimple.c =================================================================== *** gcc/tree-gimple.c (revision 130435) --- gcc/tree-gimple.c (working copy) *************** is_gimple_non_addressable (tree t) *** 382,387 **** --- 382,394 ---- bool is_gimple_val (tree t) { + /* Allow array operands in expressions. */ + if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE + /*&& (is_gimple_variable (t) + || (TREE_CODE (t) == INDIRECT_REF + && is_gimple_variable (TREE_OPERAND (t, 0))))*/) + return true; + /* Make loads from volatiles and memory vars explicit. */ if (is_gimple_variable (t) && is_gimple_reg_type (TREE_TYPE (t)) Index: gcc/c-typeck.c =================================================================== *** gcc/c-typeck.c (revision 130435) --- gcc/c-typeck.c (working copy) *************** default_function_array_conversion (struc *** 1663,1668 **** --- 1663,1674 ---- if (TREE_NO_WARNING (orig_exp)) TREE_NO_WARNING (exp.value) = 1; + /* XXX */ + if (TREE_CODE (exp.value) == INDIRECT_REF + && (TREE_CODE (TREE_OPERAND (exp.value, 0)) == NOP_EXPR + || TREE_CODE (TREE_OPERAND (exp.value, 0)) == CONVERT_EXPR)) + return exp; + lvalue_array_p = !not_lvalue && lvalue_p (exp.value); if (!flag_isoc99 && !lvalue_array_p) { *************** convert_for_assignment (tree type, tree *** 4368,4373 **** --- 4374,4398 ---- } else if (codel == BOOLEAN_TYPE && coder == POINTER_TYPE) return convert (type, rhs); + /* XXX For middle-end operations on arrays we handle ARRAY vs. REAL + types. This needs to be appropriately extended to handle any (valid) + scalar type. */ + else if (codel == ARRAY_TYPE + && coder == REAL_TYPE) + { + tree etype = type; + do { + etype = TREE_TYPE (etype); + } while (TREE_CODE (etype) == ARRAY_TYPE); + return convert (etype, rhs); + } + else if (codel == ARRAY_TYPE + && coder == ARRAY_TYPE) + { + /* XXX No automatic scalar conversion with this assignment. + XXX No check for matching layout. */ + return rhs; + } switch (errtype) { *************** build_binary_op (enum tree_code code, tr *** 7962,7967 **** --- 7987,8002 ---- return pointer_int_sum (PLUS_EXPR, op0, op1); else if (code1 == POINTER_TYPE && code0 == INTEGER_TYPE) return pointer_int_sum (PLUS_EXPR, op1, op0); + else if ((code0 == ARRAY_TYPE && code1 == REAL_TYPE) + || (code0 == REAL_TYPE && code1 == ARRAY_TYPE) + || (code0 == ARRAY_TYPE && code1 == ARRAY_TYPE)) + { + /* XXX Check compatibility. */ + if (code0 == REAL_TYPE) + return build2 (PLUS_EXPR, type1, op1, op0); + else + return build2 (PLUS_EXPR, type0, op0, op1); + } else common = 1; break; Index: gcc/c-parser.c =================================================================== *** gcc/c-parser.c (revision 130435) --- gcc/c-parser.c (working copy) *************** c_parser_expr_no_commas (c_parser *parse *** 4431,4437 **** } c_parser_consume_token (parser); rhs = c_parser_expr_no_commas (parser, NULL); ! rhs = default_function_array_conversion (rhs); ret.value = build_modify_expr (lhs.value, code, rhs.value); if (code == NOP_EXPR) ret.original_code = MODIFY_EXPR; --- 4431,4439 ---- } c_parser_consume_token (parser); rhs = c_parser_expr_no_commas (parser, NULL); ! /* XXX */ ! if (TREE_CODE (TREE_TYPE (lhs.value)) != ARRAY_TYPE) ! rhs = default_function_array_conversion (rhs); ret.value = build_modify_expr (lhs.value, code, rhs.value); if (code == NOP_EXPR) ret.original_code = MODIFY_EXPR; From dewar@adacore.com Mon Dec 3 14:10:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Mon, 03 Dec 2007 14:10:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <10712031323.AA20179@vlsi1.ultra.nyu.edu> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> <4751EEDB.6070008@adacore.com> <2007-12-02-09-59-09+trackit+sam@rfc1149.net> <47534ADB.3080502@adacore.com> <10712031323.AA20179@vlsi1.ultra.nyu.edu> Message-ID: <47540E4A.2060903@adacore.com> Richard Kenner wrote: >> OK, sounds reasonable, but then I don't understand the logic behind >> avoiding this instruction sequence for the volatile case, this is >> two accesses at the bus level so what's the difference? > > There's no difference from that perspective. The logic behind what's > generated is that instead of trying to do a case-by-case analysis of > what instruction combinations might actually be valid for volatile > memory (which could potentially be target-specific), GCC takes the > conservative approach of simply disabling all but trivial combinations > for volatile access. Right, but it would seem this is a good canididate for combination. This is especially true since often Volatile is used with the sense of Atomic in Ada, and it is not a bad idea to combine these in practice, giving an atomic update (right, nothing in the language requires it, but it is definitely useful!) From galibert@pobox.com Mon Dec 3 14:15:00 2007 From: galibert@pobox.com (Olivier Galibert) Date: Mon, 03 Dec 2007 14:15:00 -0000 Subject: [Fwd: Re: FW: matrix linking] In-Reply-To: <475401A1.9000807@georgeshagov.com> References: <474693FF.4020706@georgeshagov.com> <20071123095054.GA25192@dspnet.fr.eu.org> <4749ADA3.5000107@georgeshagov.com> <475401A1.9000807@georgeshagov.com> Message-ID: <20071203141520.GC1316@dspnet.fr.eu.org> On Mon, Dec 03, 2007 at 04:16:17PM +0300, george@georgeshagov.com wrote: > Have you got a chance to take a look at the materials? > If yes, what do you think on it? Nope, sorry, too busy with other things. OG. From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 14:41:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 14:41:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4aca3dc20712022051u7bafa9bbqba7ef27f13dce46f@mail.gmail.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <2007-12-02-11-52-03+trackit+sam@rfc1149.net> <20071202065558.S8303@dair.pair.com> <47533F95.5080608@adacore.com> <10712030406.AA12801@vlsi1.ultra.nyu.edu> <4aca3dc20712022051u7bafa9bbqba7ef27f13dce46f@mail.gmail.com> Message-ID: <10712031441.AA22354@vlsi1.ultra.nyu.edu> > There are in fact, already programs that will generate GNU format > changelogs from svn log (see http://ch.tudelft.nl/~arthur/svn2cl/) > It would be very easy to run these as part of the release process. Sure, but I think that's bad for this project since I support the idea that the svn log should contain additional information that's not part of what the GNU project wants in ChangeLog. From joel.sherrill@oarcorp.com Mon Dec 3 15:02:00 2007 From: joel.sherrill@oarcorp.com (Joel Sherrill) Date: Mon, 03 Dec 2007 15:02:00 -0000 Subject: gnat1 huge time In-Reply-To: <200711302314.29446.ebotcazou@adacore.com> References: <474DA2F1.1020007@oarcorp.com> <200711302045.30388.ebotcazou@adacore.com> <47507C31.30905@oarcorp.com> <200711302314.29446.ebotcazou@adacore.com> Message-ID: <47541A79.4000105@oarcorp.com> Eric Botcazou wrote: >> On the SPARC, this produced an executable I couldn't run >> on the simulator. It looked like the .text segment may >> have increased enough to not fit in the simulator. >> > > Weird. The EH tables should probably not end up in .text. > > RTEMS applications are statically linked with all support libraries. I wonder if it just make the run-time larger. I need to look at an executable in more detail. >> So this appears to work around the build time problem and >> will let me continue to test the real changes I was working >> on but I don't know that I like this as a permanent solution. >> > > This setting should bring the Ada compiler on par with the C++ compiler as > far as the EH mechanism is concerned: same space overhead, same performance > overhead. Which EH mechanism do you use for C++ in RTEMS? > > >> I have seen reports where people complained about the size >> of embedded GNAT and GNAT/RTEMS executables and doubling >> the code size just makes it worse. >> > > It's a tradeoff between space and performance. Certainly EH tables can be > large and setjmp/longjmp EH might be better suited, albeit slower. > > >> But this is progress and gives a real hint as to the underlying >> problem. Maybe it is enough for someone to fix it. >> > > It's not really related to the problem though, which appears to be in DF. > But at least it makes it possible to reproduce it even on platforms where > it doesn't arise natively, by making the opposite change you made. > > >> Is there a PR for this or do I need to try to file one? >> > > No, I don't think there is one. > > From dberlin@dberlin.org Mon Dec 3 15:48:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Mon, 03 Dec 2007 15:48:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712031329.AA20246@vlsi1.ultra.nyu.edu> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> Message-ID: <4aca3dc20712030748v24872dbak7773e2621020e873@mail.gmail.com> On 12/3/07, Richard Kenner wrote: > > Sorry, but again, this is not a good enough justification to me. > > We do a lot of things different than "The GNU Project". > > So do plenty of parts of the "official GNU project". > > They use different coding standards, bug tracking systems, version > > control systems, checkin policies, etc, than each other. > > Yes, but none of those are visible other than to the development community. > People who obtain the source distributions of projects don't get to see > those things. They DO see things like the ChangeLog format and coding > and documentation conventions and THOSE are the things that need to be > common among GNU projects. Except they aren't, across large parts of the GNU project. You may find it the same in the "traditional" parts of the GNU project (IE coreutils, emacs, etc). It's certainly not the same across any of the newer parts of the GNU project. From rsandifo@nildram.co.uk Mon Dec 3 15:55:00 2007 From: rsandifo@nildram.co.uk (Richard Sandiford) Date: Mon, 03 Dec 2007 15:55:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <47531D1D.8020805@codesourcery.com> (Mark Mitchell's message of "Sun\, 02 Dec 2007 13\:01\:17 -0800") References: <474C8FA4.2040603@codesourcery.com> <474C95BA.1060807@t-online.de> <474C96C1.7010208@codesourcery.com> <474C98AA.50105@t-online.de> <474C9A65.2060902@codesourcery.com> <474C9B33.8060503@t-online.de> <474C9CBD.2070708@codesourcery.com> <87fxyqdc45.fsf@firetop.home> <474D943C.4030106@codesourcery.com> <877ik0aerh.fsf@firetop.home> <20071130022132.GL17368@sygehus.dk> <87sl2o6s1g.fsf@firetop.home> <47505D76.4040207@codesourcery.com> <878x4eu4c8.fsf@firetop.home> <47531D1D.8020805@codesourcery.com> Message-ID: <87eje34vui.fsf@firetop.home> Mark Mitchell writes: > Richard Sandiford wrote: >> Anyway, given that there have been objections to the patch generally, >> I realise that the pre-approval is void. > > I think there's no controversy over the libstdc++ change, so let's put > that in. If nothing else, it makes the libstdc++ configury more > self-consistent; if we decide to change the overall strategy, then we > can do that all at once. Well, Rask's patch would make the libstdc++ change unnecessary, so it seems premature to change libstdc++ now. (Not that I'm objecting to anyone else doing it. I'm just not comfortable doing it myself, especially since, on its own, it doesn't affect any of "my" targets.) Richard From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 16:05:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 16:05:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <47540E4A.2060903@adacore.com> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> <4751EEDB.6070008@adacore.com> <2007-12-02-09-59-09+trackit+sam@rfc1149.net> <47534ADB.3080502@adacore.com> <10712031323.AA20179@vlsi1.ultra.nyu.edu> <47540E4A.2060903@adacore.com> Message-ID: <10712031605.AA25563@vlsi1.ultra.nyu.edu> > Right, but it would seem this is a good canididate for combination. This > is especially true since often Volatile is used with the sense of Atomic > in Ada, and it is not a bad idea to combine these in practice, giving an > atomic update (right, nothing in the language requires it, but it is > definitely useful!) I don't disagree that "this" is a good candidate for combination, but one problem is that by the time you're at that level, you don't easily have the source correspondance you want. E.g., y |= 2; and t1 = y | 2; y = t1; are very hard to tell apart at the RTL level. Though it's clear that a single instruction might best match the expect semantics of the former, it's a lot less clear that it would for the latter. >From a legacy perspective, it's dangerous to muck around much in this area. As to Ada's Atomic, it's just implementation convenience that it's mapped to GCC's volatile. Most things that GCC's volatile implies aren't needed for Ada's atomic (and it may even be the case that most of them aren't even needed for Volatile in Ada). One approach here would be to separate the properties we now consider part of "volatile" (e.g., "can't remove dead load", "can't change access size", "must keep same number of loads", etc.) into separate properties and test those in places where we now test the "volatile" attribute. That would be a fairly straightforward but large and pervasive change. It's not clear it'd be worth the effort, but I'm curious what others think. From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 16:20:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 16:20:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4aca3dc20712030748v24872dbak7773e2621020e873@mail.gmail.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> <4aca3dc20712030748v24872dbak7773e2621020e873@mail.gmail.com> Message-ID: <10712031620.AA26536@vlsi1.ultra.nyu.edu> > Except they aren't, across large parts of the GNU project. > > You may find it the same in the "traditional" parts of the GNU project > (IE coreutils, emacs, etc). > It's certainly not the same across any of the newer parts of the GNU project. Perhaps, but GCC has always been considered, like emacs, to be one of the most central parts of the GCC project and I think it'd be very wrong for us not follow their wishes in standardization areas. From andi@firstfloor.org Mon Dec 3 16:34:00 2007 From: andi@firstfloor.org (Andi Kleen) Date: Mon, 03 Dec 2007 16:34:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712031329.AA20246@vlsi1.ultra.nyu.edu.suse.lists.egcs> (Richard Kenner's message of "Mon\, 3 Dec 2007 13\:29\:57 +0000 \(UTC\)") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net.suse.lists.egcs> <200712022136.57819.ebotcazou@libertysurf.fr.suse.lists.egcs> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com.suse.lists.egcs> <200712022355.23871.ebotcazou@libertysurf.fr.suse.lists.egcs> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com.suse.lists.egcs> <10712031329.AA20246@vlsi1.ultra.nyu.edu.suse.lists.egcs> Message-ID: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) writes: > > Yes, but none of those are visible other than to the development community. > People who obtain the source distributions of projects don't get to see > those things. They DO see things like the ChangeLog format and coding > and documentation conventions and THOSE are the things that need to be > common among GNU projects. It would be probably reasonable these days to require of someone who wants to do serious development to just download a SVN checkout for that [or they can use svnweb on http://gcc.gnu.org] -Andi From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 16:38:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 16:38:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net.suse.lists.egcs> <200712022136.57819.ebotcazou@libertysurf.fr.suse.lists.egcs> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com.suse.lists.egcs> <200712022355.23871.ebotcazou@libertysurf.fr.suse.lists.egcs> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com.suse.lists.egcs> <10712031329.AA20246@vlsi1.ultra.nyu.edu.suse.lists.egcs> Message-ID: <10712031638.AA27752@vlsi1.ultra.nyu.edu> > It would be probably reasonable these days to require of someone who > wants to do serious development to just download a SVN checkout > for that [or they can use svnweb on http://gcc.gnu.org] I agree. But I think the idea of the ChangeLog is for somewhere just short of "serious development". I'm not too far from being willing to agree that ChangeLog is now hopelessly anachronistic (though I'm not there yet), but feel that this is really an FSF issue, not a GCC one. From kiesling@earthlink.net Mon Dec 3 16:50:00 2007 From: kiesling@earthlink.net (Robert Kiesling) Date: Mon, 03 Dec 2007 16:50:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4aca3dc20712030748v24872dbak7773e2621020e873@mail.gmail.com> Message-ID: [ Charset ISO-8859-1 unsupported, converting... ] > On 12/3/07, Richard Kenner wrote: > > > Sorry, but again, this is not a good enough justification to me. > > > We do a lot of things different than "The GNU Project". > > > So do plenty of parts of the "official GNU project". > > > They use different coding standards, bug tracking systems, version > > > control systems, checkin policies, etc, than each other. > > > > Yes, but none of those are visible other than to the development community. > > People who obtain the source distributions of projects don't get to see > > those things. They DO see things like the ChangeLog format and coding > > and documentation conventions and THOSE are the things that need to be > > common among GNU projects. > > Except they aren't, across large parts of the GNU project. > > You may find it the same in the "traditional" parts of the GNU project > (IE coreutils, emacs, etc). > It's certainly not the same across any of the newer parts of the GNU project. That's because, although the GNU project strictly - and correctly, experience has shown - monitors its code base, with the propagation of the Free Software development model, newer Free Software contributors who maintain their code on sites like sourceforge.net, are subject to commercial pressures that the older, ivory-tower authors in general are shielded from. It's impossible to convince someone who wants your "niche" for a quickie IPO that maintaining code for more than two or three years is worth the investment. -- Ctalk Home Page: http://www.ctalklang.org/ From dewar@adacore.com Mon Dec 3 17:26:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Mon, 03 Dec 2007 17:26:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: Message-ID: <47543C42.5090000@adacore.com> Robert Kiesling wrote: > That's because, although the GNU project strictly - and correctly, > experience has shown - monitors its code base, with the propagation > of the Free Software development model, newer Free Software > contributors who maintain their code on sites like sourceforge.net, > are subject to commercial pressures that the older, ivory-tower > authors in general are shielded from. > > It's impossible to convince someone who wants your "niche" for a > quickie IPO that maintaining code for more than two or three years > is worth the investment. I don't treally understand this commment, we are talking about improving the maintainability here, and what people are saying is that some other parts of the project have already moved in this direction. From dewar@adacore.com Mon Dec 3 17:28:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Mon, 03 Dec 2007 17:28:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <10712031605.AA25563@vlsi1.ultra.nyu.edu> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> <4751EEDB.6070008@adacore.com> <2007-12-02-09-59-09+trackit+sam@rfc1149.net> <47534ADB.3080502@adacore.com> <10712031323.AA20179@vlsi1.ultra.nyu.edu> <47540E4A.2060903@adacore.com> <10712031605.AA25563@vlsi1.ultra.nyu.edu> Message-ID: <47543C9B.60709@adacore.com> Richard Kenner wrote: >> Right, but it would seem this is a good canididate for combination. This >> is especially true since often Volatile is used with the sense of Atomic >> in Ada, and it is not a bad idea to combine these in practice, giving an >> atomic update (right, nothing in the language requires it, but it is >> definitely useful!) > > I don't disagree that "this" is a good candidate for combination, but > one problem is that by the time you're at that level, you don't easily have > the source correspondance you want. > > E.g., > > y |= 2; > > and > > t1 = y | 2; > y = t1; > > are very hard to tell apart at the RTL level. Though it's clear that > a single instruction might best match the expect semantics of the former, > it's a lot less clear that it would for the latter. I think it would still be OK for the latter, why not? From dnovillo@google.com Mon Dec 3 17:33:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Mon, 03 Dec 2007 17:33:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> Message-ID: <47543DE8.3010003@google.com> On 12/02/07 05:05, Samuel Tardieu wrote: > Maybe we should consider dropping ChangeLogs and using better checkin > messages. I'm not sure people will want to drop ChangeLogs anytime soon. I don't find them all that useful, but I *have* used them extensively when doing archeology. It gives you the initial thread to pull when finding out about changes. What I *do* miss a lot is a an easier way to link from the ChangeLog entry into the email message explaining the whys and hows of a change. In this respect, the comment in the code is not enough. The comment explains what the code does today, it does not (and should not) explain the history of that piece of code. Otherwise, comments would soon grow to useless proportions. The history is something one finds on the mailing lists. So, my proposal is to add a commit-time check that makes sure that the commit message contains a URL to the message describing the change. IIUC, such check shouldn't be hard to implement (Dan?) I try to do that with fixed PRs. When closing one, I usually add a link to the message explaining the fix. The only annoying issue with this proposal is that it forces the committer to fish out the message URL from the mailing lists, so perhaps we could make the check a warning instead of an error. Thoughts? Diego. From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 17:35:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 17:35:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <47543C9B.60709@adacore.com> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> <4751EEDB.6070008@adacore.com> <2007-12-02-09-59-09+trackit+sam@rfc1149.net> <47534ADB.3080502@adacore.com> <10712031323.AA20179@vlsi1.ultra.nyu.edu> <47540E4A.2060903@adacore.com> <10712031605.AA25563@vlsi1.ultra.nyu.edu> <47543C9B.60709@adacore.com> Message-ID: <10712031735.AA29165@vlsi1.ultra.nyu.edu> > > t1 = y | 2; > > y = t1; > > > > are very hard to tell apart at the RTL level. Though it's clear that > > a single instruction might best match the expect semantics of the former, > > it's a lot less clear that it would for the latter. > > I think it would still be OK for the latter, why not? There was certainly a time when it would not, because a R/M/W cycle on a device register meant a different thing that a read followed by a write and the latter is more clearly what the above is supposed to represent. Whether there is still such hardware around is another question, but the point is that whether you or I THINK it would be OK really isn't the issue when talking about legacy code. From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 17:37:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 17:37:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <47543DE8.3010003@google.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> Message-ID: <10712031736.AA29226@vlsi1.ultra.nyu.edu> > I'm not sure people will want to drop ChangeLogs anytime soon. I don't > find them all that useful, but I *have* used them extensively when doing > archeology. It gives you the initial thread to pull when finding out > about changes. > > What I *do* miss a lot is a an easier way to link from the ChangeLog > entry into the email message explaining the whys and hows of a change. > In this respect, the comment in the code is not enough. The comment > explains what the code does today, it does not (and should not) explain > the history of that piece of code. Otherwise, comments would soon grow > to useless proportions. I agree completely with all of that. From bernds_cb1@t-online.de Mon Dec 3 17:47:00 2007 From: bernds_cb1@t-online.de (Bernd Schmidt) Date: Mon, 03 Dec 2007 17:47:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <47543DE8.3010003@google.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> Message-ID: <475440E8.4000109@t-online.de> Diego Novillo wrote: > The history is something one finds on the mailing lists. So, my > proposal is to add a commit-time check that makes sure that the commit > message contains a URL to the message describing the change. IIUC, such > check shouldn't be hard to implement (Dan?) I'd much prefer the text from the mail message be repeated in the commit log. Removes one step of indirection both when writing and reading the log. Of course, that way it's not something that can easily be enforced automatically. Bernd -- This footer brought to you by insane German lawmakers. Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif From Jan.Sjodin@amd.com Mon Dec 3 17:48:00 2007 From: Jan.Sjodin@amd.com (Sjodin, Jan) Date: Mon, 03 Dec 2007 17:48:00 -0000 Subject: How to reinterpret data in GIMPLE. Message-ID: <9BCA02B0979C2A429C5965B8AE29269818C969@SAUSEXMB2.amd.com> I would like to reinterpret (not convert/cast) a 32-bit integer to a 32-bit float in GIMPLE. Is using a NOP_EXPR with the wanted type the correct way of doing this? The reinterpretation of a value is needed to optimize reads and writes to unions. I modified the value numbering pass which worked fine, but changing PRE to use a NOP_EXPR to change the type still resulted in a conversion. It may be because of a bug somewhere else, but before doing more work I would like to make sure that NOP_EXPR is the correct operator. Thanks, Jan From pinskia@gmail.com Mon Dec 3 17:54:00 2007 From: pinskia@gmail.com (Andrew Pinski) Date: Mon, 03 Dec 2007 17:54:00 -0000 Subject: How to reinterpret data in GIMPLE. In-Reply-To: <9BCA02B0979C2A429C5965B8AE29269818C969@SAUSEXMB2.amd.com> References: <9BCA02B0979C2A429C5965B8AE29269818C969@SAUSEXMB2.amd.com> Message-ID: On 12/3/07, Sjodin, Jan wrote: > I would like to reinterpret (not convert/cast) a 32-bit integer to a > 32-bit float in GIMPLE. Is using a NOP_EXPR with the wanted type the > correct way of doing this? You want to use the tree code called VIEW_CONVERT_EXPR. Thanks, Andrew Pinski From iant@google.com Mon Dec 3 17:58:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Mon, 03 Dec 2007 17:58:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <475440E8.4000109@t-online.de> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <475440E8.4000109@t-online.de> Message-ID: Bernd Schmidt writes: > Diego Novillo wrote: > > The history is something one finds on the mailing lists. So, my > > proposal is to add a commit-time check that makes sure that the commit > > message contains a URL to the message describing the change. IIUC, such > > check shouldn't be hard to implement (Dan?) > > I'd much prefer the text from the mail message be repeated in the commit > log. Removes one step of indirection both when writing and reading the log. > > Of course, that way it's not something that can easily be enforced > automatically. I would find the URL to be very useful, because it links to the discussion. Reasonably often the last e-mail message is something like "Does this version look OK?" Ian From rask@sygehus.dk Mon Dec 3 18:13:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Mon, 03 Dec 2007 18:13:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <87d4tn4v94.fsf@firetop.home> References: <474C98AA.50105@t-online.de> <474C9A65.2060902@codesourcery.com> <474C9B33.8060503@t-online.de> <474C9CBD.2070708@codesourcery.com> <87fxyqdc45.fsf@firetop.home> <474D943C.4030106@codesourcery.com> <877ik0aerh.fsf@firetop.home> <20071130022132.GL17368@sygehus.dk> <20071203092630.N17510@dair.pair.com> <87d4tn4v94.fsf@firetop.home> Message-ID: <20071203181315.GF17368@sygehus.dk> On Mon, Dec 03, 2007 at 04:07:35PM +0000, Richard Sandiford wrote: > And I haven't yet looked at why the tests are failing. I was just noting > that they did. It looks from PR21185 that Rask was seeing the same thing > on mipsisa64-elf, and TBH, I was so unsurprised that they were failing that > I hadn't even realised it was _supposed_ to work now. I'll have a prod. mips-core: 1 byte read to unmapped address 0x0 at 0xffffffff80021468 program stopped with signal 10. There are 8456 such messages at slightly different addresses out of a total of 8488 failures. And generally, on the targets with problems, the problems seem to be in the simulation part (dejagnu, the simulator itself or the linker script) rather than in GCC. For example, the special linker script used for SPARC testing needs to be updated to handle .e.g. .rodata.* sections so they don't collide with the .bss section. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From dnovillo@google.com Mon Dec 3 18:20:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Mon, 03 Dec 2007 18:20:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <475440E8.4000109@t-online.de> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <475440E8.4000109@t-online.de> Message-ID: <475448BC.2020103@google.com> Bernd Schmidt wrote: > I'd much prefer the text from the mail message be repeated in the commit > log. Removes one step of indirection both when writing and reading the log. I guess that could work, but that wouldn't give a way into the history for the change. Several times there is a post-mortem discussion on the patch, leading to more patches. Diego. From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 18:49:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 18:49:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <475440E8.4000109@t-online.de> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <475440E8.4000109@t-online.de> Message-ID: <10712031849.AA00555@vlsi1.ultra.nyu.edu> > I'd much prefer the text from the mail message be repeated in the commit > log. Removes one step of indirection both when writing and reading the log. I would as well. Especially if you're trying to scan a large part of the log looking for something. From kenner@vlsi1.ultra.nyu.edu Mon Dec 3 18:51:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Mon, 03 Dec 2007 18:51:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <475448BC.2020103@google.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <475440E8.4000109@t-online.de> <475448BC.2020103@google.com> Message-ID: <10712031850.AA00730@vlsi1.ultra.nyu.edu> > I guess that could work, but that wouldn't give a way into the history > for the change. Several times there is a post-mortem discussion on the > patch, leading to more patches. How about both? From dnovillo@google.com Mon Dec 3 18:58:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Mon, 03 Dec 2007 18:58:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712031850.AA00730@vlsi1.ultra.nyu.edu> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <475440E8.4000109@t-online.de> <475448BC.2020103@google.com> <10712031850.AA00730@vlsi1.ultra.nyu.edu> Message-ID: <475451C5.1080204@google.com> On 12/03/07 13:50, Richard Kenner wrote: >> I guess that could work, but that wouldn't give a way into the history >> for the change. Several times there is a post-mortem discussion on the >> patch, leading to more patches. > > How about both? Sure. Diego. From tejgcc@westnet.com.au Mon Dec 3 20:08:00 2007 From: tejgcc@westnet.com.au (Tim Josling) Date: Mon, 03 Dec 2007 20:08:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <475451C5.1080204@google.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <475440E8.4000109@t-online.de> <475448BC.2020103@google.com> <10712031850.AA00730@vlsi1.ultra.nyu.edu> <475451C5.1080204@google.com> Message-ID: <1196712459.6257.23.camel@tim-gcc> On Mon, 2007-12-03 at 13:58 -0500, Diego Novillo wrote: > On 12/03/07 13:50, Richard Kenner wrote: > >> I guess that could work, but that wouldn't give a way into the history > >> for the change. Several times there is a post-mortem discussion on the > >> patch, leading to more patches. > > > > How about both? > > Sure. > > > Diego. Quite a few people are worried about verbose descriptions of changes cluttering up the ChangeLog. Others (like me) would like a way easily to find the discussions about the change, and would like a brief indication in the ChangeLog of the context of the change. The FSF also has good reasons for keeping solid records of who made what change. So, how about this: 1. For a PR fix, continue to record the PR number and category. Like this: PR tree-optimization/32694 2. For all changes, a one-line record giving the context, plus the URL of a key message in the email message trail, unless the intent is plainly obvious such as bumping the version number. Like this: Gimplification of Fortran front end. http://gcc.gnu.org/ml/gcc-patches/2007-12/msg00072.html 3. Continue to record "who made what change". Like this: * config/xtensa/xtensa.c (xtensa_expand_prologue): Put a REG_FRAME_RELATED_EXPR note on the last insn that sets up the stack pointer or frame pointer. This should satisfy everyone's needs. This would by no means be the largest divergence from the FSF standards by the GCC project. The use of languages other than C in the Ada front end is non-compliant by my reading. The compliance of the rest of the code to the FSF standards is spotty at times eg the garbage collection code. While this is a divergence from the FSF standards, it is a positive change and no information is being lost. It would be interesting to ask someone who was around at the time why the guidelines were written as they were. They rationale may no longer be relevant. Tim Josling From dewar@adacore.com Mon Dec 3 20:44:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Mon, 03 Dec 2007 20:44:00 -0000 Subject: volatile and R/M/W operations In-Reply-To: <10712031735.AA29165@vlsi1.ultra.nyu.edu> References: <2007-11-30-23-19-02+trackit+sam@rfc1149.net> <4751EEDB.6070008@adacore.com> <2007-12-02-09-59-09+trackit+sam@rfc1149.net> <47534ADB.3080502@adacore.com> <10712031323.AA20179@vlsi1.ultra.nyu.edu> <47540E4A.2060903@adacore.com> <10712031605.AA25563@vlsi1.ultra.nyu.edu> <47543C9B.60709@adacore.com> <10712031735.AA29165@vlsi1.ultra.nyu.edu> Message-ID: <47546A8F.1030607@adacore.com> Richard Kenner wrote: >>> t1 = y | 2; >>> y = t1; >>> >>> are very hard to tell apart at the RTL level. Though it's clear that >>> a single instruction might best match the expect semantics of the former, >>> it's a lot less clear that it would for the latter. >> I think it would still be OK for the latter, why not? > > There was certainly a time when it would not, because a R/M/W cycle on > a device register meant a different thing that a read followed by a write > and the latter is more clearly what the above is supposed to represent. > > Whether there is still such hardware around is another question, but > the point is that whether you or I THINK it would be OK really isn't > the issue when talking about legacy code. What is interesting is whether this translation is appropriate for modern Intel architecture chips, and as far as I know the answer is yes. From gccadmin@gcc.gnu.org Mon Dec 3 22:41:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Mon, 03 Dec 2007 22:41:00 -0000 Subject: gcc-4.1-20071203 is now available Message-ID: <20071203224122.27936.qmail@sourceware.org> Snapshot gcc-4.1-20071203 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20071203/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.1 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch revision 130587 You'll find: gcc-4.1-20071203.tar.bz2 Complete GCC (includes all of below) gcc-core-4.1-20071203.tar.bz2 C front end and core compiler gcc-ada-4.1-20071203.tar.bz2 Ada front end and runtime gcc-fortran-4.1-20071203.tar.bz2 Fortran front end and runtime gcc-g++-4.1-20071203.tar.bz2 C++ front end and runtime gcc-java-4.1-20071203.tar.bz2 Java front end and runtime gcc-objc-4.1-20071203.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.1-20071203.tar.bz2 The GCC testsuite Diffs from 4.1-20071126 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.1 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From tschwinge@gnu.org Mon Dec 3 22:49:00 2007 From: tschwinge@gnu.org (Thomas Schwinge) Date: Mon, 03 Dec 2007 22:49:00 -0000 Subject: Issue with fixincludes (?) and `limits.h' Message-ID: <20071203224854.GS16318@fencepost.gnu.org> Hello! What is the reason for GCC (trunk version) installing the header file as `PREFIX/lib/gcc/*/*/include-fixed/limits.h' instead of putting it into `PREFIX/lib/gcc/*/*/include/', which is what gcc-4_2-branch and earlier have been doing? The leads to a problem as follows. You're about to bootstrap a cross compiler from only source code. You build cross binutils. You build a minimal bootstrapping GCC (``--with-sysroot=[...] --disable-shared --disable-threads --without-headers --enable-languages=c''; ``make all-gcc install-gcc all-target-libgcc install-target-libgcc''). Then you attemp to bootstrap the glibc which will eventually fail like this: #v+ i586-pc-gnu-gcc [...] -I[glibc internal] -nostdinc -isystem [GCC target]/4.3.0/include -isystem [sysroot]/include [glibc stuff] In file included from ../sysdeps/unix/bsd/bsd4.4/bits/socket.h:31, from [...] ../include/limits.h:125:26: error: limits.h: No such file or directory #v- Is this a GCC issue or should the glibc build system be adding a ``-isystem [GCC target]/4.3.0/include-fixed''? #v+ $ find lib/gcc/*/*/ -name \*.h | sort lib/gcc/i586-pc-gnu/4.3.0/include-fixed/limits.h lib/gcc/i586-pc-gnu/4.3.0/include-fixed/syslimits.h lib/gcc/i586-pc-gnu/4.3.0/include/ammintrin.h lib/gcc/i586-pc-gnu/4.3.0/include/bmmintrin.h lib/gcc/i586-pc-gnu/4.3.0/include/cpuid.h [...] lib/gcc/i586-pc-gnu/4.3.0/include/unwind.h lib/gcc/i586-pc-gnu/4.3.0/include/varargs.h lib/gcc/i586-pc-gnu/4.3.0/include/xmmintrin.h lib/gcc/i586-pc-gnu/4.3.0/install-tools/gsyslimits.h lib/gcc/i586-pc-gnu/4.3.0/install-tools/include/limits.h #v- Note that this can also be reproduced with a pseudo GNU/Linux to GNU/Linux ``cross'' compiler, e.g., from `i686-pc-linux-gnu' to `i586-pc-linux-gnu': #v+ $ ../trunk-work/configure --target=i586-pc-linux-gnu --prefix=$(pwd).install --disable-nls --disable-shared --disable-threads --enable-languages=c --with-arch=i586 [...] $ make all-gcc install-gcc [...] $ find $(pwd).install/lib/gcc/*/*/ -name \*.h | sort /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/include-fixed/limits.h /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/include-fixed/syslimits.h /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/include/ammintrin.h /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/include/bmmintrin.h /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/include/cpuid.h [...] /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/include/unwind.h /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/include/varargs.h /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/include/xmmintrin.h /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/install-tools/gsyslimits.h /home/thomas/tmp/source/gcc/trunk-work.build.install/lib/gcc/i586-pc-linux-gnu/4.3.0/install-tools/include/limits.h #v- Why is `limits.h' put into `include-fixed/'? And what about this `install-tools' directory which duplicates some of the files? Regards, Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 191 bytes Desc: Digital signature URL: From iant@google.com Mon Dec 3 23:19:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Mon, 03 Dec 2007 23:19:00 -0000 Subject: Issue with fixincludes (?) and `limits.h' In-Reply-To: <20071203224854.GS16318@fencepost.gnu.org> References: <20071203224854.GS16318@fencepost.gnu.org> Message-ID: Thomas Schwinge writes: > Is this a GCC issue or should the glibc build system be adding a > ``-isystem [GCC target]/4.3.0/include-fixed''? The latter. http://sourceware.org/ml/libc-alpha/2007-03/msg00017.html Ian From tschwinge@gnu.org Mon Dec 3 23:47:00 2007 From: tschwinge@gnu.org (Thomas Schwinge) Date: Mon, 03 Dec 2007 23:47:00 -0000 Subject: Issue with fixincludes (?) and `limits.h' In-Reply-To: References: <20071203224854.GS16318@fencepost.gnu.org> Message-ID: <20071203234733.GT16318@fencepost.gnu.org> Hello! On Mon, Dec 03, 2007 at 03:18:42PM -0800, Ian Lance Taylor wrote: > Thomas Schwinge writes: > > Is this a GCC issue or should the glibc build system be adding a > > ``-isystem [GCC target]/4.3.0/include-fixed''? > > The latter. > > http://sourceware.org/ml/libc-alpha/2007-03/msg00017.html I will ping the glibc crew and put your patch into the glibc bugzilla, for CVS HEAD and glibc-2_7-branch, OK? Regards, Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 191 bytes Desc: Digital signature URL: From iant@google.com Tue Dec 4 01:05:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Tue, 04 Dec 2007 01:05:00 -0000 Subject: Issue with fixincludes (?) and `limits.h' In-Reply-To: <20071203234733.GT16318@fencepost.gnu.org> References: <20071203224854.GS16318@fencepost.gnu.org> <20071203234733.GT16318@fencepost.gnu.org> Message-ID: Thomas Schwinge writes: > Hello! > > On Mon, Dec 03, 2007 at 03:18:42PM -0800, Ian Lance Taylor wrote: > > Thomas Schwinge writes: > > > Is this a GCC issue or should the glibc build system be adding a > > > ``-isystem [GCC target]/4.3.0/include-fixed''? > > > > The latter. > > > > http://sourceware.org/ml/libc-alpha/2007-03/msg00017.html > > I will ping the glibc crew and put your patch into the glibc bugzilla, > for CVS HEAD and glibc-2_7-branch, OK? It's not my patch, but that sounds like a good plan. Ian From tschwinge@gnu.org Tue Dec 4 01:21:00 2007 From: tschwinge@gnu.org (Thomas Schwinge) Date: Tue, 04 Dec 2007 01:21:00 -0000 Subject: Issue with fixincludes (?) and `limits.h' In-Reply-To: References: <20071203224854.GS16318@fencepost.gnu.org> <20071203234733.GT16318@fencepost.gnu.org> Message-ID: <20071204012136.GA19089@fencepost.gnu.org> Hello! On Mon, Dec 03, 2007 at 05:05:10PM -0800, Ian Lance Taylor wrote: > Thomas Schwinge writes: > > On Mon, Dec 03, 2007 at 03:18:42PM -0800, Ian Lance Taylor wrote: > > > Thomas Schwinge writes: > > > > Is this a GCC issue or should the glibc build system be adding a > > > > ``-isystem [GCC target]/4.3.0/include-fixed''? > > > > > > The latter. > > > > > > http://sourceware.org/ml/libc-alpha/2007-03/msg00017.html > > > > I will ping the glibc crew and put your patch into the glibc bugzilla, > > for CVS HEAD and glibc-2_7-branch, OK? > > It's not my patch Indeed. Sorry. > but that sounds like a good plan. Regards, Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 191 bytes Desc: Digital signature URL: From zhrgcc@gmail.com Tue Dec 4 01:56:00 2007 From: zhrgcc@gmail.com (r z) Date: Tue, 04 Dec 2007 01:56:00 -0000 Subject: problems when move one pseudo reg to another (need help) Message-ID: Hello! Execuse me but I have something bothering me. I inserted a method into rest_of_compilation() (before calling rest_of_handle_life()) to insert some new insn before insn were translated to asm. But after my modification, when processing some insn moving one pseudo reg to another pseudo reg(created by me), the reg_overlap_mentioned_for_reload_p() method executed abort() at line 6341: if (regno >= FIRST_PSEUDO_REGISTER) { if (reg_equiv_memory_loc[regno]) return refers_to_mem_for_reload_p (in); else if (reg_equiv_constant[regno]) return 0; abort (); } By the way, I am using source code of gcc3.4.4. Could anyone give some help? I am new here. Thanks. (I will back again one day later after school) appendix: 1. the insn with problem: (insn 59 24 60 0 (set (reg:SI 74) (reg:SI 71)) -1 (nil) (nil)) 2. gdb backtrace after calling abort() (gdb) backtrace #0 fancy_abort (file=0x839ae65 "reload.c", line=6334, function=0x8383600 "reg_overlap_mentioned_for_reload_p") at diagnostic.c:584 #1 0x0823d648 in reg_overlap_mentioned_for_reload_p (x= 0xb7cd15d0, in=0xb7cd15c0) at reload.c:6334 #2 0x0824917f in find_reloads (insn=0xb7ed6730, replace=0, ind_levels=0, live_known=0, reload_reg_p=0x84081c0) at reload.c:1721 #3 0x08254514 in reload (first=0xb7ed3240, global=0) at reload1.c:1459 #4 0x08274f58 in rest_of_handle_old_regalloc (decl=0xb7cabf30, insns=0xb7ed3240) at ./toplev.c:2295 #5 0x082765df in rest_of_compilation (decl=0xb7cabf30) at ./toplev.c:3457 #6 0x082b50e5 in tree_rest_of_compilation (fndecl=0xb7cabf30, nested_p=false) at tree-optimize.c:168 #7 0x08059230 in c_expand_body_1 (fndecl=0xb7cabf30, nested_p=74) at c-decl.c:6189 #8 0x082b6ad9 in cgraph_expand_function (node=0xb7c8d438) at cgraphunit.c:538 #9 0x082b7aac in cgraph_assemble_pending_functions () at cgraphunit.c:144 #10 0x082b8380 in cgraph_finalize_function (decl=0xb7cabf30, nested=false) at cgraphunit.c:225 #11 0x0805f345 in finish_function () at c-decl.c:6146 #12 0x0804c76b in yyparse () at c-parse.y:385 #13 0x0804f57b in c_parse_file () at c-parse.y:3029 #14 0x0807e297 in c_common_parse_file (set_yydebug=0) at c-opts.c:1249 #15 0x08273b0e in toplev_main (argc=3, argv=0xbfc0b9f4) at ./toplev.c:1833 #16 0x0809e75e in main (argc=Cannot access memory at address 0x4a ) at main.c:35 (gdb) From njn@csse.unimelb.edu.au Tue Dec 4 04:03:00 2007 From: njn@csse.unimelb.edu.au (Nicholas Nethercote) Date: Tue, 04 Dec 2007 04:03:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net.suse.lists.egcs> Message-ID: On Mon, 3 Dec 2007, Andi Kleen wrote: >> Commit logs are basically invisible; > > That's just a (fixable) problem in your coding setup. In other > projects it is very common to use tools like cvs annotate / cvsps / > git blame / git log / etc. to find the reasons for why code is the way > it is. In fact in several editors these can be functions on hot > keys. Programming is hard enough as is without ignoring such valuable > information sources. Don't do it. I didn't say you cannot or should not use these tools. But a good comment on a piece of code sure beats a good commit message, which must be looked at separately, and can be fragmented over multiple commits, etc. Nick From kiesling@earthlink.net Tue Dec 4 10:19:00 2007 From: kiesling@earthlink.net (Robert Kiesling) Date: Tue, 04 Dec 2007 10:19:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: Message-ID: > Nicholas Nethercote writes: > > > Commit logs are basically invisible; > > That's just a (fixable) problem in your coding setup. In other > projects it is very common to use tools like cvs annotate / cvsps / > git blame / git log / etc. to find the reasons for why code is the way > it is. In fact in several editors these can be functions on hot > keys. Programming is hard enough as is without ignoring such valuable > information sources. Don't do it. Unless there's a good reason _not_ to derive from a source, and there are several, most of which require a clean-room approach, or a simulation of it. I'm just now starting to move over to Subversion, and I'm sure it has the same ability, to publish CVS logs, though not via CVS itself. C-x v u :) -- Ctalk Home Page: http://www.ctalklang.org/ From windsor@sfr.net Tue Dec 4 11:50:00 2007 From: windsor@sfr.net (=?koi8-r?B?7MnEydEg?=) Date: Tue, 04 Dec 2007 11:50:00 -0000 Subject: FW[0]: Message-ID: <000501c8366b$02b828e4$3e04c1ad@nvebn> E-Mail ?????????????? ???????????????????? ??????????????????????????????????! 648 6761 From kenner@vlsi1.ultra.nyu.edu Tue Dec 4 13:05:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Tue, 04 Dec 2007 13:05:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net.suse.lists.egcs> Message-ID: <10712041305.AA18081@vlsi1.ultra.nyu.edu> > I didn't say you cannot or should not use these tools. But a good comment > on a piece of code sure beats a good commit message, which must be looked at > separately, and can be fragmented over multiple commits, etc. I don't see one as "beating" the other because they have very different purposes. Sometimes you need one and sometimes you need the other. The purpose of COMMENTS is to help somebody understand the code as it stands at some point in time. In most cases, that means saying WHAT the code does and WHY (at some level) it does what it does. Once in a while, it also means saying why it DOESN'T do something, for example, if it might appear that there's a simpler way of doing what the code is doing now but it doesn't work for some subtle reason. But it's NOT appropriate to put into comments the historical remark that this code used to have a typo which caused a miscompilation at some specific place. However, the commit log IS the place for that sort of note. My view is that, in general, the comments are usually the most appropriate place to put information about how the code currently works and the commit log is generally the best place for information that contrasts how the code currently works with how it used to work and provides the motivation for making the change. But there are exceptions to both of those generalizations. From rask@sygehus.dk Tue Dec 4 14:46:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Tue, 04 Dec 2007 14:46:00 -0000 Subject: Link tests after GCC_NO_EXECUTABLES In-Reply-To: <20071201223447.GU17368@sygehus.dk> References: <20071128210420.GH17368@sygehus.dk> <474DF7E4.6050308@codesourcery.com> <20071130181424.GO17368@sygehus.dk> <4750559E.2090800@codesourcery.com> <20071130211005.GQ17368@sygehus.dk> <87d4tqu4nv.fsf@firetop.home> <20071201115252.GS17368@sygehus.dk> <20071201120251.GT17368@sygehus.dk> <20071201223447.GU17368@sygehus.dk> Message-ID: <20071204144622.GG17368@sygehus.dk> On Sat, Dec 01, 2007 at 11:34:47PM +0100, Rask Ingemann Lambertsen wrote: > Index: configure.ac > =================================================================== > --- configure.ac (revision 130442) > +++ configure.ac (working copy) > AC_SUBST(CONFIGURE_GDB_TK) > AC_SUBST(GDB_TK) > AC_SUBST(INSTALL_GDB_TK) > +AC_SUBST(with_newlib) > > # Build module lists & subconfigure args. > AC_SUBST(build_configargs) That hunk is corrupt, it should look like this: Index: configure.ac =================================================================== --- configure.ac (revision 130442) +++ configure.ac (working copy) @@ -2435,6 +2435,7 @@ AC_SUBST(CONFIGURE_GDB_TK) AC_SUBST(GDB_TK) AC_SUBST(INSTALL_GDB_TK) +AC_SUBST(with_newlib) # Build module lists & subconfigure args. AC_SUBST(build_configargs) -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From kathi@strykercorp.com Tue Dec 4 15:36:00 2007 From: kathi@strykercorp.com (=?koi8-r?B?98HOxMEg?=) Date: Tue, 04 Dec 2007 15:36:00 -0000 Subject: fw[1]: Message-ID: <000501c8368b$0673d4d8$56a3f1a3@awwjwybr> ``E-m@il ?????????????????? ???????????????????? ???????????????????????????????? 6??86??-8?? From tromey@redhat.com Tue Dec 4 16:45:00 2007 From: tromey@redhat.com (Tom Tromey) Date: Tue, 04 Dec 2007 16:45:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <47543DE8.3010003@google.com> (Diego Novillo's message of "Mon\, 03 Dec 2007 12\:33\:28 -0500") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> Message-ID: >>>>> "Diego" == Diego Novillo writes: Diego> I'm not sure people will want to drop ChangeLogs anytime soon. I Diego> don't find them all that useful, but I *have* used them extensively Diego> when doing archeology. It gives you the initial thread to pull when Diego> finding out about changes. Yeah. They aren't incredibly useful, but they aren't useless, either. One thing they give you that 'svn annotate' does not is a record of when things were deleted. I use them this way on occasion. I have a few concerns about a change in this area. First, continuing to have good quality messages. Right now at the very least you get a (semi-) accurate record of what was touched. I've seen plenty of ChangeLog-less projects out there than end up with commits like "fixed a bug", or even worse. I suppose we'll need to review the commit messages just like we review ChangeLog entries now. That doesn't sound fun, but I suppose it won't be too much work. Second, whether it makes the process heavier: Diego> The only annoying issue with this proposal is that it forces the Diego> committer to fish out the message URL from the mailing lists, so Diego> perhaps we could make the check a warning instead of an error. This would be a big pain. Perhaps it wouldn't be so bad if the mail server added a URL to the header somewhere, so I could wait for the mail to come back and then look it up. The obvious alternative of hitting reload on the gcc web page is unattractive. Also it seems to me that this will make it a bit harder for developers without write access to get their patches checked in ... because it will mean even more work for whoever does the commit. Tom From tdineen@ix.netcom.com Tue Dec 4 19:02:00 2007 From: tdineen@ix.netcom.com (Thomas Dineen) Date: Tue, 04 Dec 2007 19:02:00 -0000 Subject: Serious Bugs In Gcc Builds Message-ID: <4755A3B9.9060601@ix.netcom.com> Gentle People: I am writing to you today to document several serious build bugs in GCC releases gcc-3.4.6, gcc-4.0.4, and gcc-4.1.2. To be honest I have wasted several days of work on reflector interaction and attempts to work around these issues all to no avail! I have been unable to build a usable gcc on my Solaris 8 Sparc System! By the way don't bother flaming back at me I am way beyond this and thus impervious! Hopefully by fixing the issues documented below you will open up the GCC software to be usable by a larger audience of users. Issue 1) The configure and build scripts insist (read that fight to the death) on using the solaris linker (/usr/ccs/bin/ld) despite every effort to the contrary, and this of course causes errors. The following were my failed efforts to redirect it to the Gnu Linker (/opt/sfw/bin/gld): a) Providing explicit command line direction to configure. /export/home/tools/gcc/gcc-4.0.4/configure --with-gnu-ld --with-gnu-as --with-as=/opt/sfw/bin/as --with-ld=/opt/sfw/bin/ld b) Provide links in the object directory attempting to redirect it: ln -s /opt/sfw/bin/gld ld c) Provide links in the executable directory attempting to redirect it: ln -s gld ld d) Reordering the path so that the Gnu Tools would appear first. e) Removing the Solaris Linker from the path. Issue 2) Same as Issue 1 except for the Gnu assembler (/opt/sfw/bin/gas). Issue 3) After unzipping and untarring release gcc-4.1.2 I changed the owner (chown) and file mode (chmod) to values compatible with my environment. This caused build errors with make complaining of files being touched or changes which required a call to makeinfo, and a further complaint that makeinfo was missing. A subsequent test of makeinfo --version in the same shell as the attempted build indicated that makeinfo was present. Issue 4) What's In A Name? Or what the hell should we name it? When I down load and install various releases of GNU Bintools a tool like GNU Make is sometimes called gmake and sometimes called make. This causes confusion and thus errors in that the Gcc build scripts use make. I would suggest standardizing on the names to prevent confusion. To this end I would suggest that GNU Make always, always, always be called gmake and when you want to use GNU Make in your project that you type gmake. Issue 5) The build process is way to complicated for the average user to negotiate successfully. The user interface should be simplified to the following for a native compiler: ./configure gmake gmake install - A listing or the commands used in the various build attempts: ; ; Gcc Build gcc-3.4.6 ; ; Use gmake gls, gas - required! ; Using csh ; Changed $path in /root/.cshrc to put /opt/sfw/bin first to pick up ; GNU Tools first. cd /export/home/tools/gcc gunzip gcc-3.4.6.tar.gz tar -xvif gcc-3.4.6.tar ; -i -> Ignore directory checksum errors. mkdir gcc-3.4.6-obj chmod 777 gcc-3.4.6-obj cd gcc-3.4.6-obj ln -s /opt/sfw/bin/gmake make ln -s /opt/sfw/bin/gld ld ln -s /opt/sfw/bin/gas as /export/home/tools/gcc/gcc-3.4.6/configure --with-gnu-ld --with-gnu-as --with-as=/opt/sfw/bin/as --with-ld=/opt/sfw/bin/ld gmake DESTDIR=/export/home/tools/gcc/gcc-3.4.6-bin install ; ; Gcc Build gcc-4.0.4 ; ; Use gmake gls, gas - required! ; Using csh cd /export/home/tools/gcc gunzip gcc-4.0.4.tar.gz tar -xvif gcc-4.0.4.tar ; -i -> Ignore directory checksum errors. cd gcc-4.0.4-obj ln -s /opt/sfw/bin/gmake make ln -s /opt/sfw/bin/gld ld ln -s /opt/sfw/bin/gas as /export/home/tools/gcc/gcc-4.0.4/configure --with-gnu-ld --with-gnu-as --with-as=/opt/sfw/bin --with-ld=/opt/sfw/bin gmake DESTDIR=/export/home/tools/gcc/gcc-4.0.4-bin install ; ; Gcc Build gcc-4.1.2 ; ; Use gmake gls, gas - required! ; Using csh cd gcc-4.1.2-obj ln -s /opt/sfw/bin/gmake make ln -s /opt/sfw/bin/gld ld ln -s /opt/sfw/bin/gas as /export/home/tools/gcc/gcc-4.1.2/configure --with-gnu-ld --with-gnu-as --with-as=/opt/sfw/bin --with-ld=/opt/sfw/bin gmake DESTDIR=/export/home/tools/gcc/gcc-4.1.2-bin install From vagabon.xyz@gmail.com Tue Dec 4 20:44:00 2007 From: vagabon.xyz@gmail.com (Franck Bui-Huu) Date: Tue, 04 Dec 2007 20:44:00 -0000 Subject: Clarification on section variable attribute usage Message-ID: <4755BC01.3000303@gmail.com> Hi, Since at least 3.4, the GCC manual says: Use the `section' attribute with an _initialized_ definition of a _global_ variable, as shown in the example. GCC issues a warning and otherwise ignores the `section' attribute in uninitialized variable declarations. but this doesn't seem correct. For example the Linux kernel creates several data sections mainly for parking data which are only used during boot time and freed at runtime. Taken from the kernel source code (drivers/acpi/tables.c), this is how such variable is stated: static int acpi_apic_instance __attribute__ ((__section__ (".init.data"))); When compiling the driver, no warning is issued _and_ the section attribute is not ignored. So either the documentation is wrong or the compiler is misbehaving. Could anybody clarify this point ? Thanks, Franck From law@redhat.com Tue Dec 4 22:10:00 2007 From: law@redhat.com (Jeffrey Law) Date: Tue, 04 Dec 2007 22:10:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712031329.AA20246@vlsi1.ultra.nyu.edu> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> Message-ID: <1196805952.11808.20.camel@omfg.slc.redhat.com> On Mon, 2007-12-03 at 08:29 -0500, Richard Kenner wrote: > > Sorry, but again, this is not a good enough justification to me. > > We do a lot of things different than "The GNU Project". > > So do plenty of parts of the "official GNU project". > > They use different coding standards, bug tracking systems, version > > control systems, checkin policies, etc, than each other. > > Yes, but none of those are visible other than to the development community. > People who obtain the source distributions of projects don't get to see > those things. They DO see things like the ChangeLog format and coding > and documentation conventions and THOSE are the things that need to be > common among GNU projects. > > In my view, ChangeLog is mostly "write-only" from a developer's > perspective. It's a document that the GNU project requires us to produce > for the benefit of people who DON'T want access to our checkin-logs, bug > tracking information, and mailing lsits. But for our own development > purposes, we use the above information much more than ChangeLog. Right. I don't necessarily want verbose ChangeLogs -- there are times I just want to know what changed and who changed it. That's nice and easy to extract from the ChangeLog. Sometimes I want to look at the code/comments. Obviously I go to the source to read those. Sometimes I want even more information for a particularly complex or controversial change -- in those cases I go back to the mailing list archives and review the discussion(s) leading to changes to the code. Each repository of information provides a different level of detail and each (IMHO) has its place/utility. Jeff From Paul.Zimmerman@synopsys.com Wed Dec 5 00:55:00 2007 From: Paul.Zimmerman@synopsys.com (Paul Zimmerman) Date: Wed, 05 Dec 2007 00:55:00 -0000 Subject: Bad offset to struct member in generated code Message-ID: <76283BCB2FE6D9479105D17FAD968EE803005A8E@US01WEMBX3.internal.synopsys.com> Hi, I have a problem while porting gcc to a custom processor. Here is a simplified example: struct big_struct { ... int a; char b; char c; ... } testme; char char_tmp; int int_tmp; int testit(void) { char_tmp = testme.c; int_tmp = testme.a; } For the first line of code in testit(), gcc will load the address of 'testme.c' into a register, then do an indirect 8-bit load with an offset of 0 to fetch the value of 'testme.c'. This is fine. For the second line, gcc will reuse the same address register, and generate an indirect 32-bit load with an offset of -5 to fetch the value of 'testme.a'. But on our custom processor, the offsets of indirect load instructions are scaled by the size of the data item. So the assembly instruction generated for the second line is something like "ld32 %r1, %r0, -5/4", which can not be translated by the assembler. So when accessing a data structure, we need gcc to always use a base address that is a multiple of 4, to prevent this from happening. So how do I tell gcc about this limitation of our architecture? Do I do this somehow using the REG_MODE_OK_FOR_BASE_P or GO_IF_MODE_DEPENDENT_ADDRESS macros in our architecture's .h file? I am using the gcc 4.0.2 sources if that matters. Thanks, Paul paulz at synopsys dot com From iant@google.com Wed Dec 5 01:21:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Wed, 05 Dec 2007 01:21:00 -0000 Subject: Bad offset to struct member in generated code In-Reply-To: <76283BCB2FE6D9479105D17FAD968EE803005A8E@US01WEMBX3.internal.synopsys.com> References: <76283BCB2FE6D9479105D17FAD968EE803005A8E@US01WEMBX3.internal.synopsys.com> Message-ID: "Paul Zimmerman" writes: > For the second line, gcc will reuse the same address register, and > generate an indirect 32-bit load with an offset of -5 to fetch the > value of 'testme.a'. But on our custom processor, the offsets of > indirect load instructions are scaled by the size of the data item. > So the assembly instruction generated for the second line is something > like "ld32 %r1, %r0, -5/4", which can not be translated by the > assembler. > > So when accessing a data structure, we need gcc to always use a base > address that is a multiple of 4, to prevent this from happening. > > So how do I tell gcc about this limitation of our architecture? Do I do > this somehow using the REG_MODE_OK_FOR_BASE_P or > GO_IF_MODE_DEPENDENT_ADDRESS macros in our architecture's .h file? I am > using the gcc 4.0.2 sources if that matters. You need to fix this in GO_IF_LEGITIMATE_ADDRESS. It may help to look at thumb1_legitimate_address_p and thumb_legitimate_offset_p in config/arm/arm.c. Ian From jbeulich@novell.com Wed Dec 5 09:31:00 2007 From: jbeulich@novell.com (Jan Beulich) Date: Wed, 05 Dec 2007 09:31:00 -0000 Subject: [RFC] [PATCH] 32-bit pointers in x86-64 In-Reply-To: References: <1ba5638c0711250829o416e3ccasba572d3c205ea7f4@mail.gmail.com> Message-ID: <47567DF8.76E4.0078.0@novell.com> >>> "Andrew Pinski" 25.11.07 19:45 >>> >On 11/25/07, Luca wrote: >> 7.1. Add __attribute__((pointer_size(XXX))) and #pragma pointer_size >> to allow 64-bit pointers in 32-bit mode and viceversa > >This is already there, try using __attribute__((mode(DI) )). Hmm, unless this is a new feature in 4.3, I can't seem to get this to work on either i386 (using mode DI) or x86-64 (using mode SI). Could you clarify? If this worked consistently on at least all 64-bit architectures, I would have a use for it in the kernel (cutting down the kernel size by perhaps several pages). Btw., I continue to think that the error message 'initializer element is not computable at load time' on 64-bit code like this extern char array[]; unsigned int p = (unsigned long)array; or 32-bit code like this extern char array[]; unsigned long long p = (unsigned long)array; is incorrect - the compiler generally has no knowledge what 'array' is (it may know whether the architecture is generally capable of expressing the necessary relocation, but if 'array' is really a placeholder for an assembly level constant, possibly even defined through __asm__() in the same translation unit, this diagnostic should at best be a warning). I'm pretty sure I have an open bug for this, but the sad thing is that bugs like this never appear to really get looked at. Thanks, Jan From pinskia@gmail.com Wed Dec 5 10:48:00 2007 From: pinskia@gmail.com (Andrew Pinski) Date: Wed, 05 Dec 2007 10:48:00 -0000 Subject: [RFC] [PATCH] 32-bit pointers in x86-64 In-Reply-To: <47567DF8.76E4.0078.0@novell.com> References: <1ba5638c0711250829o416e3ccasba572d3c205ea7f4@mail.gmail.com> <47567DF8.76E4.0078.0@novell.com> Message-ID: On 12/5/07, Jan Beulich wrote: > >>> "Andrew Pinski" 25.11.07 19:45 >>> > >On 11/25/07, Luca wrote: > >> 7.1. Add __attribute__((pointer_size(XXX))) and #pragma pointer_size > >> to allow 64-bit pointers in 32-bit mode and viceversa > > > >This is already there, try using __attribute__((mode(DI) )). > > Hmm, unless this is a new feature in 4.3, I can't seem to get this to work on > either i386 (using mode DI) or x86-64 (using mode SI). Could you clarify? This only works when you add support for the different pointer modes. I was saying the middle support for this feature was already there, just the target support was not. Also there are issues with mode on pointers for C++, I don't know what they are though. Note this feature is used on the s390 target and also the ia64-hpux targets. --Pinski From rep.dot.nop@gmail.com Wed Dec 5 14:02:00 2007 From: rep.dot.nop@gmail.com (Bernhard Fischer) Date: Wed, 05 Dec 2007 14:02:00 -0000 Subject: libbid and floatingpoint exception access funcs Message-ID: <20071205140232.GB11485@aon.at> Hi, My libc is configured to omit any FP support (UCLIBC_HAS_FLOATS is not set) but the recent libbid updates seems to unconditionally pull in floatingpoint accessor functions thus breaking bootstrap. My notes on this read: --------8<-------- Follows: Precedes: do not pull in allegedly unneeded floatingpoint exception access funcs HJL's recent update of libbid would pull in Floating-point exception handling, although __GCC_FLOAT_NOT_NEEDED is defined. Prevent pulling in feclearexcept, feraiseexcept et al for now. FIXME: revisit --------8<-------- H.J., please advise. PS: I currently do: libgcc/ChangeLog: 2007-10-13 Bernhard Fischer <> * config/libbid/bid_conf.h: Do not define DECIMAL_GLOBAL_EXCEPTION_FLAGS_ACCESS_FUNCTIONS if __GCC_FLOAT_NOT_NEEDED is defined. -------------- next part -------------- A non-text attachment was scrubbed... Name: 306-libbid-no-decimal-global-exception-access-funcs.patch Type: text/x-diff Size: 600 bytes Desc: not available URL: From dnovillo@google.com Wed Dec 5 14:21:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Wed, 05 Dec 2007 14:21:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4749DE66.1090602@codesourcery.com> References: <84fc9c000711050327x74845c78ya18a3329fcf9e4d2@mail.gmail.com> <4733A637.8070004@adacore.com> <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> Message-ID: <4756B02D.9010302@google.com> On 11/25/07 3:43 PM, Mark Mitchell wrote: > My suggestion (not as a GCC SC member or GCC RM, but just as a fellow > GCC developer with an interest in improving the compiler in the same way > that you're trying to do) is that you stop writing code and start > writing a paper about what you're trying to do. > > Ignore the implementation. Describe the problem in detail. Narrow its > scope if necessary. Describe the success criteria in detail. Ideally, > the success criteria are mechanically checkable properties: i.e., given > a C program as input, and optimized code + debug information as output, > it should be possible to algorithmically prove whether the output is > correct. Yes, please. I would very much like to see an abstract design document on what you are trying to accomplish. I have been trying to follow this thread but I've gotten lost. It's full of implementation details, rhetoric and high-level discussion. I would like to see exactly what Mark is asking for. Perhaps a presentation in next year's Summit? I don't think I understand the goal of the project. "Correct debugging info" means little, particularly if you say that it's not debuggers that you are thinking about. It's certainly worrisome that your implementation seems to be intrusive to the point of brittleness. Will every new optimization need to think about debug information from scratch and refrain from doing certain transformations? In my simplistic view of this problem, I've always had the idea that -O0 -g means "full debugging bliss", -O1 -g means "tolerable debugging" (symbols shouldn't disappear, for instance, though they do now) and -O2 -g means "you can probably know what line+function you're executing". But you seem to be addressing other problems. And it even seems to me that you want debugging information that is capable of deconstructing arbitrary transformations done by the optimizers. But I think I'm just lost in this thread, so a high-level design document would be perfect to expose your ideas. Diego. From dberlin@dberlin.org Wed Dec 5 19:08:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 05 Dec 2007 19:08:00 -0000 Subject: Git and GCC Message-ID: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> So I tried a full history conversion using git-svn of the gcc repository (IE every trunk revision from 1-HEAD as of yesterday) The git-svn import was done using repacks every 1000 revisions. After it finished, I used git-gc --aggressive --prune. Two hours later, it finished. The final size after this is 1.5 gig for all of the history of gcc for just trunk. dberlin@home:/compilerstuff/gitgcc/gccrepo/.git/objects/pack$ ls -trl total 1568899 -r--r--r-- 1 dberlin dberlin 1585972834 2007-12-05 14:01 pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.pack -r--r--r-- 1 dberlin dberlin 19008488 2007-12-05 14:01 pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.idx This is 3x bigger than hg *and* hg doesn't require me to waste my life repacking every so often. The hg operations run roughly as fast as the git ones I'm sure there are magic options, magic command lines, etc, i could use to make it smaller. I'm sure if i spent the next few weeks fucking around with git, it may even be usable! But given that git is harder to use, requires manual repacking to get any kind of sane space usage, and is 3x bigger anyway, i don't see any advantage to continuing to experiment with git and gcc. I already have two way sync with hg. Maybe someday when git is more usable than hg to a normal developer, or it at least is significantly smaller than hg, i'll look at it again. For now, it seems a net loss. --Dan > > git clone --depth 100 git://git.infradead.org/gcc.git > > should give around ~50mb repository with usable trunk. This is all thanks to > Bernardo Innocenti for setting up an up-to-date gcc git repo. > > P.S:Please cut down on the usage of exclamation mark. > > Regards, > ismail > > -- > Never learn by your mistakes, if you do you may never dare to try again. > From dberlin@dberlin.org Wed Dec 5 19:11:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 05 Dec 2007 19:11:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> Message-ID: <4aca3dc20712051110if55ef95u6ffd1d1dd2068c50@mail.gmail.com> For the record: dberlin@home:/compilerstuff/gitgcc/gccrepo$ git --version git version 1.5.3.7 (I downloaded it yesterday when i started the import) On 12/5/07, Daniel Berlin wrote: > So I tried a full history conversion using git-svn of the gcc > repository (IE every trunk revision from 1-HEAD as of yesterday) > The git-svn import was done using repacks every 1000 revisions. > After it finished, I used git-gc --aggressive --prune. Two hours > later, it finished. > The final size after this is 1.5 gig for all of the history of gcc for > just trunk. > > dberlin@home:/compilerstuff/gitgcc/gccrepo/.git/objects/pack$ ls -trl > total 1568899 > -r--r--r-- 1 dberlin dberlin 1585972834 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.pack > -r--r--r-- 1 dberlin dberlin 19008488 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.idx > > This is 3x bigger than hg *and* hg doesn't require me to waste my life > repacking every so often. > The hg operations run roughly as fast as the git ones > > I'm sure there are magic options, magic command lines, etc, i could > use to make it smaller. > > I'm sure if i spent the next few weeks fucking around with git, it may > even be usable! > > But given that git is harder to use, requires manual repacking to get > any kind of sane space usage, and is 3x bigger anyway, i don't see any > advantage to continuing to experiment with git and gcc. > > I already have two way sync with hg. > Maybe someday when git is more usable than hg to a normal developer, > or it at least is significantly smaller than hg, i'll look at it > again. > For now, it seems a net loss. > > --Dan > > > > git clone --depth 100 git://git.infradead.org/gcc.git > > > > should give around ~50mb repository with usable trunk. This is all thanks to > > Bernardo Innocenti for setting up an up-to-date gcc git repo. > > > > P.S:Please cut down on the usage of exclamation mark. > > > > Regards, > > ismail > > > > -- > > Never learn by your mistakes, if you do you may never dare to try again. > > > From nightstrike@gmail.com Wed Dec 5 19:13:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Wed, 05 Dec 2007 19:13:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> Message-ID: On 12/5/07, Daniel Berlin wrote: > I already have two way sync with hg. > Maybe someday when git is more usable than hg to a normal developer, > or it at least is significantly smaller than hg, i'll look at it > again. Sorry, what is hg? From dberlin@dberlin.org Wed Dec 5 19:16:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 05 Dec 2007 19:16:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> Message-ID: <4aca3dc20712051116w7d622974lb22472f37e7c09ae@mail.gmail.com> On 12/5/07, NightStrike wrote: > On 12/5/07, Daniel Berlin wrote: > > I already have two way sync with hg. > > Maybe someday when git is more usable than hg to a normal developer, > > or it at least is significantly smaller than hg, i'll look at it > > again. > > Sorry, what is hg? > http://www.selenic.com/mercurial/ From ismail@pardus.org.tr Wed Dec 5 19:36:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Wed, 05 Dec 2007 19:36:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> Message-ID: <200712052137.09528.ismail@pardus.org.tr> Wednesday 05 December 2007 21:08:41 Daniel Berlin yazm??t?: > So I tried a full history conversion using git-svn of the gcc > repository (IE every trunk revision from 1-HEAD as of yesterday) > The git-svn import was done using repacks every 1000 revisions. > After it finished, I used git-gc --aggressive --prune. Two hours > later, it finished. > The final size after this is 1.5 gig for all of the history of gcc for > just trunk. > > dberlin@home:/compilerstuff/gitgcc/gccrepo/.git/objects/pack$ ls -trl > total 1568899 > -r--r--r-- 1 dberlin dberlin 1585972834 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.pack > -r--r--r-- 1 dberlin dberlin 19008488 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.idx > > This is 3x bigger than hg *and* hg doesn't require me to waste my life > repacking every so often. > The hg operations run roughly as fast as the git ones I think this (gcc HG repo) is very good but only problem is its not always in sync with SVN, it would really rock if a post svn commit hook would sync hg repo. Thanks for doing this anyhow. Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From aaw@google.com Wed Dec 5 19:36:00 2007 From: aaw@google.com (Ollie Wild) Date: Wed, 05 Dec 2007 19:36:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> Message-ID: <65dd6fd50712051136j2528b15yc16e51f1457ed6b1@mail.gmail.com> On Dec 5, 2007 11:08 AM, Daniel Berlin wrote: > So I tried a full history conversion using git-svn of the gcc > repository (IE every trunk revision from 1-HEAD as of yesterday) > The git-svn import was done using repacks every 1000 revisions. > After it finished, I used git-gc --aggressive --prune. Two hours > later, it finished. > The final size after this is 1.5 gig for all of the history of gcc for > just trunk. Out of curiosity, how much of that is the .git/svn directory? This is where git-svn-specific data is stored. It is *very* inefficient, at least for the 1.5.2.5 version I'm using. Ollie From baembel@gmx.de Wed Dec 5 20:08:00 2007 From: baembel@gmx.de (Boris Boesler) Date: Wed, 05 Dec 2007 20:08:00 -0000 Subject: BITS_PER_UNIT larger than 8 -- word addressing Message-ID: <2649CC51-1ACF-4C2B-88BA-396CB39D4BEE@gmx.de> On 2007-11-27 18:29, Michael Eager wrote: > Joseph S. Myers wrote: > > On Tue, 27 Nov 2007, Michael Eager wrote: > > > >> I think that there is a pervasive understanding that SImode is > >> single precision integer, 32-bits long. > > > > Only among contributors not considering non-8-bit bytes. SImode is 4 > > times QImode, 4*BITS_PER_UNIT bits, and may not exist (or at least not be > > particularly usable, much like the limitations on TImode on 32-bit > > targets) with large BITS_PER_UNIT. > > I think you just described the majority of contributors. :-) > > It's human nature not to recognize one's tacit assumptions or their > consequences. I assume that GCC internals assume that memory can be byte (8 bits) addressed - for historical reasons. Therefore, the sizes of all types are multiples of a byte. The same is true for addressing values in memory. (Sizes of types and their addresses must be separated more precisely. A 32 bit value could be on a 4 bit boundary!) But this is changing. Addressable units are on 32 bit boundaries or even on 4 bit boundaries today. Well, this is the problem I'm running in right now. Boris From sam@rfc1149.net Wed Dec 5 20:23:00 2007 From: sam@rfc1149.net (Samuel Tardieu) Date: Wed, 05 Dec 2007 20:23:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> (Daniel Berlin's message of "Wed\, 5 Dec 2007 14\:08\:41 -0500") References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> Message-ID: <2007-12-05-21-23-14+trackit+sam@rfc1149.net> >>>>> "Daniel" == Daniel Berlin writes: Daniel> So I tried a full history conversion using git-svn of the gcc Daniel> repository (IE every trunk revision from 1-HEAD as of Daniel> yesterday) The git-svn import was done using repacks every Daniel> 1000 revisions. After it finished, I used git-gc --aggressive Daniel> --prune. Two hours later, it finished. The final size after Daniel> this is 1.5 gig for all of the history of gcc for just trunk. Most of the space is probably taken by the SVN specific data. To get an idea of how GIT would handle GCC data, you should clone the GIT directory or checkout one from infradead.org: % git clone git://git.infradead.org/gcc.git On my machine, it takes 856M with a checkout copy of trunk and contains the trunk, autovect, fixed-point, 4.1 and 4.2 branches. In comparaison, my checked out copy of trunk using SVN requires 1.2G, and I don't have any history around... Sam -- Samuel Tardieu -- sam@rfc1149.net -- http://www.rfc1149.net/ From dberlin@dberlin.org Wed Dec 5 21:04:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 05 Dec 2007 21:04:00 -0000 Subject: Patch manager dying for a week or two Message-ID: <4aca3dc20712051304q786e1ef8ve7e73ab0546fe5c3@mail.gmail.com> Patch manager will be dying for a week or two while i change hosting. of course, if nobody is still using it, i can just kill it permanently. From kenner@vlsi1.ultra.nyu.edu Wed Dec 5 21:20:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Wed, 05 Dec 2007 21:20:00 -0000 Subject: ACATS c460008 and VRP (was: Bootstrap failure on trunk: x86_64-linux-gnu) In-Reply-To: <1141256224.24449.64.camel@pc.site> References: <17400.42579.162536.967995@zapata.pink> <200602281206.32138.ebotcazou@adacore.com> <1141145312.2618.57.camel@localhost.localdomain> <200602281842.22672.ebotcazou@adacore.com> <1141166437.2618.103.camel@localhost.localdomain> <20060228235946.GA12161@nevyn.them.org> <1141214164.24449.52.camel@pc.site> <1141252514.3223.101.camel@localhost.localdomain> <1141256224.24449.64.camel@pc.site> Message-ID: <10712052119.AA29653@vlsi1.ultra.nyu.edu> > Richard, Arnaud, could you check amongst GNAT experts if for such types > (non power of two modulus), it's not worth enabling overflow checks by > default now that we have VRP doing non trivial optimisations? People > using non power of two modulus are not caring for performance anyway, so > having a compliant implementation by default won't harm. I don't think that either of us are the best people to ask, but my sense is that it's not a great idea to have the default overflow handling differ between types. For one thing, what option would then disable overflow checking for those types? -gnato is required for ACATS tests because you need -gnato for RM compliance. From kenner@vlsi1.ultra.nyu.edu Wed Dec 5 21:25:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Wed, 05 Dec 2007 21:25:00 -0000 Subject: ACATS c460008 and VRP (was: Bootstrap failure on trunk: x86_64-linux-gnu) In-Reply-To: <1141257264.24449.73.camel@pc.site> References: <10603012348.AA17328@vlsi1.ultra.nyu.edu> <1141257264.24449.73.camel@pc.site> Message-ID: <10712052125.AA29882@vlsi1.ultra.nyu.edu> > On GCC we use -gnato on tests known to need it > (/gcc/testsuite/ada/acats/overflow.lst) since we want to test > flags the typical GCC/Ada user does use and not what official validation > requires (which is -gnato -gnatE IIRC). But you're running a test that's *part* of the official validation and it assumes the options that implement the full language (including overflow checks). I don't see the relevance of what options the "typical user" specifies: this isn't typical user *code*! From ralmquist@ssi-corp.com Wed Dec 5 21:25:00 2007 From: ralmquist@ssi-corp.com (Richard Almquist) Date: Wed, 05 Dec 2007 21:25:00 -0000 Subject: Common logging config Message-ID: Tony, To configure common-logging to use JDK logger. Create a file named "commons-loggin.properties" with the following: org.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger In a webapp this would go into WEB-INF/classes directory. I'm not sure where to put it for the routing engine. Richard From iant@google.com Wed Dec 5 21:32:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Wed, 05 Dec 2007 21:32:00 -0000 Subject: BITS_PER_UNIT larger than 8 -- word addressing In-Reply-To: <2649CC51-1ACF-4C2B-88BA-396CB39D4BEE@gmx.de> References: <2649CC51-1ACF-4C2B-88BA-396CB39D4BEE@gmx.de> Message-ID: Boris Boesler writes: > I assume that GCC internals assume that memory can be byte (8 bits) > addressed - for historical reasons. No. gcc internals assume that memory can be addressed in units of size BITS_PER_UNIT. The default for BITS_PER_UNIT is 8. I have written backends for machines for which that is not true. It is unusual, and there is only one official target with BITS_PER_UNIT != 8 (c4x), so there is often some minor breakage. Ian From nightstrike@gmail.com Wed Dec 5 21:32:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Wed, 05 Dec 2007 21:32:00 -0000 Subject: Patch manager dying for a week or two In-Reply-To: <4aca3dc20712051304q786e1ef8ve7e73ab0546fe5c3@mail.gmail.com> References: <4aca3dc20712051304q786e1ef8ve7e73ab0546fe5c3@mail.gmail.com> Message-ID: On 12/5/07, Daniel Berlin wrote: > Patch manager will be dying for a week or two while i change hosting. > > of course, if nobody is still using it, i can just kill it permanently. > What is the patch manager? From dberlin@dberlin.org Wed Dec 5 21:40:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 05 Dec 2007 21:40:00 -0000 Subject: Git and GCC In-Reply-To: <65dd6fd50712051136j2528b15yc16e51f1457ed6b1@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <65dd6fd50712051136j2528b15yc16e51f1457ed6b1@mail.gmail.com> Message-ID: <4aca3dc20712051340l616480bau28030ee6ba906d0a@mail.gmail.com> On 12/5/07, Ollie Wild wrote: > On Dec 5, 2007 11:08 AM, Daniel Berlin wrote: > > So I tried a full history conversion using git-svn of the gcc > > repository (IE every trunk revision from 1-HEAD as of yesterday) > > The git-svn import was done using repacks every 1000 revisions. > > After it finished, I used git-gc --aggressive --prune. Two hours > > later, it finished. > > The final size after this is 1.5 gig for all of the history of gcc for > > just trunk. > > Out of curiosity, how much of that is the .git/svn directory? This is > where git-svn-specific data is stored. It is *very* inefficient, at > least for the 1.5.2.5 version I'm using. > I was only counting the space in .the packs dir. > Ollie > From dberlin@dberlin.org Wed Dec 5 21:48:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 05 Dec 2007 21:48:00 -0000 Subject: Git and GCC In-Reply-To: <2007-12-05-21-23-14+trackit+sam@rfc1149.net> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> Message-ID: <4aca3dc20712051347v6bb8a4dbm901d1d9ddf1dff34@mail.gmail.com> On 12/5/07, Samuel Tardieu wrote: > >>>>> "Daniel" == Daniel Berlin writes: > > Daniel> So I tried a full history conversion using git-svn of the gcc > Daniel> repository (IE every trunk revision from 1-HEAD as of > Daniel> yesterday) The git-svn import was done using repacks every > Daniel> 1000 revisions. After it finished, I used git-gc --aggressive > Daniel> --prune. Two hours later, it finished. The final size after > Daniel> this is 1.5 gig for all of the history of gcc for just trunk. > > Most of the space is probably taken by the SVN specific data. I showed a du of the pack directory. Everyone tells me that svn specfic data is in .svn, so i am disinclined to believe this. Also, given that hg can store the svn data without this kind of penalty, it's just another strike against git. > To get > an idea of how GIT would handle GCC data, you should clone the GIT > directory or checkout one from infradead.org: Does infradead have the entire history? > % git clone git://git.infradead.org/gcc.git > > On my machine, it takes 856M with a checkout copy of trunk and > contains the trunk, autovect, fixed-point, 4.1 and 4.2 branches. In > comparaison, my checked out copy of trunk using SVN requires 1.2G, and > I don't have any history around... This is about git's usability and space usage, not SVN. People say we should consider GIT. I have been considering GIT and hg, and right now, GIT looks like a massive loser in every respect. It's harder to use. It takes up more space than hg to store the same data. It requires manual repacking it's diff/etc commands are not any faster. Humorously, i tried to verify whether infradead has full history or not, but of course git log git://git.infradead.org/gcc.git says "fatal, not a git repository". (though git clone is happy to clone it, because it is a git repository). I'm sure there is some magic option or command line i need to use to view remote log history without cloning the repository. But all the other systems we look at don't require this kind of bullshit to actually get things done. As I said, maybe i'll look at git in another year or so. But i'm certainly going to ignore all the "git is so great, we should move gcc to it" people until it works better, while i am much more inclined to believe the "hg is so great, we should move gc to it" people. From harvey.harrison@gmail.com Wed Dec 5 21:50:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Wed, 05 Dec 2007 21:50:00 -0000 Subject: Git and GCC In-Reply-To: <2007-12-05-21-23-14+trackit+sam@rfc1149.net> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> Message-ID: <1196891451.10408.54.camel@brick> On Wed, 2007-12-05 at 21:23 +0100, Samuel Tardieu wrote: > >>>>> "Daniel" == Daniel Berlin writes: > > Daniel> So I tried a full history conversion using git-svn of the gcc > Daniel> repository (IE every trunk revision from 1-HEAD as of > Daniel> yesterday) The git-svn import was done using repacks every > Daniel> 1000 revisions. After it finished, I used git-gc --aggressive > Daniel> --prune. Two hours later, it finished. The final size after > Daniel> this is 1.5 gig for all of the history of gcc for just trunk. > > Most of the space is probably taken by the SVN specific data. To get > an idea of how GIT would handle GCC data, you should clone the GIT > directory or checkout one from infradead.org: > > % git clone git://git.infradead.org/gcc.git > Actually I went through and created the basis for that repo. It contains all branches and tags in the gcc svn repo and the final pack comes to about 600M. This has _everything_, not just trunk. For the first time after doing such an import, I found it much better to do git repack -a -f --depth=100 --window=100. After that initial repack a plain git-gc occasionally will be just fine. If you want any more information about this, let me know. CHeers, Harvey From dominiq@lps.ens.fr Wed Dec 5 21:55:00 2007 From: dominiq@lps.ens.fr (Dominique Dhumieres) Date: Wed, 05 Dec 2007 21:55:00 -0000 Subject: Broken regression testing on Intel Darwin9 Message-ID: <20071205215459.20EFA5BB6C@mailhost.lps.ens.fr> At revision 130629 regtesting on Intel Darwin9 gives a dozen The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec(). Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug. then stop to do anything untill I kill it. I did not see that with revision 130589. What is the meaning of the message? and what could I do? TIA Dominique From nightstrike@gmail.com Wed Dec 5 21:56:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Wed, 05 Dec 2007 21:56:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051347v6bb8a4dbm901d1d9ddf1dff34@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <4aca3dc20712051347v6bb8a4dbm901d1d9ddf1dff34@mail.gmail.com> Message-ID: On 12/5/07, Daniel Berlin wrote: > As I said, maybe i'll look at git in another year or so. > But i'm certainly going to ignore all the "git is so great, we should > move gcc to it" people until it works better, while i am much more > inclined to believe the "hg is so great, we should move gc to it" > people. Just out of curiosity, is there something wrong with the current choice of svn? As I recall, it wasn't too long ago that gcc converted from cvs to svn. What's the motivation to change again? (I'm not trying to oppose anything.. I'm just curious, as I don't know much about this kind of thing). From Joe.Buck@synopsys.COM Wed Dec 5 22:10:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Wed, 05 Dec 2007 22:10:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4756B02D.9010302@google.com> References: <4733A637.8070004@adacore.com> <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> Message-ID: <20071205221035.GB31543@synopsys.com> On Wed, Dec 05, 2007 at 09:05:33AM -0500, Diego Novillo wrote: > In my simplistic view of this problem, I've always had the idea that -O0 > -g means "full debugging bliss", -O1 -g means "tolerable debugging" > (symbols shouldn't disappear, for instance, though they do now) and -O2 > -g means "you can probably know what line+function you're executing". I'd be happy enough if the state of -O1 -g debugging were improved, perhaps using some of Alexandre's ideas so that it could be "full debugging bliss" with some optimization as well. Speeding up the compile/test/debug/modify cycle would result. We could then have fast but fully debuggable code at -O1, and even faster code at -O2 not constrained by the requirement of, as Diego says, "deconstructing arbitrary transformations done by the optimizers". From drow@false.org Wed Dec 5 22:11:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Wed, 05 Dec 2007 22:11:00 -0000 Subject: iWMMXt/Linux EABI toolchain In-Reply-To: <20060301182053.67859.qmail@web25003.mail.ukl.yahoo.com> References: <20060301175459.GA7475@nevyn.them.org> <20060301182053.67859.qmail@web25003.mail.ukl.yahoo.com> Message-ID: <20060301182204.GA8497@nevyn.them.org> On Wed, Mar 01, 2006 at 06:20:53PM +0000, Steven Newbury wrote: > OK, thank-you. I'll target "arm-iwmmxt-linux-gnueabi" with --with-cpu= etc and > --disable-multilib. The vendor string is for my build scripts and also will > help differentiate the toolchain, is that valid? Yep. -- Daniel Jacobowitz CodeSourcery From dave.korn@artimi.com Wed Dec 5 22:37:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Wed, 05 Dec 2007 22:37:00 -0000 Subject: Patch manager dying for a week or two In-Reply-To: <4aca3dc20712051304q786e1ef8ve7e73ab0546fe5c3@mail.gmail.com> References: <4aca3dc20712051304q786e1ef8ve7e73ab0546fe5c3@mail.gmail.com> Message-ID: <04ec01c8378f$5b549520$2e08a8c0@CAM.ARTIMI.COM> On 05 December 2007 21:04, Daniel Berlin wrote: > Patch manager will be dying for a week or two while i change hosting. > > of course, if nobody is still using it, i can just kill it permanently. Well I haven't submitted any patches just lately, but I always use it when I do, I think it's very useful indeed. Thanks for organising it. cheers, DaveK -- Can't think of a witty .sigline today.... From rask@sygehus.dk Wed Dec 5 22:41:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Wed, 05 Dec 2007 22:41:00 -0000 Subject: Patch manager dying for a week or two In-Reply-To: References: <4aca3dc20712051304q786e1ef8ve7e73ab0546fe5c3@mail.gmail.com> Message-ID: <20071205224111.GN17368@sygehus.dk> On Wed, Dec 05, 2007 at 04:32:00PM -0500, NightStrike wrote: > On 12/5/07, Daniel Berlin wrote: > > Patch manager will be dying for a week or two while i change hosting. > > > > of course, if nobody is still using it, i can just kill it permanently. grep -F -e patchapp gcc-bugs@ says it is being used. I use it and would like to keep doing so. As well as tracking my patches, I find the notice automatically posted to the bug database a lot more convenient that having to do so manually. > What is the patch manager? http://gcc.gnu.org/wiki/GCC_Patch_Tracking -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From gccadmin@gcc.gnu.org Wed Dec 5 22:44:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Wed, 05 Dec 2007 22:44:00 -0000 Subject: gcc-4.2-20071205 is now available Message-ID: <20071205224413.13703.qmail@sourceware.org> Snapshot gcc-4.2-20071205 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20071205/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.2 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch revision 130635 You'll find: gcc-4.2-20071205.tar.bz2 Complete GCC (includes all of below) gcc-core-4.2-20071205.tar.bz2 C front end and core compiler gcc-ada-4.2-20071205.tar.bz2 Ada front end and runtime gcc-fortran-4.2-20071205.tar.bz2 Fortran front end and runtime gcc-g++-4.2-20071205.tar.bz2 C++ front end and runtime gcc-java-4.2-20071205.tar.bz2 Java front end and runtime gcc-objc-4.2-20071205.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.2-20071205.tar.bz2 The GCC testsuite Diffs from 4.2-20071128 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.2 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From jcpiza@gmail.com Wed Dec 5 22:51:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Wed, 05 Dec 2007 22:51:00 -0000 Subject: Git and GCC Message-ID: <998d0e4a0712051451l47590558x4f8297eb2b9aea7b@mail.gmail.com> On 12/5/07, Daniel Berlin wrote: > So I tried a full history conversion using git-svn of the gcc > repository (IE every trunk revision from 1-HEAD as of yesterday) > The git-svn import was done using repacks every 1000 revisions. > After it finished, I used git-gc --aggressive --prune. Two hours > later, it finished. > The final size after this is 1.5 gig for all of the history of gcc for > just trunk. > > dberlin@home:/compilerstuff/gitgcc/gccrepo/.git/objects/pack$ ls -trl > total 1568899 > -r--r--r-- 1 dberlin dberlin 1585972834 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.pack > -r--r--r-- 1 dberlin dberlin 19008488 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.idx > > This is 3x bigger than hg *and* hg doesn't require me to waste my life > repacking every so often. > The hg operations run roughly as fast as the git ones > > I'm sure there are magic options, magic command lines, etc, i could > use to make it smaller. > > I'm sure if i spent the next few weeks fucking around with git, it may > even be usable! > > But given that git is harder to use, requires manual repacking to get > any kind of sane space usage, and is 3x bigger anyway, i don't see any > advantage to continuing to experiment with git and gcc. > > I already have two way sync with hg. > Maybe someday when git is more usable than hg to a normal developer, > or it at least is significantly smaller than hg, i'll look at it > again. > For now, it seems a net loss. > > --Dan > > > > git clone --depth 100 git://git.infradead.org/gcc.git > > > > should give around ~50mb repository with usable trunk. This is all thanks to > > Bernardo Innocenti for setting up an up-to-date gcc git repo. > > > > P.S:Please cut down on the usage of exclamation mark. > > > > Regards, > > ismail > > > > -- > > Never learn by your mistakes, if you do you may never dare to try again. > > To see "Re: svn trunk reaches nearly 1 GiB!!! That massive!!!" http://gcc.gnu.org/ml/gcc/2007-11/msg00805.html http://gcc.gnu.org/ml/gcc/2007-11/msg00770.html http://gcc.gnu.org/ml/gcc/2007-11/msg00769.html http://gcc.gnu.org/ml/gcc/2007-11/msg00768.html http://gcc.gnu.org/ml/gcc/2007-11/msg00767.html On 12/5/07, Daniel Berlin wrote: > So I tried a full history conversion using git-svn of the gcc > repository (IE every trunk revision from 1-HEAD as of yesterday) > The git-svn import was done using repacks every 1000 revisions. > After it finished, I used git-gc --aggressive --prune. Two hours > later, it finished. > The final size after this is 1.5 gig for all of the history of gcc for > just trunk. > > dberlin@home:/compilerstuff/gitgcc/gccrepo/.git/objects/pack$ ls -trl > total 1568899 > -r--r--r-- 1 dberlin dberlin 1585972834 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.pack > -r--r--r-- 1 dberlin dberlin 19008488 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.idx > > This is 3x bigger than hg *and* hg doesn't require me to waste my life > repacking every so often. > The hg operations run roughly as fast as the git ones > > I'm sure there are magic options, magic command lines, etc, i could > use to make it smaller. > > I'm sure if i spent the next few weeks fucking around with git, it may > even be usable! > > But given that git is harder to use, requires manual repacking to get > any kind of sane space usage, and is 3x bigger anyway, i don't see any > advantage to continuing to experiment with git and gcc. > > I already have two way sync with hg. > Maybe someday when git is more usable than hg to a normal developer, > or it at least is significantly smaller than hg, i'll look at it > again. > For now, it seems a net loss. > > --Dan > > > > git clone --depth 100 git://git.infradead.org/gcc.git > > > > should give around ~50mb repository with usable trunk. This is all thanks to > > Bernardo Innocenti for setting up an up-to-date gcc git repo. > > > > P.S:Please cut down on the usage of exclamation mark. > > > > Regards, > > ismail > > > > -- > > Never learn by your mistakes, if you do you may never dare to try again. > > To see "Re: svn trunk reaches nearly 1 GiB!!! That massive!!!" http://gcc.gnu.org/ml/gcc/2007-11/msg00805.html http://gcc.gnu.org/ml/gcc/2007-11/msg00770.html http://gcc.gnu.org/ml/gcc/2007-11/msg00769.html http://gcc.gnu.org/ml/gcc/2007-11/msg00768.html http://gcc.gnu.org/ml/gcc/2007-11/msg00767.html * In http://gcc.gnu.org/ml/gcc/2007-11/msg00675.html , i did put The generated files from flex/bison are a lot of "trashing hexadecimals" that don't must to be commited to any cvs/svn/git/hg because it consumes a lot of diskspace for only a modification of few lines of flex/bison sources. * In http://gcc.gnu.org/ml/gcc/2007-11/msg00683.html , i did put I hate considering temporary files as sources of the tree. They aren't sources. It's good idea to remove ALL generated files from sources: A) generated *.c, *.h from lex/bison sources *.l/*.y B) generated not-handwritten configure, makefile, aclocal.m4, config.h.in, makefile.in from the configure.ac and makefile.am sources. [the handwritten configure and makefile have to be rewritten to *.ac/*.am] C) generated binary objects *.class, *.o, *.a, *.so, ... D) generated *.c, *.h, *.cpp, *.hpp, ... from *.java E) any generated from any available source by available tool. The only exception is when the project need a bootstrapping system. See to understand http://en.wikipedia.org/wiki/GNU_build_system easy! So, the cvs/svn/git/hg repositories of sources will be "small and clean" without "trashing generated files, hexadecimals, ..." This recommendation is an advantage to navigate by web to clean cvs/svn/git/hg repositories. In another case, it's an inconvenient. J.C.Pizarro From bje@au1.ibm.com Wed Dec 5 23:16:00 2007 From: bje@au1.ibm.com (Ben Elliston) Date: Wed, 05 Dec 2007 23:16:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> Message-ID: <1196896524.12541.3.camel@localhost> On Tue, 2007-12-04 at 09:18 -0700, Tom Tromey wrote: > First, continuing to have good quality messages. Right now at the > very least you get a (semi-) accurate record of what was touched. > I've seen plenty of ChangeLog-less projects out there than end up with > commits like "fixed a bug", or even worse. Something else that hasn't been raised is that ChangeLogs can be revised. We often see people making mistakes with their ChangeLog entries, but since the ChangeLog is versioned, they can revise it. If you screw up a commit message, it's much harder to fix it (and a purist might argue that to do so would be destroying revision history). > Also it seems to me that this will make it a bit harder for developers > without write access to get their patches checked in ... because it > will mean even more work for whoever does the commit. That's a good point. Ben From schwab@suse.de Wed Dec 5 23:34:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Wed, 05 Dec 2007 23:34:00 -0000 Subject: Git and GCC In-Reply-To: <1196891451.10408.54.camel@brick> (Harvey Harrison's message of "Wed\, 05 Dec 2007 13\:50\:51 -0800") References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <1196891451.10408.54.camel@brick> Message-ID: Harvey Harrison writes: > On Wed, 2007-12-05 at 21:23 +0100, Samuel Tardieu wrote: >> >>>>> "Daniel" == Daniel Berlin writes: >> >> Daniel> So I tried a full history conversion using git-svn of the gcc >> Daniel> repository (IE every trunk revision from 1-HEAD as of >> Daniel> yesterday) The git-svn import was done using repacks every >> Daniel> 1000 revisions. After it finished, I used git-gc --aggressive >> Daniel> --prune. Two hours later, it finished. The final size after >> Daniel> this is 1.5 gig for all of the history of gcc for just trunk. >> >> Most of the space is probably taken by the SVN specific data. To get >> an idea of how GIT would handle GCC data, you should clone the GIT >> directory or checkout one from infradead.org: >> >> % git clone git://git.infradead.org/gcc.git >> > > Actually I went through and created the basis for that repo. It > contains all branches and tags in the gcc svn repo and the final > pack comes to about 600M. This has _everything_, not just trunk. Not everything. Only trunk and a few selected branches, and no tags. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From dannyb@google.com Wed Dec 5 23:35:00 2007 From: dannyb@google.com (Daniel Berlin) Date: Wed, 05 Dec 2007 23:35:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1196896524.12541.3.camel@localhost> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <1196896524.12541.3.camel@localhost> Message-ID: <2fbe2a060712051535l7a9991c9q573106ac6f02a1ed@mail.gmail.com> On Dec 5, 2007 6:15 PM, Ben Elliston wrote: > On Tue, 2007-12-04 at 09:18 -0700, Tom Tromey wrote: > > > First, continuing to have good quality messages. Right now at the > > very least you get a (semi-) accurate record of what was touched. > > I've seen plenty of ChangeLog-less projects out there than end up with > > commits like "fixed a bug", or even worse. > > Something else that hasn't been raised is that ChangeLogs can be > revised. We often see people making mistakes with their ChangeLog > entries, but since the ChangeLog is versioned, they can revise it. If > you screw up a commit message, it's much harder to fix it (and a purist > might argue that to do so would be destroying revision history). Uh? svn propedit --revision svn:log Hope this helps! From harvey.harrison@gmail.com Wed Dec 5 23:37:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Wed, 05 Dec 2007 23:37:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <1196891451.10408.54.camel@brick> Message-ID: <1196897840.10408.57.camel@brick> On Thu, 2007-12-06 at 00:34 +0100, Andreas Schwab wrote: > Harvey Harrison writes: > > > On Wed, 2007-12-05 at 21:23 +0100, Samuel Tardieu wrote: > >> >>>>> "Daniel" == Daniel Berlin writes: > >> > >> Daniel> So I tried a full history conversion using git-svn of the gcc > >> Daniel> repository (IE every trunk revision from 1-HEAD as of > >> Daniel> yesterday) The git-svn import was done using repacks every > >> Daniel> 1000 revisions. After it finished, I used git-gc --aggressive > >> Daniel> --prune. Two hours later, it finished. The final size after > >> Daniel> this is 1.5 gig for all of the history of gcc for just trunk. > >> > >> Most of the space is probably taken by the SVN specific data. To get > >> an idea of how GIT would handle GCC data, you should clone the GIT > >> directory or checkout one from infradead.org: > >> > >> % git clone git://git.infradead.org/gcc.git > >> > > > > Actually I went through and created the basis for that repo. It > > contains all branches and tags in the gcc svn repo and the final > > pack comes to about 600M. This has _everything_, not just trunk. > > Not everything. Only trunk and a few selected branches, and no tags. > Yes, everything, by default you only get the more modern branches/tags, but it's all in there. If there is interest I can work with Bernardo and get the rest publically exposed. Harvey From aaw@google.com Thu Dec 6 00:01:00 2007 From: aaw@google.com (Ollie Wild) Date: Thu, 06 Dec 2007 00:01:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051340l616480bau28030ee6ba906d0a@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <65dd6fd50712051136j2528b15yc16e51f1457ed6b1@mail.gmail.com> <4aca3dc20712051340l616480bau28030ee6ba906d0a@mail.gmail.com> Message-ID: <65dd6fd50712051601i8b28a97g5950a03e78973d23@mail.gmail.com> On Dec 5, 2007 1:40 PM, Daniel Berlin wrote: > > > Out of curiosity, how much of that is the .git/svn directory? This is > > where git-svn-specific data is stored. It is *very* inefficient, at > > least for the 1.5.2.5 version I'm using. > > > > I was only counting the space in .the packs dir. In my personal client, which includes the entire history of GCC, the packs dir is only 652MB. Obviouisly, you're not a big fan of Git, and you're entitled to your opinion. I, however, find it very useful. Given a choice between Git and Mercurial, I choose git, but only because I have prior experience working with the Linux kernel. From what I've heard, both do the job reasonably well. Thanks to git-svn, using Git to develop GCC is practical with or without explicit support from the GCC maintainers. As I see it, the main barrier is the inordinate amount of time it takes to bring up a repository from scratch. As has already been noted, Harvey has provided a read-only copy, but it (a) only allows access to a subset of GCC's branches and (b) doesn't provide a mechanism for developers to push changes directly via git-svn. This sounds like a homework project. I'll do some investigation and see if I can come up with a good bootstrap process. Ollie From paul@codesourcery.com Thu Dec 6 00:17:00 2007 From: paul@codesourcery.com (Paul Brook) Date: Thu, 06 Dec 2007 00:17:00 -0000 Subject: iWMMXt/Linux EABI toolchain In-Reply-To: <20060301173402.40630.qmail@web25011.mail.ukl.yahoo.com> References: <20060301173402.40630.qmail@web25011.mail.ukl.yahoo.com> Message-ID: <200712060017.31717.paul@codesourcery.com> > > > > Thanks for the quick response! > > > > I'm sure it seems I like to make hard wok for myself! It gets worse, > > > > I'm porting Gentoo Linux to iWMMXt with pure EABI kernel and > > > > userspace. I'm not concerned about being able to run old binaries. > > > > So is using abi=iwmmxt really not what I want? A really bad idea? > > > > > > Absolutely. You want the AAPCS, not Intel's pre-AAPCS ABI. > > > > Actually, -mabi=iwmmxt is AAPCS based. It's diffferent from the old intel > > iwmmxt ABI. Yes, but not all AAPCS ABIs are equal. There are some aspects of the ABI (e.g. enum sizes) that are target specific. gcc currently does not have an option for both Linux and iwmmxt. Paul From bje@au1.ibm.com Thu Dec 6 00:29:00 2007 From: bje@au1.ibm.com (Ben Elliston) Date: Thu, 06 Dec 2007 00:29:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <2fbe2a060712051535l7a9991c9q573106ac6f02a1ed@mail.gmail.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <1196896524.12541.3.camel@localhost> <2fbe2a060712051535l7a9991c9q573106ac6f02a1ed@mail.gmail.com> Message-ID: <1196899962.17430.0.camel@localhost> On Wed, 2007-12-05 at 18:35 -0500, Daniel Berlin wrote: > svn propedit --revision svn:log OK, well, it used to be a bit trickier in CVS .. :-) Ben From dewar@adacore.com Thu Dec 6 00:42:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Thu, 06 Dec 2007 00:42:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1196896524.12541.3.camel@localhost> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <1196896524.12541.3.camel@localhost> Message-ID: <4757453C.9080408@adacore.com> Ben Elliston wrote: > Something else that hasn't been raised is that ChangeLogs can be > revised. We often see people making mistakes with their ChangeLog > entries, but since the ChangeLog is versioned, they can revise it. If > you screw up a commit message, it's much harder to fix it (and a purist > might argue that to do so would be destroying revision history). What we do with Ada is to allow *additions* to an existing revision history entry, but not modifications of what is already there. From hjl@lucon.org Thu Dec 6 02:07:00 2007 From: hjl@lucon.org (H.J. Lu) Date: Thu, 06 Dec 2007 02:07:00 -0000 Subject: libbid and floatingpoint exception access funcs In-Reply-To: <20071205140232.GB11485@aon.at> References: <20071205140232.GB11485@aon.at> Message-ID: <20071206020637.GA5180@lucon.org> Hi Bernhard, Please open a gcc bug and assign it to me. Thanks. H.J. --- On Wed, Dec 05, 2007 at 03:02:32PM +0100, Bernhard Fischer wrote: > Hi, > > My libc is configured to omit any FP support (UCLIBC_HAS_FLOATS is not set) > but the recent libbid updates seems to unconditionally pull in floatingpoint > accessor functions thus breaking bootstrap. My notes on this read: > > --------8<-------- > Follows: > Precedes: > > do not pull in allegedly unneeded floatingpoint exception access funcs > > HJL's recent update of libbid would pull in Floating-point exception > handling, although __GCC_FLOAT_NOT_NEEDED is defined. > > Prevent pulling in feclearexcept, feraiseexcept et al for now. > FIXME: revisit > --------8<-------- > > H.J., please advise. > > PS: I currently do: > libgcc/ChangeLog: > 2007-10-13 Bernhard Fischer <> > > * config/libbid/bid_conf.h: Do not define > DECIMAL_GLOBAL_EXCEPTION_FLAGS_ACCESS_FUNCTIONS if > __GCC_FLOAT_NOT_NEEDED is defined. > Index: gcc-4.3.0/libgcc/config/libbid/bid_conf.h > =================================================================== > --- gcc-4.3.0/libgcc/config/libbid/bid_conf.h (revision 129202) > +++ gcc-4.3.0/libgcc/config/libbid/bid_conf.h (working copy) > @@ -535,7 +535,9 @@ Software Foundation, 51 Franklin Street, > #define DECIMAL_GLOBAL_ROUNDING 1 > #define DECIMAL_GLOBAL_ROUNDING_ACCESS_FUNCTIONS 1 > #define DECIMAL_GLOBAL_EXCEPTION_FLAGS 1 > +#ifndef __GCC_FLOAT_NOT_NEEDED > #define DECIMAL_GLOBAL_EXCEPTION_FLAGS_ACCESS_FUNCTIONS 1 > +#endif > #define BID_HAS_GCC_DECIMAL_INTRINSICS 1 > #endif /* IN_LIBGCC2 */ > From davem@davemloft.net Thu Dec 6 02:28:00 2007 From: davem@davemloft.net (David Miller) Date: Thu, 06 Dec 2007 02:28:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> Message-ID: <20071205.182815.249974508.davem@davemloft.net> From: "Daniel Berlin" Date: Wed, 5 Dec 2007 14:08:41 -0500 > So I tried a full history conversion using git-svn of the gcc > repository (IE every trunk revision from 1-HEAD as of yesterday) > The git-svn import was done using repacks every 1000 revisions. > After it finished, I used git-gc --aggressive --prune. Two hours > later, it finished. > The final size after this is 1.5 gig for all of the history of gcc for > just trunk. > > dberlin@home:/compilerstuff/gitgcc/gccrepo/.git/objects/pack$ ls -trl > total 1568899 > -r--r--r-- 1 dberlin dberlin 1585972834 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.pack > -r--r--r-- 1 dberlin dberlin 19008488 2007-12-05 14:01 > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.idx > > This is 3x bigger than hg *and* hg doesn't require me to waste my life > repacking every so often. > The hg operations run roughly as fast as the git ones > > I'm sure there are magic options, magic command lines, etc, i could > use to make it smaller. > > I'm sure if i spent the next few weeks fucking around with git, it may > even be usable! > > But given that git is harder to use, requires manual repacking to get > any kind of sane space usage, and is 3x bigger anyway, i don't see any > advantage to continuing to experiment with git and gcc. I would really appreciate it if you would share experiences like this with the GIT community, who have been now CC:'d. That's the only way this situation is going to improve. When you don't CC: the people who can fix the problem, I can only speculate that perhaps at least subconsciously you don't care if the situation improves or not. The OpenSolaris folks behaved similarly, and that really ticked me off. From dberlin@dberlin.org Thu Dec 6 02:41:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Thu, 06 Dec 2007 02:41:00 -0000 Subject: Git and GCC In-Reply-To: <20071205.182815.249974508.davem@davemloft.net> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> Message-ID: <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> On 12/5/07, David Miller wrote: > From: "Daniel Berlin" > Date: Wed, 5 Dec 2007 14:08:41 -0500 > > > So I tried a full history conversion using git-svn of the gcc > > repository (IE every trunk revision from 1-HEAD as of yesterday) > > The git-svn import was done using repacks every 1000 revisions. > > After it finished, I used git-gc --aggressive --prune. Two hours > > later, it finished. > > The final size after this is 1.5 gig for all of the history of gcc for > > just trunk. > > > > dberlin@home:/compilerstuff/gitgcc/gccrepo/.git/objects/pack$ ls -trl > > total 1568899 > > -r--r--r-- 1 dberlin dberlin 1585972834 2007-12-05 14:01 > > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.pack > > -r--r--r-- 1 dberlin dberlin 19008488 2007-12-05 14:01 > > pack-cd328fcf0bd673d8f2f72c42fbe67da64cbcd218.idx > > > > This is 3x bigger than hg *and* hg doesn't require me to waste my life > > repacking every so often. > > The hg operations run roughly as fast as the git ones > > > > I'm sure there are magic options, magic command lines, etc, i could > > use to make it smaller. > > > > I'm sure if i spent the next few weeks fucking around with git, it may > > even be usable! > > > > But given that git is harder to use, requires manual repacking to get > > any kind of sane space usage, and is 3x bigger anyway, i don't see any > > advantage to continuing to experiment with git and gcc. > > I would really appreciate it if you would share experiences > like this with the GIT community, who have been now CC:'d. > > That's the only way this situation is going to improve. > > When you don't CC: the people who can fix the problem, I can only > speculate that perhaps at least subconsciously you don't care if > the situation improves or not. > I didn't cc the git community for three reasons 1. It's not the nicest message in the world, and thus, more likely to get bad responses than constructive ones. 2. Based on the level of usability, I simply assume it is too young for regular developers to use. At least, I hope this is the case. 3. People i know have had bad experiences talking usability issues with the git community in the past. I am not likely to fare any better, so I would rather have someone who is involved with both our community and theirs, raise these issues, rather than a complete newcomer. But hey, whatever floats your boat :) It is true I gave up quickly, but this is mainly because i don't like to fight with my tools. I am quite fine with a distributed workflow, I now use 8 or so gcc branches in mercurial (auto synced from svn) and merge a lot between them. I wanted to see if git would sanely let me manage the commits back to svn. After fighting with it, i gave up and just wrote a python extension to hg that lets me commit non-svn changesets back to svn directly from hg. --Dan From davem@davemloft.net Thu Dec 6 02:52:00 2007 From: davem@davemloft.net (David Miller) Date: Thu, 06 Dec 2007 02:52:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> Message-ID: <20071205.185203.262588544.davem@davemloft.net> From: "Daniel Berlin" Date: Wed, 5 Dec 2007 21:41:19 -0500 > It is true I gave up quickly, but this is mainly because i don't like > to fight with my tools. > I am quite fine with a distributed workflow, I now use 8 or so gcc > branches in mercurial (auto synced from svn) and merge a lot between > them. I wanted to see if git would sanely let me manage the commits > back to svn. After fighting with it, i gave up and just wrote a > python extension to hg that lets me commit non-svn changesets back to > svn directly from hg. I find it ironic that you were even willing to write tools to facilitate your hg based gcc workflow. That really shows what your thinking is on this matter, in that you're willing to put effort towards making hg work better for you but you're not willing to expend that level of effort to see if git can do so as well. This is what really eats me from the inside about your dissatisfaction with git. Your analysis seems to be a self-fullfilling prophecy, and that's totally unfair to both hg and git. From dberlin@dberlin.org Thu Dec 6 03:47:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Thu, 06 Dec 2007 03:47:00 -0000 Subject: Git and GCC In-Reply-To: <20071205.185203.262588544.davem@davemloft.net> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> Message-ID: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> On 12/5/07, David Miller wrote: > From: "Daniel Berlin" > Date: Wed, 5 Dec 2007 21:41:19 -0500 > > > It is true I gave up quickly, but this is mainly because i don't like > > to fight with my tools. > > I am quite fine with a distributed workflow, I now use 8 or so gcc > > branches in mercurial (auto synced from svn) and merge a lot between > > them. I wanted to see if git would sanely let me manage the commits > > back to svn. After fighting with it, i gave up and just wrote a > > python extension to hg that lets me commit non-svn changesets back to > > svn directly from hg. > > I find it ironic that you were even willing to write tools to > facilitate your hg based gcc workflow. Why? > That really shows what your > thinking is on this matter, in that you're willing to put effort > towards making hg work better for you but you're not willing to expend > that level of effort to see if git can do so as well. See, now you claim to know my thinking. I went back to hg because the GIT's space usage wasn't even in the ballpark, i couldn't get git-svn rebase to update the revs after the initial import (even though i had properly used a rewriteRoot). The size is clearly not just svn data, it's in the git pack itself. I spent a long time working on SVN to reduce it's space usage (repo side and cleaning up the client side and giving a path to svn devs to reduce it further), as well as ui issues, and I really don't feel like having to do the same for GIT. I'm tired of having to spend a large amount of effort to get my tools to work. If the community wants to find and fix the problem, i've already said repeatedly i'll happily give over my repo, data, whatever. You are correct i am not going to spend even more effort when i can be productive with something else much quicker. The devil i know (committing to svn) is better than the devil i don't (diving into git source code and finding/fixing what is causing this space blowup). The python extension took me a few hours (< 4). In git, i spent these hours waiting for git-gc to finish. > This is what really eats me from the inside about your dissatisfaction > with git. Your analysis seems to be a self-fullfilling prophecy, and > that's totally unfair to both hg and git. Oh? You seem to be taking this awfully personally. I came into this completely open minded. Really, I did (i'm sure you'll claim otherwise). GIT people told me it would work great and i'd have a really small git repo and be able to commit back to svn. I tried it. It didn't work out. It doesn't seem to be usable for whatever reason. I'm happy to give details, data, whatever. I made the engineering decision that my effort would be better spent doing something I knew i could do quickly (make hg commit back to svn for my purposes) then trying to improve larger issues in GIT (UI and space usage). That took me a few hours, and I was happy again. I would have been incredibly happy to have git just have come up with a 400 meg gcc repository, and to be happily committing away from git-svn to gcc's repository ... But it didn't happen. So far, you have yet to actually do anything but incorrectly tell me what I am thinking. I'll probably try again in 6 months, and maybe it will be better. From jadamcze@utas.edu.au Thu Dec 6 04:04:00 2007 From: jadamcze@utas.edu.au (Jonathan Adamczewski) Date: Thu, 06 Dec 2007 04:04:00 -0000 Subject: Function specific optimizations call for discussion In-Reply-To: <20071128205737.GA28277@mmeissner-gold.amd.com> References: <20071128205737.GA28277@mmeissner-gold.amd.com> Message-ID: <47577411.5000608@utas.edu.au> Michael Meissner wrote: > One of the things that I've been interested in is adding support to GCC to > compile individual functions with specific target options. I first presented a > draft at the Google mini-summit, and then another draft at the GCC developer > summit last July. > > ... > > The proposal is at: > http://gcc.gnu.org/wiki/FunctionSpecificOpt > Have you given any thought to specifying --param values? jonathan. From davem@davemloft.net Thu Dec 6 04:20:00 2007 From: davem@davemloft.net (David Miller) Date: Thu, 06 Dec 2007 04:20:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> References: <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> Message-ID: <20071205.202047.58135920.davem@davemloft.net> From: "Daniel Berlin" Date: Wed, 5 Dec 2007 22:47:01 -0500 > The size is clearly not just svn data, it's in the git pack itself. And other users have shown much smaller metadata from a GIT import, and yes those are including all of the repository history and branches not just the trunk. From harvey.harrison@gmail.com Thu Dec 6 04:25:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 06 Dec 2007 04:25:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> Message-ID: <1196915112.10408.66.camel@brick> I fought with this a few months ago when I did my own clone of gcc svn. My bad for only discussing this on #git at the time. Should have put this to the list as well. If anyone recalls my report was something along the lines of git gc --aggressive explodes pack size. git repack -a -d --depth=100 --window=100 produced a ~550MB packfile immediately afterwards a git gc --aggressive produces a 1.5G packfile. This was for all branches/tags, not just trunk like Daniel's repo. The best theory I had at the time was that the gc doesn't find as good deltas or doesn't allow the same delta chain depth and so generates a new object in the pack, rather the reusing a good delta it already has in the well-packed pack. Cheers, Harvey From harvey.harrison@gmail.com Thu Dec 6 04:28:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 06 Dec 2007 04:28:00 -0000 Subject: Git and GCC In-Reply-To: <20071205.202047.58135920.davem@davemloft.net> References: <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> Message-ID: <1196915319.10408.71.camel@brick> On Wed, 2007-12-05 at 20:20 -0800, David Miller wrote: > From: "Daniel Berlin" > Date: Wed, 5 Dec 2007 22:47:01 -0500 > > > The size is clearly not just svn data, it's in the git pack itself. > > And other users have shown much smaller metadata from a GIT import, > and yes those are including all of the repository history and branches > not just the trunk. David, I think it is actually a bug in git gc with the --aggressive option...mind you, even if he solves that the format git svn uses for its bi-directional metadata is so space-inefficient Daniel will be crying for other reasons immediately afterwards...4MB for every branch and tag in gcc svn (more than a few thousand). You only need it around for any branches you are planning on committing to but it is all created during the default git svn import. FYI Harvey From dberlin@dberlin.org Thu Dec 6 04:33:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Thu, 06 Dec 2007 04:33:00 -0000 Subject: Git and GCC In-Reply-To: <20071205.202047.58135920.davem@davemloft.net> References: <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> Message-ID: <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> On 12/5/07, David Miller wrote: > From: "Daniel Berlin" > Date: Wed, 5 Dec 2007 22:47:01 -0500 > > > The size is clearly not just svn data, it's in the git pack itself. > > And other users have shown much smaller metadata from a GIT import, > and yes those are including all of the repository history and branches > not just the trunk. I followed the instructions in the tutorials. I followed the instructions given to by people who created these. I came up with a 1.5 gig pack file. You want to help, or you want to argue with me. Right now it sounds like you are trying to blame me or make it look like i did something wrong. You are of course, welcome to try it yourself. I can give you the absolute exactly commands I gave, and with git 1.5.3.7, it will give you a 1.5 gig pack file. From davem@davemloft.net Thu Dec 6 04:48:00 2007 From: davem@davemloft.net (David Miller) Date: Thu, 06 Dec 2007 04:48:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> Message-ID: <20071205.204848.227521641.davem@davemloft.net> From: "Daniel Berlin" Date: Wed, 5 Dec 2007 23:32:52 -0500 > On 12/5/07, David Miller wrote: > > From: "Daniel Berlin" > > Date: Wed, 5 Dec 2007 22:47:01 -0500 > > > > > The size is clearly not just svn data, it's in the git pack itself. > > > > And other users have shown much smaller metadata from a GIT import, > > and yes those are including all of the repository history and branches > > not just the trunk. > I followed the instructions in the tutorials. > I followed the instructions given to by people who created these. > I came up with a 1.5 gig pack file. > You want to help, or you want to argue with me. Several people replied in this thread showing what options can lead to smaller pack files. They also listed what the GIT limitations are that would effect the kind of work you are doing, which seemed to mostly deal with the high space cost of branching and tags when converting to/from SVN repos. From torvalds@linux-foundation.org Thu Dec 6 04:54:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Thu, 06 Dec 2007 04:54:00 -0000 Subject: Git and GCC In-Reply-To: <1196915112.10408.66.camel@brick> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <1196915112.10408.66.camel@brick> Message-ID: On Wed, 5 Dec 2007, Harvey Harrison wrote: > > If anyone recalls my report was something along the lines of > git gc --aggressive explodes pack size. Yes, --aggressive is generally a bad idea. I think we should remove it or at least fix it. It doesn't do what the name implies, because it actually throws away potentially good packing, and re-does it all from a clean slate. That said, it's totally pointless for a person who isn't a git proponent to do an initial import, and in that sense I agree with Daniel: he shouldn't waste his time with tools that he doesn't know or care about, since there are people who *can* do a better job, and who know what they are doing, and understand and like the tool. While you can do a half-assed job with just mindlessly running "git svnimport" (which is deprecated these days) or "git svn clone" (better), the fact is, to do a *good* import does likely mean spending some effort on it. Trying to make the user names / emails to be better with a mailmap, for example. [ By default, for example, "git svn clone/fetch" seems to create those horrible fake email addresses that contain the ID of the SVN repo in each commit - I'm not talking about the "git-svn-id", I'm talking about the "user@hex-string-goes-here" thing for the author. Maybe people don't really care, but isn't that ugly as hell? I'd think it's worth it doing a really nice import, spending some effort on it. But maybe those things come from the older CVS->SVN import, I don't really know. I've done a few SVN imports, but I've done them just for stuff where I didn't want to touch SVN, but just wanted to track some project like libgpod. For things like *that*, a totally mindless "git svn" thing is fine ] Of course, that does require there to be git people in the gcc crowd who are motivated enough to do the proper import and then make sure it's up-to-date and hosted somewhere. If those people don't exist, I'm not sure there's much idea to it. The point being, you cannot ask a non-git person to do a major git import for an actual switch-over. Yes, it *can* be as simple as just doing a git svn clone --stdlayout svn://svn://gcc.gnu.org/svn/gcc gcc but the fact remains, you want to spend more effort and expertise on it if you actually want the result to be used as a basis for future work (as opposed to just tracking somebody elses SVN tree). That includes: - do the historic import with good packing (and no, "--aggressive" is not it, never mind the misleading name and man-page) - probably mailmap entries, certainly spending some time validating the results. - hosting it and perhaps most importantly - helping people who are *not* git users get up to speed. because doing a good job at it is like asking a CVS newbie to set up a branch in CVS. I'm sure you can do it from man-pages, but I'm also sure you sure as hell won't like the end result. Linus From harvey.harrison@gmail.com Thu Dec 6 05:04:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 06 Dec 2007 05:04:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <1196915112.10408.66.camel@brick> Message-ID: <1196917487.10408.82.camel@brick> On Wed, 2007-12-05 at 20:54 -0800, Linus Torvalds wrote: > > On Wed, 5 Dec 2007, Harvey Harrison wrote: > > > > If anyone recalls my report was something along the lines of > > git gc --aggressive explodes pack size. > [ By default, for example, "git svn clone/fetch" seems to create those > horrible fake email addresses that contain the ID of the SVN repo in > each commit - I'm not talking about the "git-svn-id", I'm talking about > the "user@hex-string-goes-here" thing for the author. Maybe people don't > really care, but isn't that ugly as hell? I'd think it's worth it doing > a really nice import, spending some effort on it. > > But maybe those things come from the older CVS->SVN import, I don't > really know. I've done a few SVN imports, but I've done them just for > stuff where I didn't want to touch SVN, but just wanted to track some > project like libgpod. For things like *that*, a totally mindless "git > svn" thing is fine ] > git svn does accept a mailmap at import time with the same format as the cvs importer I think. But for someone that just wants a repo to check out this was easiest. I'd be willing to spend the time to do a nicer job if there was any interest from the gcc side, but I'm not that invested (other than owing them for an often-used tool). Harvey From dberlin@dberlin.org Thu Dec 6 05:11:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Thu, 06 Dec 2007 05:11:00 -0000 Subject: Git and GCC In-Reply-To: <20071205.204848.227521641.davem@davemloft.net> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> Message-ID: <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> On 12/5/07, David Miller wrote: > From: "Daniel Berlin" > Date: Wed, 5 Dec 2007 23:32:52 -0500 > > > On 12/5/07, David Miller wrote: > > > From: "Daniel Berlin" > > > Date: Wed, 5 Dec 2007 22:47:01 -0500 > > > > > > > The size is clearly not just svn data, it's in the git pack itself. > > > > > > And other users have shown much smaller metadata from a GIT import, > > > and yes those are including all of the repository history and branches > > > not just the trunk. > > I followed the instructions in the tutorials. > > I followed the instructions given to by people who created these. > > I came up with a 1.5 gig pack file. > > You want to help, or you want to argue with me. > > Several people replied in this thread showing what options can lead to > smaller pack files. Actually, one person did, but that's okay, let's assume it was several. I am currently trying Harvey's options. I asked about using the pre-existing repos so i didn't have to do this, but they were all 1. Done using read-only imports or 2. Don't contain full history (IE the one that contains full history that is often posted here was done as a read only import and thus doesn't have the metadata). > They also listed what the GIT limitations are that would effect the > kind of work you are doing, which seemed to mostly deal with the high > space cost of branching and tags when converting to/from SVN repos. Actually, it turns out that git-gc --aggressive does this dumb thing to pack files sometimes regardless of whether you converted from an SVN repo or not. From harvey.harrison@gmail.com Thu Dec 6 05:15:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 06 Dec 2007 05:15:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: <1196918132.10408.85.camel@brick> On Thu, 2007-12-06 at 00:11 -0500, Daniel Berlin wrote: > On 12/5/07, David Miller wrote: > > From: "Daniel Berlin" > > Date: Wed, 5 Dec 2007 23:32:52 -0500 > > > > > On 12/5/07, David Miller wrote: > > > > From: "Daniel Berlin" > > > > Date: Wed, 5 Dec 2007 22:47:01 -0500 > > > > > > > > > The size is clearly not just svn data, it's in the git pack itself. > > > > > > > > And other users have shown much smaller metadata from a GIT import, > > > > and yes those are including all of the repository history and branches > > > > not just the trunk. > > > I followed the instructions in the tutorials. > > > I followed the instructions given to by people who created these. > > > I came up with a 1.5 gig pack file. > > > You want to help, or you want to argue with me. > > > > Several people replied in this thread showing what options can lead to > > smaller pack files. > > Actually, one person did, but that's okay, let's assume it was several. > I am currently trying Harvey's options. > > I asked about using the pre-existing repos so i didn't have to do > this, but they were all > 1. Done using read-only imports or > 2. Don't contain full history > (IE the one that contains full history that is often posted here was > done as a read only import and thus doesn't have the metadata). While you won't get the git svn metadata if you clone the infradead repo, it can be recreated on the fly by git svn if you want to start commiting directly to gcc svn. Harvey From dberlin@dberlin.org Thu Dec 6 05:17:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Thu, 06 Dec 2007 05:17:00 -0000 Subject: Git and GCC In-Reply-To: <1196918132.10408.85.camel@brick> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> Message-ID: <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> > While you won't get the git svn metadata if you clone the infradead > repo, it can be recreated on the fly by git svn if you want to start > commiting directly to gcc svn. > I will give this a try :) From torvalds@linux-foundation.org Thu Dec 6 06:09:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Thu, 06 Dec 2007 06:09:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: On Thu, 6 Dec 2007, Daniel Berlin wrote: > > Actually, it turns out that git-gc --aggressive does this dumb thing > to pack files sometimes regardless of whether you converted from an > SVN repo or not. Absolutely. git --aggressive is mostly dumb. It's really only useful for the case of "I know I have a *really* bad pack, and I want to throw away all the bad packing decisions I have done". To explain this, it's worth explaining (you are probably aware of it, but let me go through the basics anyway) how git delta-chains work, and how they are so different from most other systems. In other SCM's, a delta-chain is generally fixed. It might be "forwards" or "backwards", and it might evolve a bit as you work with the repository, but generally it's a chain of changes to a single file represented as some kind of single SCM entity. In CVS, it's obviously the *,v file, and a lot of other systems do rather similar things. Git also does delta-chains, but it does them a lot more "loosely". There is no fixed entity. Delta's are generated against any random other version that git deems to be a good delta candidate (with various fairly successful heursitics), and there are absolutely no hard grouping rules. This is generally a very good thing. It's good for various conceptual reasons (ie git internally never really even needs to care about the whole revision chain - it doesn't really think in terms of deltas at all), but it's also great because getting rid of the inflexible delta rules means that git doesn't have any problems at all with merging two files together, for example - there simply are no arbitrary *,v "revision files" that have some hidden meaning. It also means that the choice of deltas is a much more open-ended question. If you limit the delta chain to just one file, you really don't have a lot of choices on what to do about deltas, but in git, it really can be a totally different issue. And this is where the really badly named "--aggressive" comes in. While git generally tries to re-use delta information (because it's a good idea, and it doesn't waste CPU time re-finding all the good deltas we found earlier), sometimes you want to say "let's start all over, with a blank slate, and ignore all the previous delta information, and try to generate a new set of deltas". So "--aggressive" is not really about being aggressive, but about wasting CPU time re-doing a decision we already did earlier! *Sometimes* that is a good thing. Some import tools in particular could generate really horribly bad deltas. Anything that uses "git fast-import", for example, likely doesn't have much of a great delta layout, so it might be worth saying "I want to start from a clean slate". But almost always, in other cases, it's actually a really bad thing to do. It's going to waste CPU time, and especially if you had actually done a good job at deltaing earlier, the end result isn't going to re-use all those *good* deltas you already found, so you'll actually end up with a much worse end result too! I'll send a patch to Junio to just remove the "git gc --aggressive" documentation. It can be useful, but it generally is useful only when you really understand at a very deep level what it's doing, and that documentation doesn't help you do that. Generally, doing incremental "git gc" is the right approach, and better than doing "git gc --aggressive". It's going to re-use old deltas, and when those old deltas can't be found (the reason for doing incremental GC in the first place!) it's going to create new ones. On the other hand, it's definitely true that an "initial import of a long and involved history" is a point where it can be worth spending a lot of time finding the *really*good* deltas. Then, every user ever after (as long as they don't use "git gc --aggressive" to undo it!) will get the advantage of that one-time event. So especially for big projects with a long history, it's probably worth doing some extra work, telling the delta finding code to go wild. So the equivalent of "git gc --aggressive" - but done *properly* - is to do (overnight) something like git repack -a -d --depth=250 --window=250 where that depth thing is just about how deep the delta chains can be (make them longer for old history - it's worth the space overhead), and the window thing is about how big an object window we want each delta candidate to scan. And here, you might well want to add the "-f" flag (which is the "drop all old deltas", since you now are actually trying to make sure that this one actually finds good candidates. And then it's going to take forever and a day (ie a "do it overnight" thing). But the end result is that everybody downstream from that repository will get much better packs, without having to spend any effort on it themselves. Linus From jonsmirl@gmail.com Thu Dec 6 06:48:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Thu, 06 Dec 2007 06:48:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> Message-ID: <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> On 12/6/07, Daniel Berlin wrote: > > While you won't get the git svn metadata if you clone the infradead > > repo, it can be recreated on the fly by git svn if you want to start > > commiting directly to gcc svn. > > > I will give this a try :) Back when I was working on the Mozilla repository we were able to convert the full 4GB CVS repository complete with all history into a 450MB pack file. That work is where the git-fastimport tool came from. But it took a month of messing with the import tools to achieve this and Mozilla still chose another VCS (mainly because of poor Windows support in git). Like Linus says, this type of command will yield the smallest pack file: git repack -a -d --depth=250 --window=250 I do agree that importing multi-gigabyte repositories is not a daily occurrence nor a turn-key operation. There are significant issues when translating from one VCS to another. The lack of global branch tracking in CVS causes extreme problems on import. Hand editing of CVS files also caused endless trouble. The key to converting repositories of this size is RAM. 4GB minimum, more would be better. git-repack is not multi-threaded. There were a few attempts at making it multi-threaded but none were too successful. If I remember right, with loads of RAM, a repack on a 450MB repository was taking about five hours on a 2.8Ghz Core2. But this is something you only have to do once for the import. Later repacks will reuse the original deltas. -- Jon Smirl jonsmirl@gmail.com From peff@peff.net Thu Dec 6 07:15:00 2007 From: peff@peff.net (Jeff King) Date: Thu, 06 Dec 2007 07:15:00 -0000 Subject: Git and GCC In-Reply-To: <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> Message-ID: <20071206071503.GA19504@coredump.intra.peff.net> On Thu, Dec 06, 2007 at 01:47:54AM -0500, Jon Smirl wrote: > The key to converting repositories of this size is RAM. 4GB minimum, > more would be better. git-repack is not multi-threaded. There were a > few attempts at making it multi-threaded but none were too successful. > If I remember right, with loads of RAM, a repack on a 450MB repository > was taking about five hours on a 2.8Ghz Core2. But this is something > you only have to do once for the import. Later repacks will reuse the > original deltas. Actually, Nicolas put quite a bit of work into multi-threading the repack process; the results have been in master for some time, and will be in the soon-to-be-released v1.5.4. The downside is that the threading partitions the object space, so the resulting size is not necessarily as small (but I don't know that anybody has done testing on large repos to find out how large the difference is). -Peff From harvey.harrison@gmail.com Thu Dec 6 07:49:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 06 Dec 2007 07:49:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: <1196927361.13109.1.camel@brick> > git repack -a -d --depth=250 --window=250 > Since I have the whole gcc repo locally I'll give this a shot overnight just to see what can be done at the extreme end or things. Harvey From git@davidb.org Thu Dec 6 08:12:00 2007 From: git@davidb.org (David Brown) Date: Thu, 06 Dec 2007 08:12:00 -0000 Subject: Git and GCC In-Reply-To: <1196927361.13109.1.camel@brick> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196927361.13109.1.camel@brick> Message-ID: <20071206081139.GA14370@old.davidb.org> On Wed, Dec 05, 2007 at 11:49:21PM -0800, Harvey Harrison wrote: > >> git repack -a -d --depth=250 --window=250 >> > >Since I have the whole gcc repo locally I'll give this a shot overnight >just to see what can be done at the extreme end or things. When I tried this on a very large repo, at least one with some large files in it, git quickly exceeded my physical memory and started thrashing the machine. I had good results with git config pack.deltaCacheSize 512m git config pack.windowMemory 512m of course adjusting based on your physical memory. I think changing the windowMemory will affect the resulting compression, so changing these ratios might get better compression out of the result. If you're really patient, though, you could leave the unbounded window, hope you have enough swap, and just let it run. Dave From info@occ.org Thu Dec 6 09:28:00 2007 From: info@occ.org (OVERSEAS CREDIT COMMISION) Date: Thu, 06 Dec 2007 09:28:00 -0000 Subject: : RE : CONSIGNMENT DELIVERY TO YOUR DOOR-STEP Message-ID: Attn:Beneficiary On behalf of the board and management of Overseas Credit Commission (OCC), London UK, I Mr John Frederick Fisher the Operations Manager wishes to inform you that your consignment/fund content $5.7 million dollars tagged diplomatic lugagge 122 with Ref:No1226/X42/206 which was deposited in our vault for safe keeping by a Diplomatic courier company is due for Immediate collection. Be informed that we have concluded all arrangments to deliver your consignment at your doorstep through diplomatic means. In line with the binding diplomatic consignment delivery policies, kindly furnish us with the following as set forth. A copy of your international passport or any other means of identification as the true consignee. The address where the above cargo/funds should be delivered to and your phone numberList the nearest international airport to your address location. Meanwhile, we urge you to treat the above requirement with utmost urgency to enable us dispense our duties and obligation accordingly thereby allowing us to serve you in a timely fashion. Upon satisfactory receipt of all the above mentioned, you will be further acquainted with the detailed delivery itinerary including information of the diplomat who will accompany your consignment. As always, feel very free to contact us should you have any further question as our customers rights are continuously protected. We pledge our best service at all times. Yours Sincerely Mr John Frederick Fisher Foreign Operations Manager, (OCC) Oversea Credit Commission, London Uk. From dragonfly@linux-vs.org Thu Dec 6 09:40:00 2007 From: dragonfly@linux-vs.org (Li Wang) Date: Thu, 06 Dec 2007 09:40:00 -0000 Subject: Generate Codes for a something like stack/dataflow computer Message-ID: <4757C37C.5010704@linux-vs.org> Hi, We are retargetting GCC to a VLIW chip, which runs as a coprocessor to a general purpose processor. The coprocessor is responsible for expediating some code sections which have good parallel characteristics without any dependences. Its ISA enables it can only fetch data sequentially rather than random access from a on-chip memory which is shared by the host processor, through dedicated function units named DBx. The host processor is responsible to place data there, and told the DBx base address and data length. Once the data is fetched by the coprocessor, it is stored to local registers owned by the coprocessor, and before the computing ends, the data will always reside in the coprocessor's registers. Namely, without spills and it permits no spills. From the coprocessor standpoint, the instructions supports no memory operands and no any addressing mode. It supports only register move and arithmetical operations. It looks something like data flow computer or stack computer. Let's take the following codes as an example: int main() { int a[16], b[16], c[16]; compute(a, b, c); return 0; } void compute(int a[], int b[], int c[]) { for (int j = 0; j < 16; j++) c[j] = a[j] + b[j]; return; } We want to put the function compute() executed on the coprocessor, and host processor organizes and places the data at proper positions in the on-chip memory, prepare the DBx function units. Assume DB0 is allocated to array a[], DB1 to b[], DB2 to c[]. Then the assemble codes for the coprocessor we want to generate like as follows, L3: if (data in DB0 not exausted) goto L1; else goto L2; L1: get R0, DB0; // load a data from the on-chip memory through DB0 to R0 get R1, DB1; add R2, R0, R1; put R2, DB2; // store result to DB2 goto L3; L2: end; Could anyone give some hints how to implement that, currently the GCC internals for addressing mode in the machine description could support that? Li From schwab@suse.de Thu Dec 6 09:52:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Thu, 06 Dec 2007 09:52:00 -0000 Subject: Git and GCC In-Reply-To: <1196917487.10408.82.camel@brick> (Harvey Harrison's message of "Wed\, 05 Dec 2007 21\:04\:47 -0800") References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <1196915112.10408.66.camel@brick> <1196917487.10408.82.camel@brick> Message-ID: Harvey Harrison writes: > git svn does accept a mailmap at import time with the same format as the > cvs importer I think. But for someone that just wants a repo to check > out this was easiest. I'd be willing to spend the time to do a nicer > job if there was any interest from the gcc side, but I'm not that > invested (other than owing them for an often-used tool). I have a complete list of the uid<->mail mapping for the gcc repository. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From schwab@suse.de Thu Dec 6 09:53:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Thu, 06 Dec 2007 09:53:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1196899962.17430.0.camel@localhost> (Ben Elliston's message of "Thu\, 06 Dec 2007 11\:12\:42 +1100") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <47543DE8.3010003@google.com> <1196896524.12541.3.camel@localhost> <2fbe2a060712051535l7a9991c9q573106ac6f02a1ed@mail.gmail.com> <1196899962.17430.0.camel@localhost> Message-ID: Ben Elliston writes: > On Wed, 2007-12-05 at 18:35 -0500, Daniel Berlin wrote: > >> svn propedit --revision svn:log > > OK, well, it used to be a bit trickier in CVS .. :-) In CVS it's just a cvs admin -m as well. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From burnus@net-b.de Thu Dec 6 10:45:00 2007 From: burnus@net-b.de (Tobias Burnus) Date: Thu, 06 Dec 2007 10:45:00 -0000 Subject: Patch manager dying for a week or two Message-ID: <4757D2D6.3080602@net-b.de> Daniel Berlin wrote: > Patch manager will be dying for a week or two while i change hosting. > of course, if nobody is still using it, i can just kill it permanently. At least I use it almost always to make sure patches does not get forgotten; thus I regularly check http://dberlin.org/patches/patches/list Additionally, I like that it automatically adds a link to the mailing list in the PR; that way one can easily check the discussion in the mailing list. (I also like PRs, they not only help to obtain more information about a patch [cf. recently ChangeLog discussion], but also ensure that one does not forget something.) I think many gfortraners use :ADDPATCH: Tobias From bmei@broadcom.com Thu Dec 6 11:01:00 2007 From: bmei@broadcom.com (Bingfeng Mei) Date: Thu, 06 Dec 2007 11:01:00 -0000 Subject: How to define a blackbox data type in gcc? Message-ID: <2E073B3ABB3F664DBA1D1C4D5FB47EF40710B6CC@NT-IRVA-0752.brcm.ad.broadcom.com> Hello, I am wondering how to define a blackbox data type in gcc. It can be too wide and irregular to be represented by current data types. It needs to be assigned to special register files. I don't care and don't want to touch its content except using intrinsics (builtin) functions. An example of such data type is to represent value in a MAC register. Is there are convenient way to do that? Thanks in advance, Cheers, Bingfeng Mei Broadcom UK From Johannes.Schindelin@gmx.de Thu Dec 6 11:57:00 2007 From: Johannes.Schindelin@gmx.de (Johannes Schindelin) Date: Thu, 06 Dec 2007 11:57:00 -0000 Subject: Git and GCC In-Reply-To: <20071205.185203.262588544.davem@davemloft.net> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> Message-ID: Hi, On Wed, 5 Dec 2007, David Miller wrote: > From: "Daniel Berlin" > Date: Wed, 5 Dec 2007 21:41:19 -0500 > > > It is true I gave up quickly, but this is mainly because i don't like > > to fight with my tools. > > > > I am quite fine with a distributed workflow, I now use 8 or so gcc > > branches in mercurial (auto synced from svn) and merge a lot between > > them. I wanted to see if git would sanely let me manage the commits > > back to svn. After fighting with it, i gave up and just wrote a > > python extension to hg that lets me commit non-svn changesets back to > > svn directly from hg. > > I find it ironic that you were even willing to write tools to facilitate > your hg based gcc workflow. That really shows what your thinking is on > this matter, in that you're willing to put effort towards making hg work > better for you but you're not willing to expend that level of effort to > see if git can do so as well. While this is true... > This is what really eats me from the inside about your dissatisfaction > with git. Your analysis seems to be a self-fullfilling prophecy, and > that's totally unfair to both hg and git. ... I actually appreciate people complaining -- in the meantime. It shows right away what group you belong to in the "Those who can do, do, those who can't, complain.". You can see that very easily on the git list, or on the #git channel on irc.freenode.net. There is enough data for a study which yearns to be written, that shows how quickly we resolve issues with people that are sincerely interested in a solution. (Of course, on the other hand, there are also quite a few cases which show how frustrating (for both sides) and unfruitful discussions started by a complaint are.) So I fully expect an issue like Daniel's to be resolved in a matter of minutes on the git list, if the OP gives us a chance. If we are not even Cc'ed, you are completely right, she or he probably does not want the issue to be resolved. Ciao, Dscho From Johannes.Schindelin@gmx.de Thu Dec 6 12:04:00 2007 From: Johannes.Schindelin@gmx.de (Johannes Schindelin) Date: Thu, 06 Dec 2007 12:04:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: The default was not to change the window or depth at all. As suggested by Jon Smirl, Linus Torvalds and others, default to --window=250 --depth=250 Signed-off-by: Johannes Schindelin --- On Wed, 5 Dec 2007, Linus Torvalds wrote: > On Thu, 6 Dec 2007, Daniel Berlin wrote: > > > > Actually, it turns out that git-gc --aggressive does this dumb > > thing to pack files sometimes regardless of whether you > > converted from an SVN repo or not. > > Absolutely. git --aggressive is mostly dumb. It's really only > useful for the case of "I know I have a *really* bad pack, and I > want to throw away all the bad packing decisions I have done". > > [...] > > So the equivalent of "git gc --aggressive" - but done *properly* > - is to do (overnight) something like > > git repack -a -d --depth=250 --window=250 How about this, then? builtin-gc.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/builtin-gc.c b/builtin-gc.c index 799c263..c6806d3 100644 --- a/builtin-gc.c +++ b/builtin-gc.c @@ -23,7 +23,7 @@ static const char * const builtin_gc_usage[] = { }; static int pack_refs = 1; -static int aggressive_window = -1; +static int aggressive_window = 250; static int gc_auto_threshold = 6700; static int gc_auto_pack_limit = 20; @@ -192,6 +192,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (aggressive) { append_option(argv_repack, "-f", MAX_ADD); + append_option(argv_repack, "--depth=250", MAX_ADD); if (aggressive_window > 0) { sprintf(buf, "--window=%d", aggressive_window); append_option(argv_repack, buf, MAX_ADD); -- 1.5.3.7.2157.g9598e From ismail@pardus.org.tr Thu Dec 6 12:04:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Thu, 06 Dec 2007 12:04:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> Message-ID: <200712061404.58827.ismail@pardus.org.tr> Thursday 06 December 2007 13:57:06 Johannes Schindelin yazm??t?: [...] > So I fully expect an issue like Daniel's to be resolved in a matter of > minutes on the git list, if the OP gives us a chance. If we are not even > Cc'ed, you are completely right, she or he probably does not want the > issue to be resolved. Lets be fair about this, Ollie Wild already sent a mail about git-svn disk usage and there is no concrete solution yet, though it seems the bottleneck is known. Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From harvey.harrison@gmail.com Thu Dec 6 12:23:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 06 Dec 2007 12:23:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071205.182815.249974508.davem@davemloft.net> <4aca3dc20712051841o71ab773ft6dd0714ebc355dd5@mail.gmail.com> <20071205.185203.262588544.davem@davemloft.net> <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <1196915112.10408.66.camel@brick> <1196917487.10408.82.camel@brick> Message-ID: <1196943792.13311.5.camel@brick> On Thu, 2007-12-06 at 10:52 +0100, Andreas Schwab wrote: > Harvey Harrison writes: > > > git svn does accept a mailmap at import time with the same format as the > > cvs importer I think. But for someone that just wants a repo to check > > out this was easiest. I'd be willing to spend the time to do a nicer > > job if there was any interest from the gcc side, but I'm not that > > invested (other than owing them for an often-used tool). > > I have a complete list of the uid<->mail mapping for the gcc repository. > > Andreas. > Feel free to send it along, but for now I'll keep on going without a mapping. If I went back now and changed it, all those people who are already using the existing mirror will have to download a whole new history. If gcc decides they would like a more clean import for more official use I'd be more than happy to work with you guys to produce a more clean import with Author/commiter names cleaned up, etc. Cheers, Harvey From amacleod@redhat.com Thu Dec 6 13:41:00 2007 From: amacleod@redhat.com (Andrew MacLeod) Date: Thu, 06 Dec 2007 13:41:00 -0000 Subject: update_stmt calls In-Reply-To: <20060301043123.GA12384@atrey.karlin.mff.cuni.cz> References: <20060301043123.GA12384@atrey.karlin.mff.cuni.cz> Message-ID: <4757FB8B.2080903@redhat.com> Zdenek Dvorak wrote: > Hello, > > during a recent discussion, it was pointed to my attention that > update_stmt is performance critical. I wondered why; this is the number > of update_stmt calls for combine.i (all the other passes have less then > 1000 calls): > > <...> > > I have a patch that decreases number of update_stmt calls in tree alias > analysis to 46525; still, is it really that useful to run pass_may_alias > *six* times during compilation? Obviously, we need the initial one, and > there are comments after pass_sra and pass_fold_builtins that indicate > that the following pass_may_alias cannot be avoided (which seems > doubtful to me, at the very least in the later case), but the remaining > three seem to be just placed randomly. > > I also have a patch that decreases the number of update_stmt calls > in VRP to 5229 (which is more or less the number of ASSERT_EXPRs it > creates, so this cannot be improved significantly). > I can't say I'm suprised, there was a time when the general rule was "when in doubt, update", so not a lot of thought has gone into those calls, especially in the older passes. Anything which reduces the calls to update_stmt() is a probably good thing :-) I can't speak to the number of calls to pass_may_alias, but that does seem a bit excessive to me as well. To the best of my knowledge, no one has recently (if ever) sat down and figured out which passes we actually need where and when. They just get added when people think they are a good idea, but rarely get removed later when things change. I would suspect there are numerous passes that could be eliminated with some analysis and shuffling. Thats something that would be easier to do with a dynamic pass manager :-) Andrew From tytso@mit.edu Thu Dec 6 13:43:00 2007 From: tytso@mit.edu (Theodore Tso) Date: Thu, 06 Dec 2007 13:43:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: <20071206134243.GA17037@thunk.org> On Thu, Dec 06, 2007 at 12:03:38PM +0000, Johannes Schindelin wrote: > > The default was not to change the window or depth at all. As suggested > by Jon Smirl, Linus Torvalds and others, default to > > --window=250 --depth=250 I'd also suggest adding a comment in the man pages that this should only be done rarely, and that it can potentially take a *long* time (i.e., overnight) for big repositories, and in general it's not worth the effort to use --aggressive. Apologies to Linus and to the gcc folks, since I was the one who originally coded up gc --aggressive, and at the time my intent was "rarely does it make sense, and it may take a long time". The reason why I didn't make the default --window and --depth larger is because at the time the biggest repo I had easy access to was the Linux kernel's, and there you rapidly hit diminishing returns at much smaller numbers, so there was no real point in using --window=250 --depth=250. Linus later pointed out that what we *really* should do is at some point was to change repack -f to potentially retry to find a better delta, but to reuse the existing delta if it was no worse. That automatically does the right thing in the case where you had previously done a repack with --window= --depth=, but then later try using "gc --agressive", which ends up doing a worse job and throwing away the information from the previous repack with large window and depth sizes. Unfortunately no one ever got around to implementing that. Regards, - Ted From nico@cam.org Thu Dec 6 14:02:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Thu, 06 Dec 2007 14:02:00 -0000 Subject: Git and GCC In-Reply-To: <1196927361.13109.1.camel@brick> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196927361.13109.1.camel@brick> Message-ID: On Wed, 5 Dec 2007, Harvey Harrison wrote: > > > git repack -a -d --depth=250 --window=250 > > > > Since I have the whole gcc repo locally I'll give this a shot overnight > just to see what can be done at the extreme end or things. Don't forget to add -f as well. Nicolas From nico@cam.org Thu Dec 6 14:15:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Thu, 06 Dec 2007 14:15:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: <20071206134243.GA17037@thunk.org> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <20071206134243.GA17037@thunk.org> Message-ID: On Thu, 6 Dec 2007, Theodore Tso wrote: > Linus later pointed out that what we *really* should do is at some > point was to change repack -f to potentially retry to find a better > delta, but to reuse the existing delta if it was no worse. That > automatically does the right thing in the case where you had > previously done a repack with --window= --depth=, > but then later try using "gc --agressive", which ends up doing a worse > job and throwing away the information from the previous repack with > large window and depth sizes. Unfortunately no one ever got around to > implementing that. I did start looking at it, but there are subtle issues to consider, such as making sure not to create delta loops. Currently this is avoided by never involving already reused deltas in new delta chains, except for edge base objects. IOW, this requires some head scratching which I didn't have the time for so far. Nicolas From nico@cam.org Thu Dec 6 14:18:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Thu, 06 Dec 2007 14:18:00 -0000 Subject: Git and GCC In-Reply-To: <20071206071503.GA19504@coredump.intra.peff.net> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> Message-ID: On Thu, 6 Dec 2007, Jeff King wrote: > On Thu, Dec 06, 2007 at 01:47:54AM -0500, Jon Smirl wrote: > > > The key to converting repositories of this size is RAM. 4GB minimum, > > more would be better. git-repack is not multi-threaded. There were a > > few attempts at making it multi-threaded but none were too successful. > > If I remember right, with loads of RAM, a repack on a 450MB repository > > was taking about five hours on a 2.8Ghz Core2. But this is something > > you only have to do once for the import. Later repacks will reuse the > > original deltas. > > Actually, Nicolas put quite a bit of work into multi-threading the > repack process; the results have been in master for some time, and will > be in the soon-to-be-released v1.5.4. > > The downside is that the threading partitions the object space, so the > resulting size is not necessarily as small (but I don't know that > anybody has done testing on large repos to find out how large the > difference is). Quick guesstimate is in the 1% ballpark. Nicolas From madcoder@debian.org Thu Dec 6 14:23:00 2007 From: madcoder@debian.org (Pierre Habouzit) Date: Thu, 06 Dec 2007 14:23:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: <20071206142254.GD5959@artemis.madism.org> On Thu, Dec 06, 2007 at 12:03:38PM +0000, Johannes Schindelin wrote: > > The default was not to change the window or depth at all. As suggested > by Jon Smirl, Linus Torvalds and others, default to > > --window=250 --depth=250 well, this will explode on many quite reasonnably sized systems. This should also use a memory-limit that could be auto-guessed from the system total physical memory (50% of the actual memory could be a good idea e.g.). On very large repositories, using that on the e.g. linux kernel, swaps like hell on a machine with 1Go of ram, and almost nothing running on it (less than 200Mo of ram actually used) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From harvey.harrison@gmail.com Thu Dec 6 15:31:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 06 Dec 2007 15:31:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: <1196955059.13633.3.camel@brick> Wow /usr/bin/time git repack -a -d -f --window=250 --depth=250 23266.37user 581.04system 7:41:25elapsed 86%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (419835major+123275804minor)pagefaults 0swaps -r--r--r-- 1 hharrison hharrison 29091872 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.idx -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack That extra delta depth really does make a difference. Just over a 300MB pack in the end, for all gcc branches/tags as of last night. Cheers, Harvey From Johannes.Schindelin@gmx.de Thu Dec 6 15:56:00 2007 From: Johannes.Schindelin@gmx.de (Johannes Schindelin) Date: Thu, 06 Dec 2007 15:56:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: <1196955059.13633.3.camel@brick> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196955059.13633.3.camel@brick> Message-ID: Hi, On Thu, 6 Dec 2007, Harvey Harrison wrote: > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 > pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack Wow. Ciao, Dscho From Johannes.Schindelin@gmx.de Thu Dec 6 15:56:00 2007 From: Johannes.Schindelin@gmx.de (Johannes Schindelin) Date: Thu, 06 Dec 2007 15:56:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: <20071206142254.GD5959@artemis.madism.org> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <20071206142254.GD5959@artemis.madism.org> Message-ID: Hi, On Thu, 6 Dec 2007, Pierre Habouzit wrote: > On Thu, Dec 06, 2007 at 12:03:38PM +0000, Johannes Schindelin wrote: > > > > The default was not to change the window or depth at all. As > > suggested by Jon Smirl, Linus Torvalds and others, default to > > > > --window=250 --depth=250 > > well, this will explode on many quite reasonnably sized systems. This > should also use a memory-limit that could be auto-guessed from the > system total physical memory (50% of the actual memory could be a good > idea e.g.). > > On very large repositories, using that on the e.g. linux kernel, swaps > like hell on a machine with 1Go of ram, and almost nothing running on it > (less than 200Mo of ram actually used) Yes. However, I think that --aggressive should be aggressive, and if you decide to run it on a machine which lacks the muscle to be aggressive, well, you should have known better. The upside: if you run this on a strong machine and clone it to a weak machine, you'll still have the benefit of a small pack (and you should mark it as .keep, too, to keep the benefit...) Ciao, Dscho From torvalds@linux-foundation.org Thu Dec 6 16:19:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Thu, 06 Dec 2007 16:19:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: <1196955059.13633.3.camel@brick> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196955059.13633.3.camel@brick> Message-ID: On Thu, 6 Dec 2007, Harvey Harrison wrote: > > 7:41:25elapsed 86%CPU Heh. And this is why you want to do it exactly *once*, and then just export the end result for others ;) > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack But yeah, especially if you allow longer delta chains, the end result can be much smaller (and what makes the one-time repack more expensive is the window size, not the delta chain - you could make the delta chains longer with no cost overhead at packing time) HOWEVER. The longer delta chains do make it potentially much more expensive to then use old history. So there's a trade-off. And quite frankly, a delta depth of 250 is likely going to cause overflows in the delta cache (which is only 256 entries in size *and* it's a hash, so it's going to start having hash conflicts long before hitting the 250 depth limit). So when I said "--depth=250 --window=250", I chose those numbers more as an example of extremely aggressive packing, and I'm not at all sure that the end result is necessarily wonderfully usable. It's going to save disk space (and network bandwidth - the delta's will be re-used for the network protocol too!), but there are definitely downsides too, and using long delta chains may simply not be worth it in practice. (And some of it might just want to have git tuning, ie if people think that long deltas are worth it, we could easily just expand on the delta hash, at the cost of some more memory used!) That said, the good news is that working with *new* history will not be affected negatively, and if you want to be _really_ sneaky, there are ways to say "create a pack that contains the history up to a version one year ago, and be very aggressive about those old versions that we still want to have around, but do a separate pack for newer stuff using less aggressive parameters" So this is something that can be tweaked, although we don't really have any really nice interfaces for stuff like that (ie the git delta cache size is hardcoded in the sources and cannot be set in the config file, and the "pack old history more aggressively" involves some manual scripting and knowing how "git pack-objects" works rather than any nice simple command line switch). So the thing to take away from this is: - git is certainly flexible as hell - .. but to get the full power you may need to tweak things - .. happily you really only need to have one person to do the tweaking, and the tweaked end results will be available to others that do not need to know/care. And whether the difference between 320MB and 500MB is worth any really involved tweaking (considering the potential downsides), I really don't know. Only testing will tell. Linus From dak@gnu.org Thu Dec 6 17:08:00 2007 From: dak@gnu.org (David Kastrup) Date: Thu, 06 Dec 2007 17:08:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive In-Reply-To: (Johannes Schindelin's message of "Thu, 6 Dec 2007 15:55:43 +0000 (GMT)") References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <20071206142254.GD5959@artemis.madism.org> Message-ID: <8563zbpxde.fsf@lola.goethe.zz> Johannes Schindelin writes: > However, I think that --aggressive should be aggressive, and if you > decide to run it on a machine which lacks the muscle to be aggressive, > well, you should have known better. That's a rather cheap shot. "you should have known better" than expecting to be able to use a documented command and option because the git developers happened to have a nicer machine... _How_ is one supposed to have known better? -- David Kastrup, Kriemhildstr. 15, 44793 Bochum From peff@peff.net Thu Dec 6 17:39:00 2007 From: peff@peff.net (Jeff King) Date: Thu, 06 Dec 2007 17:39:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> Message-ID: <20071206173946.GA10845@sigill.intra.peff.net> On Thu, Dec 06, 2007 at 09:18:39AM -0500, Nicolas Pitre wrote: > > The downside is that the threading partitions the object space, so the > > resulting size is not necessarily as small (but I don't know that > > anybody has done testing on large repos to find out how large the > > difference is). > > Quick guesstimate is in the 1% ballpark. Fortunately, we now have numbers. Harvey Harrison reported repacking the gcc repo and getting these results: > /usr/bin/time git repack -a -d -f --window=250 --depth=250 > > 23266.37user 581.04system 7:41:25elapsed 86%CPU (0avgtext+0avgdata 0maxresident)k > 0inputs+0outputs (419835major+123275804minor)pagefaults 0swaps > > -r--r--r-- 1 hharrison hharrison 29091872 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.idx > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack I tried the threaded repack with pack.threads = 3 on a dual-processor machine, and got: time git repack -a -d -f --window=250 --depth=250 real 309m59.849s user 377m43.948s sys 8m23.319s -r--r--r-- 1 peff peff 28570088 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.idx -r--r--r-- 1 peff peff 339922573 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.pack So it is about 5% bigger. What is really disappointing is that we saved only about 20% of the time. I didn't sit around watching the stages, but my guess is that we spent a long time in the single threaded "writing objects" stage with a thrashing delta cache. -Peff From nico@cam.org Thu Dec 6 18:03:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Thu, 06 Dec 2007 18:03:00 -0000 Subject: Git and GCC In-Reply-To: <20071206173946.GA10845@sigill.intra.peff.net> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> Message-ID: On Thu, 6 Dec 2007, Jeff King wrote: > On Thu, Dec 06, 2007 at 09:18:39AM -0500, Nicolas Pitre wrote: > > > > The downside is that the threading partitions the object space, so the > > > resulting size is not necessarily as small (but I don't know that > > > anybody has done testing on large repos to find out how large the > > > difference is). > > > > Quick guesstimate is in the 1% ballpark. > > Fortunately, we now have numbers. Harvey Harrison reported repacking the > gcc repo and getting these results: > > > /usr/bin/time git repack -a -d -f --window=250 --depth=250 > > > > 23266.37user 581.04system 7:41:25elapsed 86%CPU (0avgtext+0avgdata 0maxresident)k > > 0inputs+0outputs (419835major+123275804minor)pagefaults 0swaps > > > > -r--r--r-- 1 hharrison hharrison 29091872 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.idx > > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46ca030c3d6d6b95ad316deb922be06b167a3d.pack > > I tried the threaded repack with pack.threads = 3 on a dual-processor > machine, and got: > > time git repack -a -d -f --window=250 --depth=250 > > real 309m59.849s > user 377m43.948s > sys 8m23.319s > > -r--r--r-- 1 peff peff 28570088 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.idx > -r--r--r-- 1 peff peff 339922573 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.pack > > So it is about 5% bigger. Right. I should probably revisit that idea of finding deltas across partition boundaries to mitigate that loss. And those partitions could be made coarser as well to reduce the number of such partition gaps (just increase the value of chunk_size on line 1648 in builtin-pack-objects.c). > What is really disappointing is that we saved > only about 20% of the time. I didn't sit around watching the stages, but > my guess is that we spent a long time in the single threaded "writing > objects" stage with a thrashing delta cache. Maybe you should run the non threaded repack on the same machine to have a good comparison. And if you have only 2 CPUs, you will have better performances with pack.threads = 2, otherwise there'll be wasteful task switching going on. And of course, if the delta cache is being trashed, that might be due to the way the existing pack was previously packed. Hence the current pack might impact object _access_ when repacking them. So for a really really fair performance comparison, you'd have to preserve the original pack and swap it back before each repack attempt. Nicolas From dberlin@dberlin.org Thu Dec 6 18:05:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Thu, 06 Dec 2007 18:05:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> On 12/6/07, Linus Torvalds wrote: > > > On Thu, 6 Dec 2007, Daniel Berlin wrote: > > > > Actually, it turns out that git-gc --aggressive does this dumb thing > > to pack files sometimes regardless of whether you converted from an > > SVN repo or not. > > Absolutely. git --aggressive is mostly dumb. It's really only useful for > the case of "I know I have a *really* bad pack, and I want to throw away > all the bad packing decisions I have done". > > To explain this, it's worth explaining (you are probably aware of it, but > let me go through the basics anyway) how git delta-chains work, and how > they are so different from most other systems. > I worked on Monotone and other systems that use object stores. for a little while :) In particular, I believe GIT's original object store was based on Monotone, IIRC. > In other SCM's, a delta-chain is generally fixed. It might be "forwards" > or "backwards", and it might evolve a bit as you work with the repository, > but generally it's a chain of changes to a single file represented as some > kind of single SCM entity. In CVS, it's obviously the *,v file, and a lot > of other systems do rather similar things. > > Git also does delta-chains, but it does them a lot more "loosely". There > is no fixed entity. Delta's are generated against any random other version > that git deems to be a good delta candidate (with various fairly > successful heursitics), and there are absolutely no hard grouping rules. Sure. SVN actually supports this (surprisingly), it just never happens to choose delta bases that aren't related by ancestry. (IE it would have absolutely no problem with you using random other parts of the repository as delta bases, and i've played with it before). I actually advocated we move towards an object store model, as ancestry can be a crappy way of approximating similarity when you have a lot of branches. > So the equivalent of "git gc --aggressive" - but done *properly* - is to > do (overnight) something like > > git repack -a -d --depth=250 --window=250 > I gave this a try overnight, and it definitely helps a lot. Thanks! > And then it's going to take forever and a day (ie a "do it overnight" > thing). But the end result is that everybody downstream from that > repository will get much better packs, without having to spend any effort > on it themselves. > If your forever and a day is spent figuring out which deltas to use, you can reduce this significantly. If it is spent writing out the data, it's much harder. :) From iant@google.com Thu Dec 6 18:15:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Thu, 06 Dec 2007 18:15:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <4aca3dc20712051347v6bb8a4dbm901d1d9ddf1dff34@mail.gmail.com> Message-ID: NightStrike writes: > On 12/5/07, Daniel Berlin wrote: > > As I said, maybe i'll look at git in another year or so. > > But i'm certainly going to ignore all the "git is so great, we should > > move gcc to it" people until it works better, while i am much more > > inclined to believe the "hg is so great, we should move gc to it" > > people. > > Just out of curiosity, is there something wrong with the current > choice of svn? As I recall, it wasn't too long ago that gcc converted > from cvs to svn. What's the motivation to change again? (I'm not > trying to oppose anything.. I'm just curious, as I don't know much > about this kind of thing). Distributed version systems like git or Mercurial have some advantages over Subversion. For example, it is easy for developers to produce patches which can be reliably committed or exchanged with other developers. With Subversion, we send around patch files generated by diff and applied with patch. This works, but is inconvenient, and there is no way to track them. With regard to git, I think it's worth noting that it was initially designed to solve the problems faced by one man, Linus Torvalds. The problems he faces are not the problems which gcc developers face. Our development process is not the Linux kernel development process. Of course, many people have worked on git, and I expect that git can do what we need. For any git proponents, I'm curious to hear what advantages it offers over Mercurial. From this thread, one advantage of Mercurial seems clear: it is easier to understand how to use it correctly. Ian From nightstrike@gmail.com Thu Dec 6 18:24:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Thu, 06 Dec 2007 18:24:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: On 12/6/07, Linus Torvalds wrote: > > > On Thu, 6 Dec 2007, Daniel Berlin wrote: > > > > Actually, it turns out that git-gc --aggressive does this dumb thing > > to pack files sometimes regardless of whether you converted from an > > SVN repo or not. > I'll send a patch to Junio to just remove the "git gc --aggressive" > documentation. It can be useful, but it generally is useful only when you > really understand at a very deep level what it's doing, and that > documentation doesn't help you do that. No disrespect is meant by this reply. I am just curious (and I am probably misunderstanding something).. Why remove all of the documentation entirely? Wouldn't it be better to just document it more thoroughly? I thought you did a fine job in this post in explaining its purpose, when to use it, when not to, etc. Removing the documention seems counter-intuitive when you've already gone to the trouble of creating good documentation here in this post. From torvalds@linux-foundation.org Thu Dec 6 18:29:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Thu, 06 Dec 2007 18:29:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> Message-ID: On Thu, 6 Dec 2007, Daniel Berlin wrote: > > I worked on Monotone and other systems that use object stores. for a > little while :) In particular, I believe GIT's original object store was > based on Monotone, IIRC. Yes and no. Monotone does what git does for the blobs. But there is a big difference in how git then does it for everything else too, ie trees and history. Tree being in that object store in particular are very important, and one of the biggest deals for deltas (actually, for two reasons: most of the time they don't change AT ALL if some subdirectory gets no changes and you don't need any delta, and even when they do change, it's usually going to delta very well, since it's usually just a small part that changes). > > And then it's going to take forever and a day (ie a "do it overnight" > > thing). But the end result is that everybody downstream from that > > repository will get much better packs, without having to spend any effort > > on it themselves. > > If your forever and a day is spent figuring out which deltas to use, > you can reduce this significantly. It's almost all about figuring out the delta. Which is why *not* using "-f" (or "--aggressive") is such a big deal for normal operation, because then you just skip it all. Linus From torvalds@linux-foundation.org Thu Dec 6 18:36:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Thu, 06 Dec 2007 18:36:00 -0000 Subject: Git and GCC In-Reply-To: <20071206173946.GA10845@sigill.intra.peff.net> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> Message-ID: On Thu, 6 Dec 2007, Jeff King wrote: > > What is really disappointing is that we saved only about 20% of the > time. I didn't sit around watching the stages, but my guess is that we > spent a long time in the single threaded "writing objects" stage with a > thrashing delta cache. I don't think you spent all that much time writing the objects. That part isn't very intensive, it's mostly about the IO. I suspect you may simply be dominated by memory-throughput issues. The delta matching doesn't cache all that well, and using two or more cores isn't going to help all that much if they are largely waiting for memory (and quite possibly also perhaps fighting each other for a shared cache? Is this a Core 2 with the shared L2?) Linus From torvalds@linux-foundation.org Thu Dec 6 18:46:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Thu, 06 Dec 2007 18:46:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: On Thu, 6 Dec 2007, NightStrike wrote: > > No disrespect is meant by this reply. I am just curious (and I am > probably misunderstanding something).. Why remove all of the > documentation entirely? Wouldn't it be better to just document it > more thoroughly? Well, part of it is that I don't think "--aggressive" as it is implemented right now is really almost *ever* the right answer. We could change the implementation, of course, but generally the right thing to do is to not use it (tweaking the "--window" and "--depth" manually for the repacking is likely the more natural thing to do). The other part of the answer is that, when you *do* want to do what that "--aggressive" tries to achieve, it's such a special case event that while it should probably be documented, I don't think it should necessarily be documented where it is now (as part of "git gc"), but as part of a much more technical manual for "deep and subtle tricks you can play". > I thought you did a fine job in this post in explaining its purpose, > when to use it, when not to, etc. Removing the documention seems > counter-intuitive when you've already gone to the trouble of creating > good documentation here in this post. I'm so used to writing emails, and I *like* trying to explain what is going on, so I have no problems at all doing that kind of thing. However, trying to write a manual or man-page or other technical documentation is something rather different. IOW, I like explaining git within the _context_ of a discussion or a particular problem/issue. But documentation should work regardless of context (or at least set it up), and that's the part I am not so good at. In other words, if somebody (hint hint) thinks my explanation was good and readable, I'd love for them to try to turn it into real documentation by editing it up and creating enough context for it! But I'm nort personally very likely to do that. I'd just send Junio the patch to remove a misleading part of the documentation we have. Linus From jonsmirl@gmail.com Thu Dec 6 18:56:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Thu, 06 Dec 2007 18:56:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> Message-ID: <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> On 12/6/07, Linus Torvalds wrote: > > > On Thu, 6 Dec 2007, Jeff King wrote: > > > > What is really disappointing is that we saved only about 20% of the > > time. I didn't sit around watching the stages, but my guess is that we > > spent a long time in the single threaded "writing objects" stage with a > > thrashing delta cache. > > I don't think you spent all that much time writing the objects. That part > isn't very intensive, it's mostly about the IO. > > I suspect you may simply be dominated by memory-throughput issues. The > delta matching doesn't cache all that well, and using two or more cores > isn't going to help all that much if they are largely waiting for memory > (and quite possibly also perhaps fighting each other for a shared cache? > Is this a Core 2 with the shared L2?) When I lasted looked at the code, the problem was in evenly dividing the work. I was using a four core machine and most of the time one core would end up with 3-5x the work of the lightest loaded core. Setting pack.threads up to 20 fixed the problem. With a high number of threads I was able to get a 4hr pack to finished in something like 1:15. A scheme where each core could work a minute without communicating to the other cores would be best. It would also be more efficient if the cores could avoid having sync points between them. -- Jon Smirl jonsmirl@gmail.com From jcpiza@gmail.com Thu Dec 6 19:07:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Thu, 06 Dec 2007 19:07:00 -0000 Subject: [PATCH] gc --aggressive: make it really aggressive Message-ID: <998d0e4a0712061107r47f99599m262ffc7aefc4938a@mail.gmail.com> On 2007/12/06, David Kastrup wrote: > Johannes Schindelin writes: > > > However, I think that --aggressive should be aggressive, and if you > > decide to run it on a machine which lacks the muscle to be aggressive, > > well, you should have known better. > > That's a rather cheap shot. "you should have known better" than > expecting to be able to use a documented command and option because the > git developers happened to have a nicer machine... > > _How_ is one supposed to have known better? > > -- > David Kastrup, Kriemhildstr. 15, 44793 Bochum In GIT, the --aggressive option doesn't make it aggressive. In GCC, the -Wall option doesn't enable all warnings. # It's a "Tie one to one" with the similar reputations. ####### To have a rest in peace. # # J.C.Pizarro # From nico@cam.org Thu Dec 6 19:08:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Thu, 06 Dec 2007 19:08:00 -0000 Subject: Git and GCC In-Reply-To: <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> Message-ID: On Thu, 6 Dec 2007, Jon Smirl wrote: > On 12/6/07, Linus Torvalds wrote: > > > > > > On Thu, 6 Dec 2007, Jeff King wrote: > > > > > > What is really disappointing is that we saved only about 20% of the > > > time. I didn't sit around watching the stages, but my guess is that we > > > spent a long time in the single threaded "writing objects" stage with a > > > thrashing delta cache. > > > > I don't think you spent all that much time writing the objects. That part > > isn't very intensive, it's mostly about the IO. > > > > I suspect you may simply be dominated by memory-throughput issues. The > > delta matching doesn't cache all that well, and using two or more cores > > isn't going to help all that much if they are largely waiting for memory > > (and quite possibly also perhaps fighting each other for a shared cache? > > Is this a Core 2 with the shared L2?) > > When I lasted looked at the code, the problem was in evenly dividing > the work. I was using a four core machine and most of the time one > core would end up with 3-5x the work of the lightest loaded core. > Setting pack.threads up to 20 fixed the problem. With a high number of > threads I was able to get a 4hr pack to finished in something like > 1:15. But as far as I know you didn't try my latest incarnation which has been available in Git's master branch for a few months already. Nicolas From jdl@freescale.com Thu Dec 6 19:13:00 2007 From: jdl@freescale.com (Jon Loeliger) Date: Thu, 06 Dec 2007 19:13:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: <1196968371.18340.30.camel@ld0161-tx32> On Thu, 2007-12-06 at 00:09, Linus Torvalds wrote: > Git also does delta-chains, but it does them a lot more "loosely". There > is no fixed entity. Delta's are generated against any random other version > that git deems to be a good delta candidate (with various fairly > successful heursitics), and there are absolutely no hard grouping rules. I'd like to learn more about that. Can someone point me to either more documentation on it? In the absence of that, perhaps a pointer to the source code that implements it? I guess one question I posit is, would it be more accurate to think of this as a "delta net" in a weighted graph rather than a "delta chain"? Thanks, jdl From jcpiza@gmail.com Thu Dec 6 19:25:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Thu, 06 Dec 2007 19:25:00 -0000 Subject: Git and GCC. Why not with fork, exec and pipes like in linux? Message-ID: <998d0e4a0712061125h3d44139ctb7f5600bc8467292@mail.gmail.com> On 2007/12/06, "Jon Smirl" wrote: > On 12/6/07, Linus Torvalds wrote: > > On Thu, 6 Dec 2007, Jeff King wrote: > > > > > > What is really disappointing is that we saved only about 20% of the > > > time. I didn't sit around watching the stages, but my guess is that we > > > spent a long time in the single threaded "writing objects" stage with a > > > thrashing delta cache. > > > > I don't think you spent all that much time writing the objects. That part > > isn't very intensive, it's mostly about the IO. > > > > I suspect you may simply be dominated by memory-throughput issues. The > > delta matching doesn't cache all that well, and using two or more cores > > isn't going to help all that much if they are largely waiting for memory > > (and quite possibly also perhaps fighting each other for a shared cache? > > Is this a Core 2 with the shared L2?) > > When I lasted looked at the code, the problem was in evenly dividing > the work. I was using a four core machine and most of the time one > core would end up with 3-5x the work of the lightest loaded core. > Setting pack.threads up to 20 fixed the problem. With a high number of > threads I was able to get a 4hr pack to finished in something like > 1:15. > > A scheme where each core could work a minute without communicating to > the other cores would be best. It would also be more efficient if the > cores could avoid having sync points between them. > > -- > Jon Smirl > jonsmirl@gmail.com For multicores CPUs, don't divide the work in threads. To divide the work in processes! Tips, tricks and hacks: to use fork, exec, pipes and another IPC mechanisms like mutexes, shared memory's IPC, file locks, pipes, semaphores, RPCs, sockets, etc. to access concurrently and parallely to the filelocked database. For Intel Quad Core e.g., x4 cores, it need a parent process and 4 child processes linked to the parent with pipes. The parent process can be * no-threaded using select/epoll/libevent * threaded using Pth (GNU Portable Threads), NPTL (from RedHat) or whatever. J.C.Pizarro From vincent+gcc@vinc17.org Thu Dec 6 19:29:00 2007 From: vincent+gcc@vinc17.org (Vincent Lefevre) Date: Thu, 06 Dec 2007 19:29:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <4aca3dc20712051347v6bb8a4dbm901d1d9ddf1dff34@mail.gmail.com> Message-ID: <20071206192859.GU5855@prunille.vinc17.org> On 2007-12-06 10:15:17 -0800, Ian Lance Taylor wrote: > Distributed version systems like git or Mercurial have some advantages > over Subversion. It's surprising that you don't mention svk, which is based on top of Subversion[*]. Has anyone tried? Is there any problem with it? [*] You have currently an obvious advantage here. -- Vincent Lef??vre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon) From ismail@pardus.org.tr Thu Dec 6 19:31:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Thu, 06 Dec 2007 19:31:00 -0000 Subject: Git and GCC In-Reply-To: <20071206192859.GU5855@prunille.vinc17.org> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <20071206192859.GU5855@prunille.vinc17.org> Message-ID: <200712062132.05006.ismail@pardus.org.tr> Thursday 06 December 2007 21:28:59 Vincent Lefevre yazm??t?: > On 2007-12-06 10:15:17 -0800, Ian Lance Taylor wrote: > > Distributed version systems like git or Mercurial have some advantages > > over Subversion. > > It's surprising that you don't mention svk, which is based on top > of Subversion[*]. Has anyone tried? Is there any problem with it? > > [*] You have currently an obvious advantage here. Last time I tried SVK it was slow and buggy. I wouldn't recommend it. /ismail -- Never learn by your mistakes, if you do you may never dare to try again. From torvalds@linux-foundation.org Thu Dec 6 19:40:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Thu, 06 Dec 2007 19:40:00 -0000 Subject: Git and GCC In-Reply-To: <1196968371.18340.30.camel@ld0161-tx32> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196968371.18340.30.camel@ld0161-tx32> Message-ID: On Thu, 6 Dec 2007, Jon Loeliger wrote: > > On Thu, 2007-12-06 at 00:09, Linus Torvalds wrote: > > Git also does delta-chains, but it does them a lot more "loosely". There > > is no fixed entity. Delta's are generated against any random other version > > that git deems to be a good delta candidate (with various fairly > > successful heursitics), and there are absolutely no hard grouping rules. > > I'd like to learn more about that. Can someone point me to > either more documentation on it? In the absence of that, > perhaps a pointer to the source code that implements it? Well, in a very real sense, what the delta code does is: - just list every single object in the whole repository - walk over each object, trying to find another object that it can be written as a delta against - write out the result as a pack-file That's simplified: we may not walk _all_ objects, for example: only a global repack does that (and most pack creations are actually for pushign and pulling between two repositories, so we only walk the objects that are in the source but not the destination repository). The interesting phase is the "walk each object, try to find a delta" part. In particular, you don't want to try to find a delta by comparing each object to every other object out there (that would be O(n^2) in objects, and with a fairly high constant cost too!). So what it does is to sort the objects by a few heuristics (type of object, base name that object was found as when traversing a tree and size, and how recently it was found in the history). And then over that sorted list, it tries to find deltas between entries that are "close" to each other (and that's where the "--window=xyz" thing comes in - it says how big the window is for objects being close. A smaller window generates somewhat less good deltas, but takes a lot less effort to generate). The source is in git/builtin-pack-objects.c, with the core of it being - try_delta() - try to generate a *single* delta when given an object pair. - find_deltas() - do the actual list traversal - prepare_pack() and type_size_sort() - create the delta sort list from the list of objects. but that whole file is probably some of the more opaque parts of git. > I guess one question I posit is, would it be more accurate > to think of this as a "delta net" in a weighted graph rather > than a "delta chain"? It's certainly not a simple chain, it's more of a set of acyclic directed graphs in the object list. And yes, it's weigted by the size of the delta between objects, and the optimization problem is kind of akin to finding the smallest spanning tree (well, forest - since you do *not* want to create one large graph, you also want to make the individual trees shallow enough that you don't have excessive delta depth). There are good algorithms for finding minimum spanning trees, but this one is complicated by the fact that the biggest cost (by far!) is the calculation of the weights itself. So rather than really worry about finding the minimal tree/forest, the code needs to worry about not having to even calculate all the weights! (That, btw, is a common theme. A lot of git is about traversing graphs, like the revision graph. And most of the trivial graph problems all assume that you have the whole graph, but since the "whole graph" is the whole history of the repository, those algorithms are totally worthless, since they are fundamentally much too expensive - if we have to generate the whole history, we're already screwed for a big project. So things like revision graph calculation, the main performance issue is to avoid having to even *look* at parts of the graph that we don't need to see!) Linus From gitster@pobox.com Thu Dec 6 20:04:00 2007 From: gitster@pobox.com (Junio C Hamano) Date: Thu, 06 Dec 2007 20:04:00 -0000 Subject: Git and GCC In-Reply-To: <1196968371.18340.30.camel@ld0161-tx32> (Jon Loeliger's message of "Thu, 06 Dec 2007 13:12:51 -0600") References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196968371.18340.30.camel@ld0161-tx32> Message-ID: <7vk5nrd1yq.fsf@gitster.siamese.dyndns.org> Jon Loeliger writes: > On Thu, 2007-12-06 at 00:09, Linus Torvalds wrote: > >> Git also does delta-chains, but it does them a lot more "loosely". There >> is no fixed entity. Delta's are generated against any random other version >> that git deems to be a good delta candidate (with various fairly >> successful heursitics), and there are absolutely no hard grouping rules. > > I'd like to learn more about that. Can someone point me to > either more documentation on it? In the absence of that, > perhaps a pointer to the source code that implements it? See Documentation/technical/pack-heuristics.txt, but the document predates and does not talk about delta reusing, which was covered here: http://thread.gmane.org/gmane.comp.version-control.git/16223/focus=16267 > I guess one question I posit is, would it be more accurate > to think of this as a "delta net" in a weighted graph rather > than a "delta chain"? Yes. From abel@ispras.ru Thu Dec 6 20:35:00 2007 From: abel@ispras.ru (Andrey Belevantsev) Date: Thu, 06 Dec 2007 20:35:00 -0000 Subject: Git and GCC In-Reply-To: <20071206192859.GU5855@prunille.vinc17.org> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <4aca3dc20712051347v6bb8a4dbm901d1d9ddf1dff34@mail.gmail.com> <20071206192859.GU5855@prunille.vinc17.org> Message-ID: <47585D1D.2010807@ispras.ru> Vincent Lefevre wrote: > It's surprising that you don't mention svk, which is based on top > of Subversion[*]. Has anyone tried? Is there any problem with it? I must agree with Ismail's reply here. We have used svk for our internal development for about two years, for the reason of easy mirroring of gcc trunk and branching from it locally. I would not complain about its speed, but sometimes we had problems with merge from trunk, ending up with e.g. zero-sized files in our branch which were removed from trunk, or we even couldn't merge at all, and I had to resort to underlying subversion repository for merging. As a result, we're currently migrating to mercurial. Andrey From jcpiza@gmail.com Thu Dec 6 20:37:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Thu, 06 Dec 2007 20:37:00 -0000 Subject: Git and GCC. Why not with fork, exec and pipes like in linux? In-Reply-To: <998d0e4a0712061125h3d44139ctb7f5600bc8467292@mail.gmail.com> References: <998d0e4a0712061125h3d44139ctb7f5600bc8467292@mail.gmail.com> Message-ID: <998d0e4a0712061237j6ed43aaav5934e4fe63398233@mail.gmail.com> On 2007/12/6, J.C. Pizarro , i wrote: > For multicores CPUs, don't divide the work in threads. > To divide the work in processes! > > Tips, tricks and hacks: to use fork, exec, pipes and another IPC mechanisms like > mutexes, shared memory's IPC, file locks, pipes, semaphores, RPCs, sockets, etc. > to access concurrently and parallely to the filelocked database. I'm sorry, we don't need exec. We need fork, pipes and another IPC mechanisms because it so shares easy the C code for parallelism. Thanks to Linus because GIT is implemented in C language to interact with system calls of the kernel written in C. > For Intel Quad Core e.g., x4 cores, it need a parent process and 4 > child processes linked to the parent with pipes. For peak performance (e.g 99.9% usage), the minimum number of child processes should be more than 4, normally between e.g. 6 and 10 processes depending on the statistics of idle's stalls of the cores. > The parent process can be > * no-threaded using select/epoll/libevent > * threaded using Pth (GNU Portable Threads), NPTL (from RedHat) or whatever. Note: there is a little design's problem with slowdown of I/O bandwith when the parent is multithreaded and the children MUST to be multithreaded that we can't avoid them to be non-multithreaded for maximum I/O bandwith. The "finding of the smallest spanning forest with deltas" consumes a lot of CPU, so if it scales well in a CPU x4 cores then it can to reduce 4 hours to 1 hour. J.C.Pizarro :) From dberlin@dberlin.org Thu Dec 6 20:43:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Thu, 06 Dec 2007 20:43:00 -0000 Subject: Git and GCC In-Reply-To: <47585D1D.2010807@ispras.ru> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <4aca3dc20712051347v6bb8a4dbm901d1d9ddf1dff34@mail.gmail.com> <20071206192859.GU5855@prunille.vinc17.org> <47585D1D.2010807@ispras.ru> Message-ID: <4aca3dc20712061243k427f49b2p142a55b925e69544@mail.gmail.com> On 12/6/07, Andrey Belevantsev wrote: > Vincent Lefevre wrote: > > It's surprising that you don't mention svk, which is based on top > > of Subversion[*]. Has anyone tried? Is there any problem with it? > I must agree with Ismail's reply here. We have used svk for our > internal development for about two years, for the reason of easy > mirroring of gcc trunk and branching from it locally. I would not > complain about its speed, but sometimes we had problems with merge from > trunk, ending up with e.g. zero-sized files in our branch which were > removed from trunk, or we even couldn't merge at all, and I had to > resort to underlying subversion repository for merging. As a result, > we're currently migrating to mercurial. I would not recommend SVK either (even being an SVN committer). While i love the SVK guys to death, it's just not the way to go if you want a distributed system. > > Andrey > From gitster@pobox.com Thu Dec 6 21:02:00 2007 From: gitster@pobox.com (Junio C Hamano) Date: Thu, 06 Dec 2007 21:02:00 -0000 Subject: Git and GCC In-Reply-To: <7vk5nrd1yq.fsf@gitster.siamese.dyndns.org> (Junio C. Hamano's message of "Thu, 06 Dec 2007 12:04:29 -0800") References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196968371.18340.30.camel@ld0161-tx32> <7vk5nrd1yq.fsf@gitster.siamese.dyndns.org> Message-ID: <7vabonczad.fsf@gitster.siamese.dyndns.org> Junio C Hamano writes: > Jon Loeliger writes: > >> I'd like to learn more about that. Can someone point me to >> either more documentation on it? In the absence of that, >> perhaps a pointer to the source code that implements it? > > See Documentation/technical/pack-heuristics.txt, A somewhat funny thing about this is ... $ git show --stat --summary b116b297 commit b116b297a80b54632256eb89dd22ea2b140de622 Author: Jon Loeliger Date: Thu Mar 2 19:19:29 2006 -0600 Added Packing Heursitics IRC writeup. Signed-off-by: Jon Loeliger Signed-off-by: Junio C Hamano Documentation/technical/pack-heuristics.txt | 466 +++++++++++++++++++++++++++ 1 files changed, 466 insertions(+), 0 deletions(-) create mode 100644 Documentation/technical/pack-heuristics.txt From jonsmirl@gmail.com Thu Dec 6 21:39:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Thu, 06 Dec 2007 21:39:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> Message-ID: <9e4733910712061339n3aef023r22e5b73aac120c8a@mail.gmail.com> On 12/6/07, Nicolas Pitre wrote: > > When I lasted looked at the code, the problem was in evenly dividing > > the work. I was using a four core machine and most of the time one > > core would end up with 3-5x the work of the lightest loaded core. > > Setting pack.threads up to 20 fixed the problem. With a high number of > > threads I was able to get a 4hr pack to finished in something like > > 1:15. > > But as far as I know you didn't try my latest incarnation which has been > available in Git's master branch for a few months already. I've deleted all my giant packs. Using the kernel pack: 4GB Q6600 Using the current thread pack code I get these results. The interesting case is the last one. I set it to 15 threads and monitored with 'top'. For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and 74-100% was 100% CPU. It never used all for cores. The only other things running were top and my desktop. This is the same load balancing problem I observed earlier. Much more clock time was spent in the 2/1 core phases than the 3 core one. Threaded, threads = 5 jonsmirl@terra:/home/linux$ time git repack -a -d -f Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 528994), reused 0 (delta 0) real 1m31.395s user 2m59.239s sys 0m3.048s jonsmirl@terra:/home/linux$ 12 seconds counting 53 seconds compressing 38 seconds writing Without threads, jonsmirl@terra:/home/linux$ time git repack -a -d -f warning: no threads support, ignoring pack.threads Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 528999), reused 0 (delta 0) real 2m54.849s user 2m51.267s sys 0m1.412s jonsmirl@terra:/home/linux$ Threaded, threads = 5 jonsmirl@terra:/home/linux$ time git repack -a -d -f --depth=250 --window=250 Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 539080), reused 0 (delta 0) real 9m18.032s user 19m7.484s sys 0m3.880s jonsmirl@terra:/home/linux$ jonsmirl@terra:/home/linux/.git/objects/pack$ ls -l total 182156 -r--r--r-- 1 jonsmirl jonsmirl 15561848 2007-12-06 16:15 pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.idx -r--r--r-- 1 jonsmirl jonsmirl 170768761 2007-12-06 16:15 pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.pack jonsmirl@terra:/home/linux/.git/objects/pack$ Non-threaded: jonsmirl@terra:/home/linux$ time git repack -a -d -f --depth=250 --window=250 warning: no threads support, ignoring pack.threads Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 539080), reused 0 (delta 0) real 18m51.183s user 18m46.538s sys 0m1.604s jonsmirl@terra:/home/linux$ jonsmirl@terra:/home/linux/.git/objects/pack$ ls -l total 182156 -r--r--r-- 1 jonsmirl jonsmirl 15561848 2007-12-06 15:33 pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.idx -r--r--r-- 1 jonsmirl jonsmirl 170768761 2007-12-06 15:33 pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.pack jonsmirl@terra:/home/linux/.git/objects/pack$ Threaded, threads = 15 jonsmirl@terra:/home/linux$ time git repack -a -d -f --depth=250 --window=250 Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 539080), reused 0 (delta 0) real 9m18.325s user 19m14.340s sys 0m3.996s jonsmirl@terra:/home/linux$ -- Jon Smirl jonsmirl@gmail.com From nico@cam.org Thu Dec 6 22:08:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Thu, 06 Dec 2007 22:08:00 -0000 Subject: Git and GCC In-Reply-To: <9e4733910712061339n3aef023r22e5b73aac120c8a@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> <9e4733910712061339n3aef023r22e5b73aac120c8a@mail.gmail.com> Message-ID: On Thu, 6 Dec 2007, Jon Smirl wrote: > On 12/6/07, Nicolas Pitre wrote: > > > When I lasted looked at the code, the problem was in evenly dividing > > > the work. I was using a four core machine and most of the time one > > > core would end up with 3-5x the work of the lightest loaded core. > > > Setting pack.threads up to 20 fixed the problem. With a high number of > > > threads I was able to get a 4hr pack to finished in something like > > > 1:15. > > > > But as far as I know you didn't try my latest incarnation which has been > > available in Git's master branch for a few months already. > > I've deleted all my giant packs. Using the kernel pack: > 4GB Q6600 > > Using the current thread pack code I get these results. > > The interesting case is the last one. I set it to 15 threads and > monitored with 'top'. > For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and > 74-100% was 100% CPU. It never used all for cores. The only other > things running were top and my desktop. This is the same load > balancing problem I observed earlier. Well, that's possible with a window 25 times larger than the default. The load balancing is solved with a master thread serving relatively small object list segments to any work thread that finished with its previous segment. But the size for those segments is currently fixed to window * 1000 which is way too large when window == 250. I have to find a way to auto-tune that segment size somehow. But with the default window size there should not be any such noticeable load balancing problem. Note that threading only happens in the compression phase. The count and write phase are hardly paralleled. Nicolas From jonsmirl@gmail.com Thu Dec 6 22:11:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Thu, 06 Dec 2007 22:11:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> <9e4733910712061339n3aef023r22e5b73aac120c8a@mail.gmail.com> Message-ID: <9e4733910712061411y77f800dcx46bb8fdd5d97941f@mail.gmail.com> On 12/6/07, Nicolas Pitre wrote: > On Thu, 6 Dec 2007, Jon Smirl wrote: > > > On 12/6/07, Nicolas Pitre wrote: > > > > When I lasted looked at the code, the problem was in evenly dividing > > > > the work. I was using a four core machine and most of the time one > > > > core would end up with 3-5x the work of the lightest loaded core. > > > > Setting pack.threads up to 20 fixed the problem. With a high number of > > > > threads I was able to get a 4hr pack to finished in something like > > > > 1:15. > > > > > > But as far as I know you didn't try my latest incarnation which has been > > > available in Git's master branch for a few months already. > > > > I've deleted all my giant packs. Using the kernel pack: > > 4GB Q6600 > > > > Using the current thread pack code I get these results. > > > > The interesting case is the last one. I set it to 15 threads and > > monitored with 'top'. > > For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and > > 74-100% was 100% CPU. It never used all for cores. The only other > > things running were top and my desktop. This is the same load > > balancing problem I observed earlier. > > Well, that's possible with a window 25 times larger than the default. > > The load balancing is solved with a master thread serving relatively > small object list segments to any work thread that finished with its > previous segment. But the size for those segments is currently fixed to > window * 1000 which is way too large when window == 250. > > I have to find a way to auto-tune that segment size somehow. That would be nice. Threading is most important on the giant pack/window combinations. The normal case is fast enough that I don't real notice it. These giant pack/window combos can run 8-10 hours. > > But with the default window size there should not be any such noticeable > load balancing problem. I only spend 30 seconds in the compression phase without making the window larger. It's not long enough to really see what is going on. > > Note that threading only happens in the compression phase. The count > and write phase are hardly paralleled. > > > Nicolas > -- Jon Smirl jonsmirl@gmail.com From jonsmirl@gmail.com Thu Dec 6 22:22:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Thu, 06 Dec 2007 22:22:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> <9e4733910712061339n3aef023r22e5b73aac120c8a@mail.gmail.com> Message-ID: <9e4733910712061422w139273c0gf3cfb04c6ba8c509@mail.gmail.com> On 12/6/07, Nicolas Pitre wrote: > On Thu, 6 Dec 2007, Jon Smirl wrote: > > > On 12/6/07, Nicolas Pitre wrote: > > > > When I lasted looked at the code, the problem was in evenly dividing > > > > the work. I was using a four core machine and most of the time one > > > > core would end up with 3-5x the work of the lightest loaded core. > > > > Setting pack.threads up to 20 fixed the problem. With a high number of > > > > threads I was able to get a 4hr pack to finished in something like > > > > 1:15. > > > > > > But as far as I know you didn't try my latest incarnation which has been > > > available in Git's master branch for a few months already. > > > > I've deleted all my giant packs. Using the kernel pack: > > 4GB Q6600 > > > > Using the current thread pack code I get these results. > > > > The interesting case is the last one. I set it to 15 threads and > > monitored with 'top'. > > For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and > > 74-100% was 100% CPU. It never used all for cores. The only other > > things running were top and my desktop. This is the same load > > balancing problem I observed earlier. > > Well, that's possible with a window 25 times larger than the default. Why did it never use more than three cores? > > The load balancing is solved with a master thread serving relatively > small object list segments to any work thread that finished with its > previous segment. But the size for those segments is currently fixed to > window * 1000 which is way too large when window == 250. > > I have to find a way to auto-tune that segment size somehow. > > But with the default window size there should not be any such noticeable > load balancing problem. > > Note that threading only happens in the compression phase. The count > and write phase are hardly paralleled. > > > Nicolas > -- Jon Smirl jonsmirl@gmail.com From dak@gnu.org Thu Dec 6 22:25:00 2007 From: dak@gnu.org (David Kastrup) Date: Thu, 06 Dec 2007 22:25:00 -0000 Subject: Git and GCC In-Reply-To: <7vabonczad.fsf@gitster.siamese.dyndns.org> (Junio C. Hamano's message of "Thu, 06 Dec 2007 13:02:18 -0800") References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196968371.18340.30.camel@ld0161-tx32> <7vk5nrd1yq.fsf@gitster.siamese.dyndns.org> <7vabonczad.fsf@gitster.siamese.dyndns.org> Message-ID: <85r6hzo3y8.fsf@lola.goethe.zz> Junio C Hamano writes: > Junio C Hamano writes: > >> Jon Loeliger writes: >> >>> I'd like to learn more about that. Can someone point me to >>> either more documentation on it? In the absence of that, >>> perhaps a pointer to the source code that implements it? >> >> See Documentation/technical/pack-heuristics.txt, > > A somewhat funny thing about this is ... > > $ git show --stat --summary b116b297 > commit b116b297a80b54632256eb89dd22ea2b140de622 > Author: Jon Loeliger > Date: Thu Mar 2 19:19:29 2006 -0600 > > Added Packing Heursitics IRC writeup. Ah, fishing for compliments. The cookie baking season... -- David Kastrup, Kriemhildstr. 15, 44793 Bochum From nico@cam.org Thu Dec 6 22:30:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Thu, 06 Dec 2007 22:30:00 -0000 Subject: Git and GCC In-Reply-To: <9e4733910712061422w139273c0gf3cfb04c6ba8c509@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> <9e4733910712061339n3aef023r22e5b73aac120c8a@mail.gmail.com> <9e4733910712061422w139273c0gf3cfb04c6ba8c509@mail.gmail.com> Message-ID: On Thu, 6 Dec 2007, Jon Smirl wrote: > On 12/6/07, Nicolas Pitre wrote: > > On Thu, 6 Dec 2007, Jon Smirl wrote: > > > > > On 12/6/07, Nicolas Pitre wrote: > > > > > When I lasted looked at the code, the problem was in evenly dividing > > > > > the work. I was using a four core machine and most of the time one > > > > > core would end up with 3-5x the work of the lightest loaded core. > > > > > Setting pack.threads up to 20 fixed the problem. With a high number of > > > > > threads I was able to get a 4hr pack to finished in something like > > > > > 1:15. > > > > > > > > But as far as I know you didn't try my latest incarnation which has been > > > > available in Git's master branch for a few months already. > > > > > > I've deleted all my giant packs. Using the kernel pack: > > > 4GB Q6600 > > > > > > Using the current thread pack code I get these results. > > > > > > The interesting case is the last one. I set it to 15 threads and > > > monitored with 'top'. > > > For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and > > > 74-100% was 100% CPU. It never used all for cores. The only other > > > things running were top and my desktop. This is the same load > > > balancing problem I observed earlier. > > > > Well, that's possible with a window 25 times larger than the default. > > Why did it never use more than three cores? You have 648366 objects total, and only 647457 of them are subject to delta compression. With a window size of 250 and a default thread segment of window * 1000 that means only 3 segments will be distributed to threads, hence only 3 threads with work to do. Nicolas From rdunlap@xenotime.net Thu Dec 6 22:38:00 2007 From: rdunlap@xenotime.net (Randy Dunlap) Date: Thu, 06 Dec 2007 22:38:00 -0000 Subject: [OT] Re: Git and GCC In-Reply-To: <85r6hzo3y8.fsf@lola.goethe.zz> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196968371.18340.30.camel@ld0161-tx32> <7vk5nrd1yq.fsf@gitster.siamese.dyndns.org> <7vabonczad.fsf@gitster.siamese.dyndns.org> <85r6hzo3y8.fsf@lola.goethe.zz> Message-ID: <20071206143827.004991f8.rdunlap@xenotime.net> On Thu, 06 Dec 2007 23:26:07 +0100 David Kastrup wrote: > Junio C Hamano writes: > > > Junio C Hamano writes: > > > >> Jon Loeliger writes: > >> > >>> I'd like to learn more about that. Can someone point me to > >>> either more documentation on it? In the absence of that, > >>> perhaps a pointer to the source code that implements it? > >> > >> See Documentation/technical/pack-heuristics.txt, > > > > A somewhat funny thing about this is ... > > > > $ git show --stat --summary b116b297 > > commit b116b297a80b54632256eb89dd22ea2b140de622 > > Author: Jon Loeliger > > Date: Thu Mar 2 19:19:29 2006 -0600 > > > > Added Packing Heursitics IRC writeup. > > Ah, fishing for compliments. The cookie baking season... Indeed. Here are some really good & sweet recipes (IMHO). http://www.xenotime.net/linux/recipes/ --- ~Randy Features and documentation: http://lwn.net/Articles/260136/ From jonsmirl@gmail.com Thu Dec 6 22:44:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Thu, 06 Dec 2007 22:44:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071206173946.GA10845@sigill.intra.peff.net> <9e4733910712061055p353775d8wd0321bc9c81297b7@mail.gmail.com> <9e4733910712061339n3aef023r22e5b73aac120c8a@mail.gmail.com> <9e4733910712061422w139273c0gf3cfb04c6ba8c509@mail.gmail.com> Message-ID: <9e4733910712061444i64c115e2y94f6212dd7a4ddda@mail.gmail.com> On 12/6/07, Nicolas Pitre wrote: > > > Well, that's possible with a window 25 times larger than the default. > > > > Why did it never use more than three cores? > > You have 648366 objects total, and only 647457 of them are subject to > delta compression. > > With a window size of 250 and a default thread segment of window * 1000 > that means only 3 segments will be distributed to threads, hence only 3 > threads with work to do. One little tweak and the clock time drops from 9.5 to 6 minutes. The tweak makes all four cores work. jonsmirl@terra:/home/apps/git$ git diff diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index 4f44658..e0dd12e 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -1645,7 +1645,7 @@ static void ll_find_deltas(struct object_entry **list, unsigned list_size, } /* this should be auto-tuned somehow */ - chunk_size = window * 1000; + chunk_size = window * 50; do { unsigned sublist_size = chunk_size; jonsmirl@terra:/home/linux/.git$ time git repack -a -d -f --depth=250 --window=250 Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 539043), reused 0 (delta 0) real 6m2.109s user 20m0.491s sys 0m4.608s jonsmirl@terra:/home/linux/.git$ > > > Nicolas > -- Jon Smirl jonsmirl@gmail.com From bviyer@ncsu.edu Thu Dec 6 22:52:00 2007 From: bviyer@ncsu.edu (Balaji V. Iyer) Date: Thu, 06 Dec 2007 22:52:00 -0000 Subject: Help with the Machine Description Message-ID: <000001c8385a$b3a22820$33160e98@ece.ncsu.edu> Hello Everyone, I am trying to modify the OpenRISC GCC to modify the existing instructions and add more instructions into the system. I had to rewrite most of the or32.md. When I am trying to compile something, it says the following constaint is not found. Can someone please help me with reading this contraint correctly? (insn 112 110 478 12 (set (mem:QI (reg/v/f:SI 16 r16 [orig:72 line.183 ] [72]) [0 S1 A8]) (const_int 0 [0x0])) 16 {movqi} (nil) (nil)) >From what I see, it is just a that we are trying to set 1 byte of a memory location with the value in register #16 (r16) with an offset of 0....which I have handled already in my machine description...so what can this be? Any help is highly appreciated. Thanking You, Yours Sincerely, Balaji V. Iyer. -- Balaji V. Iyer PhD Student, Center for Efficient, Scalable and Reliable Computing, Department of Electrical and Computer Engineering, North Carolina State University. From jnareb@gmail.com Fri Dec 7 00:30:00 2007 From: jnareb@gmail.com (Jakub Narebski) Date: Fri, 07 Dec 2007 00:30:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196968371.18340.30.camel@ld0161-tx32> Message-ID: Linus Torvalds writes: > On Thu, 6 Dec 2007, Jon Loeliger wrote: >> I guess one question I posit is, would it be more accurate >> to think of this as a "delta net" in a weighted graph rather >> than a "delta chain"? > > It's certainly not a simple chain, it's more of a set of acyclic directed > graphs in the object list. And yes, it's weigted by the size of the delta > between objects, and the optimization problem is kind of akin to finding > the smallest spanning tree (well, forest - since you do *not* want to > create one large graph, you also want to make the individual trees shallow > enough that you don't have excessive delta depth). > > There are good algorithms for finding minimum spanning trees, but this one > is complicated by the fact that the biggest cost (by far!) is the > calculation of the weights itself. So rather than really worry about > finding the minimal tree/forest, the code needs to worry about not having > to even calculate all the weights! > > (That, btw, is a common theme. A lot of git is about traversing graphs, > like the revision graph. And most of the trivial graph problems all assume > that you have the whole graph, but since the "whole graph" is the whole > history of the repository, those algorithms are totally worthless, since > they are fundamentally much too expensive - if we have to generate the > whole history, we're already screwed for a big project. So things like > revision graph calculation, the main performance issue is to avoid having > to even *look* at parts of the graph that we don't need to see!) Hmmm... I think that these two problems (find minimal spanning forest with limited depth and traverse graph) with the additional constraint to avoid calculating weights / avoid calculating whole graph would be a good problem to present at CompSci course. Just a thought... -- Jakub Narebski Poland ShadeHawk on #git From jcpiza@gmail.com Fri Dec 7 02:10:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Fri, 07 Dec 2007 02:10:00 -0000 Subject: In future, to replace autotools by cmake like KDE4 did? Message-ID: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> The autotools ( automake + libtool + autoconf + ... ) generate many big files that they have been slowing the building's computation and growing enormously their cvs/svn/git/hg repositories because of generated files. To see below interesting links: 1. http://dot.kde.org/1172083974/ 2. http://sam.zoy.org/lectures/20050910-debian/ 3. https://lwn.net/Articles/188693/ 4. http://en.wikipedia.org/wiki/GNU_Build_Tools 5. http://en.wikipedia.org/wiki/GNU_Automake The benefits could be: * +40% faster in the KDE4 building vs KDE 3.5.6. * elimination of redundant and unnecesary generated files as those from autotools. * smaller cvs/svn/git/hg repositories. * less errors/crashes when it's configuring. * can be improved the cmake's sources for better performance's gain. * good and long maintainance life. I hope if the files for cmake+make can be well integrated in GCC 4.4 J.C.Pizarro From harvey.harrison@gmail.com Fri Dec 7 02:42:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Fri, 07 Dec 2007 02:42:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> Message-ID: <1196995353.22471.20.camel@brick> On Thu, 2007-12-06 at 13:04 -0500, Daniel Berlin wrote: > On 12/6/07, Linus Torvalds wrote: > > > So the equivalent of "git gc --aggressive" - but done *properly* - is to > > do (overnight) something like > > > > git repack -a -d --depth=250 --window=250 > > > I gave this a try overnight, and it definitely helps a lot. > Thanks! I've updated the public mirror repo with the very-packed version. People cloning it now should get the just over 300MB repo now. git.infradead.org/gcc.git Cheers, Harvey From davem@davemloft.net Fri Dec 7 03:31:00 2007 From: davem@davemloft.net (David Miller) Date: Fri, 07 Dec 2007 03:31:00 -0000 Subject: Git and GCC In-Reply-To: <20071206173946.GA10845@sigill.intra.peff.net> References: <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> Message-ID: <20071206.193121.40404287.davem@davemloft.net> From: Jeff King Date: Thu, 6 Dec 2007 12:39:47 -0500 > I tried the threaded repack with pack.threads = 3 on a dual-processor > machine, and got: > > time git repack -a -d -f --window=250 --depth=250 > > real 309m59.849s > user 377m43.948s > sys 8m23.319s > > -r--r--r-- 1 peff peff 28570088 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.idx > -r--r--r-- 1 peff peff 339922573 2007-12-06 10:11 pack-1fa336f33126d762988ed6fc3f44ecbe0209da3c.pack > > So it is about 5% bigger. What is really disappointing is that we saved > only about 20% of the time. I didn't sit around watching the stages, but > my guess is that we spent a long time in the single threaded "writing > objects" stage with a thrashing delta cache. If someone can give me a good way to run this test case I can have my 64-cpu Niagara-2 box crunch on this and see how fast it goes and how much larger the resulting pack file is. From nico@cam.org Fri Dec 7 04:21:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Fri, 07 Dec 2007 04:21:00 -0000 Subject: Git and GCC In-Reply-To: <9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> <1196995353.22471.20.camel@brick> <9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com> Message-ID: On Thu, 6 Dec 2007, Jon Smirl wrote: > I have a 4.8GB git process with 4GB of physical memory. Everything > started slowing down a lot when the process got that big. Does git > really need 4.8GB to repack? I could only keep 3.4GB resident. Luckily > this happen at 95% completion. With 8GB of memory you should be able > to do this repack in under 20 minutes. Probably you have too many cached delta results. By default, every delta smaller than 1000 bytes is kept in memory until the write phase. Try using pack.deltacachesize = 256M or lower, or try disabling this caching entirely with pack.deltacachelimit = 0. Nicolas From torvalds@linux-foundation.org Fri Dec 7 05:22:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Fri, 07 Dec 2007 05:22:00 -0000 Subject: Git and GCC In-Reply-To: <9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> <1196995353.22471.20.camel@brick> <9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com> Message-ID: On Thu, 6 Dec 2007, Jon Smirl wrote: > > > > time git blame -C gcc/regclass.c > /dev/null > > jonsmirl@terra:/video/gcc$ time git blame -C gcc/regclass.c > /dev/null > > real 1m21.967s > user 1m21.329s Well, I was also hoping for a "compared to not-so-aggressive packing" number on the same machine.. IOW, what I was wondering is whether there is a visible performance downside to the deeper delta chains in the 300MB pack vs the (less aggressive) 500MB pack. Linus From nightstrike@gmail.com Fri Dec 7 05:37:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Fri, 07 Dec 2007 05:37:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> Message-ID: On 12/6/07, Linus Torvalds wrote: > > > On Thu, 6 Dec 2007, NightStrike wrote: > > > > No disrespect is meant by this reply. I am just curious (and I am > > probably misunderstanding something).. Why remove all of the > > documentation entirely? Wouldn't it be better to just document it > > more thoroughly? > > Well, part of it is that I don't think "--aggressive" as it is implemented > right now is really almost *ever* the right answer. We could change the > implementation, of course, but generally the right thing to do is to not > use it (tweaking the "--window" and "--depth" manually for the repacking > is likely the more natural thing to do). > > The other part of the answer is that, when you *do* want to do what that > "--aggressive" tries to achieve, it's such a special case event that while > it should probably be documented, I don't think it should necessarily be > documented where it is now (as part of "git gc"), but as part of a much > more technical manual for "deep and subtle tricks you can play". > > > I thought you did a fine job in this post in explaining its purpose, > > when to use it, when not to, etc. Removing the documention seems > > counter-intuitive when you've already gone to the trouble of creating > > good documentation here in this post. > > I'm so used to writing emails, and I *like* trying to explain what is > going on, so I have no problems at all doing that kind of thing. However, > trying to write a manual or man-page or other technical documentation is > something rather different. > > IOW, I like explaining git within the _context_ of a discussion or a > particular problem/issue. But documentation should work regardless of > context (or at least set it up), and that's the part I am not so good at. > > In other words, if somebody (hint hint) thinks my explanation was good and > readable, I'd love for them to try to turn it into real documentation by > editing it up and creating enough context for it! But I'm nort personally > very likely to do that. I'd just send Junio the patch to remove a > misleading part of the documentation we have. hehe.. I'd love to, actually. I can work on it next week. From ERES@il.ibm.com Fri Dec 7 05:40:00 2007 From: ERES@il.ibm.com (Revital1 Eres) Date: Fri, 07 Dec 2007 05:40:00 -0000 Subject: Help with the Machine Description In-Reply-To: <000001c8385a$b3a22820$33160e98@ece.ncsu.edu> Message-ID: Hello, I think you should look at the constraint of the instruction in your md file, for example (taken from altivec.md file under config/rs6000 dir): (define_insn "altivec_stvx" [(parallel [(set (match_operand:V4SI 0 "memory_operand" "=Z") (match_operand:V4SI 1 "register_operand" "v")) (unspec [(const_int 0)] UNSPEC_STVX)])] "TARGET_ALTIVEC" "stvx %1,%y0" [(set_attr "type" "vecstore")]) The v and Z indicate constraints on the operands of the instruction. Their description can be found in constraints.md file in the same dir:: (define_memory_constraint "Z" "Indexed or indirect memory operand" (match_operand 0 "indexed_or_indirect_operand")) You can take a look at the gcc internals for more info about this. Revital gcc-owner@gcc.gnu.org wrote on 07/12/2007 00:52:38: > Hello Everyone, > I am trying to modify the OpenRISC GCC to modify the existing > instructions and add more instructions into the system. I had to rewrite > most of the or32.md. When I am trying to compile something, it says the > following constaint is not found. Can someone please help me with > reading this contraint correctly? > > (insn 112 110 478 12 (set (mem:QI (reg/v/f:SI 16 r16 [orig:72 line.183 ] > [72]) [0 S1 A8]) > (const_int 0 [0x0])) 16 {movqi} (nil) > (nil)) > > From what I see, it is just a that we are trying to set 1 byte of a > memory location with the value in register #16 (r16) with an offset of > 0....which I have handled already in my machine description...so what > can this be? > > Any help is highly appreciated. > > Thanking You, > > Yours Sincerely, > > Balaji V. Iyer. > > -- > > Balaji V. Iyer > PhD Student, > Center for Efficient, Scalable and Reliable Computing, > Department of Electrical and Computer Engineering, > North Carolina State University. > > From peff@peff.net Fri Dec 7 06:39:00 2007 From: peff@peff.net (Jeff King) Date: Fri, 07 Dec 2007 06:39:00 -0000 Subject: Git and GCC In-Reply-To: <20071206.193121.40404287.davem@davemloft.net> References: <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <20071206.193121.40404287.davem@davemloft.net> Message-ID: <20071207063848.GA13101@coredump.intra.peff.net> On Thu, Dec 06, 2007 at 07:31:21PM -0800, David Miller wrote: > > So it is about 5% bigger. What is really disappointing is that we saved > > only about 20% of the time. I didn't sit around watching the stages, but > > my guess is that we spent a long time in the single threaded "writing > > objects" stage with a thrashing delta cache. > > If someone can give me a good way to run this test case I can > have my 64-cpu Niagara-2 box crunch on this and see how fast > it goes and how much larger the resulting pack file is. That would be fun to see. The procedure I am using is this: # compile recent git master with threaded delta cd git echo THREADED_DELTA_SEARCH = 1 >>config.mak make install # get the gcc pack mkdir gcc && cd gcc git --bare init git config remote.gcc.url git://git.infradead.org/gcc.git git config remote.gcc.fetch \ '+refs/remotes/gcc.gnu.org/*:refs/remotes/gcc.gnu.org/*' git remote update # make a copy, so we can run further tests from a known point cd .. cp -a gcc test # and test multithreaded large depth/window repacking cd test git config pack.threads 4 time git repack -a -d -f --window=250 --depth=250 -Peff From peff@peff.net Fri Dec 7 06:50:00 2007 From: peff@peff.net (Jeff King) Date: Fri, 07 Dec 2007 06:50:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> Message-ID: <20071207065047.GB13101@coredump.intra.peff.net> On Thu, Dec 06, 2007 at 01:02:58PM -0500, Nicolas Pitre wrote: > > What is really disappointing is that we saved > > only about 20% of the time. I didn't sit around watching the stages, but > > my guess is that we spent a long time in the single threaded "writing > > objects" stage with a thrashing delta cache. > > Maybe you should run the non threaded repack on the same machine to have > a good comparison. Sorry, I should have been more clear. By "saved" I meant "we needed N minutes of CPU time, but took only M minutes of real time to use it." IOW, if we assume that the threading had zero overhead and that we were completely CPU bound, then the task would have taken N minutes of real time. And obviously those assumptions aren't true, but I was attempting to say "it would have been at most N minutes of real time to do it single-threaded." > And if you have only 2 CPUs, you will have better performances with > pack.threads = 2, otherwise there'll be wasteful task switching going > on. Yes, but balanced by one thread running out of data way earlier than the other, and completing the task with only one CPU. I am doing a 4-thread test on a quad-CPU right now, and I will also try it with threads=1 and threads=6 for comparison. > And of course, if the delta cache is being trashed, that might be due to > the way the existing pack was previously packed. Hence the current pack > might impact object _access_ when repacking them. So for a really > really fair performance comparison, you'd have to preserve the original > pack and swap it back before each repack attempt. I am working each time from the pack generated by fetching from git://git.infradead.org/gcc.git. -Peff From jonsmirl@gmail.com Fri Dec 7 07:08:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Fri, 07 Dec 2007 07:08:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> <1196995353.22471.20.camel@brick> <9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com> Message-ID: <9e4733910712062308t22258c6anb685b18a663e0a31@mail.gmail.com> On 12/7/07, Linus Torvalds wrote: > > > On Thu, 6 Dec 2007, Jon Smirl wrote: > > > > > > time git blame -C gcc/regclass.c > /dev/null > > > > jonsmirl@terra:/video/gcc$ time git blame -C gcc/regclass.c > /dev/null > > > > real 1m21.967s > > user 1m21.329s > > Well, I was also hoping for a "compared to not-so-aggressive packing" > number on the same machine.. IOW, what I was wondering is whether there is > a visible performance downside to the deeper delta chains in the 300MB > pack vs the (less aggressive) 500MB pack. Same machine with a default pack jonsmirl@terra:/video/gcc/.git/objects/pack$ ls -l total 2145716 -r--r--r-- 1 jonsmirl jonsmirl 23667932 2007-12-07 02:03 pack-bd163555ea9240a7fdd07d2708a293872665f48b.idx -r--r--r-- 1 jonsmirl jonsmirl 2171385413 2007-12-07 02:03 pack-bd163555ea9240a7fdd07d2708a293872665f48b.pack jonsmirl@terra:/video/gcc/.git/objects/pack$ Delta lengths have virtually no impact. The bigger pack file causes more IO which offsets the increased delta processing time. One of my rules is smaller is almost always better. Smaller eliminates IO and helps with the CPU cache. It's like the kernel being optimized for size instead of speed ending up being faster. time git blame -C gcc/regclass.c > /dev/null real 1m19.289s user 1m17.853s sys 0m0.952s > > Linus > -- Jon Smirl jonsmirl@gmail.com From jonsmirl@gmail.com Fri Dec 7 07:11:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Fri, 07 Dec 2007 07:11:00 -0000 Subject: Git and GCC In-Reply-To: <20071207063848.GA13101@coredump.intra.peff.net> References: <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> Message-ID: <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> On 12/7/07, Jeff King wrote: > On Thu, Dec 06, 2007 at 07:31:21PM -0800, David Miller wrote: > > > > So it is about 5% bigger. What is really disappointing is that we saved > > > only about 20% of the time. I didn't sit around watching the stages, but > > > my guess is that we spent a long time in the single threaded "writing > > > objects" stage with a thrashing delta cache. > > > > If someone can give me a good way to run this test case I can > > have my 64-cpu Niagara-2 box crunch on this and see how fast > > it goes and how much larger the resulting pack file is. > > That would be fun to see. The procedure I am using is this: > > # compile recent git master with threaded delta > cd git > echo THREADED_DELTA_SEARCH = 1 >>config.mak > make install > > # get the gcc pack > mkdir gcc && cd gcc > git --bare init > git config remote.gcc.url git://git.infradead.org/gcc.git > git config remote.gcc.fetch \ > '+refs/remotes/gcc.gnu.org/*:refs/remotes/gcc.gnu.org/*' > git remote update > > # make a copy, so we can run further tests from a known point > cd .. > cp -a gcc test > > # and test multithreaded large depth/window repacking > cd test > git config pack.threads 4 64 threads with 64 CPUs, if they are multicore you want even more. you need to adjust chunk_size as mentioned in the other mail. > time git repack -a -d -f --window=250 --depth=250 > > -Peff > -- Jon Smirl jonsmirl@gmail.com From peff@peff.net Fri Dec 7 07:27:00 2007 From: peff@peff.net (Jeff King) Date: Fri, 07 Dec 2007 07:27:00 -0000 Subject: Git and GCC In-Reply-To: <20071207065047.GB13101@coredump.intra.peff.net> References: <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <20071207065047.GB13101@coredump.intra.peff.net> Message-ID: <20071207072710.GA13620@coredump.intra.peff.net> On Fri, Dec 07, 2007 at 01:50:47AM -0500, Jeff King wrote: > Yes, but balanced by one thread running out of data way earlier than the > other, and completing the task with only one CPU. I am doing a 4-thread > test on a quad-CPU right now, and I will also try it with threads=1 and > threads=6 for comparison. Hmm. As this has been running, I read the rest of the thread, and it looks like Jon Smirl has already posted the interesting numbers. So nevermind, unless there is something particular you would like to see. -Peff From peff@peff.net Fri Dec 7 07:31:00 2007 From: peff@peff.net (Jeff King) Date: Fri, 07 Dec 2007 07:31:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> Message-ID: <20071207073109.GA13638@coredump.intra.peff.net> On Thu, Dec 06, 2007 at 10:35:22AM -0800, Linus Torvalds wrote: > > What is really disappointing is that we saved only about 20% of the > > time. I didn't sit around watching the stages, but my guess is that we > > spent a long time in the single threaded "writing objects" stage with a > > thrashing delta cache. > > I don't think you spent all that much time writing the objects. That part > isn't very intensive, it's mostly about the IO. It can get nasty with super-long deltas thrashing the cache, I think. But in this case, I think it ended up being just a poor division of labor caused by the chunk_size parameter using the quite large window size (see elsewhere in the thread for discussion). > I suspect you may simply be dominated by memory-throughput issues. The > delta matching doesn't cache all that well, and using two or more cores > isn't going to help all that much if they are largely waiting for memory > (and quite possibly also perhaps fighting each other for a shared cache? > Is this a Core 2 with the shared L2?) I think the chunk_size more or less explains it. I have had reasonable success keeping both CPUs busy on similar tasks in the past (but with smaller window sizes). For reference, it was a Core 2 Duo; do they all share L2, or is there something I can look for in /proc/cpuinfo? -Peff From marcel@holtmann.org Fri Dec 7 07:57:00 2007 From: marcel@holtmann.org (Marcel Holtmann) Date: Fri, 07 Dec 2007 07:57:00 -0000 Subject: In future, to replace autotools by cmake like KDE4 did? In-Reply-To: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> References: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> Message-ID: <7FAE3E89-4C56-47AE-8AD8-9191D6BB8FAC@holtmann.org> Hi, > The autotools ( automake + libtool + autoconf + ... ) generate many > big > files that they have been slowing the building's computation and > growing > enormously their cvs/svn/git/hg repositories because of generated > files. > > To see below interesting links: > 1. http://dot.kde.org/1172083974/ > 2. http://sam.zoy.org/lectures/20050910-debian/ > 3. https://lwn.net/Articles/188693/ > 4. http://en.wikipedia.org/wiki/GNU_Build_Tools > 5. http://en.wikipedia.org/wiki/GNU_Automake > > The benefits could be: > * +40% faster in the KDE4 building vs KDE 3.5.6. > * elimination of redundant and unnecesary generated files as those > from autotools. > * smaller cvs/svn/git/hg repositories. stop spreading this FUD. If you leave the auto-generated files from autotools in the source control repositories, then it is your fault. They are generated files and can always be generated. Hence putting them under revision control makes no sense and so don't do it. And more certain don't complain about it if you did. Regards Marcel From jnareb@gmail.com Fri Dec 7 12:14:00 2007 From: jnareb@gmail.com (Jakub Narebski) Date: Fri, 07 Dec 2007 12:14:00 -0000 Subject: In future, to replace autotools by cmake like KDE4 did? In-Reply-To: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> References: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> Message-ID: "J.C. Pizarro" writes: > The autotools ( automake + libtool + autoconf + ... ) generate many big > files that they have been slowing the building's computation and growing > enormously their cvs/svn/git/hg repositories because of generated files. [cut] And this is relevant for this mailing list exactly how? From the whole autotools package git uses only autoconf, and only as an optional part to configure only Makefile configuration variables. Generated files should not be put into version control, unless it is for convenience only in separate branch like HTML and manpage versions of git documentation are in 'html and 'man' branches, respectively. The same could be done with ./configure script. Although there was some talk about whether giw should use autotools, or perhaps CMake, or handmade ./configure script like MPlayer IIRC, instead of its own handmade Makefile... -- Jakub Narebski ShadeHawk on #git From ae@op5.se Fri Dec 7 12:44:00 2007 From: ae@op5.se (Andreas Ericsson) Date: Fri, 07 Dec 2007 12:44:00 -0000 Subject: In future, to replace autotools by cmake like KDE4 did? In-Reply-To: References: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> Message-ID: <47594021.40200@op5.se> Jakub Narebski wrote: > > Although there was some talk about whether giw should use autotools, > or perhaps CMake, or handmade ./configure script like MPlayer IIRC, > instead of its own handmade Makefile... > To tell the truth, I'd be much happier if everything like that got put in a header file or some such. 95% of what we figure out by looking at "uname" output can already be learned by looking at the various pre-defined macros. Fortunately, there's a project devoted solely to this, so most of the tedious research need not be done. It can be found at http://predef.sourceforge.net/ -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 From davem@davemloft.net Fri Dec 7 12:53:00 2007 From: davem@davemloft.net (David Miller) Date: Fri, 07 Dec 2007 12:53:00 -0000 Subject: Git and GCC In-Reply-To: <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> Message-ID: <20071207.045329.204650714.davem@davemloft.net> From: "Jon Smirl" Date: Fri, 7 Dec 2007 02:10:49 -0500 > On 12/7/07, Jeff King wrote: > > On Thu, Dec 06, 2007 at 07:31:21PM -0800, David Miller wrote: > > > > # and test multithreaded large depth/window repacking > > cd test > > git config pack.threads 4 > > 64 threads with 64 CPUs, if they are multicore you want even more. > you need to adjust chunk_size as mentioned in the other mail. It's an 8 core system with 64 cpu threads. > > time git repack -a -d -f --window=250 --depth=250 Didn't work very well, even with the one-liner patch for chunk_size it died. I think I need to build 64-bit binaries. davem@huronp11:~/src/GCC/git/test$ time git repack -a -d -f --window=250 --depth=250 Counting objects: 1190671, done. fatal: Out of memory? mmap failed: Cannot allocate memory real 58m36.447s user 289m8.270s sys 4m40.680s davem@huronp11:~/src/GCC/git/test$ While it did run the load was anywhere between 5 and 9, although it did create 64 threads, and the size of the process was about 3.2GB This may be in part why it wasn't able to use all 64 thread effectively. Like I said it seemed to have 9 active at best, at any one time, most of the time only 4 or 5 were busy doing anything. Also I could end up being performance limited by SHA, it's not very well tuned on Sparc. It's been on my TODO list to code up the crypto unit support for Niagara-2 in the kernel, then work with Herbert Xu on the userland interfaces to take advantage of that in things like libssl. Even a better C/asm version would probably improve GIT performance a bit. Is SHA a significant portion of the compute during these repacks? I should run oprofile... From jnareb@gmail.com Fri Dec 7 13:56:00 2007 From: jnareb@gmail.com (Jakub Narebski) Date: Fri, 07 Dec 2007 13:56:00 -0000 Subject: In future, to replace autotools by cmake like KDE4 did? In-Reply-To: <47594021.40200@op5.se> References: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> <47594021.40200@op5.se> Message-ID: <200712071456.11019.jnareb@gmail.com> Andreas Ericsson wrote: > Jakub Narebski wrote: > > > > Although there was some talk about whether giw should use autotools, > > or perhaps CMake, or handmade ./configure script like MPlayer IIRC, > > instead of its own handmade Makefile... > > > > To tell the truth, I'd be much happier if everything like that got > put in a header file or some such. 95% of what we figure out by looking > at "uname" output can already be learned by looking at the various > pre-defined macros. > > Fortunately, there's a project devoted solely to this, so most of > the tedious research need not be done. It can be found at > http://predef.sourceforge.net/ Code talks, bullsh*t walks. Pre-defined macros cannot tell us if one have specific libraries installed, cannot tell us if formatted IO functions support 'size specifiers' even though compiler claim C99 compliance or even though compiler doesn't claim C99 compliance but supports this, etc. But perhaps the "uname" based compile configuration could be replaced by testing pre-defined macros... at least for C code, and git is not only C code. -- Jakub Narebski Poland From amonakov@ispras.ru Fri Dec 7 14:04:00 2007 From: amonakov@ispras.ru (Alexander Monakov) Date: Fri, 07 Dec 2007 14:04:00 -0000 Subject: [RFC/RFT] Improving SMS by data dependence export Message-ID: Hi. Attached is the patch that allows to save dependence info obtained on tree level by data-reference analysis for usage on RTL level (for RTL memory disambiguation and dependence graph construction for modulo scheduling). It helps for RTL disambiguation on platforms without base+offset memory addressing modes, and impact on SMS is described below. We would like to see it in 4.4 mainline. We have tested this patch with modulo scheduling on ia64, using SPEC CPU2000 benchmark suite. It allows to apply software pipelining to more loops, resulting in ~1-2% speedup (compared to SMS without exported info). The most frequent improvements are removal of cross-iteration memory dependencies, as currently SMS adds such dependencies for all pair of memory references, even in cases when they cannot alias (for example, for different arrays or different fields of a struct). As I understand, SMS does not use RTL alias analysis here because pairs that do not alias within one iteration, but may alias when cross-iteration movement is performed (like a[i] and a[i+1]), should be marked as dependent. So, SMS data dependence analysis can be greatly improved even without data-dependence export patch by using RTL-like memory disambiguation, but without pointer arithmetic analysis. There are currently two miscompiled SPEC tests with this patch; in one of them, the problem is related to generation of register moves in the prologue of software pipelined loop (which was not pipelined without the patch). The problem is reported and discussed with Revital Eres from IBM Haifa. We would like to ask people interested in SMS performance on PowerPC and Cell SPU to conduct tests with this patch. Any feedback is greatly appreciated. Thanks. -- Alexander Monakov -------------- next part -------------- A non-text attachment was scrubbed... Name: export-ddg-20071120.patch Type: application/octet-stream Size: 56758 bytes Desc: not available URL: From fleury@labri.fr Fri Dec 7 14:28:00 2007 From: fleury@labri.fr (Emmanuel Fleury) Date: Fri, 07 Dec 2007 14:28:00 -0000 Subject: ETAPS Conferences Message-ID: <47595871.1080306@labri.fr> Hi all, Is anyone planning to go at CC'08 or COCV'08 ? http://www.sable.mcgill.ca/~hendren/CC2008/ http://www.complang.tuwien.ac.at/cocv2008/cocv2008.html The're held within the ETAPS'08 joint conferences: http://etaps08.mit.bme.hu/ Regards -- Emmanuel Fleury Associate Professor, | Room: 261 LaBRI, Domaine Universitaire | Phone: +33 (0)5 40 00 69 34 351, Cours de la Lib??ration | Email: emmanuel.fleury@labri.fr 33405 Talence Cedex, France | URL: http://www.labri.fr/~fleury From jcpiza@gmail.com Fri Dec 7 14:42:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Fri, 07 Dec 2007 14:42:00 -0000 Subject: In future, to replace autotools by cmake like KDE4 did? In-Reply-To: <200712071456.11019.jnareb@gmail.com> References: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> <47594021.40200@op5.se> <200712071456.11019.jnareb@gmail.com> Message-ID: <998d0e4a0712070642u6ae75232t9cb5bfd0920b2439@mail.gmail.com> On 2007/12/7, Jakub Narebski wrote: > Andreas Ericsson wrote: > > Jakub Narebski wrote: > > > > > > Although there was some talk about whether giw should use autotools, > > > or perhaps CMake, or handmade ./configure script like MPlayer IIRC, > > > instead of its own handmade Makefile... > > > > > > > To tell the truth, I'd be much happier if everything like that got > > put in a header file or some such. 95% of what we figure out by looking > > at "uname" output can already be learned by looking at the various > > pre-defined macros. > > > > Fortunately, there's a project devoted solely to this, so most of > > the tedious research need not be done. It can be found at > > http://predef.sourceforge.net/ > > Code talks, bullsh*t walks. > > Pre-defined macros cannot tell us if one have specific libraries > installed, cannot tell us if formatted IO functions support 'size > specifiers' even though compiler claim C99 compliance or even though > compiler doesn't claim C99 compliance but supports this, etc. > > But perhaps the "uname" based compile configuration could be replaced > by testing pre-defined macros... at least for C code, and git is not > only C code. > > -- > Jakub Narebski > Poland > A powerful tool can do better things that old generators-based tools (as autotools). To imagine, there are many scripts in subdirectories or subprojects: * Before: (many copy and paste of code as below paragraph) A_VARIABLE_OS = `uname -a | grep .... ` # <- slow case "$A_VARIABLE_OS" in *linux*) ... ;; *bsd*) ... ;; *aix*) ... ;; *) ...;; esac m4 foo.sh.m4 > bar.sh # <- very slow ./bar.sh * Later: (with the powerful tool that had cached many predefined variables in a ramdisk's file or in a daemon's memory) # call once at 1st time to internal uname of powerful tool for all ocurrences of # below predefined variable from many scripts: case "$FOO_VARIABLE_OS" in *linux*) ... ;; *bsd*) ... ;; *aix*) ... ;; *) ...;; esac # i don't need to generate more scripts to inspect still more it. J.C.Pizarro From jonsmirl@gmail.com Fri Dec 7 15:01:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Fri, 07 Dec 2007 15:01:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> <1196995353.22471.20.camel@brick> Message-ID: <9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com> On 12/6/07, Linus Torvalds wrote: > > > On Thu, 6 Dec 2007, Harvey Harrison wrote: > > > > I've updated the public mirror repo with the very-packed version. > > Side note: it might be interesting to compare timings for > history-intensive stuff with and without this kind of very-packed > situation. > > The very density of a smaller pack-file might be enough to overcome the > downsides (more CPU time to apply longer delta-chains), but regardless, > real numbers talks, bullshit walks. So wouldn't it be nice to have real > numbers? > > One easy way to get real numbers for history would be to just time some > reasonably costly operation that uses lots of history. Ie just do a > > time git blame -C gcc/regclass.c > /dev/null > > and see if the deeper delta chains are very expensive. jonsmirl@terra:/video/gcc$ time git blame -C gcc/regclass.c > /dev/null real 1m21.967s user 1m21.329s sys 0m0.640s The Mozilla repo is at least 50% larger than the gcc one. It took me 23 minutes to repack the gcc one on my $800 Dell. The trick to this is lots of RAM and 64b. There is little disk IO during the compression phase, everything is cached. I have a 4.8GB git process with 4GB of physical memory. Everything started slowing down a lot when the process got that big. Does git really need 4.8GB to repack? I could only keep 3.4GB resident. Luckily this happen at 95% completion. With 8GB of memory you should be able to do this repack in under 20 minutes. jonsmirl@terra:/video/gcc$ time git repack -a -d -f --depth=250 --window=250 real 22m54.380s user 69m18.948s sys 0m23.773s > (Yeah, the above is pretty much designed to be the worst possible case for > this kind of aggressive history packing, but I don't know if that choice > of file to try to annotate is a good choice or not. I suspect that "git > blame -C" with a CVS import is just horrid, because CVS commits tend to be > pretty big and nasty and not as localized as we've tried to make things in > the kernel, so doing the code copy detection is probably horrendously > expensive) > > Linus > - > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Jon Smirl jonsmirl@gmail.com From torvalds@linux-foundation.org Fri Dec 7 15:01:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Fri, 07 Dec 2007 15:01:00 -0000 Subject: Git and GCC In-Reply-To: <1196995353.22471.20.camel@brick> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> <1196995353.22471.20.camel@brick> Message-ID: On Thu, 6 Dec 2007, Harvey Harrison wrote: > > I've updated the public mirror repo with the very-packed version. Side note: it might be interesting to compare timings for history-intensive stuff with and without this kind of very-packed situation. The very density of a smaller pack-file might be enough to overcome the downsides (more CPU time to apply longer delta-chains), but regardless, real numbers talks, bullshit walks. So wouldn't it be nice to have real numbers? One easy way to get real numbers for history would be to just time some reasonably costly operation that uses lots of history. Ie just do a time git blame -C gcc/regclass.c > /dev/null and see if the deeper delta chains are very expensive. (Yeah, the above is pretty much designed to be the worst possible case for this kind of aggressive history packing, but I don't know if that choice of file to try to annotate is a good choice or not. I suspect that "git blame -C" with a CVS import is just horrid, because CVS commits tend to be pretty big and nasty and not as localized as we've tried to make things in the kernel, so doing the code copy detection is probably horrendously expensive) Linus From dave.korn@artimi.com Fri Dec 7 15:08:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Fri, 07 Dec 2007 15:08:00 -0000 Subject: libiberty/pex-unix vfork abuse? Message-ID: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> Hey all, This is what posix says about vfork: http://www.opengroup.org/onlinepubs/000095399/functions/vfork.html "The vfork() function shall be equivalent to fork(), except that the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit() or one of the exec family of functions." This is how pex-unix.c uses vfork: static long pex_unix_exec_child (struct pex_obj *obj, int flags, const char *executable, char * const * argv, char * const * env, int in, int out, int errdes, int toclose, const char **errmsg, int *err) { pid_t pid; /* We declare these to be volatile to avoid warnings from gcc about them being clobbered by vfork. */ volatile int sleep_interval; volatile int retries; sleep_interval = 1; pid = -1; for (retries = 0; retries < 4; ++retries) { pid = vfork (); if (pid >= 0) break; sleep (sleep_interval); sleep_interval *= 2; } switch (pid) { case -1: *err = errno; *errmsg = VFORK_STRING; return -1; case 0: /* Child process. */ if (in != STDIN_FILE_NO) { if (dup2 (in, STDIN_FILE_NO) < 0) pex_child_error (obj, executable, "dup2", errno); if (close (in) < 0) pex_child_error (obj, executable, "close", errno); } if (out != STDOUT_FILE_NO) { if (dup2 (out, STDOUT_FILE_NO) < 0) pex_child_error (obj, executable, "dup2", errno); if (close (out) < 0) pex_child_error (obj, executable, "close", errno); } if (errdes != STDERR_FILE_NO) { if (dup2 (errdes, STDERR_FILE_NO) < 0) pex_child_error (obj, executable, "dup2", errno); if (close (errdes) < 0) pex_child_error (obj, executable, "close", errno); } if (toclose >= 0) { if (close (toclose) < 0) pex_child_error (obj, executable, "close", errno); } if ((flags & PEX_STDERR_TO_STDOUT) != 0) { if (dup2 (STDOUT_FILE_NO, STDERR_FILE_NO) < 0) pex_child_error (obj, executable, "dup2", errno); } if (env) environ = (char**) env; if ((flags & PEX_SEARCH) != 0) { execvp (executable, argv); pex_child_error (obj, executable, "execvp", errno); } else { execv (executable, argv); pex_child_error (obj, executable, "execv", errno); } Note the several calls to dup2() and close(), which seem to me to be "calls [to] any other function", and the setting of environ, which seem to me to be modification of "any data other than a variable of type pid_t used to store the return value from vfork()". The comment on pex_child_error (which uses write() to do output) gives a hint at the thinking here: /* Report an error from a child process. We don't use stdio routines, because we might be here due to a vfork call. */ static void pex_child_error (struct pex_obj *obj, const char *executable, const char *errmsg, int err) { #define writeerr(s) (void) write (STDERR_FILE_NO, s, strlen (s)) writeerr (obj->pname); But I don't see any reason to assume the restriction only applies to f*() stdio functions, in fact by my reading I don't think you're [technically] even allowed to call a pure const inline function that's part of your own code. (I assume that that would in fact work ok in practice at least most of the time). Are we ok here? This code seems like it's doing the wrong thing to me. As far as I can tell, we only get away with this in cygwin because of paranoid defensive programming that backs up the fd table before running the vfork'd child's code in the parent's context up to the first exec*() call, and then restores it afterward, but I'm fairly sure that this implementation will still overwrite the parent's environment.... which could well be Not A Good Thing! cheers, DaveK -- Can't think of a witty .sigline today.... From mcostalba@gmail.com Fri Dec 7 16:10:00 2007 From: mcostalba@gmail.com (Marco Costalba) Date: Fri, 07 Dec 2007 16:10:00 -0000 Subject: In future, to replace autotools by cmake like KDE4 did? In-Reply-To: <998d0e4a0712070642u6ae75232t9cb5bfd0920b2439@mail.gmail.com> References: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> <47594021.40200@op5.se> <200712071456.11019.jnareb@gmail.com> <998d0e4a0712070642u6ae75232t9cb5bfd0920b2439@mail.gmail.com> Message-ID: On Dec 7, 2007 3:42 PM, J.C. Pizarro wrote: > > A powerful tool can do better things that old generators-based tools > (as autotools). > --- cut --- > > * Later: (with the powerful tool that had cached many predefined variables in Insisting on highlighting your proposal as "powerful tool" vs what is in git now (on which people spent long hours to tune it out) will give you hard times on this list ;-) Just my guess... Marco From ESmith-rowland@alionscience.com Fri Dec 7 16:19:00 2007 From: ESmith-rowland@alionscience.com (Smith-Rowland, Edward M) Date: Fri, 07 Dec 2007 16:19:00 -0000 Subject: Broken link for Modula-3 front end. Message-ID: <893D428105AB9F49AC7A5C96A454A8B8138D2F@email4a.alionscience.com> Here is a minute tidbit. When I click on the Modula-3 link: http://www.m3.org/ it rolls over to: http://www.igencorp.com/igencorp/ Ii looks like they moved to: http://www.modula3.org/ Ed Smith-Rowland From iant@google.com Fri Dec 7 16:59:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Fri, 07 Dec 2007 16:59:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> References: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> Message-ID: "Dave Korn" writes: > Note the several calls to dup2() and close(), which seem to me to be "calls > [to] any other function", and the setting of environ, which seem to me to be > modification of "any data other than a variable of type pid_t used to store > the return value from vfork()". Despite the standardese, vfork was invented to support calling dup/dup2 before calling exec. Without that feature, it would be nearly useless. Any actual implementation of vfork must support calling dup/dup2, or it will break all real programs which use vfork. On the other hand, the setting of environ is very dubious and is likely to break on real systems. The code should be changed to call execve instead. Unfortunately there is no standard execvpe function. Fortunately gcc never uses the variant which sets environ. Offhand I'm not sure what does. Ian From dave.korn@artimi.com Fri Dec 7 17:09:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Fri, 07 Dec 2007 17:09:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: References: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> On 07 December 2007 16:59, Ian Lance Taylor wrote: > "Dave Korn" writes: > >> Note the several calls to dup2() and close(), which seem to me to be >> "calls [to] any other function", and the setting of environ, which seem to >> me to be modification of "any data other than a variable of type pid_t >> used to store the return value from vfork()". > > Despite the standardese, vfork was invented to support calling > dup/dup2 before calling exec. Without that feature, it would be > nearly useless. Any actual implementation of vfork must support > calling dup/dup2, or it will break all real programs which use vfork. Ah, right, hence the 'defensive coding' relating to the fdtab in cygwin. I can see how hard it would be to do the standard unix fork-and-fd-swap dance without that. (Should possibly Cc. the austin group ml and suggest a revision to the wording, assuming they aren't deciding to remove it altogether). > On the other hand, the setting of environ is very dubious and is > likely to break on real systems. The code should be changed to call > execve instead. Unfortunately there is no standard execvpe function. > Fortunately gcc never uses the variant which sets environ. Offhand > I'm not sure what does. Perhaps we could work around this case by setting environ in the parent before the vfork call and restoring it afterward, but we'd need kind of serialisation there, and I don't know how to do a critical section using pthreads/posix. cheers, DaveK -- Can't think of a witty .sigline today.... From torvalds@linux-foundation.org Fri Dec 7 17:24:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Fri, 07 Dec 2007 17:24:00 -0000 Subject: Git and GCC In-Reply-To: <20071207.045329.204650714.davem@davemloft.net> References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> Message-ID: On Fri, 7 Dec 2007, David Miller wrote: > > Also I could end up being performance limited by SHA, it's not very > well tuned on Sparc. It's been on my TODO list to code up the crypto > unit support for Niagara-2 in the kernel, then work with Herbert Xu on > the userland interfaces to take advantage of that in things like > libssl. Even a better C/asm version would probably improve GIT > performance a bit. I doubt yu can use the hardware support. Kernel-only hw support is inherently broken for any sane user-space usage, the setup costs are just way way too high. To be useful, crypto engines need to support direct user space access (ie a regular instruction, with all state being held in normal registers that get saved/restored by the kernel). > Is SHA a significant portion of the compute during these repacks? > I should run oprofile... SHA1 is almost totally insignificant on x86. It hardly shows up. But we have a good optimized version there. zlib tends to be a lot more noticeable (especially the uncompression: it may be faster than compression, but it's done _so_ much more that it totally dominates). Linus From jcpiza@gmail.com Fri Dec 7 17:24:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Fri, 07 Dec 2007 17:24:00 -0000 Subject: libiberty/pex-unix vfork abuse? Message-ID: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> On 2007/12/07, "Dave Korn" wrote: > > On the other hand, the setting of environ is very dubious and is > > likely to break on real systems. The code should be changed to call > > execve instead. Unfortunately there is no standard execvpe function. > > Fortunately gcc never uses the variant which sets environ. Offhand > > I'm not sure what does. > > Perhaps we could work around this case by setting environ in the parent > before the vfork call and restoring it afterward, but we'd need kind of > serialisation there, and I don't know how to do a critical section using > pthreads/posix. You can do a critical section mainly between processes using system calls of IPC synchronization like filelocks, RPCs, shared memory with mmap and mutexes/semaphores, messages passing through pipes as tunnels, MPI, etc. Now well, a critical section between multithreaded processes are complicated. J.C.Pizarro From dave.korn@artimi.com Fri Dec 7 17:42:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Fri, 07 Dec 2007 17:42:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> References: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> Message-ID: <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> On 07 December 2007 17:24, J.C. Pizarro wrote: > You can do a critical section mainly between processes Thanks for your well-meaning attempt to help, but you don't understand what we're talking about, and sending a generic list of synchronisation techniques without regard to their relevance or applicability in this situation doesn't actually tell either me or Ian anything we didn't already know nor advance the discussion any. cheers, DaveK -- Can't think of a witty .sigline today.... From Joe.Buck@synopsys.COM Fri Dec 7 17:50:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Fri, 07 Dec 2007 17:50:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> References: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <20071207175005.GB12580@synopsys.com> On Fri, Dec 07, 2007 at 05:41:50PM -0000, Dave Korn wrote: > On 07 December 2007 17:24, J.C. Pizarro wrote: > > > You can do a critical section mainly between processes > > Thanks for your well-meaning attempt to help, but you don't understand what > we're talking about, and sending a generic list of synchronisation techniques > without regard to their relevance or applicability in this situation doesn't > actually tell either me or Ian anything we didn't already know nor advance the > discussion any. And this is hardly an isolated case. J.C., please stop responding to every issue with a grab-bag of buzzwords and generic solutions. As Dave says, you aren't telling people anything they didn't know. If you actually experiment with a specific idea and have data, by all means submit that. But random suggestions based on something you read in school are useless, and many developers on this list already took those classes and read those papers. From sebpop@gmail.com Fri Dec 7 17:59:00 2007 From: sebpop@gmail.com (Sebastian Pop) Date: Fri, 07 Dec 2007 17:59:00 -0000 Subject: ETAPS Conferences In-Reply-To: <47595871.1080306@labri.fr> References: <47595871.1080306@labri.fr> Message-ID: On Dec 7, 2007 8:28 AM, Emmanuel Fleury wrote: > Is anyone planning to go at CC'08 or COCV'08 ? I was planning, but my paper at ESOP'08 was rejected, so I won't go there ;-) Sebastian From jcpiza@gmail.com Fri Dec 7 18:09:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Fri, 07 Dec 2007 18:09:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <20071207175005.GB12580@synopsys.com> References: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> <20071207175005.GB12580@synopsys.com> Message-ID: <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> 2007/12/7, Joe Buck wrote: > On Fri, Dec 07, 2007 at 05:41:50PM -0000, Dave Korn wrote: > > On 07 December 2007 17:24, J.C. Pizarro wrote: > > > > > You can do a critical section mainly between processes > > > > Thanks for your well-meaning attempt to help, but you don't understand what > > we're talking about, and sending a generic list of synchronisation techniques > > without regard to their relevance or applicability in this situation doesn't > > actually tell either me or Ian anything we didn't already know nor advance the > > discussion any. > > And this is hardly an isolated case. J.C., please stop responding to > every issue with a grab-bag of buzzwords and generic solutions. As > Dave says, you aren't telling people anything they didn't know. > > If you actually experiment with a specific idea and have data, by all > means submit that. But random suggestions based on something you read in > school are useless, and many developers on this list already took those > classes and read those papers. > But random suggestions based on something you read in school are useless You're wrong. My suggestions are not based from school and are not useless. My suggestions are based from university, books, papers and internet, and i did put those by a same reason, my freedom. Do you permit me a question for you? "Are important the suggestions?" J.C.Pizarro From aph@redhat.com Fri Dec 7 18:14:00 2007 From: aph@redhat.com (Andrew Haley) Date: Fri, 07 Dec 2007 18:14:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> References: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> <20071207175005.GB12580@synopsys.com> <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> Message-ID: <18265.36225.838319.242593@zebedee.pink> J.C. Pizarro writes: > You're wrong. My suggestions are not based from school and are not useless. > My suggestions are based from university, books, papers and internet, and > i did put those by a same reason, my freedom. You have the freedom to make useless postings to this list, just as we have freedom to ask you to stop. Please stop. Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From dave.korn@artimi.com Fri Dec 7 18:28:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Fri, 07 Dec 2007 18:28:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> References: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> <20071207175005.GB12580@synopsys.com> <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> Message-ID: <062301c838fe$ec4cbd80$2e08a8c0@CAM.ARTIMI.COM> On 07 December 2007 18:09, J.C. Pizarro wrote: > You're wrong. My suggestions are not based from school and are not useless. Now /you're/ wrong: your suggestions *are* useless. You suggested using inter-process communications to try and resolve a potential data-access race condition between multiple threads within the *same* process. This can NOT be done by any of "filelocks, RPCs, shared memory with mmap and mutexes/semaphores, messages passing through pipes as tunnels, MPI". - A filelock cannot be used to prevent a second thread within the same process from reading from or writing to the environ[] array (or any other variable) or calling setenv/getenv while a first thread temporarily swaps the value in it. - A remote procedure call cannot stop a second thread in the same process from reading the environ[] array (or any other ... (etc). - An mmap'd shared memory block cannot stop a second thread ... (etc). - Passing a message through a pipe cannot stop a second thread ... (etc). - A mutex/semaphore *could* do this, but only if every call to getenv/setenv and every direct access to the environ array were *also* wrapped in mutex lock calls. This is not something you can work around in a library implementation of pexecute, it would require every author of every program that uses libiberty to modify their entire code. The fact that you have suggested all those useless suggestions proves that you have not understood what we are discussing, which was how to temporarily alter the value of a global variable in one thread of a program without any other thread seeing the altered value, when we do not control the other threads because we are just a library function, not the application itself. The generic solution for this problem, which I mentioned in my original post, is a "critical section", which is a stretch of code that locks out the scheduler from scheduling any other threads of the process to run during the period the lock is held. > Do you permit me a question for you? > > "Are important the suggestions?" Their importance is proportional to the product of their relevance multiplied by their feasibility. Your suggestions above were all irrelevant, except for the unfeasible one. 0*(anything) == (anything)*0 == 0*0 == 0. cheers, DaveK -- Can't think of a witty .sigline today.... From dnovillo@google.com Fri Dec 7 18:31:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Fri, 07 Dec 2007 18:31:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> References: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> <20071207175005.GB12580@synopsys.com> <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> Message-ID: <47599167.3070301@google.com> On 12/7/07 1:09 PM, J.C. Pizarro wrote: > Do you permit me a question for you? > > "Are important the suggestions?" > > J.C.Pizarro JC, The problem that many of us have with your responses is that they are almost always content-free. You do not seem to grasp the basic principles of the issues that you write about. Your responses are purely an amalgamation of very basic terms that you seem to have gotten from books/papers. We *all* know those things. We generally take that for granted. So, when you offer only that information, you are not contributing anything to the discussion. You are just adding noise. The problem is that over time, people stop listening to you and then when you eventually do have an actual contribution, nobody will listen to it, because people will already be too tired of arguing with you. Just a suggestion. Maybe you *do* have something useful to contribute, it's just hard to see what it might be. Diego. From iant@google.com Fri Dec 7 18:40:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Fri, 07 Dec 2007 18:40:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> References: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> <060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> Message-ID: "Dave Korn" writes: > > On the other hand, the setting of environ is very dubious and is > > likely to break on real systems. The code should be changed to call > > execve instead. Unfortunately there is no standard execvpe function. > > Fortunately gcc never uses the variant which sets environ. Offhand > > I'm not sure what does. > > Perhaps we could work around this case by setting environ in the parent > before the vfork call and restoring it afterward, but we'd need kind of > serialisation there, and I don't know how to do a critical section using > pthreads/posix. The setting of environ came in here: http://gcc.gnu.org/ml/gcc-patches/2006-05/msg00377.html Mark, setting the global environ variable won't work correctly on Unix when using vfork. Your e-mail refers to the prelinker. The prelinker sources that I have don't use the pex routines. Do your prelinker sources require setting the environment when using the PEX_SEARCH flag while setting the environment? If not, I think the simple approach is to disallow that flag for pex_run_in_environment, to not set environ, and to use execve instead. Ian From jcpiza@gmail.com Fri Dec 7 18:47:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Fri, 07 Dec 2007 18:47:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <062301c838fe$ec4cbd80$2e08a8c0@CAM.ARTIMI.COM> References: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> <20071207175005.GB12580@synopsys.com> <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> <062301c838fe$ec4cbd80$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <998d0e4a0712071047g2a35a43av67017bdf8495cf00@mail.gmail.com> On 2007/12/7, Dave Korn wrote: > On 07 December 2007 18:09, J.C. Pizarro wrote: > > > You're wrong. My suggestions are not based from school and are not useless. > > Now /you're/ wrong: your suggestions *are* useless. You suggested using > inter-process communications to try and resolve a potential data-access race > condition between multiple threads within the *same* process. This can NOT be > done by any of "filelocks, RPCs, shared memory with mmap and > mutexes/semaphores, messages passing through pipes as tunnels, MPI". > > - A filelock cannot be used to prevent a second thread within the same > process from reading from or writing to the environ[] array (or any other > variable) or calling setenv/getenv while a first thread temporarily swaps the > value in it. > > - A remote procedure call cannot stop a second thread in the same process > from reading the environ[] array (or any other ... (etc). > > - An mmap'd shared memory block cannot stop a second thread ... (etc). > > - Passing a message through a pipe cannot stop a second thread ... (etc). > > - A mutex/semaphore *could* do this, but only if every call to getenv/setenv > and every direct access to the environ array were *also* wrapped in mutex lock > calls. This is not something you can work around in a library implementation > of pexecute, it would require every author of every program that uses > libiberty to modify their entire code. > > The fact that you have suggested all those useless suggestions proves that > you have not understood what we are discussing, which was how to temporarily > alter the value of a global variable in one thread of a program without any > other thread seeing the altered value, when we do not control the other > threads because we are just a library function, not the application itself. > The generic solution for this problem, which I mentioned in my original post, > is a "critical section", which is a stretch of code that locks out the > scheduler from scheduling any other threads of the process to run during the > period the lock is held. > > > Do you permit me a question for you? > > > > "Are important the suggestions?" > > Their importance is proportional to the product of their relevance > multiplied by their feasibility. Your suggestions above were all irrelevant, > except for the unfeasible one. 0*(anything) == (anything)*0 == 0*0 == 0. > > cheers, > DaveK > -- > Can't think of a witty .sigline today.... and you wrote too at start of this topic: > "The vfork() function shall be equivalent to fork(), except that the behavior > is undefined if the process created by vfork() either modifies any data other > than a variable of type pid_t used to store the return value from vfork(), or > returns from the function in which vfork() was called, or calls any other > function before successfully calling _exit() or one of the exec family of > functions." Briefly, it's fork-like. It's about processes, not threads. It's about parent and children processes. And the another man talked about threads (critical section for threads). J.C.Pizarro "the noiser" (theirs freedoms that they want to stop me) From baembel@gmx.de Fri Dec 7 19:04:00 2007 From: baembel@gmx.de (Boris Boesler) Date: Fri, 07 Dec 2007 19:04:00 -0000 Subject: BITS_PER_UNIT less than 8 (was: Re: BITS_PER_UNIT larger than 8 -- word addressing) In-Reply-To: References: <2649CC51-1ACF-4C2B-88BA-396CB39D4BEE@gmx.de> Message-ID: <50DAF4FC-6D88-41B8-B3D5-C706A75B1930@gmx.de> Am 05.12.2007 um 22:32 schrieb Ian Lance Taylor: > Boris Boesler writes: > >> I assume that GCC internals assume that memory can be byte (8 bits) >> addressed - for historical reasons. > > No. gcc internals assume that memory can be addressed in units of > size BITS_PER_UNIT. The default for BITS_PER_UNIT is 8. I have > written backends for machines for which that is not true. > > It is unusual, and there is only one official target with > BITS_PER_UNIT != 8 (c4x), so there is often some minor breakage. Ok, so what have I to do to write a back-end where all addresses are given in bits? Memory is addressed in bits, not bytes. So I set: #define BITS_PER_UNIT 1 #define UNITS_PER_WORD 32 (As far as I can see, offsets are divided by BITS_PER_UNIT, so this seems to be a precondition for bit addressing.) All sizes and and boundary are set to 32. SImode is only four bits wide, so I added the integer modes OI and XI: INT_MODE(OI, 32) INT_MODE(XI, 64) In builtin_define_type_max I added the case "1", which will return without doing anything. Without these changes the compiler stops with internal error mesages. With these changes gcc/cc1 generates a bus error. So, what can I do to get this running for my architecture? Thanks in advance, Boris From nico@cam.org Fri Dec 7 19:36:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Fri, 07 Dec 2007 19:36:00 -0000 Subject: Git and GCC In-Reply-To: <9e4733910712062308t22258c6anb685b18a663e0a31@mail.gmail.com> References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <4aca3dc20712061004g43f5902cw79bf633917d3ade9@mail.gmail.com> <1196995353.22471.20.camel@brick> <9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com> <9e4733910712062308t22258c6anb685b18a663e0a31@mail.gmail.com> Message-ID: On Fri, 7 Dec 2007, Jon Smirl wrote: > On 12/7/07, Linus Torvalds wrote: > > > > > > On Thu, 6 Dec 2007, Jon Smirl wrote: > > > > > > > > time git blame -C gcc/regclass.c > /dev/null > > > > > > jonsmirl@terra:/video/gcc$ time git blame -C gcc/regclass.c > /dev/null > > > > > > real 1m21.967s > > > user 1m21.329s > > > > Well, I was also hoping for a "compared to not-so-aggressive packing" > > number on the same machine.. IOW, what I was wondering is whether there is > > a visible performance downside to the deeper delta chains in the 300MB > > pack vs the (less aggressive) 500MB pack. > > Same machine with a default pack > > jonsmirl@terra:/video/gcc/.git/objects/pack$ ls -l > total 2145716 > -r--r--r-- 1 jonsmirl jonsmirl 23667932 2007-12-07 02:03 > pack-bd163555ea9240a7fdd07d2708a293872665f48b.idx > -r--r--r-- 1 jonsmirl jonsmirl 2171385413 2007-12-07 02:03 > pack-bd163555ea9240a7fdd07d2708a293872665f48b.pack > jonsmirl@terra:/video/gcc/.git/objects/pack$ > > Delta lengths have virtually no impact. I can confirm this. I just did a repack keeping the default depth of 50 but with window=100 instead of the default of 10, and the pack shrunk from 2171385413 bytes down to 410607140 bytes. So our default window size is definitely not adequate for the gcc repo. OTOH, I recall tytso mentioning something about not having much return on a bigger window size in his tests when he proposed to increase the default delta depth to 50. So there is definitely some kind of threshold at which point the increased window size stops being advantageous wrt the number of cycles involved, and we should find a way to correlate it to the data set to have a better default window size than the current fixed default. Nicolas From rridge@csclub.uwaterloo.ca Fri Dec 7 19:38:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Fri, 07 Dec 2007 19:38:00 -0000 Subject: libiberty/pex-unix vfork abuse? Message-ID: <20071207193813.1E8F873D4D@caffeine.csclub.uwaterloo.ca> Dave Korn writes: > Perhaps we could work around this case by setting environ in the parent > before the vfork call and restoring it afterward, but we'd need kind of > serialisation there, and I don't know how to do a critical section using > pthreads/posix. A simple solution would be to call fork() instead of vfork() when changing the environment. Ross Ridge From rasky@develer.com Fri Dec 7 20:27:00 2007 From: rasky@develer.com (Giovanni Bajo) Date: Fri, 07 Dec 2007 20:27:00 -0000 Subject: Git and GCC In-Reply-To: References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> Message-ID: <4759AC8E.3070102@develer.com> On 12/7/2007 6:23 PM, Linus Torvalds wrote: >> Is SHA a significant portion of the compute during these repacks? >> I should run oprofile... > > SHA1 is almost totally insignificant on x86. It hardly shows up. But we > have a good optimized version there. > > zlib tends to be a lot more noticeable (especially the uncompression: it > may be faster than compression, but it's done _so_ much more that it > totally dominates). Have you considered alternatives, like: http://www.oberhumer.com/opensource/ucl/ -- Giovanni Bajo From rridge@csclub.uwaterloo.ca Fri Dec 7 20:37:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Fri, 07 Dec 2007 20:37:00 -0000 Subject: BITS_PER_UNIT less than 8 Message-ID: <20071207203726.4690573D41@caffeine.csclub.uwaterloo.ca> Boris Boesler writes: > Ok, so what have I to do to write a back-end where all addresses are > given in bits? Memory is addressed in bits, not bytes. So I set: > > #define BITS_PER_UNIT 1 > #define UNITS_PER_WORD 32 I don't know if it's useful to define the size of a byte to be less than 8-bits, even if that more accurately reflects the hardware. Standard C requires that the char type both be at least 8 bits (UCHAR_MAX >= 256) and the same size as a byte (sizeof(char) == 1). You can't define any types that are smaller than a char and have sizeof work correctly. >So, what can I do to get this running for my architecture? If you think there's still some benefit from having GCC use a 1-bit byte, you'll probably have to fix a number of assumptions made in the code. Things like that the size of a byte is at least 8 bits and is the same in frontend and backend. Ross Ridge From rasky@develer.com Fri Dec 7 20:49:00 2007 From: rasky@develer.com (Giovanni Bajo) Date: Fri, 07 Dec 2007 20:49:00 -0000 Subject: Git and GCC In-Reply-To: References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> Message-ID: <4759AC8E.3070102@develer.com> On 12/7/2007 6:23 PM, Linus Torvalds wrote: >> Is SHA a significant portion of the compute during these repacks? >> I should run oprofile... > > SHA1 is almost totally insignificant on x86. It hardly shows up. But we > have a good optimized version there. > > zlib tends to be a lot more noticeable (especially the uncompression: it > may be faster than compression, but it's done _so_ much more that it > totally dominates). Have you considered alternatives, like: http://www.oberhumer.com/opensource/ucl/ -- Giovanni Bajo From dberlin@dberlin.org Fri Dec 7 20:52:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Fri, 07 Dec 2007 20:52:00 -0000 Subject: [RFC/RFT] Improving SMS by data dependence export In-Reply-To: References: Message-ID: <4aca3dc20712071249u2cc5242age0fc1a3b62d6c00@mail.gmail.com> On 12/7/07, Alexander Monakov wrote: > Hi. > > Attached is the patch that allows to save dependence info obtained on tree > level by data-reference analysis for usage on RTL level (for RTL memory > disambiguation and dependence graph construction for modulo scheduling). > It helps for RTL disambiguation on platforms without base+offset memory > addressing modes, and impact on SMS is described below. We would like to > see it in 4.4 mainline. > > We have tested this patch with modulo scheduling on ia64, using SPEC > CPU2000 benchmark suite. It allows to apply software pipelining to more > loops, resulting in ~1-2% speedup (compared to SMS without exported > info). The most frequent improvements are removal of cross-iteration > memory dependencies, as currently SMS adds such dependencies for all pair > of memory references, even in cases when they cannot alias (for example, > for different arrays or different fields of a struct). As I understand, > SMS does not use RTL alias analysis here because pairs that do not alias > within one iteration, but may alias when cross-iteration movement is > performed (like a[i] and a[i+1]), should be marked as dependent. So, SMS > data dependence analysis can be greatly improved even without > data-dependence export patch by using RTL-like memory disambiguation, but > without pointer arithmetic analysis. > > There are currently two miscompiled SPEC tests with this patch; in one of > them, the problem is related to generation of register moves in the > prologue of software pipelined loop (which was not pipelined without the > patch). The problem is reported and discussed with Revital Eres from IBM > Haifa. > > We would like to ask people interested in SMS performance on PowerPC and > Cell SPU to conduct tests with this patch. Any feedback is greatly > appreciated. > I see a few random unrelated changes, like, for example: if (may_eliminate_iv (data, use, cand, &bound)) - { - elim_cost = force_var_cost (data, bound, &depends_on_elim); - /* The bound is a loop invariant, so it will be only computed - once. */ - elim_cost /= AVG_LOOP_NITER (data->current_loop); - } + elim_cost = force_var_cost (data, bound, &depends_on_elim); else elim_cost = INFTY; Please pull these out into separate patches or don't do them :) also, i see + /* We do not use operand_equal_p for ORIG_EXPRs because we need to + distinguish memory references at different points of the loop (which + would have different indices in SSA form, like a[i_1] and a[i_2], but + were later rewritten to same a[i]). */ + && (p->orig_expr == q->orig_expr)); This doesn't do enough to distinguish memory references at different points of the loop, while also eliminating from consideration that *are* the same. What if they are regular old VAR_DECL? This will still return true, but they may be different accesses at different points in the loop. In any case, this doesn't belong in mem_attrs_htab_eq, because if they are operand_equal_p, for purposes of memory attributes, they *are* equal. They may still be different accesses, which is something you have to discover later on. IE You should be doing this check somewhere else, not in a hashtable equality function :) DDR will mark them as data refs > Thanks. > > -- > Alexander Monakov > From schwab@suse.de Fri Dec 7 20:55:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Fri, 07 Dec 2007 20:55:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> (Dave Korn's message of "Fri\, 7 Dec 2007 17\:09\:41 -0000") References: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> <060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> Message-ID: "Dave Korn" writes: > Perhaps we could work around this case by setting environ in the parent > before the vfork call and restoring it afterward, but we'd need kind of > serialisation there, Do we? vfork should block the parent until the child calls execve or exit. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From gdr@cs.tamu.edu Fri Dec 7 22:14:00 2007 From: gdr@cs.tamu.edu (Gabriel Dos Reis) Date: Fri, 07 Dec 2007 22:14:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> References: <998d0e4a0712070924w3e8cbfa0pe6fbc98553e44a48@mail.gmail.com> <061401c838f8$74373880$2e08a8c0@CAM.ARTIMI.COM> <20071207175005.GB12580@synopsys.com> <998d0e4a0712071009y5b2b8bc8x2d5940461a171538@mail.gmail.com> Message-ID: <87prxidxic.fsf@soliton.cs.tamu.edu> "J.C. Pizarro" writes: | > But random suggestions based on something you read in school are useless | | You're wrong. My suggestions are not based from school and are not useless. | My suggestions are based from university, books, papers and internet What is the difference? -- Gaby From jnareb@gmail.com Fri Dec 7 22:33:00 2007 From: jnareb@gmail.com (Jakub Narebski) Date: Fri, 07 Dec 2007 22:33:00 -0000 Subject: Git and GCC In-Reply-To: <4759AC8E.3070102@develer.com> References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> <4759AC8E.3070102@develer.com> Message-ID: Giovanni Bajo writes: > On 12/7/2007 6:23 PM, Linus Torvalds wrote: > > >> Is SHA a significant portion of the compute during these repacks? > >> I should run oprofile... > > SHA1 is almost totally insignificant on x86. It hardly shows up. But > > we have a good optimized version there. > > zlib tends to be a lot more noticeable (especially the > > *uncompression*: it may be faster than compression, but it's done _so_ > > much more that it totally dominates). > > Have you considered alternatives, like: > http://www.oberhumer.com/opensource/ucl/ As compared to LZO, the UCL algorithms achieve a better compression ratio but *decompression* is a little bit slower. See below for some rough timings. It is uncompression speed that is more important, because it is used much more often. -- Jakub Narebski ShadeHawk on #git From iant@google.com Fri Dec 7 22:53:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Fri, 07 Dec 2007 22:53:00 -0000 Subject: BITS_PER_UNIT less than 8 (was: Re: BITS_PER_UNIT larger than 8 -- word addressing) In-Reply-To: <50DAF4FC-6D88-41B8-B3D5-C706A75B1930@gmx.de> References: <2649CC51-1ACF-4C2B-88BA-396CB39D4BEE@gmx.de> <50DAF4FC-6D88-41B8-B3D5-C706A75B1930@gmx.de> Message-ID: Boris Boesler writes: > Ok, so what have I to do to write a back-end where all addresses > are given in bits? That's kind of an extreme case. But it sounds like you are following the right approach. > Without these changes the compiler stops with internal error > mesages. With these changes gcc/cc1 generates a bus error. > > So, what can I do to get this running for my architecture? Well, you have to look at the generated code, find out where it is wrong, and fix it. There is no royal road to success. It's always hard to write a new gcc backend. And since your backend is so unusual, it's likely to be unusually hard. Ian From gccadmin@gcc.gnu.org Fri Dec 7 23:04:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Fri, 07 Dec 2007 23:04:00 -0000 Subject: gcc-4.3-20071207 is now available Message-ID: <20071207225314.20792.qmail@sourceware.org> Snapshot gcc-4.3-20071207 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20071207/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.3 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 130696 You'll find: gcc-4.3-20071207.tar.bz2 Complete GCC (includes all of below) gcc-core-4.3-20071207.tar.bz2 C front end and core compiler gcc-ada-4.3-20071207.tar.bz2 Ada front end and runtime gcc-fortran-4.3-20071207.tar.bz2 Fortran front end and runtime gcc-g++-4.3-20071207.tar.bz2 C++ front end and runtime gcc-java-4.3-20071207.tar.bz2 Java front end and runtime gcc-objc-4.3-20071207.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.3-20071207.tar.bz2 The GCC testsuite Diffs from 4.3-20071130 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.3 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From git@vicaya.com Fri Dec 7 23:14:00 2007 From: git@vicaya.com (Luke Lu) Date: Fri, 07 Dec 2007 23:14:00 -0000 Subject: Git and GCC In-Reply-To: References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> <4759AC8E.3070102@develer.com> Message-ID: On Dec 7, 2007, at 2:14 PM, Jakub Narebski wrote: > Giovanni Bajo writes: >> On 12/7/2007 6:23 PM, Linus Torvalds wrote: >>>> Is SHA a significant portion of the compute during these repacks? >>>> I should run oprofile... >>> SHA1 is almost totally insignificant on x86. It hardly shows up. But >>> we have a good optimized version there. >>> zlib tends to be a lot more noticeable (especially the >>> *uncompression*: it may be faster than compression, but it's done >>> _so_ >>> much more that it totally dominates). >> >> Have you considered alternatives, like: >> http://www.oberhumer.com/opensource/ucl/ > > > As compared to LZO, the UCL algorithms achieve a better compression > ratio but *decompression* is a little bit slower. See below for some > rough timings. > > > It is uncompression speed that is more important, because it is used > much more often. So why didn't we consider lzo then? It's much faster than zlib. __Luke From rasky@develer.com Fri Dec 7 23:33:00 2007 From: rasky@develer.com (Giovanni Bajo) Date: Fri, 07 Dec 2007 23:33:00 -0000 Subject: Git and GCC In-Reply-To: References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> <4759AC8E.3070102@develer.com> Message-ID: <1197069298.6118.1.camel@ozzu> On Fri, 2007-12-07 at 14:14 -0800, Jakub Narebski wrote: > > >> Is SHA a significant portion of the compute during these repacks? > > >> I should run oprofile... > > > SHA1 is almost totally insignificant on x86. It hardly shows up. But > > > we have a good optimized version there. > > > zlib tends to be a lot more noticeable (especially the > > > *uncompression*: it may be faster than compression, but it's done _so_ > > > much more that it totally dominates). > > > > Have you considered alternatives, like: > > http://www.oberhumer.com/opensource/ucl/ > > > As compared to LZO, the UCL algorithms achieve a better compression > ratio but *decompression* is a little bit slower. See below for some > rough timings. > > > It is uncompression speed that is more important, because it is used > much more often. I know, but the point is not what is the fastestest, but if it's fast enough to get off the profiles. I think UCL is fast enough since it's still times faster than zlib. Anyway, LZO is GPL too, so why not considering it too. They are good libraries. -- Giovanni Bajo From dberlin@dberlin.org Sat Dec 8 00:47:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Sat, 08 Dec 2007 00:47:00 -0000 Subject: Git and GCC In-Reply-To: <1197069298.6118.1.camel@ozzu> References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> <4759AC8E.3070102@develer.com> <1197069298.6118.1.camel@ozzu> Message-ID: <4aca3dc20712071533k3189d25dp901c5941e5326ead@mail.gmail.com> On 12/7/07, Giovanni Bajo wrote: > On Fri, 2007-12-07 at 14:14 -0800, Jakub Narebski wrote: > > > > >> Is SHA a significant portion of the compute during these repacks? > > > >> I should run oprofile... > > > > SHA1 is almost totally insignificant on x86. It hardly shows up. But > > > > we have a good optimized version there. > > > > zlib tends to be a lot more noticeable (especially the > > > > *uncompression*: it may be faster than compression, but it's done _so_ > > > > much more that it totally dominates). > > > > > > Have you considered alternatives, like: > > > http://www.oberhumer.com/opensource/ucl/ > > > > > > As compared to LZO, the UCL algorithms achieve a better compression > > ratio but *decompression* is a little bit slower. See below for some > > rough timings. > > > > > > It is uncompression speed that is more important, because it is used > > much more often. > > I know, but the point is not what is the fastestest, but if it's fast > enough to get off the profiles. I think UCL is fast enough since it's > still times faster than zlib. Anyway, LZO is GPL too, so why not > considering it too. They are good libraries. At worst, you could also use fastlz (www.fastlz.org), which is faster than all of these by a factor of 4 (and compression wise, is actually sometimes better, sometimes worse, than LZO). From harvey.harrison@gmail.com Sat Dec 8 01:49:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Sat, 08 Dec 2007 01:49:00 -0000 Subject: Git and GCC In-Reply-To: References: <4aca3dc20712051947t5fbbb383ua1727c652eb25d7e@mail.gmail.com> <20071205.202047.58135920.davem@davemloft.net> <4aca3dc20712052032n521c344cla07a5df1f2c26cb8@mail.gmail.com> <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> Message-ID: <1197074839.22471.34.camel@brick> Some interesting stats from the highly packed gcc repo. The long chain lengths very quickly tail off. Over 60% of the objects have a chain length of 20 or less. If anyone wants the full list let me know. I also have included a few other interesting points, the git default depth of 50, my initial guess of 100 and every 10% in the cumulative distribution from 60-100%. This shows the git default of 50 really isn't that bad, and after about 100 it really starts to get sparse. Harvey 1: 103817 103817 10.20% 1017922 2: 67332 171149 16.81% 3: 57520 228669 22.46% 4: 52570 281239 27.63% 5: 43910 325149 31.94% 6: 37520 362669 35.63% 7: 35248 397917 39.09% 8: 29819 427736 42.02% 9: 27619 455355 44.73% 10: 22656 478011 46.96% 11: 21073 499084 49.03% 12: 18738 517822 50.87% 13: 16674 534496 52.51% 14: 14882 549378 53.97% 15: 14424 563802 55.39% 16: 12765 576567 56.64% 17: 11662 588229 57.79% 18: 11845 600074 58.95% 19: 11694 611768 60.10% 20: 9625 621393 61.05% 34: 5354 719356 70.67% 50: 3395 785342 77.15% 60: 2547 815072 80.07% 100: 1644 898284 88.25% 113: 1292 917046 90.09% 158: 959 967429 95.04% 200: 652 997653 98.01% 219: 491 1008132 99.04% 245: 179 1017717 99.98% 246: 111 1017828 99.99% 247: 61 1017889 100.00% 248: 27 1017916 100.00% 249: 6 1017922 100.00% From joseph@codesourcery.com Sat Dec 8 01:55:00 2007 From: joseph@codesourcery.com (Joseph S. Myers) Date: Sat, 08 Dec 2007 01:55:00 -0000 Subject: BITS_PER_UNIT less than 8 In-Reply-To: <20071207203726.4690573D41@caffeine.csclub.uwaterloo.ca> References: <20071207203726.4690573D41@caffeine.csclub.uwaterloo.ca> Message-ID: On Fri, 7 Dec 2007, Ross Ridge wrote: > Boris Boesler writes: > > Ok, so what have I to do to write a back-end where all addresses are > > given in bits? Memory is addressed in bits, not bytes. So I set: > > > > #define BITS_PER_UNIT 1 > > #define UNITS_PER_WORD 32 > > I don't know if it's useful to define the size of a byte to be less than > 8-bits, even if that more accurately reflects the hardware. Standard C > requires that the char type both be at least 8 bits (UCHAR_MAX >= 256) > and the same size as a byte (sizeof(char) == 1). You can't define any > types that are smaller than a char and have sizeof work correctly. In theory GCC supports CHAR_TYPE_SIZE > BITS_PER_UNIT, so sizeof(char) is still 1 (sizeof counts in units of CHAR_TYPE_SIZE not BITS_PER_UNIT) but a char is not the hardware addressing unit. I expect this is even more broken in practice than BITS_PER_UNIT > 8. -- Joseph S. Myers joseph@codesourcery.com From davem@davemloft.net Sat Dec 8 02:21:00 2007 From: davem@davemloft.net (David Miller) Date: Sat, 08 Dec 2007 02:21:00 -0000 Subject: Git and GCC In-Reply-To: References: <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> Message-ID: <20071207.175529.104353710.davem@davemloft.net> From: Linus Torvalds Date: Fri, 7 Dec 2007 09:23:47 -0800 (PST) > > > On Fri, 7 Dec 2007, David Miller wrote: > > > > Also I could end up being performance limited by SHA, it's not very > > well tuned on Sparc. It's been on my TODO list to code up the crypto > > unit support for Niagara-2 in the kernel, then work with Herbert Xu on > > the userland interfaces to take advantage of that in things like > > libssl. Even a better C/asm version would probably improve GIT > > performance a bit. > > I doubt yu can use the hardware support. Kernel-only hw support is > inherently broken for any sane user-space usage, the setup costs are just > way way too high. To be useful, crypto engines need to support direct user > space access (ie a regular instruction, with all state being held in > normal registers that get saved/restored by the kernel). Unfortunately they are hypervisor calls, and you have to give the thing physical addresses for the buffer to work on, so letting userland get at it directly isn't currently doable. I still believe that there are cases where userland can take advantage of in-kernel crypto devices, such as when we are streaming the data into the kernel anyways (for a write() or sendmsg()) and the user just wants the transformation to be done on that stream. As a specific case, hardware crypto SSL support works quite well for sendmsg() user packet data. And this the kind of API Solaris provides to get good SSL performance with Niagara. > > Is SHA a significant portion of the compute during these repacks? > > I should run oprofile... > > SHA1 is almost totally insignificant on x86. It hardly shows up. But we > have a good optimized version there. Ok. > zlib tends to be a lot more noticeable (especially the uncompression: it > may be faster than compression, but it's done _so_ much more that it > totally dominates). zlib is really hard to optimize on Sparc, I've tried numerous times. Actually compress is the real cycle killer, and in that case the inner loop wants to dereference 2-byte shorts at a time but they are unaligned half of the time, and any the check for alignment nullifies the gains of avoiding the two byte loads. Uncompress I don't think is optimized at all on any platform with asm stuff like the compress side is. It's a pretty straightforward transformation and the memory accesses dominate the overhead. I'll do some profiling to see what might be worth looking into. From jcpiza@gmail.com Sat Dec 8 03:51:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Sat, 08 Dec 2007 03:51:00 -0000 Subject: Git and GCC Message-ID: <998d0e4a0712071821o520a75c4lbcaae92256071f48@mail.gmail.com> On 2007/12/07, "Linus Torvalds" wrote: > On Fri, 7 Dec 2007, David Miller wrote: > > > > Also I could end up being performance limited by SHA, it's not very > > well tuned on Sparc. It's been on my TODO list to code up the crypto > > unit support for Niagara-2 in the kernel, then work with Herbert Xu on > > the userland interfaces to take advantage of that in things like > > libssl. Even a better C/asm version would probably improve GIT > > performance a bit. > > I doubt yu can use the hardware support. Kernel-only hw support is > inherently broken for any sane user-space usage, the setup costs are just > way way too high. To be useful, crypto engines need to support direct user > space access (ie a regular instruction, with all state being held in > normal registers that get saved/restored by the kernel). > > > Is SHA a significant portion of the compute during these repacks? > > I should run oprofile... > > SHA1 is almost totally insignificant on x86. It hardly shows up. But we > have a good optimized version there. If SHA1 is slow then why dont he contribute adding Haval160 (3 rounds) that it's faster than SHA1? And to optimize still more it with SIMD instructions in kernelspace and userland. > > zlib tends to be a lot more noticeable (especially the uncompression: it > may be faster than compression, but it's done _so_ much more that it > totally dominates). > > Linus It's better 1. "Don't compress this repo but compact this uncompressed repo using minimal spanning forest and deltas" 2. "After, compress this whole repo with LZMA (e.g. 48MiB) from 7zip before burning it to DVD for backup reasons or before replicating it to internet". J.C.Pizarro "the noiser" From hailijuan@gmail.com Sat Dec 8 04:39:00 2007 From: hailijuan@gmail.com (Lijuan Hai) Date: Sat, 08 Dec 2007 04:39:00 -0000 Subject: Howto make another convertion with _identifiers_ following '#' in libcpp Message-ID: <48353bf60712071950k2c273b33l6529082b757444f6@mail.gmail.com> Hi all, I have a plan to convert UCN to alphabet instead of UTF8 in GCC-4.2.0, and already handled it in libcpp. But I encountered a problem when compiling the code like following: -------------------cut------------------- 1: #define str(t) #t 2: int foo() 3: { 4: char* cc = str(\u1234); 5: if (!strcmp(cc, "\u1234")) 6: abort(); 7: } -------------------cut------------------- With my changes, \u1234 is converted to alphabet in line 4 while kept in line 5. It's incorrect and also unexpected to convert it in line 4 for '#' makes it different from plain identifiers. So how could I catch the case and prevent converting it to alphabet? I believe there's someway in libcpp to handle it well. Anyone familiar with libcpp processing? Thanks in advance. Nice weekends. -- Best wishes! Yours, Lijuan Hai _ _ (_)(_) (,,) =()= ((__)\ _|L\_______/ From zackw@panix.com Sat Dec 8 06:12:00 2007 From: zackw@panix.com (Zack Weinberg) Date: Sat, 08 Dec 2007 06:12:00 -0000 Subject: Howto make another convertion with _identifiers_ following '#' in libcpp Message-ID: Lijuan Hai wrote: > > I have a plan to convert UCN to alphabet instead of UTF8 in > GCC-4.2.0, and already handled it in libcpp. I would like to offer advice, but I don't understand what you are trying to do. You say you want to "convert UCN[s] to [an] alphabet instead of UTF8" but that doesn't make any sense. Alphabets are abstract sets of glyphs commonly used to write a language. They are not alternatives to UTF8 (a scheme for encoding integers as sequences of bytes) or even to Unicode (a mapping from integers to glyphs). The only thing I can guess is that you want to convert UCNs to some specific character set other than Unicode, like EUC-JP or ISO8859.n. In that case the first thing I must ask you is to read up on the -fexec-charset option, and to explain why that doesn't do what you need it to do. > But I encountered a problem when compiling the code like following: > -------------------cut------------------- > 1: #define str(t) #t > 2: int foo() > 3: { > 4: char* cc = str(\u1234); > 5: if (!strcmp(cc, "\u1234")) > 6: abort(); > 7: } > -------------------cut------------------- > With my changes, \u1234 is converted to alphabet in line 4 while > kept in line 5. It's incorrect and also unexpected to convert it in > line 4 for '#' makes it different from plain identifiers. As I don't know what you mean by "converted to alphabet", I can't say for sure, but if I had to guess, I'd say you inserted your code into the routines for scanning identifiers? But at that point there is no way to know that there is a '#' in effect. You need to postpone the conversion, whatever it is, until much later; the point where cpplib hands off identifiers to the compiler proper, or perhaps even the assembly output macros, depending on your goal. (Have you read the long comment at the top of libcpp/charset.c? Do you understand all of the fine distinctions made there?) zw From kcpxv2009@eoy.com Sat Dec 8 11:31:00 2007 From: kcpxv2009@eoy.com (Nikky) Date: Sat, 08 Dec 2007 11:31:00 -0000 Subject: Where was gone? Message-ID: 3317947544.2715497478@eoy.com http://x-oyox.nm.ru Here that you asked taht. fshnl From jcpiza@gmail.com Sat Dec 8 12:01:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Sat, 08 Dec 2007 12:01:00 -0000 Subject: The Regents of the University of California BSD-license in GPLed GCC. Message-ID: <998d0e4a0712080331r58c2ac83mc66ce54130e9c2ca@mail.gmail.com> In GPLed GCC-4.1 branch appears a notice of BSD license gcc/config/i386/gmon-sol2.c * Copyright (c) 1991 The Regents of the University of California. * All rights reserved. ... J.C.Pizarro sincerely ;) From Johannes.Schindelin@gmx.de Sat Dec 8 12:24:00 2007 From: Johannes.Schindelin@gmx.de (Johannes Schindelin) Date: Sat, 08 Dec 2007 12:24:00 -0000 Subject: Git and GCC In-Reply-To: <4aca3dc20712071533k3189d25dp901c5941e5326ead@mail.gmail.com> References: <20071206.193121.40404287.davem@davemloft.net> <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> <4759AC8E.3070102@develer.com> <1197069298.6118.1.camel@ozzu> <4aca3dc20712071533k3189d25dp901c5941e5326ead@mail.gmail.com> Message-ID: Hi, On Fri, 7 Dec 2007, Daniel Berlin wrote: > On 12/7/07, Giovanni Bajo wrote: > > On Fri, 2007-12-07 at 14:14 -0800, Jakub Narebski wrote: > > > > > > >> Is SHA a significant portion of the compute during these > > > > >> repacks? I should run oprofile... > > > > > SHA1 is almost totally insignificant on x86. It hardly shows up. > > > > > But we have a good optimized version there. zlib tends to be a > > > > > lot more noticeable (especially the *uncompression*: it may be > > > > > faster than compression, but it's done _so_ much more that it > > > > > totally dominates). > > > > > > > > Have you considered alternatives, like: > > > > http://www.oberhumer.com/opensource/ucl/ > > > > > > > > > As compared to LZO, the UCL algorithms achieve a better > > > compression ratio but *decompression* is a little bit slower. See > > > below for some rough timings. > > > > > > > > > It is uncompression speed that is more important, because it is used > > > much more often. > > > > I know, but the point is not what is the fastestest, but if it's fast > > enough to get off the profiles. I think UCL is fast enough since it's > > still times faster than zlib. Anyway, LZO is GPL too, so why not > > considering it too. They are good libraries. > > > At worst, you could also use fastlz (www.fastlz.org), which is faster > than all of these by a factor of 4 (and compression wise, is actually > sometimes better, sometimes worse, than LZO). fastLZ is awfully short on details when it comes to a comparison of the resulting file sizes. The only result I saw was that for the (single) example they chose, compressed size was 470MB as opposed to 361MB for zip's _fastest_ mode. Really, that's not acceptable for me in the context of git. Besides, if you change the compression algorithm you will have to add support for legacy clients to _recompress_ with libz. Which most likely would make Sisyphos grin watching them servers. Ciao, Dscho From Johannes.Schindelin@gmx.de Sat Dec 8 15:06:00 2007 From: Johannes.Schindelin@gmx.de (Johannes Schindelin) Date: Sat, 08 Dec 2007 15:06:00 -0000 Subject: Git and GCC In-Reply-To: <998d0e4a0712071821o520a75c4lbcaae92256071f48@mail.gmail.com> References: <998d0e4a0712071821o520a75c4lbcaae92256071f48@mail.gmail.com> Message-ID: Hi, On Sat, 8 Dec 2007, J.C. Pizarro wrote: > On 2007/12/07, "Linus Torvalds" wrote: > > > SHA1 is almost totally insignificant on x86. It hardly shows up. But > > we have a good optimized version there. > > If SHA1 is slow then why dont he contribute adding Haval160 (3 rounds) > that it's faster than SHA1? And to optimize still more it with SIMD > instructions in kernelspace and userland. He said SHA-1 is insignificant. > > zlib tends to be a lot more noticeable (especially the uncompression: > > it may be faster than compression, but it's done _so_ much more that > > it totally dominates). > > It's better > > 1. "Don't compress this repo but compact this uncompressed repo > using minimal spanning forest and deltas" > 2. "After, compress this whole repo with LZMA (e.g. 48MiB) from 7zip before > burning it to DVD for backup reasons or before replicating it to > internet". Patches? ;-) Ciao, Dscho From hariharans@picochip.com Sat Dec 8 18:33:00 2007 From: hariharans@picochip.com (Hariharan Sandanagobalane) Date: Sat, 08 Dec 2007 18:33:00 -0000 Subject: VLIW scheduling and delayed branch Message-ID: <475AB2D4.8060409@picochip.com> Hi, I am trying to enable delayed branch scheduling on our port of Gcc for picochip (16-bit VLIW DSP). I understand that delayed-branch is run as a seperate pass after the DFA scheduling is done. We basically depend on the TImode set on the cycle-start instructions to decide what instructions form a valid VLIW. By enabling delayed-branch, it seems like the delay-branch pass takes any instruction and puts it on the delay slot. This sometimes seem to pick the TImode set instructions, but does not seem to set the TImode on the next instruction. Has anyone faced a similar problem before? Are there targets for which both VLIW and DBR are enabled? Perhaps ia64? Thanks for your help. Regards Hari From toon@moene.indiv.nluug.nl Sat Dec 8 19:45:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Sat, 08 Dec 2007 19:45:00 -0000 Subject: The Regents of the University of California BSD-license in GPLed GCC. In-Reply-To: <998d0e4a0712080331r58c2ac83mc66ce54130e9c2ca@mail.gmail.com> References: <998d0e4a0712080331r58c2ac83mc66ce54130e9c2ca@mail.gmail.com> Message-ID: <475AE377.4020008@moene.indiv.nluug.nl> J.C. Pizarro wrote: > In GPLed GCC-4.1 branch appears a notice of BSD license > gcc/config/i386/gmon-sol2.c > > * Copyright (c) 1991 The Regents of the University of California. > * All rights reserved. No doubt. And in the mean time I'm listening to: Title: Bach: Prelude and Fugue in C, BWV 531 Artist: Darren L. Slider Genre: Classical Someone who performs the above Bach prelude and fugue on a house organ. Good enough for those who appreciate Bach, whatever, and not good enough for those who prefer a church organ. If you need a church organ, turn to the Regents of the Church of California. -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 From Joe.Buck@synopsys.COM Sat Dec 8 19:54:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Sat, 08 Dec 2007 19:54:00 -0000 Subject: The Regents of the University of California BSD-license in GPLed GCC. In-Reply-To: <998d0e4a0712080331r58c2ac83mc66ce54130e9c2ca@mail.gmail.com> References: <998d0e4a0712080331r58c2ac83mc66ce54130e9c2ca@mail.gmail.com> Message-ID: <20071208194506.GA4731@synopsys.com> On Sat, Dec 08, 2007 at 12:31:43PM +0100, J.C. Pizarro wrote: > In GPLed GCC-4.1 branch appears a notice of BSD license > gcc/config/i386/gmon-sol2.c > > * Copyright (c) 1991 The Regents of the University of California. > * All rights reserved. And why are you sending this to both gcc and gcc-help? This is known, it is not news, and it is not a problem. From Joe.Buck@synopsys.COM Sat Dec 8 20:28:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Sat, 08 Dec 2007 20:28:00 -0000 Subject: Git and GCC In-Reply-To: References: <998d0e4a0712071821o520a75c4lbcaae92256071f48@mail.gmail.com> Message-ID: <20071208195352.GB4731@synopsys.com> On Sat, 8 Dec 2007, J.C. Pizarro wrote: > > 1. "Don't compress this repo but compact this uncompressed repo > > using minimal spanning forest and deltas" > > 2. "After, compress this whole repo with LZMA (e.g. 48MiB) from 7zip before > > burning it to DVD for backup reasons or before replicating it to > > internet". On Sat, Dec 08, 2007 at 12:24:00PM +0000, Johannes Schindelin wrote: > Patches? ;-) git list, meet J.C. Pizarro. Care to take him off of our hands for a while? He's been hanging on the gcc list for some time, and perhaps seeks new horizons. Mr. Pizarro has endless ideas, and he'll give you some new ones every day. He thinks that no one else knows any computer science, and he will attempt to teach you what he knows, and tell you to rewrite all of your code based on something he read and half-understood. But he's not interested in actually DOING the work, mind you; that's up to you. When you object that he's wasting your time, he'll start talking about freedom of speech. From mcostalba@gmail.com Sat Dec 8 20:49:00 2007 From: mcostalba@gmail.com (Marco Costalba) Date: Sat, 08 Dec 2007 20:49:00 -0000 Subject: Git and GCC In-Reply-To: <20071208195352.GB4731@synopsys.com> References: <998d0e4a0712071821o520a75c4lbcaae92256071f48@mail.gmail.com> <20071208195352.GB4731@synopsys.com> Message-ID: On Dec 8, 2007 8:53 PM, Joe Buck wrote: > > Mr. Pizarro has endless ideas, and he'll give you some new ones every day. That's true. > He thinks that no one else knows any computer science, and he will attempt > to teach you what he knows, It's not the only one ;-) is in good and numerous company. > But he's not interested in > actually DOING the work, mind you; that's up to you. Where did have you read this ? I missed that part. > When you object > that he's wasting your time, he'll start talking about freedom of speech. > Actually he never spoke like that (probably I missed that part too). Thanks Marco From sailer@ife.ee.ethz.ch Sat Dec 8 23:21:00 2007 From: sailer@ife.ee.ethz.ch (Thomas Sailer) Date: Sat, 08 Dec 2007 23:21:00 -0000 Subject: VLIW scheduling and delayed branch In-Reply-To: <475AB2D4.8060409@picochip.com> References: <475AB2D4.8060409@picochip.com> Message-ID: <1197146980.4613.13.camel@unreal.localdomain> > Has anyone faced a similar problem before? Are there targets for which > both VLIW and DBR are enabled? Perhaps ia64? I did something similar a few months ago. The problem is that haifa and the delayed branch scheduling passes don't really fit together. delayed branch scheduling happily undoes all the haifa decisions. The question is how much you gain by delayed branch scheduling. I don't have numbers, but it wasn't much in my case. And since your company name is picochip, you certainly value size more than speed ?! I pursued two approaches. The first one was to insert "stop bit" pseudo insns into the RTL stream in machdep reorg, so I didn't have to rely on TImode insn flags during output. But then delayed branch scheduling just took one insn out of an insn group and put it into the delay slot, meaning there was usually no cycle gain at all, just larger code size (due to insn duplication). The second approach was having lots of parallel insns (using match parallel and a custom predicate). machdep reorg then converts insn bundles into a single parallel insn. Delayed branch scheduling then does the right thing. This approach works fairly well for me, but there are a few complications. My output code is pretty hackish, as I didn't want to duplicate outputing a single insn / outputing the same insn as component of a parallel insn group. Tom From ludovic@ludovic-brenta.org Sun Dec 9 01:51:00 2007 From: ludovic@ludovic-brenta.org (Ludovic Brenta) Date: Sun, 09 Dec 2007 01:51:00 -0000 Subject: gnat1 huge time In-Reply-To: Message-ID: <8763z87oyl.fsf@ludovic-brenta.org> Having observed the bug while building a native, SJLJ version of libgnat on x86_64, I have filed PR ada/34400 for this. -- Ludovic Brenta. From dberlin@dberlin.org Sun Dec 9 07:02:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Sun, 09 Dec 2007 07:02:00 -0000 Subject: Git and GCC In-Reply-To: References: <998d0e4a0712071821o520a75c4lbcaae92256071f48@mail.gmail.com> <20071208195352.GB4731@synopsys.com> Message-ID: <4aca3dc20712081751v6c6a7c84w40d093bcac93a2bb@mail.gmail.com> > > Where did have you read this ? I missed that part. > > > When you object > > that he's wasting your time, he'll start talking about freedom of speech. > > > > Actually he never spoke like that (probably I missed that part too). > > Read gcc mailing list archives, if you have a lot of time on your hands. From ERES@il.ibm.com Sun Dec 9 08:55:00 2007 From: ERES@il.ibm.com (Revital1 Eres) Date: Sun, 09 Dec 2007 08:55:00 -0000 Subject: [RFC/RFT] Improving SMS by data dependence export In-Reply-To: Message-ID: Hi Alexander, > We would like to ask people interested in SMS performance on PowerPC and > Cell SPU to conduct tests with this patch. Any feedback is greatly > appreciated. I intend to perform testing with this patch (on ppc and SPU), after resolving the miscompilation issues mentioned above. Thanks, Revital From bviyer@ncsu.edu Sun Dec 9 13:07:00 2007 From: bviyer@ncsu.edu (Balaji V. Iyer) Date: Sun, 09 Dec 2007 13:07:00 -0000 Subject: Help with another constraint Message-ID: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> Hello Everyone, I am trying to partition register files in GCC port of Opencores (OPENRISC 1000). It is currently failing the following constraint in negdi2 (insn 15 13 16 (set (mem:SI (plus:SI (reg/f:SI 2 r2) (const_int -28 [0xffffffe4])) [0 D.1256+0 S4 A32]) (neg:SI (reg:SI 3 r3 [orig:80 D.1255 ] [80]))) 38 {negsi2} (nil) (nil)) ../../gcc-4.0.2/gcc/libgcc2.c:72: internal compiler error: in final_scan_insn, at final.c:2439 Please submit a full bug report, REGISTER R2 is the frame pointer!! I think this is because of the way I am handling the frame pointer...This is how I do it in the or32.h file #define FRAME_POINTER_REGNUM 2 #define FRAME_POINTER_REQUIRED 0 #define INITIAL_FRAME_POINTER_OFFSET(DEPTH) \ { int regno; \ int offset = 0; \ for( regno=0; regno < FIRST_PSEUDO_REGISTER; regno++ ) \ if( regs_ever_live[regno] && !call_used_regs[regno] ) \ offset += 4; \ (DEPTH) = (!current_function_is_leaf || regs_ever_live[LINK_REGNUM] ? 4 : 0) + \ (frame_pointer_needed ? 4 : 0) + \ offset + \ OR32_ALIGN(current_function_outgoing_args_size,4) + \ OR32_ALIGN(get_frame_size(),4); \ } #define FIX_FRAME_POINTER_ADDRESS(ADDR,DEPTH) \ { int offset = -1; \ rtx regs = stack_pointer_rtx; \ if (ADDR == frame_pointer_rtx) \ offset = 0; \ else if (GET_CODE (ADDR) == PLUS && XEXP (ADDR, 1) == frame_pointer_rtx \ && GET_CODE (XEXP (ADDR, 0)) == CONST_INT) \ offset = INTVAL (XEXP (ADDR, 0)); \ else if (GET_CODE (ADDR) == PLUS && XEXP (ADDR, 0) == frame_pointer_rtx \ && GET_CODE (XEXP (ADDR, 1)) == CONST_INT) \ offset = INTVAL (XEXP (ADDR, 1)); \ else if (GET_CODE (ADDR) == PLUS && XEXP (ADDR, 0) == frame_pointer_rtx) \ { rtx other_reg = XEXP (ADDR, 1); \ offset = 0; \ regs = gen_rtx (PLUS, Pmode, stack_pointer_rtx, other_reg); } \ else if (GET_CODE (ADDR) == PLUS && XEXP (ADDR, 1) == frame_pointer_rtx) \ { rtx other_reg = XEXP (ADDR, 0); \ offset = 0; \ regs = gen_rtx (PLUS, Pmode, stack_pointer_rtx, other_reg); } \ if (offset >= 0) \ { int regno; \ extern char call_used_regs[]; \ offset += 4; /* I don't know why??? */ \ for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) \ if (regs_ever_live[regno] && ! call_used_regs[regno]) \ offset += 4; \ ADDR = plus_constant (regs, offset + (DEPTH)); } } What am I doing wrong?? ANy help is highly highly appreciated! Yours Sincerely, Balaji V. Iyer. -- Balaji V. Iyer PhD Student, Center for Efficient, Scalable and Reliable Computing, Department of Electrical and Computer Engineering, North Carolina State University. From rask@sygehus.dk Sun Dec 9 14:28:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Sun, 09 Dec 2007 14:28:00 -0000 Subject: Help with another constraint In-Reply-To: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> References: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> Message-ID: <20071209130740.GI17368@sygehus.dk> On Sun, Dec 09, 2007 at 03:55:36AM -0500, Balaji V. Iyer wrote: > Hello Everyone, > I am trying to partition register files in GCC port of Opencores > (OPENRISC 1000). It is currently failing the following constraint in > negdi2 > > (insn 15 13 16 (set (mem:SI (plus:SI (reg/f:SI 2 r2) ^^^ > (const_int -28 [0xffffffe4])) [0 D.1256+0 S4 A32]) > (neg:SI (reg:SI 3 r3 [orig:80 D.1255 ] [80]))) 38 {negsi2} (nil) > (nil)) > ../../gcc-4.0.2/gcc/libgcc2.c:72: internal compiler error: in > final_scan_insn, at final.c:2439 > Please submit a full bug report, +(define_insn "negsi2" + [(set (match_operand:SI 0 "register_operand" "=r") ^^^^^^^^^^^^^^^^ + (neg:SI (match_operand:SI 1 "register_operand" "r")))] + "" + "l.sub \t%0,r0,%1" + [(set_attr "type" "add") + (set_attr "length" "1")]) How did that happen? Look at the dump files. Btw, what is the error message above the insn dump? -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From suj.ranga@gmail.com Sun Dec 9 16:35:00 2007 From: suj.ranga@gmail.com (S.Reng) Date: Sun, 09 Dec 2007 16:35:00 -0000 Subject: Fwd: Cross compiler build stops In-Reply-To: <7a2580440712081952w241fd7e4u74a2304e2a83622a@mail.gmail.com> References: <7a2580440712081952w241fd7e4u74a2304e2a83622a@mail.gmail.com> Message-ID: <7a2580440712090628k5de74662jb1e2daed95fa3e26@mail.gmail.com> Hi, I am using latest cygwin to build cross gcc for linux. I am using crosstool-0.43 with demo-i686.sh modified to have as given below. I dont know if I am right in this config. or any this else is need, but build stop with attached error. I have also posted the question in crossgcc group. As seen I have latest ld in cygwin. Thanks a lot. in demo-i686.sh eval `cat i686.dat gcc-4.2.2-glibc-2.7-tls.dat` sh all.sh --notest --nounpack gcc-4.2.2-glibc-2.7-tls.dat IS: BINUTILS_DIR=binutils-2.18 GCC_CORE_DIR=gcc-4.2.2 GCC_DIR=gcc-4.2.2 GLIBC_DIR=glibc-2.7 LINUX_DIR=linux-2.6.23.9 LINUX_SANITIZED_HEADER_DIR=linux-libc-headers-2.6.12.0 GLIBCTHREADS_FILENAME=glibc-linuxthreads-2.5 GDB_DIR=gdb-6.7.1 But the script stops + /cygdrive/e/crosstool-0.43/build/i686-unknown-linux-gnu/gcc-4.2.2-glibc-2.7/glibc-2.7/configure --prefix=/usr --build=i686-pc-cygwin --host=i686-unknown linux-gnu --without-cvs --disable-sanity-checks --with-headers=/cygdrive/e/crosstool/gcc- 4.2.2-glibc-2.7/i686-unknown-linux-gnu/i686-unknown-linux-gnu/inc ude --enable-hacker-mode checking build system type... i686-pc-cygwin checking host system type... i686-unknown-linux-gnu configure: running configure fragment for add-on linuxthreads linuxthreads disabled because nptl add-on is also in use configure: running configure fragment for add-on nptl checking sysdep dirs... sysdeps/i386/elf nptl/sysdeps/unix/sysv/linux/i386/i686 nptl/sysdeps/unix/sysv/linux/i386 sysdeps/unix/sysv/linux/i386 nptl/sysdep /unix/sysv/linux nptl/sysdeps/pthread sysdeps/pthread sysdeps/unix/sysv/linux sysdeps/gnu sysdeps/unix/common sysdeps/unix/mman sysdeps/unix/inet sysdeps/ nix/sysv/i386 nptl/sysdeps/unix/sysv sysdeps/unix/sysv sysdeps/unix/i386 nptl/sysdeps/unix sysdeps/unix sysdeps/posix sysdeps/i386/i686/fpu nptl/sysdeps/i 86/i686 sysdeps/i386/i686 sysdeps/i386/i486 nptl/sysdeps/i386/i486 sysdeps/i386/fpu nptl/sysdeps/i386 sysdeps/i386 sysdeps/wordsize-32 sysdeps/ieee754/ldb -96 sysdeps/ieee754/dbl-64 sysdeps/ieee754/flt-32 sysdeps/ieee754 sysdeps/generic/elf sysdeps/generic checking for a BSD-compatible install... /usr/bin/install -c checking whether ln -s works... yes checking for i686-unknown-linux-gnu-gcc... gcc checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking for gcc... gcc checking how to run the C preprocessor... gcc -E checking for i686-unknown-linux-gnu-g++... no checking for i686-unknown-linux-gnu-c++... no checking for i686-unknown-linux-gnu-gpp... no checking for i686-unknown-linux-gnu-aCC... no checking for i686-unknown-linux-gnu-CC... no checking for i686-unknown-linux-gnu-cxx... no checking for i686-unknown-linux-gnu-cc++... no checking for i686-unknown-linux-gnu-cl.exe... no checking for i686-unknown-linux-gnu-FCC... no checking for i686-unknown-linux-gnu-KCC... no checking for i686-unknown-linux-gnu-RCC... no checking for i686-unknown-linux-gnu-xlC_r... no checking for i686-unknown-linux-gnu-xlC... no checking for g++... g++ configure: WARNING: In the future, Autoconf will not detect cross-tools whose name does not start with the host triplet. If you think this configuration is useful to you, please write to autoconf@gnu.org. checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking whether /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/as.exe is GNU as... yes checking whether /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld.exe is GNU ld... yes checking for /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/as.exe... /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/a .exe checking version of /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/as.exe... 2.17.50, ok checking for /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld.exe... /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/l .exe checking version of /usr/lib/gcc/i686-pc-cygwin/3.4.4/../../../../i686-pc-cygwin/bin/ld.exe... 2.17.50, ok checking for pwd... /usr/bin/pwd checking for i686-unknown-linux-gnu-gcc... (cached) gcc checking version of gcc... 3.4.4, ok checking for gnumake... no checking for gmake... no checking for make... make checking version of make... 3.81, ok checking for gnumsgfmt... no checking for gmsgfmt... no checking for msgfmt... msgfmt checking version of msgfmt... 0.15, ok checking for makeinfo... makeinfo checking version of makeinfo... 4.8, ok checking for sed... sed checking version of sed... 4.1.5, ok checking for autoconf... autoconf checking whether autoconf works... yes checking whether ranlib is necessary... yes checking LD_LIBRARY_PATH variable... ok checking whether GCC supports -static-libgcc... -static-libgcc checking for bash... /usr/bin/bash checking for gawk... gawk checking for perl... /usr/bin/perl checking for install-info... /usr/bin/install-info checking for bison... /usr/bin/bison checking for signed size_t type... no checking for libc-friendly stddef.h... yes checking whether we need to use -P to assemble .S files... no checking whether .text pseudo-op must be used... yes checking for assembler global-symbol directive... .globl checking for .set assembler directive... no checking for assembler .type directive prefix... no checking for .symver assembler directive... no checking for ld --version-script... no *** WARNING: You should not compile GNU libc without versioning. Not using *** versioning will introduce incompatibilities so that old binaries *** will not run anymore. *** For versioning you need recent binutils (binutils-2.8.1.0.23 or newer). checking for .previous assembler directive... no checking for .popsection assembler directive... no checking for .protected and .hidden assembler directive... configure: error: assembler support for symbol visibility is required # my system is $ uname -a CYGWIN_NT-5.1 D3DTQM51 1.5.24(0.156/4/2) 2007-01-31 10:57 i686 Cygwin with $ ld -v GNU ld version 2.17.50 20060817 Thanks a lot, Suj.Renga From bviyer@ncsu.edu Sun Dec 9 22:06:00 2007 From: bviyer@ncsu.edu (Balaji V. Iyer) Date: Sun, 09 Dec 2007 22:06:00 -0000 Subject: Help with another constraint In-Reply-To: <20071209130740.GI17368@sygehus.dk> References: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> <20071209130740.GI17368@sygehus.dk> Message-ID: <000d01c83a81$85462ca0$33160e98@ece.ncsu.edu> Hello Rask, I am not understanding your response, can you clarify it for me? As per the question about the error message above? ../../gcc-4.0.2/gcc/libgcc2.c -o libgcc/./_negdi2.o ../../gcc-4.0.2/gcc/libgcc2.c: In function '__negdi2': ../../gcc-4.0.2/gcc/libgcc2.c:72: error: insn does not satisfy its constraints: (insn 15 13 16 (set (mem:SI (plus:SI (reg/f:SI 2 r2) (const_int -28 [0xffffffe4])) [0 D.1256+0 S4 A32]) (neg:SI (reg:SI 3 r3 [orig:80 D.1255 ] [80]))) 38 {negsi2} (nil) (nil)) ../../gcc-4.0.2/gcc/libgcc2.c:72: internal compiler error: in final_scan_insn, at final.c:2439 Please submit a full bug report, with preprocessed source if appropriate. See for instructions. make[2]: *** [libgcc/./_negdi2.o] Error 1 make[2]: Leaving directory -Balaji V. Iyer. -- Balaji V. Iyer PhD Student, Center for Efficient, Scalable and Reliable Computing, Department of Electrical and Computer Engineering, North Carolina State University. -----Original Message----- From: Rask Ingemann Lambertsen [mailto:rask@sygehus.dk] Sent: Sunday, December 09, 2007 8:08 AM To: Balaji V. Iyer Cc: gcc@gcc.gnu.org; openrisc@opencores.org Subject: Re: Help with another constraint On Sun, Dec 09, 2007 at 03:55:36AM -0500, Balaji V. Iyer wrote: > Hello Everyone, > I am trying to partition register files in GCC port of Opencores > (OPENRISC 1000). It is currently failing the following constraint in > negdi2 > > (insn 15 13 16 (set (mem:SI (plus:SI (reg/f:SI 2 r2) ^^^ > (const_int -28 [0xffffffe4])) [0 D.1256+0 S4 A32]) > (neg:SI (reg:SI 3 r3 [orig:80 D.1255 ] [80]))) 38 {negsi2} (nil) > (nil)) > ../../gcc-4.0.2/gcc/libgcc2.c:72: internal compiler error: in > final_scan_insn, at final.c:2439 Please submit a full bug report, +(define_insn "negsi2" + [(set (match_operand:SI 0 "register_operand" "=r") ^^^^^^^^^^^^^^^^ + (neg:SI (match_operand:SI 1 "register_operand" "r")))] "" + "l.sub \t%0,r0,%1" + [(set_attr "type" "add") + (set_attr "length" "1")]) How did that happen? Look at the dump files. Btw, what is the error message above the insn dump? -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From ghazi@caip.rutgers.edu Sun Dec 9 23:21:00 2007 From: ghazi@caip.rutgers.edu (Kaveh R. GHAZI) Date: Sun, 09 Dec 2007 23:21:00 -0000 Subject: Revisiting GCC's minimum MPFR version Message-ID: As requested by Richard G here: http://gcc.gnu.org/ml/gcc-patches/2007-05/msg00945.html I'm re-visiting during stage3 the minimum MPFR version required by GCC. At the time of the above post, mpfr-2.3.0 had not yet been released, but it was this past August, and one can obtain it here: http://www.mpfr.org/mpfr-current The current situation is that GCC requires only mpfr-2.2.0, however it recommends mpfr-2.2.1 in the documentation and configure checks. (If configure find 2.2.0, it will say something like "buggy but acceptable" and continue bootstrapping). Also, there is some functionality for builtin bessel, remquo and gamma functions that is only active when mpfr-2.3.0 is available. The testcase gcc.dg/torture/builtin-math-4.c for these mpfr-2.3.0 functions is XFAILed at the moment. Our options include: 1. Do nothing. Things work, don't break it. Revisit again in stage1. 2. Continue accepting 2.2.0, but update the recommended version from 2.2.1 to 2.3.0. This would entail updating the configure warning, the docs and the removing the XFAIL from the testcase. This option would cause no change in hard bootstrap requirements. 3. In addtion to #2, hard fail for anything less than mpfr-2.3.0. I have no strong opinion on which way to go. Thoughts? --Kaveh -- Kaveh R. Ghazi ghazi@caip.rutgers.edu From richard.guenther@gmail.com Mon Dec 10 00:37:00 2007 From: richard.guenther@gmail.com (Richard Guenther) Date: Mon, 10 Dec 2007 00:37:00 -0000 Subject: Revisiting GCC's minimum MPFR version In-Reply-To: References: Message-ID: <84fc9c000712091521w11b4ef73i4418cb873496a579@mail.gmail.com> On Dec 9, 2007 11:05 PM, Kaveh R. GHAZI wrote: > As requested by Richard G here: > http://gcc.gnu.org/ml/gcc-patches/2007-05/msg00945.html > > I'm re-visiting during stage3 the minimum MPFR version required by GCC. > At the time of the above post, mpfr-2.3.0 had not yet been released, but > it was this past August, and one can obtain it here: > http://www.mpfr.org/mpfr-current > > The current situation is that GCC requires only mpfr-2.2.0, however it > recommends mpfr-2.2.1 in the documentation and configure checks. (If > configure find 2.2.0, it will say something like "buggy but acceptable" > and continue bootstrapping). > > Also, there is some functionality for builtin bessel, remquo and gamma > functions that is only active when mpfr-2.3.0 is available. The testcase > gcc.dg/torture/builtin-math-4.c for these mpfr-2.3.0 functions is XFAILed > at the moment. > > > Our options include: > > 1. Do nothing. Things work, don't break it. Revisit again in stage1. > > 2. Continue accepting 2.2.0, but update the recommended version from > 2.2.1 to 2.3.0. This would entail updating the configure warning, > the docs and the removing the XFAIL from the testcase. This > option would cause no change in hard bootstrap requirements. > > 3. In addtion to #2, hard fail for anything less than mpfr-2.3.0. > > > I have no strong opinion on which way to go. I would update the recommended version to 2.3.0 and fail for anything less than 2.2.1. Richard. From mark@codesourcery.com Mon Dec 10 01:48:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Mon, 10 Dec 2007 01:48:00 -0000 Subject: Revisiting GCC's minimum MPFR version In-Reply-To: <84fc9c000712091521w11b4ef73i4418cb873496a579@mail.gmail.com> References: <84fc9c000712091521w11b4ef73i4418cb873496a579@mail.gmail.com> Message-ID: <475C8A24.9060901@codesourcery.com> Richard Guenther wrote: > I would update the recommended version to 2.3.0 and fail for anything less > than 2.2.1. I agree. Not optimizing bessel functions as builtins doesn't bother me too much, but we might as well move past the buggy version. Thanks, -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From gdr@cs.tamu.edu Mon Dec 10 07:25:00 2007 From: gdr@cs.tamu.edu (Gabriel Dos Reis) Date: Mon, 10 Dec 2007 07:25:00 -0000 Subject: Revisiting GCC's minimum MPFR version In-Reply-To: <84fc9c000712091521w11b4ef73i4418cb873496a579@mail.gmail.com> References: <84fc9c000712091521w11b4ef73i4418cb873496a579@mail.gmail.com> Message-ID: <871w9vb953.fsf@soliton.cs.tamu.edu> "Richard Guenther" writes: | I would update the recommended version to 2.3.0 and fail for anything less | than 2.2.1. Yes, that makes sense to me. I don't think we should require 2.3.0. -- Gaby From ashitpro@yahoo.co.in Mon Dec 10 08:44:00 2007 From: ashitpro@yahoo.co.in (ashish mahamuni) Date: Mon, 10 Dec 2007 08:44:00 -0000 Subject: Regarding Message-ID: <684158.87440.qm@web94115.mail.in2.yahoo.com> Hi All, Can I know the list of file names in gcc source where endianness support is implemented Thanks.. Ashish Unlimited freedom, unlimited storage. Get it now, on http://help.yahoo.com/l/in/yahoo/mail/yahoomail/tools/tools-08.html/ From ddaney@avtrex.com Mon Dec 10 09:53:00 2007 From: ddaney@avtrex.com (David Daney) Date: Mon, 10 Dec 2007 09:53:00 -0000 Subject: Regarding In-Reply-To: <684158.87440.qm@web94115.mail.in2.yahoo.com> References: <684158.87440.qm@web94115.mail.in2.yahoo.com> Message-ID: <475CFC55.2030806@avtrex.com> ashish mahamuni wrote: > Hi All, > > Can I know the list of file names in gcc source where > endianness support is implemented > > Yes, from the root of the gcc source tree do something like this: grep -r _ENDIAN * | grep -v .svn David Daney From gabriele.svelto@st.com Mon Dec 10 09:54:00 2007 From: gabriele.svelto@st.com (Gabriele SVELTO) Date: Mon, 10 Dec 2007 09:54:00 -0000 Subject: Inserting arbitrary GIMPLE statements & alias analysis Message-ID: <475D0C91.1040006@st.com> Hi everybody, I'm working on a pass for the CLI back-end which 'simplifies' GIMPLE code before entering the tree-ssa passes in order to simplify and improva CLI emission by removing or simplifying nodes which don't have a corresponding straightforward implementation in CLI. The pass runs between pass_lower_eh and pass_build_cfg and replaces some GIMPLE nodes with more-or-less arbitrary GIMPLE. However a problem has arisen when I replace COMPONENT_REFs accessing bit-fields with explicit load-mask or load-mask-store sequences, it seems that GCC loses track of pointer aliasing, here's an example from the testsuite (gcc.dg/tree-ssa/alias-14.c compiled with -O2). The original code is: struct s { long long a:12; long long b:12; long long c:40; }; struct s s, *p = &s; int main () { p->a = 1; s.a = 0; s.b = 0; return p->a + s.b; } What gets out of lower_eh is this: main () { int D.1519; D.1518; int D.1517; D.1516; int D.1515; struct s * p.0; p.0 = p; p.0->a = 1; s.a = 0; s.b = 0; p.0 = p; D.1516 = p.0->a; D.1517 = (int) D.1516; D.1518 = s.b; D.1519 = (int) D.1518; D.1515 = D.1517 + D.1519; goto ; :; return D.1515; } which my pass turns main() into this: ;; Function main (main) main () { struct s * cilsimp.18; long long int * cilsimp.17; long long int cilsimp.16; struct s * cilsimp.15; long long int * cilsimp.14; long long int cilsimp.13; struct s * cilsimp.12; long long int * cilsimp.11; long long int cilsimp.10; long long int cilsimp.9; struct s * cilsimp.8; long long int * cilsimp.7; long long int cilsimp.6; long long int cilsimp.5; struct s * cilsimp.4; long long int * cilsimp.3; long long int cilsimp.2; long long int cilsimp.1; int D.1519; D.1518; int D.1517; D.1516; int D.1515; struct s * p.0; p.0 = p; cilsimp.4 = p.0; cilsimp.3 = (long long int *) cilsimp.4; cilsimp.1 = *cilsimp.3; cilsimp.1 = cilsimp.1 & -4096; cilsimp.1 = cilsimp.1 | 1; *cilsimp.3 = cilsimp.1; cilsimp.8 = &s; cilsimp.7 = (long long int *) cilsimp.8; cilsimp.5 = *cilsimp.7; cilsimp.5 = cilsimp.5 & -4096; cilsimp.5 = cilsimp.5 | 0; *cilsimp.7 = cilsimp.5; cilsimp.12 = &s; cilsimp.11 = (long long int *) cilsimp.12; cilsimp.9 = *cilsimp.11; cilsimp.9 = cilsimp.9 & -16773121; cilsimp.9 = cilsimp.9 | 0; *cilsimp.11 = cilsimp.9; p.0 = p; cilsimp.15 = p.0; cilsimp.14 = (long long int *) cilsimp.15; cilsimp.13 = *cilsimp.14; cilsimp.13 = cilsimp.13 << 52; cilsimp.13 = cilsimp.13 >> 52; D.1516 = () cilsimp.13; D.1517 = (int) D.1516; cilsimp.18 = &s; cilsimp.17 = (long long int *) cilsimp.18; cilsimp.16 = *cilsimp.17; cilsimp.16 = cilsimp.16 << 40; cilsimp.16 = cilsimp.16 >> 52; D.1518 = () cilsimp.16; D.1519 = (int) D.1518; D.1515 = D.1517 + D.1519; goto ; :; return D.1515; } ... and later FRE into this: main () { long long int * cilsimp.17; long long int cilsimp.16; struct s * cilsimp.15; long long int * cilsimp.14; long long int cilsimp.13; long long int * cilsimp.11; long long int cilsimp.9; long long int * cilsimp.7; long long int cilsimp.5; struct s * cilsimp.4; long long int * cilsimp.3; long long int cilsimp.1; int D.1519; D.1518; int D.1517; D.1516; int D.1515; : cilsimp.4_1 = p; cilsimp.3_3 = (long long int *) cilsimp.4_1; cilsimp.1_4 = *cilsimp.3_3; cilsimp.1_5 = cilsimp.1_4 & -4096; cilsimp.1_6 = cilsimp.1_5 | 1; *cilsimp.3_3 = cilsimp.1_6; cilsimp.7_8 = (long long int *) &s; cilsimp.5_9 = *cilsimp.7_8; cilsimp.5_10 = cilsimp.5_9 & -4096; cilsimp.5_11 = cilsimp.5_10; *cilsimp.7_8 = cilsimp.5_11; cilsimp.11_13 = cilsimp.7_8; cilsimp.9_14 = cilsimp.5_10; cilsimp.9_15 = cilsimp.9_14 & -16773121; cilsimp.9_16 = cilsimp.9_15; *cilsimp.11_13 = cilsimp.9_16; cilsimp.15_17 = cilsimp.4_1; cilsimp.14_19 = cilsimp.3_3; cilsimp.13_20 = cilsimp.1_6; cilsimp.13_21 = cilsimp.13_20 << 52; cilsimp.13_22 = cilsimp.13_21 >> 52; D.1516_23 = () cilsimp.13_22; D.1517_24 = (int) D.1516_23; cilsimp.17_26 = cilsimp.7_8;looking at the other passes didn't provide cilsimp.16_27 = cilsimp.9_15; cilsimp.16_28 = cilsimp.16_27 << 40; cilsimp.16_29 = cilsimp.16_28 >> 52; D.1518_30 = () cilsimp.16_29; D.1519_31 = (int) D.1518_30; D.1515_32 = D.1517_24 + D.1519_31; return D.1515_32; } The problem is that FRE optimizes away the explicit load used for getting the value of p->a and replaces it with the constant value assigned in the first line of main (1). This is wrong because the assignment s.a = 0 overwrites p->a however it seems that FRE doesn't realize that the pointers simpcil.3 and simpcil.7 are aliases and that the assignment *cilsimp.7 = cilsimp.5; overwrites the value of p->a with 0. I believe I must be doing something horribly wrong which breaks alias analysis. and I'm not sure when this is information is built in the first place and how to keep it up to date with the transformed code. Sorry for the long post but I'm really stuck and even looking at the other passes and internal documentation didn't provide much clues about how to deal with this problem. Gabriele Svelto From paubert@iram.es Mon Dec 10 09:57:00 2007 From: paubert@iram.es (Gabriel Paubert) Date: Mon, 10 Dec 2007 09:57:00 -0000 Subject: Git and GCC In-Reply-To: <1197074839.22471.34.camel@brick> References: <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <1197074839.22471.34.camel@brick> Message-ID: <20071210095426.GA32611@iram.es> On Fri, Dec 07, 2007 at 04:47:19PM -0800, Harvey Harrison wrote: > Some interesting stats from the highly packed gcc repo. The long chain > lengths very quickly tail off. Over 60% of the objects have a chain > length of 20 or less. If anyone wants the full list let me know. I > also have included a few other interesting points, the git default > depth of 50, my initial guess of 100 and every 10% in the cumulative > distribution from 60-100%. > > This shows the git default of 50 really isn't that bad, and after > about 100 it really starts to get sparse. Do you have a way to know which files have the longest chains? I have a suspiscion that the ChangeLog* files are among them, not only because they are, almost without exception, only modified by prepending text to the previous version (and a fairly small amount compared to the size of the file), and therefore the diff is simple (a single hunk) so that the limit on chain depth is probably what causes a new copy to be created. Besides that these files grow quite large and become some of the largest files in the tree, and at least one of them is changed for every commit. This leads again to many versions of fairly large files. If this guess is right, this implies that most of the size gains from longer chains comes from having less copies of the ChangeLog* files. From a performance point of view, it is rather favourable since the differences are simple. This would also explain why the window parameter has little effect. Regards, Gabriel From ashitpro@yahoo.co.in Mon Dec 10 09:57:00 2007 From: ashitpro@yahoo.co.in (ashish mahamuni) Date: Mon, 10 Dec 2007 09:57:00 -0000 Subject: Using -mlittle-endian or -mbig-endian options.... Message-ID: <890571.85786.qm@web94110.mail.in2.yahoo.com> Hi, I am working on Intel i686 machine I've Hello_World.c file. When I give following command compiler gives error that Invalid Option. gcc -mlittle-endian Hello_World.c or gcc -mlittle-endian Hello_World.c I am using 4.2 version of gcc (Latest one I guess). How can I use this options? Thanks Ashish Bollywood, fun, friendship, sports and more. You name it, we have it on http://in.promos.yahoo.com/groups From davem@davemloft.net Mon Dec 10 10:15:00 2007 From: davem@davemloft.net (David Miller) Date: Mon, 10 Dec 2007 10:15:00 -0000 Subject: Git and GCC In-Reply-To: <20071207.045329.204650714.davem@davemloft.net> References: <20071207063848.GA13101@coredump.intra.peff.net> <9e4733910712062310s30153afibc44a5550fd9ea99@mail.gmail.com> <20071207.045329.204650714.davem@davemloft.net> Message-ID: <20071210.015749.204978503.davem@davemloft.net> From: David Miller Date: Fri, 07 Dec 2007 04:53:29 -0800 (PST) > I should run oprofile... While doing the initial object counting, most of the time is spent in lookup_object(), memcmp() (via hashcmp()), and inflate(). I tried to see if I could do some tricks on sparc with the hashcmp() but the sha1 pointers are very often not even 4 byte aligned. I suspect lookup_object() could be improved if it didn't use a hash table without chaining, but I can see why 'struct object' size is a concern and thus why things are done the way they are. samples % app name symbol name 504 13.7517 libc-2.6.1.so memcmp 386 10.5321 libz.so.1.2.3.3 inflate 288 7.8581 git lookup_object 248 6.7667 libz.so.1.2.3.3 inflate_fast 201 5.4843 libz.so.1.2.3.3 inflate_table 175 4.7749 git decode_tree_entry ... Deltifying is %94 consumed by create_delta(), the rest is completely in the noise. samples % app name symbol name 10581 94.8373 git create_delta 181 1.6223 git create_delta_index 72 0.6453 git prepare_pack 55 0.4930 libc-2.6.1.so loop 34 0.3047 libz.so.1.2.3.3 inflate_fast 33 0.2958 libc-2.6.1.so _int_malloc 22 0.1972 libshadow.so shadowUpdatePacked 21 0.1882 libc-2.6.1.so _int_free 19 0.1703 libc-2.6.1.so malloc ... From pranav.bhandarkar@gmail.com Mon Dec 10 10:18:00 2007 From: pranav.bhandarkar@gmail.com (Pranav Bhandarkar) Date: Mon, 10 Dec 2007 10:18:00 -0000 Subject: VLIW scheduling and delayed branch In-Reply-To: <1197146980.4613.13.camel@unreal.localdomain> References: <475AB2D4.8060409@picochip.com> <1197146980.4613.13.camel@unreal.localdomain> Message-ID: <649555d50712100215m4c16c8ccha0285519b8315a6c@mail.gmail.com> On Dec 9, 2007 2:19 AM, Thomas Sailer wrote: > > Has anyone faced a similar problem before? Are there targets for which > > both VLIW and DBR are enabled? Perhaps ia64? > Ok, this was a long time back, but Yes I have faced a similar problem. We disabled delayed branch scheduling and used the machdep reorg pass. We examined the dependencies of the branch instructions moving backwards from the branch instruction and marking all the instructions ( and the containing insn bundle) that the branch depended upon. Then again, moving backwards from the branch insn, we picked the first insn bundle with all unmarked insns ( and cycle size of the bundle <= no of delay slots of a branch insn ) and put that bundle into the delay slot. This approach worked fine for the small testcases that we had, but we really didnt test this on any monstrous piece of software. We implemented this for the TMS320C6x VLIW DSP. HTH, Pranav From hariharans@picochip.com Mon Dec 10 10:45:00 2007 From: hariharans@picochip.com (Hariharan Sandanagobalane) Date: Mon, 10 Dec 2007 10:45:00 -0000 Subject: VLIW scheduling and delayed branch In-Reply-To: <1197146980.4613.13.camel@unreal.localdomain> References: <475AB2D4.8060409@picochip.com> <1197146980.4613.13.camel@unreal.localdomain> Message-ID: <475D124D.2030509@picochip.com> Hi thomas, Thanks for your reply. A couple of questions below. Thomas Sailer wrote: >> Has anyone faced a similar problem before? Are there targets for which >> both VLIW and DBR are enabled? Perhaps ia64? > > I did something similar a few months ago. What was your target? Is the target code available in Gcc mainline? If not, could you pass your code to me? > > The problem is that haifa and the delayed branch scheduling passes don't > really fit together. delayed branch scheduling happily undoes all the > haifa decisions. > > The question is how much you gain by delayed branch scheduling. I don't > have numbers, but it wasn't much in my case. And since your company name > is picochip, you certainly value size more than speed ?! Yeah. We do. But, in our architecture, a branch has to have a delay slot instruction anyway. In the absence of one, we put a "nop" in there. If GCC manages to move a "single" instruction vliw into the delay slot, we would benefit in both size and speed, otherwise, we will just have no impact on either. > > I pursued two approaches. The first one was to insert "stop bit" pseudo > insns into the RTL stream in machdep reorg, so I didn't have to rely on > TImode insn flags during output. But then delayed branch scheduling just > took one insn out of an insn group and put it into the delay slot, > meaning there was usually no cycle gain at all, just larger code size > (due to insn duplication). This seems fairly straightforward to implement. > > The second approach was having lots of parallel insns (using match > parallel and a custom predicate). machdep reorg then converts insn > bundles into a single parallel insn. Delayed branch scheduling then does > the right thing. This approach works fairly well for me, but there are a > few complications. My output code is pretty hackish, as I didn't want to > duplicate outputing a single insn / outputing the same insn as component > of a parallel insn group. When do you un-parallel those instructions? And, how? Regards Hari > > Tom > From richard.guenther@gmail.com Mon Dec 10 11:39:00 2007 From: richard.guenther@gmail.com (Richard Guenther) Date: Mon, 10 Dec 2007 11:39:00 -0000 Subject: Inserting arbitrary GIMPLE statements & alias analysis In-Reply-To: <475D0C91.1040006@st.com> References: <475D0C91.1040006@st.com> Message-ID: <84fc9c000712100245q748ba5a1t2c1216af20824dff@mail.gmail.com> On Dec 10, 2007 10:53 AM, Gabriele SVELTO wrote: > Hi everybody, > I'm working on a pass for the CLI back-end which 'simplifies' GIMPLE code before > entering the tree-ssa passes in order to simplify and improva CLI emission by > removing or simplifying nodes which don't have a corresponding straightforward > implementation in CLI. The pass runs between pass_lower_eh and pass_build_cfg > and replaces some GIMPLE nodes with more-or-less arbitrary GIMPLE. > However a problem has arisen when I replace COMPONENT_REFs accessing > bit-fields with explicit load-mask or load-mask-store sequences, it seems that > GCC loses track of pointer aliasing, here's an example from the testsuite > (gcc.dg/tree-ssa/alias-14.c compiled with -O2). The original code is: > > struct s > { > long long a:12; > long long b:12; > long long c:40; > }; > > struct s s, *p = &s; > > int > main () > { > p->a = 1; > s.a = 0; > s.b = 0; > return p->a + s.b; > } > > > What gets out of lower_eh is this: > > main () > { > int D.1519; > D.1518; > int D.1517; > D.1516; > int D.1515; > struct s * p.0; > > p.0 = p; > p.0->a = 1; > s.a = 0; > s.b = 0; > p.0 = p; > D.1516 = p.0->a; > D.1517 = (int) D.1516; > D.1518 = s.b; > D.1519 = (int) D.1518; > D.1515 = D.1517 + D.1519; > goto ; > :; > return D.1515; > } > > which my pass turns main() into this: > > ;; Function main (main) > > main () > { > struct s * cilsimp.18; > long long int * cilsimp.17; > long long int cilsimp.16; > struct s * cilsimp.15; > long long int * cilsimp.14; > long long int cilsimp.13; > struct s * cilsimp.12; > long long int * cilsimp.11; > long long int cilsimp.10; > long long int cilsimp.9; > struct s * cilsimp.8; > long long int * cilsimp.7; > long long int cilsimp.6; > long long int cilsimp.5; > struct s * cilsimp.4; > long long int * cilsimp.3; > long long int cilsimp.2; > long long int cilsimp.1; > int D.1519; > D.1518; > int D.1517; > D.1516; > int D.1515; > struct s * p.0; > > p.0 = p; > cilsimp.4 = p.0; > cilsimp.3 = (long long int *) cilsimp.4; > cilsimp.1 = *cilsimp.3; > cilsimp.1 = cilsimp.1 & -4096; > cilsimp.1 = cilsimp.1 | 1; > *cilsimp.3 = cilsimp.1; > cilsimp.8 = &s; > cilsimp.7 = (long long int *) cilsimp.8; > cilsimp.5 = *cilsimp.7; > cilsimp.5 = cilsimp.5 & -4096; > cilsimp.5 = cilsimp.5 | 0; > *cilsimp.7 = cilsimp.5; > cilsimp.12 = &s; > cilsimp.11 = (long long int *) cilsimp.12; > cilsimp.9 = *cilsimp.11; > cilsimp.9 = cilsimp.9 & -16773121; > cilsimp.9 = cilsimp.9 | 0; > *cilsimp.11 = cilsimp.9; > p.0 = p; > cilsimp.15 = p.0; > cilsimp.14 = (long long int *) cilsimp.15; > cilsimp.13 = *cilsimp.14; > cilsimp.13 = cilsimp.13 << 52; > cilsimp.13 = cilsimp.13 >> 52; > D.1516 = () cilsimp.13; > D.1517 = (int) D.1516; > cilsimp.18 = &s; > cilsimp.17 = (long long int *) cilsimp.18; > cilsimp.16 = *cilsimp.17; > cilsimp.16 = cilsimp.16 << 40; > cilsimp.16 = cilsimp.16 >> 52; > D.1518 = () cilsimp.16; > D.1519 = (int) D.1518; > D.1515 = D.1517 + D.1519; > goto ; > :; > return D.1515; > } > > ... and later FRE into this: > > main () > { > long long int * cilsimp.17; > long long int cilsimp.16; > struct s * cilsimp.15; > long long int * cilsimp.14; > long long int cilsimp.13; > long long int * cilsimp.11; > long long int cilsimp.9; > long long int * cilsimp.7; > long long int cilsimp.5; > struct s * cilsimp.4; > long long int * cilsimp.3; > long long int cilsimp.1; > int D.1519; > D.1518; > int D.1517; > D.1516; > int D.1515; > > : > cilsimp.4_1 = p; > cilsimp.3_3 = (long long int *) cilsimp.4_1; > cilsimp.1_4 = *cilsimp.3_3; > cilsimp.1_5 = cilsimp.1_4 & -4096; > cilsimp.1_6 = cilsimp.1_5 | 1; > *cilsimp.3_3 = cilsimp.1_6; > cilsimp.7_8 = (long long int *) &s; > cilsimp.5_9 = *cilsimp.7_8; > cilsimp.5_10 = cilsimp.5_9 & -4096; > cilsimp.5_11 = cilsimp.5_10; > *cilsimp.7_8 = cilsimp.5_11; > cilsimp.11_13 = cilsimp.7_8; > cilsimp.9_14 = cilsimp.5_10; > cilsimp.9_15 = cilsimp.9_14 & -16773121; > cilsimp.9_16 = cilsimp.9_15; > *cilsimp.11_13 = cilsimp.9_16; > cilsimp.15_17 = cilsimp.4_1; > cilsimp.14_19 = cilsimp.3_3; > cilsimp.13_20 = cilsimp.1_6; > cilsimp.13_21 = cilsimp.13_20 << 52; > cilsimp.13_22 = cilsimp.13_21 >> 52; > D.1516_23 = () cilsimp.13_22; > D.1517_24 = (int) D.1516_23; > cilsimp.17_26 = cilsimp.7_8;looking at the other passes didn't provide > cilsimp.16_27 = cilsimp.9_15; > cilsimp.16_28 = cilsimp.16_27 << 40; > cilsimp.16_29 = cilsimp.16_28 >> 52; > D.1518_30 = () cilsimp.16_29; > D.1519_31 = (int) D.1518_30; > D.1515_32 = D.1517_24 + D.1519_31; > return D.1515_32; > > } > > The problem is that FRE optimizes away the explicit load used for getting the > value of p->a and replaces it with the constant value assigned in the first line > of main (1). This is wrong because the assignment s.a = 0 overwrites p->a > however it seems that FRE doesn't realize that the pointers simpcil.3 and > simpcil.7 are aliases and that the assignment *cilsimp.7 = cilsimp.5; overwrites > the value of p->a with 0. > I believe I must be doing something horribly wrong which breaks alias > analysis. and I'm not sure when this is information is built in the first place > and how to keep it up to date with the transformed code. > Sorry for the long post but I'm really stuck and even looking at the other > passes and internal documentation didn't provide much clues about how to deal > with this problem. This transformation is indeed invalid according to our type-based alias rules. There is no 'easy' way to make it work (well, force -fno-strict-aliasing) other than to make the access through a pointer to a union. That is: union { struct s; long long int x; } *cilsimp.3 = (union ... *)p.0; cilsimp.1 = cilsimp.3->x; Or you can try using a VIEW_CONVERT_EXPR (I don't know if this will work, you'll have to try): cilsimp.1 = VIEW_CONVERT_EXPR(*p); Richard. From gabriele.svelto@st.com Mon Dec 10 13:07:00 2007 From: gabriele.svelto@st.com (Gabriele SVELTO) Date: Mon, 10 Dec 2007 13:07:00 -0000 Subject: Inserting arbitrary GIMPLE statements & alias analysis In-Reply-To: <84fc9c000712100245q748ba5a1t2c1216af20824dff@mail.gmail.com> References: <475D0C91.1040006@st.com> <84fc9c000712100245q748ba5a1t2c1216af20824dff@mail.gmail.com> Message-ID: <475D254C.1060303@st.com> Richard Guenther wrote: > This transformation is indeed invalid according to our type-based alias > rules. There is no 'easy' way to make it work (well, force > -fno-strict-aliasing) > other than to make the access through a pointer to a union. That is: > > union { > struct s; > long long int x; > } *cilsimp.3 = (union ... *)p.0; > > cilsimp.1 = cilsimp.3->x; Thank for the tip, I'll try that out. Will it work even if I have to introduce a whole structure with 'container fields' matching the original bit-fields, as for example: struct original_struct { int a : 32; int b : 32; int c : 16; int d : 8; int e : 8; } s; int a = s->d; Which I might turn into union { struct orignal_struct s; struct container_struct { long long int a_and_b; long int c_d_and_e; } c; } *container_ptr = (union ... *)s; container = container_ptr->c.c_d_and_e; a = (container << 16) >> 8; > Or you can try using a VIEW_CONVERT_EXPR (I don't know if this will > work, you'll have to try): > > cilsimp.1 = VIEW_CONVERT_EXPR(*p); If I understand it correctly that might tell gcc that cilsimp.1 could basically alias with everything else. That would defeat a lot of optimizations though. Since my aim is to let the optimizer clean up the mess resulting from removing bit-field accesses I guess that using your above suggestion should be better. Thanks again Gabriele From tprince@computer.org Mon Dec 10 13:59:00 2007 From: tprince@computer.org (Tim Prince) Date: Mon, 10 Dec 2007 13:59:00 -0000 Subject: Using -mlittle-endian or -mbig-endian options.... In-Reply-To: <890571.85786.qm@web94110.mail.in2.yahoo.com> References: <890571.85786.qm@web94110.mail.in2.yahoo.com> Message-ID: <475D381A.8090505@computer.org> ashish mahamuni wrote: > Hi, > > I am working on Intel i686 machine > I've Hello_World.c file. > When I give following command compiler gives error > that Invalid Option. > > gcc -mlittle-endian Hello_World.c > or > gcc -mlittle-endian Hello_World.c > > I am using 4.2 version of gcc (Latest one I guess). > How can I use this options? How about omitting that option, and asking usage questions on gcc-help@gcc.gnu.org? From phil.lello@googlemail.com Mon Dec 10 15:36:00 2007 From: phil.lello@googlemail.com (Phil Lello) Date: Mon, 10 Dec 2007 15:36:00 -0000 Subject: libiberty: make install doesn't install obstack.h (mingw32) Message-ID: <475D4614.2040809@gmail.com> According to http://gcc.gnu.org/onlinedocs/libiberty/Using.html: > Passing --enable-install-libiberty to the configure script when building |libiberty| causes the header files and archive library to be installed when make install is run When I run make install against the current svn libiberty code, obstack.h doesn't get installed. I'm cross-compiling for Win32 (via ./configure --host=i586-mingw32msvc --prefix=/usr/i586-mingw32msvc --enable-install-libiberty), specifically because I want the obstack functionality missing on this platfrom. Is the omission of obstack.h a mistake, or by design? If by design, what is the reason? Thanks, Phil From nico@cam.org Mon Dec 10 16:09:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Mon, 10 Dec 2007 16:09:00 -0000 Subject: Git and GCC In-Reply-To: <20071210095426.GA32611@iram.es> References: <20071205.204848.227521641.davem@davemloft.net> <4aca3dc20712052111o730f6fb6h7a329ee811a70f28@mail.gmail.com> <1196918132.10408.85.camel@brick> <4aca3dc20712052117j3ef5cf99y848d4962ae8ddf33@mail.gmail.com> <9e4733910712052247x116cabb4q48ebafffb93f7e03@mail.gmail.com> <20071206071503.GA19504@coredump.intra.peff.net> <20071206173946.GA10845@sigill.intra.peff.net> <1197074839.22471.34.camel@brick> <20071210095426.GA32611@iram.es> Message-ID: On Mon, 10 Dec 2007, Gabriel Paubert wrote: > On Fri, Dec 07, 2007 at 04:47:19PM -0800, Harvey Harrison wrote: > > Some interesting stats from the highly packed gcc repo. The long chain > > lengths very quickly tail off. Over 60% of the objects have a chain > > length of 20 or less. If anyone wants the full list let me know. I > > also have included a few other interesting points, the git default > > depth of 50, my initial guess of 100 and every 10% in the cumulative > > distribution from 60-100%. > > > > This shows the git default of 50 really isn't that bad, and after > > about 100 it really starts to get sparse. > > Do you have a way to know which files have the longest chains? With 'git verify-pack -v' you get the delta depth for each object. Then you can use 'git show' with the object SHA1 to see its content. > I have a suspiscion that the ChangeLog* files are among them, > not only because they are, almost without exception, only modified > by prepending text to the previous version (and a fairly small amount > compared to the size of the file), and therefore the diff is simple > (a single hunk) so that the limit on chain depth is probably what > causes a new copy to be created. My gcc repo is currently repacked with a max delta depth of 50, and a quick sample of those objects at the depth limit does indeed show the content of the ChangeLog file. But I have occurrences of the root directory tree object too, and the "GCC machine description for IA-32" content as well. But yes, the really deep delta chains are most certainly going to contain those ChangeLog files. > Besides that these files grow quite large and become some of the > largest files in the tree, and at least one of them is changed > for every commit. This leads again to many versions of fairly > large files. > > If this guess is right, this implies that most of the size gains > from longer chains comes from having less copies of the ChangeLog* > files. From a performance point of view, it is rather favourable > since the differences are simple. This would also explain why > the window parameter has little effect. Well, actually the window parameter does have big effects. For instance the default of 10 is completely inadequate for the gcc repo, since changing the window size from 10 to 100 made the corresponding pack shrink from 2.1GB down to 400MB, with the same max delta depth. Nicolas From monoid@ispras.ru Mon Dec 10 16:11:00 2007 From: monoid@ispras.ru (Alexander Monakov) Date: Mon, 10 Dec 2007 16:11:00 -0000 Subject: [RFC/RFT] Improving SMS by data dependence export In-Reply-To: <4aca3dc20712071249u2cc5242age0fc1a3b62d6c00@mail.gmail.com> References: <4aca3dc20712071249u2cc5242age0fc1a3b62d6c00@mail.gmail.com> Message-ID: On Fri, 07 Dec 2007 23:49:28 +0300, Daniel Berlin wrote: > On 12/7/07, Alexander Monakov wrote: >> Hi. >> >> Attached is the patch that allows to save dependence info obtained on >> tree >> level by data-reference analysis for usage on RTL level (for RTL memory >> disambiguation and dependence graph construction for modulo scheduling). >> It helps for RTL disambiguation on platforms without base+offset memory >> addressing modes, and impact on SMS is described below. We would like >> to >> see it in 4.4 mainline. >> >> We have tested this patch with modulo scheduling on ia64, using SPEC >> CPU2000 benchmark suite. It allows to apply software pipelining to more >> loops, resulting in ~1-2% speedup (compared to SMS without exported >> info). The most frequent improvements are removal of cross-iteration >> memory dependencies, as currently SMS adds such dependencies for all >> pair >> of memory references, even in cases when they cannot alias (for example, >> for different arrays or different fields of a struct). As I understand, >> SMS does not use RTL alias analysis here because pairs that do not alias >> within one iteration, but may alias when cross-iteration movement is >> performed (like a[i] and a[i+1]), should be marked as dependent. So, >> SMS >> data dependence analysis can be greatly improved even without >> data-dependence export patch by using RTL-like memory disambiguation, >> but >> without pointer arithmetic analysis. >> >> There are currently two miscompiled SPEC tests with this patch; in one >> of >> them, the problem is related to generation of register moves in the >> prologue of software pipelined loop (which was not pipelined without the >> patch). The problem is reported and discussed with Revital Eres from >> IBM >> Haifa. >> >> We would like to ask people interested in SMS performance on PowerPC and >> Cell SPU to conduct tests with this patch. Any feedback is greatly >> appreciated. >> > > I see a few random unrelated changes, like, for example: > > if (may_eliminate_iv (data, use, cand, &bound)) > - { > - elim_cost = force_var_cost (data, bound, &depends_on_elim); > - /* The bound is a loop invariant, so it will be only computed > - once. */ > - elim_cost /= AVG_LOOP_NITER (data->current_loop); > - } > + elim_cost = force_var_cost (data, bound, &depends_on_elim); > else > elim_cost = INFTY; > > > Please pull these out into separate patches or don't do them :) > also, i see > + /* We do not use operand_equal_p for ORIG_EXPRs because we need to > + distinguish memory references at different points of the loop > (which > + would have different indices in SSA form, like a[i_1] and a[i_2], > but > + were later rewritten to same a[i]). */ > + && (p->orig_expr == q->orig_expr)); > > This doesn't do enough to distinguish memory references at different > points of the loop, while also eliminating from consideration that > *are* the same. > > What if they are regular old VAR_DECL? > This will still return true, but they may be different accesses at > different points in the loop. > > In any case, this doesn't belong in mem_attrs_htab_eq, because if they > are operand_equal_p, for purposes of memory attributes, they *are* > equal. They may still be different accesses, which is something you > have to discover later on. > > IE You should be doing this check somewhere else, not in a hashtable > equality function :) > > > DDR will mark them as data refs >> Thanks. >> >> -- >> Alexander Monakov >> From qiyaoltc@gmail.com Mon Dec 10 16:55:00 2007 From: qiyaoltc@gmail.com (Yao Qi) Date: Mon, 10 Dec 2007 16:55:00 -0000 Subject: Alias-analysis in gccint Message-ID: When I read the example of alias analysis from http://gcc.gnu.org/onlinedocs/gccint/Alias-analysis.html, I could not understand it. Here is this example and text, "For instance, consider the following function: foo (int i) { int *p, *q, a, b; if (i > 10) p = &a; else q = &b; *p = 3; *q = 5; a = b + 2; return *p; } After aliasing analysis has finished, the symbol memory tag for pointer p will have two aliases, namely variables a and b. Every time pointer p is dereferenced, we want to mark the operation as a potential reference to a and b". My questions is How many aliases do p have? According to the doc here, it is said that "p will have two aliases, namely variables a and b." What I learned from compiler book is that p points-to &a, and q points-to &b. Best Regards -- Yao Qi GNU/Linux Developer http://duewayqi.googlepages.com/ From sailer@ife.ee.ethz.ch Mon Dec 10 17:02:00 2007 From: sailer@ife.ee.ethz.ch (Thomas Sailer) Date: Mon, 10 Dec 2007 17:02:00 -0000 Subject: VLIW scheduling and delayed branch In-Reply-To: <475D124D.2030509@picochip.com> References: <475AB2D4.8060409@picochip.com> <1197146980.4613.13.camel@unreal.localdomain> <475D124D.2030509@picochip.com> Message-ID: <1197305709.15205.66.camel@xbox360> > When do you un-parallel those instructions? And, how? I don't; I use a C function to output such an insn group. In that C function, I basically save the global state of final, and use functions of final.c to output constitutent insns. The insn group output function basically looks like this: first prepare: static char buf[256]; FILE *old_out_file; /* open memory file */ old_out_file = asm_out_file; asm_out_file = fmemopen (buf, sizeof(buf), "w"); gcc_assert (asm_out_file); then loop over all constitutent insns: cleanup_subreg_operands (insn); if (! constrain_operands_cached (1)) fatal_insn_not_found (insn); current_output_insn = insn; /* Find the proper template for this insn. */ template = get_insn_template (insn_code_number, insn); gcc_assert (template); gcc_assert (!(template[0] == '#' && template[1] == '\0')); fprintf (asm_out_file, "\t||"); output_asm_insn (template, recog_data.operand); fseek (asm_out_file, ftell (asm_out_file) - 1, SEEK_SET); finally cleanup: fclose (asm_out_file); asm_out_file = old_out_file; return &buf[4]; That's why I wrote it's kind of hackish :-) fmemopen also isn't necessarily very portable, but is needed since all the final output routines directly output to a FILE *, and I need to intercept that output. Tom From monoid@ispras.ru Mon Dec 10 17:16:00 2007 From: monoid@ispras.ru (Alexander Monakov) Date: Mon, 10 Dec 2007 17:16:00 -0000 Subject: [RFC/RFT] Improving SMS by data dependence export In-Reply-To: <4aca3dc20712071249u2cc5242age0fc1a3b62d6c00@mail.gmail.com> References: <4aca3dc20712071249u2cc5242age0fc1a3b62d6c00@mail.gmail.com> Message-ID: Hi. Sorry for the previous empty reply. > also, i see > + /* We do not use operand_equal_p for ORIG_EXPRs because we need to > + distinguish memory references at different points of the loop > (which > + would have different indices in SSA form, like a[i_1] and a[i_2], > but > + were later rewritten to same a[i]). */ > + && (p->orig_expr == q->orig_expr)); > > This doesn't do enough to distinguish memory references at different > points of the loop, while also eliminating from consideration that > *are* the same. > > What if they are regular old VAR_DECL? > This will still return true, but they may be different accesses at > different points in the loop. Sorry, I don't really follow. The comment is somewhat badly worded indeed. The purpose of making handling of MEM_ORIG_EXPRs (introduced by this patch) different from MEM_EXPRs in ignoring operand_equal'ity of trees pointed to by this field is enforcing that MEMs corresponding to accesses to objects of the same type but with (potentially) different addresses will not share MEM_ATTRS structure. So, if both are VAR_DECLs, returning true is OK, since different accesses still correspond to the same memory location. The first sentence also implies that potentially different accesses could be merged here, but I don't see any reason for that except for NULL MEM_ORIG_EXPRs. Could you please elaborate on this? > In any case, this doesn't belong in mem_attrs_htab_eq, because if they > are operand_equal_p, for purposes of memory attributes, they *are* > equal. They may still be different accesses, which is something you > have to discover later on. I don't follow this either. Since I add a new field to MEM_ATTRS struct, which in some cases allows better disambiguation, why should I enforce MEM_EXPR's rules on it? If I, similarly to MEM_EXPRs, apply operand_equal_p also to MEM_ORIG_EXPRs, this will give me incorrect results, since different MEMs will be annotated with same MEM_ORIG_EXPR, which is wrong, since the latter is flow-sensitive, and operand_equal_p will discard that (since trees will look the same after out-of-SSA). I do not see a better way to provide flow-sensitive annotations for MEMs. > DDR will mark them as data refs Come again? :) Thanks. -- Alexander Monakov From rask@sygehus.dk Mon Dec 10 17:31:00 2007 From: rask@sygehus.dk ('Rask Ingemann Lambertsen') Date: Mon, 10 Dec 2007 17:31:00 -0000 Subject: Help with another constraint In-Reply-To: <000d01c83a81$85462ca0$33160e98@ece.ncsu.edu> References: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> <20071209130740.GI17368@sygehus.dk> <000d01c83a81$85462ca0$33160e98@ece.ncsu.edu> Message-ID: <20071210171542.GL17368@sygehus.dk> On Sun, Dec 09, 2007 at 11:35:32AM -0500, Balaji V. Iyer wrote: > Hello Rask, > I am not understanding your response, can you clarify it for me? > > As per the question about the error message above? > > ../../gcc-4.0.2/gcc/libgcc2.c -o libgcc/./_negdi2.o > ../../gcc-4.0.2/gcc/libgcc2.c: In function '__negdi2': > ../../gcc-4.0.2/gcc/libgcc2.c:72: error: insn does not satisfy its > constraints: I think this is misleading you. It seems likely that the problem is with the predicate and not the constraint. > (insn 15 13 16 (set (mem:SI (plus:SI (reg/f:SI 2 r2) ^^^ This has to be a register, doesn't it? If so, use -fdump-rtl-all and look at the dump files to see where it goes wrong. > (const_int -28 [0xffffffe4])) [0 D.1256+0 S4 A32]) > (neg:SI (reg:SI 3 r3 [orig:80 D.1255 ] [80]))) 38 {negsi2} (nil) > (nil)) Please also post your negsi2 pattern. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From dberlin@dberlin.org Mon Dec 10 18:32:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Mon, 10 Dec 2007 18:32:00 -0000 Subject: [RFC/RFT] Improving SMS by data dependence export In-Reply-To: References: <4aca3dc20712071249u2cc5242age0fc1a3b62d6c00@mail.gmail.com> Message-ID: <4aca3dc20712100930m144f2e45o7f8af0d21b1bbc05@mail.gmail.com> On 12/10/07, Alexander Monakov wrote: > Hi. Sorry for the previous empty reply. > > > also, i see > > + /* We do not use operand_equal_p for ORIG_EXPRs because we need to > > + distinguish memory references at different points of the loop > > (which > > + would have different indices in SSA form, like a[i_1] and a[i_2], > > but > > + were later rewritten to same a[i]). */ > > + && (p->orig_expr == q->orig_expr)); > > > > This doesn't do enough to distinguish memory references at different > > points of the loop, while also eliminating from consideration that > > *are* the same. > > > > What if they are regular old VAR_DECL? > > This will still return true, but they may be different accesses at > > different points in the loop. > > Sorry, I don't really follow. The comment is somewhat badly worded > indeed. The purpose of making handling of MEM_ORIG_EXPRs (introduced by > this patch) different from MEM_EXPRs in ignoring operand_equal'ity of > trees pointed to by this field is enforcing that MEMs corresponding to > accesses to objects of the same type but with (potentially) different > addresses will not share MEM_ATTRS structure. So, if both are VAR_DECLs, > returning true is OK, since different accesses still correspond to the > same memory location. Okay, then you should edit the comment to make this clear. Because it is certainly incorrect as is. > > The first sentence also implies that potentially different accesses could > be merged here, but I don't see any reason for that except for NULL > MEM_ORIG_EXPRs. Could you please elaborate on this? COMPONENT_REF of INDIRECT_REF ( IE a->c), for example, would be merged here, incorrectly (since a may not be the same memory at this point in time) but it's not clear we ever generate them as MEM_ORIG_EXPR. Relying on pointer equality of tree expressions to give you some semantic value seems a very bad idea to me. > > > In any case, this doesn't belong in mem_attrs_htab_eq, because if they > > are operand_equal_p, for purposes of memory attributes, they *are* > > equal. They may still be different accesses, which is something you > > have to discover later on. > > I don't follow this either. Since I add a new field to MEM_ATTRS struct, I misread this portion, my apologies. I thought you were changing the semantics of an existing field. From dave.korn@artimi.com Mon Dec 10 19:19:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Mon, 10 Dec 2007 19:19:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: References: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM><060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> On 07 December 2007 20:52, Andreas Schwab wrote: > "Dave Korn" writes: > >> Perhaps we could work around this case by setting environ in the parent >> before the vfork call and restoring it afterward, but we'd need kind of >> serialisation there, > > Do we? vfork should block the parent until the child calls execve or > exit. I don't see anything in posix that suggests that? I'm worrying in this case about races between multiple threads in the parent vfork'ing multiple children, not about the child-parent interaction, which this suggestion was a workaround for. (But in any case, the subsequent suggestion by Ross to just fall back on fork instead of vfork when the environment needs setting is probably the simplest and most robust, obsoleting this earlier suggestion of mine). cheers, DaveK -- Can't think of a witty .sigline today.... From Joe.Buck@synopsys.COM Mon Dec 10 19:22:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Mon, 10 Dec 2007 19:22:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> References: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <20071210191857.GG12580@synopsys.com> On Mon, Dec 10, 2007 at 06:32:08PM -0000, Dave Korn wrote: > On 07 December 2007 20:52, Andreas Schwab wrote: > > > "Dave Korn" writes: > > > >> Perhaps we could work around this case by setting environ in the parent > >> before the vfork call and restoring it afterward, but we'd need kind of > >> serialisation there, > > > > Do we? vfork should block the parent until the child calls execve or > > exit. > > I don't see anything in posix that suggests that? I'm worrying in this case > about races between multiple threads in the parent vfork'ing multiple children, > not about the child-parent interaction, which this suggestion was a workaround > for. (But in any case, the subsequent suggestion by Ross to just fall back on > fork instead of vfork when the environment needs setting is probably the > simplest and most robust, obsoleting this earlier suggestion of mine). While the standard's wording might need fixing, with every implementation of vfork I know of, there are no threads. It's a mechanism for systems that don't support fork (or that can only do fork in a horribly inefficient way, say because there's no MMU, and no support for copy on write), but that support the creation of new processes. It's just a cheat to support the traditional fork-followed-by-exec, an aid for porting Unix code to non-Unix systems. The reason vfork blocks the parent until the child makes a new process or quits is because that's the only supported behavior on systems that support it; it is not really "blocking the parent" at all as there is only one process. I don't think it's wise to waste time fixing theoretical bugs exposed by close reading of the standard. Now, messing with environ with vfork will mess up the parent process, and if that happens it's a bug. But getting around it by using fork will harm portability, as the only reason to bother with vfork at all is that fork might not be available. From drow@false.org Mon Dec 10 19:37:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Mon, 10 Dec 2007 19:37:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <20071210191857.GG12580@synopsys.com> References: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> <20071210191857.GG12580@synopsys.com> Message-ID: <20071210192245.GA3454@caradoc.them.org> On Mon, Dec 10, 2007 at 11:18:57AM -0800, Joe Buck wrote: > While the standard's wording might need fixing, with every implementation > of vfork I know of, there are no threads. It's a mechanism for systems > that don't support fork (or that can only do fork in a horribly > inefficient way, say because there's no MMU, and no support for copy on > write), but that support the creation of new processes. No, Dave's right. On GNU/Linux you can have two threads running on different processors simultaneously calling vfork. And even with an MMU it is considerably more efficient than requiring setup of a new copy-on-write page table. -- Daniel Jacobowitz CodeSourcery From schwab@suse.de Mon Dec 10 20:01:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Mon, 10 Dec 2007 20:01:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> (Dave Korn's message of "Mon\, 10 Dec 2007 18\:32\:08 -0000") References: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> <060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> Message-ID: "Dave Korn" writes: > On 07 December 2007 20:52, Andreas Schwab wrote: > >> "Dave Korn" writes: >> >>> Perhaps we could work around this case by setting environ in the parent >>> before the vfork call and restoring it afterward, but we'd need kind of >>> serialisation there, >> >> Do we? vfork should block the parent until the child calls execve or >> exit. > > I don't see anything in posix that suggests that? That is true, but technically it is rather difficult to implement a true vfork without blocking the parent between vfork and exec/exit, since that these are the only synchronisation points. > I'm worrying in this case about races between multiple threads in the > parent vfork'ing multiple children, Typically in a multithreaded environment vfork is mapped to fork anyway. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From brian@dessent.net Mon Dec 10 20:06:00 2007 From: brian@dessent.net (Brian Dessent) Date: Mon, 10 Dec 2007 20:06:00 -0000 Subject: libiberty/pex-unix vfork abuse? References: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> <060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <475D9B0E.25E2A77@dessent.net> Andreas Schwab wrote: > Typically in a multithreaded environment vfork is mapped to fork anyway. ...which is what I don't understand about this whole thread. It seems Dave is seeing some strange behavior in Cygwin, but Cygwin's vfork = fork, there is no difference. There used to be a vfork specialization in Cygwin, but it is broken and has not been enabled in quite a long time. Brian From Joe.Buck@synopsys.COM Mon Dec 10 20:24:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Mon, 10 Dec 2007 20:24:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <20071210192245.GA3454@caradoc.them.org> References: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> <20071210191857.GG12580@synopsys.com> <20071210192245.GA3454@caradoc.them.org> Message-ID: <20071210200606.GH12580@synopsys.com> On Mon, Dec 10, 2007 at 02:22:45PM -0500, Daniel Jacobowitz wrote: > On Mon, Dec 10, 2007 at 11:18:57AM -0800, Joe Buck wrote: > > While the standard's wording might need fixing, with every implementation > > of vfork I know of, there are no threads. It's a mechanism for systems > > that don't support fork (or that can only do fork in a horribly > > inefficient way, say because there's no MMU, and no support for copy on > > write), but that support the creation of new processes. > > No, Dave's right. On GNU/Linux you can have two threads running on > different processors simultaneously calling vfork. And even with an > MMU it is considerably more efficient than requiring setup of a new > copy-on-write page table. Yes, that's true. And of course some implementation might map vfork to fork, which complies with standards. I only meant to talk about races between parent and child "process". From bulb@ucw.cz Mon Dec 10 20:35:00 2007 From: bulb@ucw.cz (Jan Hudec) Date: Mon, 10 Dec 2007 20:35:00 -0000 Subject: In future, to replace autotools by cmake like KDE4 did? In-Reply-To: <998d0e4a0712070642u6ae75232t9cb5bfd0920b2439@mail.gmail.com> References: <998d0e4a0712061810k18e6388jde9d7bc5bd006b57@mail.gmail.com> <47594021.40200@op5.se> <200712071456.11019.jnareb@gmail.com> <998d0e4a0712070642u6ae75232t9cb5bfd0920b2439@mail.gmail.com> Message-ID: <20071210202343.GB3517@efreet.light.src> On Fri, Dec 07, 2007 at 15:42:31 +0100, J.C. Pizarro wrote: > A powerful tool can do better things that old generators-based tools > (as autotools). > > To imagine, there are many scripts in subdirectories or subprojects: No, there are not. There is just one. Multiple configuration scripts rarely make sense. > * Before: (many copy and paste of code as below paragraph) > A_VARIABLE_OS = `uname -a | grep .... ` # <- slow > case "$A_VARIABLE_OS" in > *linux*) ... ;; > *bsd*) ... ;; > *aix*) ... ;; > *) ...;; > esac > m4 foo.sh.m4 > bar.sh # <- very slow Done once at release time. > ./bar.sh > > * Later: (with the powerful tool that had cached many predefined variables in > a ramdisk's file or in a daemon's memory) A daemon not runnin' here. No ramdisk here either. Freshly downloaded tarball to an ancient Un*x with some quirky barely POSIX-compliant shell. > # call once at 1st time to internal uname of powerful tool for all ocurrences of > # below predefined variable from many scripts: > case "$FOO_VARIABLE_OS" in Someone had to create that variable. And there is just one way to: uname -a | .... > *linux*) ... ;; > *bsd*) ... ;; > *aix*) ... ;; > *) ...;; > esac And how exactly does this differ from the code you had above? For task that runs once per installation (and for most users never, because their distribution's build server runs it for them), it's simplicity of code that matters. > # i don't need to generate more scripts to inspect still more it. And how exactly did you find, from the uname, whether I have libcrypto installed? And whether I have it in /usr/lib, /opt/openssl/lib or /usr/@foobar.com/sw/system/lib? How did you find libcurl, tcl, zlib...? Besides, I inspected the configure.ac script that comes with git and it does not actually contain any code like you show above. Git's configure script is NOT looking at the platform name AT ALL. The makefile does, but that obviously does not need anything generated by M4. -- Jan 'Bulb' Hudec -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From ghazi@caip.rutgers.edu Mon Dec 10 20:59:00 2007 From: ghazi@caip.rutgers.edu (Kaveh R. GHAZI) Date: Mon, 10 Dec 2007 20:59:00 -0000 Subject: PATCH: Update MPFR versions (was Re: Revisiting GCC's minimum MPFR version) In-Reply-To: <475C8A24.9060901@codesourcery.com> References: <84fc9c000712091521w11b4ef73i4418cb873496a579@mail.gmail.com> <475C8A24.9060901@codesourcery.com> Message-ID: On Sun, 9 Dec 2007, Mark Mitchell wrote: > Richard Guenther wrote: > > > I would update the recommended version to 2.3.0 and fail for anything less > > than 2.2.1. > > I agree. Not optimizing bessel functions as builtins doesn't bother me > too much, but we might as well move past the buggy version. > > Thanks, > Mark Mitchell Ok, here's my patch. Since we may have some developers still using 2.2.0, I'll wait say a week after approval before installing to give them time to upgrade. I have limited ability to test patches at the moment. The sparc-solaris infrastructure at rutgers.edu which I had access to fried and will not be fixed or replaced any time soon. So I haven't tested this patch beyond top level a configure run. However I see several XPASSes for builtin-math-4.c from people which indicates to me that the testcase does in fact pass with mpfr-2.3.0. The only other change is the one liner to the docs. http://gcc.gnu.org/ml/gcc-testresults/2007-12/msg00458.html http://gcc.gnu.org/ml/gcc-testresults/2007-12/msg00382.html http://gcc.gnu.org/ml/gcc-testresults/2007-12/msg00374.html In the mean time I'm in search of a new place to play with gcc. If further testing is required, I'll do more rigorous checks before installing when I move my stuff to a new home. Ok for mainline? Thanks, --Kaveh 2007-12-10 Kaveh R. Ghazi * configure.ac: Change required MPFR from 2.2.0 -> 2.2.1. Change recommended MPFR from 2.2.1 > 2.3.0. * configure: Regenerate. gcc: * doc/install.texi: Change recommended MPFR from 2.2.1 > 2.3.0. testsuite: * gcc.dg/torture/builtin-math-4.c: Remove XFAIL. diff -rup orig/egcc-SVN20071209/configure.ac egcc-SVN20071209/configure.ac --- orig/egcc-SVN20071209/configure.ac Mon Oct 8 23:02:51 2007 +++ egcc-SVN20071209/configure.ac Mon Dec 10 14:34:45 2007 @@ -1220,11 +1220,11 @@ if test -d ${srcdir}/gcc && test "x$have if test x"$have_gmp" = xyes; then saved_LIBS="$LIBS" LIBS="$LIBS $gmplibs" - dnl MPFR 2.2.0 is acceptable but buggy, MPFR 2.2.1 is better. + dnl MPFR 2.2.1 is acceptable, but MPFR 2.3.0 is better. AC_MSG_CHECKING([for correct version of mpfr.h]) AC_TRY_LINK([#include #include ],[ - #if MPFR_VERSION < MPFR_VERSION_NUM(2,2,0) + #if MPFR_VERSION < MPFR_VERSION_NUM(2,2,1) choke me #endif mpfr_t n; @@ -1237,7 +1237,7 @@ if test -d ${srcdir}/gcc && test "x$have mpfr_subnormalize (x, t, GMP_RNDN); ], [AC_TRY_LINK([#include #include ],[ - #if MPFR_VERSION < MPFR_VERSION_NUM(2,2,1) + #if MPFR_VERSION < MPFR_VERSION_NUM(2,3,0) choke me #endif mpfr_t n; mpfr_init(n); @@ -1248,7 +1248,7 @@ if test -d ${srcdir}/gcc && test "x$have CFLAGS="$saved_CFLAGS" if test x$have_gmp != xyes; then - AC_MSG_ERROR([Building GCC requires GMP 4.1+ and MPFR 2.2.1+. + AC_MSG_ERROR([Building GCC requires GMP 4.1+ and MPFR 2.3.0+. Try the --with-gmp and/or --with-mpfr options to specify their locations. Copies of these libraries' source code can be found at their respective hosting sites as well as at ftp://gcc.gnu.org/pub/gcc/infrastructure/. diff -rup orig/egcc-SVN20071209/gcc/doc/install.texi egcc-SVN20071209/gcc/doc/install.texi --- orig/egcc-SVN20071209/gcc/doc/install.texi Fri Dec 7 23:02:22 2007 +++ egcc-SVN20071209/gcc/doc/install.texi Mon Dec 10 14:34:45 2007 @@ -302,7 +302,7 @@ library search path, you will have to co @option{--with-gmp} configure option. See also @option{--with-gmp-lib} and @option{--with-gmp-include}. -@item MPFR Library version 2.2.1 (or later) +@item MPFR Library version 2.3.0 (or later) Necessary to build GCC. It can be downloaded from @uref{http://www.mpfr.org/}. The version of MPFR that is bundled with diff -rup orig/egcc-SVN20071209/gcc/testsuite/gcc.dg/torture/builtin-math-4.c egcc-SVN20071209/gcc/testsuite/gcc.dg/torture/builtin-math-4.c --- orig/egcc-SVN20071209/gcc/testsuite/gcc.dg/torture/builtin-math-4.c Fri May 25 23:02:37 2007 +++ egcc-SVN20071209/gcc/testsuite/gcc.dg/torture/builtin-math-4.c Mon Dec 10 14:34:45 2007 @@ -7,8 +7,6 @@ Origin: Kaveh R. Ghazi, April 23, 2007. */ /* { dg-do link } */ -/* Expect failures at least until mpfr-2.3.0 is released. */ -/* { dg-xfail-if "This test requires mpfr-2.3.0" { *-*-* } { "*" } { "" } } */ /* All references to link_error should go away at compile-time. */ extern void link_error(int); From iant@google.com Mon Dec 10 21:05:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Mon, 10 Dec 2007 21:05:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <20071210191857.GG12580@synopsys.com> References: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> <20071210191857.GG12580@synopsys.com> Message-ID: Joe Buck writes: > I don't think it's wise to waste time fixing theoretical bugs > exposed by close reading of the standard. Now, messing with environ > with vfork will mess up the parent process, and if that happens it's a > bug. But getting around it by using fork will harm portability, as the > only reason to bother with vfork at all is that fork might not be > available. That's not the only reason. I used vfork because I measured performance improvements with vfork over fork. I can't remember whether I did the measurements on GNU/Linux or on NetBSD. Performance improvements are not particularly surprising. Despite the fact that, as the Open Group specification suggests, vfork is a broken interface, it was implemented to be faster than fork/exec, and it is faster. Ian From vagabon.xyz@gmail.com Mon Dec 10 21:05:00 2007 From: vagabon.xyz@gmail.com (Franck Bui-Huu) Date: Mon, 10 Dec 2007 21:05:00 -0000 Subject: Clarification on section variable attribute usage [try #2] Message-ID: <475DA879.4060708@gmail.com> [ example updated ] Hi, Since at least 3.4, the GCC manual says: Use the `section' attribute with an _initialized_ definition of a _global_ variable, as shown in the example. GCC issues a warning and otherwise ignores the `section' attribute in uninitialized variable declarations. but this doesn't seem correct. For example compiling the following tiny program: int foo __attribute__ ((__section__ (".init.data"))); int main(int argc, char **argv) { foo = 4; return 0; } produces no warning and the section attribute is not ignored at all: $ readelf -S a.out | grep -A1 init.data [24] .init.data PROGBITS 000000000060080c 0000080c 0000000000000004 0000000000000000 WA 0 0 4 This is with 4.1.2 from fedora, but I guess other GCC give the same result. Could anybody clarify this point ? Thanks, Franck From michael.meissner@amd.com Mon Dec 10 22:22:00 2007 From: michael.meissner@amd.com (Michael Meissner) Date: Mon, 10 Dec 2007 22:22:00 -0000 Subject: Using -mlittle-endian or -mbig-endian options.... In-Reply-To: <890571.85786.qm@web94110.mail.in2.yahoo.com> References: <890571.85786.qm@web94110.mail.in2.yahoo.com> Message-ID: <20071210210434.GA31179@mmeissner-gold.amd.com> On Mon, Dec 10, 2007 at 09:55:24AM +0000, ashish mahamuni wrote: > Hi, > > I am working on Intel i686 machine > I've Hello_World.c file. > When I give following command compiler gives error > that Invalid Option. > > gcc -mlittle-endian Hello_World.c > or > gcc -mlittle-endian Hello_World.c > > I am using 4.2 version of gcc (Latest one I guess). > How can I use this options? The x86 port does not support -mlittle-endian or -mbig-endian options (the -m options are machine/port specific). Note, all x86's are little endian, so you don't need the switch in this case. The -mlittle-endian switch is available on ports that support both little endian and big endian, such as the ARM, MIPS, POWERPC, etc. -- Michael Meissner, AMD 90 Central Street, MS 83-29, Boxborough, MA, 01719, USA michael.meissner@amd.com From mark@codesourcery.com Mon Dec 10 22:35:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Mon, 10 Dec 2007 22:35:00 -0000 Subject: PATCH: Update MPFR versions (was Re: Revisiting GCC's minimum MPFR version) In-Reply-To: References: <84fc9c000712091521w11b4ef73i4418cb873496a579@mail.gmail.com> <475C8A24.9060901@codesourcery.com> Message-ID: <475DBC0C.8080204@codesourcery.com> Kaveh R. GHAZI wrote: >>> I would update the recommended version to 2.3.0 and fail for anything less >>> than 2.2.1. > Ok, here's my patch. Since we may have some developers still using 2.2.0, > I'll wait say a week after approval before installing to give them time to > upgrade. > Ok for mainline? OK, under the guidelines you suggest above. Thanks, -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From schwab@suse.de Mon Dec 10 22:44:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Mon, 10 Dec 2007 22:44:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <20071210192245.GA3454@caradoc.them.org> (Daniel Jacobowitz's message of "Mon\, 10 Dec 2007 14\:22\:45 -0500") References: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> <20071210191857.GG12580@synopsys.com> <20071210192245.GA3454@caradoc.them.org> Message-ID: Daniel Jacobowitz writes: > On Mon, Dec 10, 2007 at 11:18:57AM -0800, Joe Buck wrote: >> While the standard's wording might need fixing, with every implementation >> of vfork I know of, there are no threads. It's a mechanism for systems >> that don't support fork (or that can only do fork in a horribly >> inefficient way, say because there's no MMU, and no support for copy on >> write), but that support the creation of new processes. > > No, Dave's right. On GNU/Linux you can have two threads running on > different processors simultaneously calling vfork. And even with an > MMU it is considerably more efficient than requiring setup of a new > copy-on-write page table. Glibc will map vfork to fork in a multithreaded environment. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From gccadmin@gcc.gnu.org Mon Dec 10 22:48:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Mon, 10 Dec 2007 22:48:00 -0000 Subject: gcc-4.1-20071210 is now available Message-ID: <20071210224423.11726.qmail@sourceware.org> Snapshot gcc-4.1-20071210 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20071210/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.1 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch revision 130753 You'll find: gcc-4.1-20071210.tar.bz2 Complete GCC (includes all of below) gcc-core-4.1-20071210.tar.bz2 C front end and core compiler gcc-ada-4.1-20071210.tar.bz2 Ada front end and runtime gcc-fortran-4.1-20071210.tar.bz2 Fortran front end and runtime gcc-g++-4.1-20071210.tar.bz2 C++ front end and runtime gcc-java-4.1-20071210.tar.bz2 Java front end and runtime gcc-objc-4.1-20071210.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.1-20071210.tar.bz2 The GCC testsuite Diffs from 4.1-20071203 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.1 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From schwab@suse.de Mon Dec 10 22:56:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Mon, 10 Dec 2007 22:56:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <20071210191857.GG12580@synopsys.com> (Joe Buck's message of "Mon\, 10 Dec 2007 11\:18\:57 -0800") References: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> <20071210191857.GG12580@synopsys.com> Message-ID: Joe Buck writes: > While the standard's wording might need fixing, with every implementation > of vfork I know of, there are no threads. It's a mechanism for systems > that don't support fork (or that can only do fork in a horribly > inefficient way, say because there's no MMU, and no support for copy on > write), but that support the creation of new processes. It's just a > cheat to support the traditional fork-followed-by-exec, an aid for porting > Unix code to non-Unix systems. vfork has been invented by Unix, it has nothing to do with porting. It was needed to avoid the overhead of copying the address space just to throw the copy away during execve. On modern systems the overhead is almost nonexistent. > The reason vfork blocks the parent until the child makes a new process > or quits is because that's the only supported behavior on systems that > support it; it is not really "blocking the parent" at all as there is > only one process. The only reason the parent needs to be suspended is because it shares the stack with the child. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From drow@false.org Mon Dec 10 23:10:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Mon, 10 Dec 2007 23:10:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: References: <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> <20071210191857.GG12580@synopsys.com> <20071210192245.GA3454@caradoc.them.org> Message-ID: <20071210225553.GA14648@caradoc.them.org> On Mon, Dec 10, 2007 at 11:35:15PM +0100, Andreas Schwab wrote: > Glibc will map vfork to fork in a multithreaded environment. LinuxThreads used to. NPTL does not; this caused various trouble for GDB at the time. -- Daniel Jacobowitz CodeSourcery From hp@bitrange.com Mon Dec 10 23:28:00 2007 From: hp@bitrange.com (Hans-Peter Nilsson) Date: Mon, 10 Dec 2007 23:28:00 -0000 Subject: $prefix/lib/../$target/sys-include not in <> search path Message-ID: <20071210175823.Y61896@dair.pair.com> When configured with just a --prefix=x and --target=y, $prefix/lib/../$target/sys-include used to be searched, for e.g. limits.h, stdio.h and stdlib.h. No $prefix-rooted path shows up as a "ignoring nonexistent directory" message either. I don't know when this changed, but it doesn't seem like a deliberate change. Or at least, not an improvement. I'll open a PR if I can't resolve this. brgds, H-P From dave.korn@artimi.com Tue Dec 11 03:31:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Tue, 11 Dec 2007 03:31:00 -0000 Subject: libiberty/pex-unix vfork abuse? In-Reply-To: <475D9B0E.25E2A77@dessent.net> References: <05ea01c838e3$001b9b40$2e08a8c0@CAM.ARTIMI.COM> <060401c838f3$f6333aa0$2e08a8c0@CAM.ARTIMI.COM> <0a8201c83b5a$fa391700$2e08a8c0@CAM.ARTIMI.COM> <475D9B0E.25E2A77@dessent.net> Message-ID: <0aaf01c83b84$526ff550$2e08a8c0@CAM.ARTIMI.COM> On 10 December 2007 20:01, Brian Dessent wrote: > Andreas Schwab wrote: > >> Typically in a multithreaded environment vfork is mapped to fork anyway. > > ...which is what I don't understand about this whole thread. It seems > Dave is seeing some strange behavior in Cygwin, but Cygwin's vfork = > fork, there is no difference. There used to be a vfork specialization > in Cygwin, but it is broken and has not been enabled in quite a long > time. Yes, I've noticed it's #ifdef'd out now I've been through the code more thoroughly, and it's also the case that in the old version of gcc/libiberty I'm using the manipulation of the environ variable isn't there either, so I don't have a current problem to solve myself, but some of the embedded guys might want to keep an eye out for it. cheers, DaveK -- Can't think of a witty .sigline today.... From hp@bitrange.com Tue Dec 11 05:29:00 2007 From: hp@bitrange.com (Hans-Peter Nilsson) Date: Tue, 11 Dec 2007 05:29:00 -0000 Subject: $prefix/lib/../$target/sys-include not in <> search path In-Reply-To: <20071210175823.Y61896@dair.pair.com> References: <20071210175823.Y61896@dair.pair.com> Message-ID: <20071210222210.O34784@dair.pair.com> On Mon, 10 Dec 2007, Hans-Peter Nilsson wrote: > When configured with just a --prefix=x and --target=y, > $prefix/lib/../$target/sys-include used to be searched, for e.g. > limits.h, stdio.h and stdlib.h. No $prefix-rooted path shows > up as a "ignoring nonexistent directory" message either. > > I don't know when this changed, but it doesn't seem like a > deliberate change. Or at least, not an improvement. > > I'll open a PR if I can't resolve this. I wouldn't call it a resolution, but it seems an installed toolchain will work. Equivalently: mkdir -p $prefix/lib/gcc/$target/$gcc_version e.g. mkdir -p $prefix/lib/gcc/cris-axis-linux-gnu/4.3.0 should let gcc find $prefix/sys-include, because it wants to go back and forth: $prefix/lib/gcc/cris-axis-linux-gnu/4.3.0/../../../../cris-axis-linux-gnu/sys-include IIRC it used to be this bad, then something or other changed to let it find $prefix/sys-include without the intermediate directories, and now it's bad again. (Bad in that I have to write into $prefix in order to test, and have to keep a $gcc_version directory there.) Or perhaps better hack the testsuite to add a -idirafter or something. Suggestions welcome. Except "use buildroot" (not even sure it'd help). brgds, H-P From jonsmirl@gmail.com Tue Dec 11 07:01:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Tue, 11 Dec 2007 07:01:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> Message-ID: <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> I added the gcc people to the CC, it's their repository. Maybe they can help up sort this out. On 12/11/07, Jon Smirl wrote: > On 12/10/07, Nicolas Pitre wrote: > > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > > > New run using same configuration. With the addition of the more > > > efficient load balancing patches and delta cache accounting. > > > > > > Seconds are wall clock time. They are lower since the patch made > > > threading better at using all four cores. I am stuck at 380-390% CPU > > > utilization for the git process. > > > > > > complete seconds RAM > > > 10% 60 900M (includes counting) > > > 20% 15 900M > > > 30% 15 900M > > > 40% 50 1.2G > > > 50% 80 1.3G > > > 60% 70 1.7G > > > 70% 140 1.8G > > > 80% 180 2.0G > > > 90% 280 2.2G > > > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > > > 100% 1390 2.85G > > > During the writing phase RAM fell to 1.6G > > > What is being freed in the writing phase?? > > > > The cached delta results, but you put a cap of 256MB for them. > > > > Could you try again with that cache disabled entirely, with > > pack.deltacachesize = 1 (don't use 0 as that means unbounded). > > > > And then, while still keeping the delta cache disabled, could you try > > with pack.threads = 2, and pack.threads = 1 ? > > > > I'm sorry to ask you to do this but I don't have enough ram to even > > complete a repack with threads=2 so I'm reattempting single threaded at > > the moment. But I really wonder if the threading has such an effect on > > memory usage. > > I already have a threads = 1 running with this config. Binary and > config were same from threads=4 run. > > 10% 28min 950M > 40% 135min 950M > 50% 157min 900M > 60% 160min 830M > 100% 170min 830M > > Something is hurting bad with threads. 170 CPU minutes with one > thread, versus 195 CPU minutes with four threads. > > Is there a different memory allocator that can be used when > multithreaded on gcc? This whole problem may be coming from the memory > allocation function. git is hardly interacting at all on the thread > level so it's likely a problem in the C run-time. > > [core] > repositoryformatversion = 0 > filemode = true > bare = false > logallrefupdates = true > [pack] > threads = 1 > deltacachesize = 256M > windowmemory = 256M > deltacachelimit = 0 > [remote "origin"] > url = git://git.infradead.org/gcc.git > fetch = +refs/heads/*:refs/remotes/origin/* > [branch "trunk"] > remote = origin > merge = refs/heads/trunk > > > > > > > > > > > > > > > > I have no explanation for the change in RAM usage. Two guesses come to > > > mind. Memory fragmentation. Or the change in the way the work was > > > split up altered RAM usage. > > > > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70% > > > efficient. During the compress phase all four cores were active until > > > the last 90 seconds. Writing the objects took over 23 minutes CPU > > > bound on one core. > > > > > > New pack file is: 270,594,853 > > > Old one was: 344,543,752 > > > It still has 828,660 objects > > > > You mean the pack for the gcc repo is now less than 300MB? Wow. > > > > > > Nicolas > > > > > -- > Jon Smirl > jonsmirl@gmail.com > -- Jon Smirl jonsmirl@gmail.com From jonsmirl@gmail.com Tue Dec 11 07:34:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Tue, 11 Dec 2007 07:34:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> Message-ID: <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Switching to the Google perftools malloc http://goog-perftools.sourceforge.net/ 10% 30 828M 20% 15 831M 30% 10 834M 40% 50 1014M 50% 80 1086M 60% 80 1500M 70% 200 1.53G 80% 200 1.85G 90% 260 1.87G 95% 520 1.97G 100% 1335 2.24G Google allocator knocked 600MB off from memory use. Memory consumption did not fall during the write out phase like it did with gcc. Since all of this is with the same code except for changing the threading split, those runs where memory consumption went to 4.5GB with the gcc allocator must have triggered an extreme problem with fragmentation. Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of being faster are not true. So why does our threaded code take 20 CPU minutes longer (12%) to run than the same code with a single thread? Clock time is obviously faster. Are the threads working too close to each other in memory and bouncing cache lines between the cores? Q6600 is just two E6600s in the same package, the caches are not shared. Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) with 4 threads? But only need 950MB with one thread? Where's the extra gigabyte going? Is there another allocator to try? One that combines Google's efficiency with gcc's speed? On 12/11/07, Jon Smirl wrote: > I added the gcc people to the CC, it's their repository. Maybe they > can help up sort this out. > > On 12/11/07, Jon Smirl wrote: > > On 12/10/07, Nicolas Pitre wrote: > > > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > > > > > New run using same configuration. With the addition of the more > > > > efficient load balancing patches and delta cache accounting. > > > > > > > > Seconds are wall clock time. They are lower since the patch made > > > > threading better at using all four cores. I am stuck at 380-390% CPU > > > > utilization for the git process. > > > > > > > > complete seconds RAM > > > > 10% 60 900M (includes counting) > > > > 20% 15 900M > > > > 30% 15 900M > > > > 40% 50 1.2G > > > > 50% 80 1.3G > > > > 60% 70 1.7G > > > > 70% 140 1.8G > > > > 80% 180 2.0G > > > > 90% 280 2.2G > > > > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > > > > 100% 1390 2.85G > > > > During the writing phase RAM fell to 1.6G > > > > What is being freed in the writing phase?? > > > > > > The cached delta results, but you put a cap of 256MB for them. > > > > > > Could you try again with that cache disabled entirely, with > > > pack.deltacachesize = 1 (don't use 0 as that means unbounded). > > > > > > And then, while still keeping the delta cache disabled, could you try > > > with pack.threads = 2, and pack.threads = 1 ? > > > > > > I'm sorry to ask you to do this but I don't have enough ram to even > > > complete a repack with threads=2 so I'm reattempting single threaded at > > > the moment. But I really wonder if the threading has such an effect on > > > memory usage. > > > > I already have a threads = 1 running with this config. Binary and > > config were same from threads=4 run. > > > > 10% 28min 950M > > 40% 135min 950M > > 50% 157min 900M > > 60% 160min 830M > > 100% 170min 830M > > > > Something is hurting bad with threads. 170 CPU minutes with one > > thread, versus 195 CPU minutes with four threads. > > > > Is there a different memory allocator that can be used when > > multithreaded on gcc? This whole problem may be coming from the memory > > allocation function. git is hardly interacting at all on the thread > > level so it's likely a problem in the C run-time. > > > > [core] > > repositoryformatversion = 0 > > filemode = true > > bare = false > > logallrefupdates = true > > [pack] > > threads = 1 > > deltacachesize = 256M > > windowmemory = 256M > > deltacachelimit = 0 > > [remote "origin"] > > url = git://git.infradead.org/gcc.git > > fetch = +refs/heads/*:refs/remotes/origin/* > > [branch "trunk"] > > remote = origin > > merge = refs/heads/trunk > > > > > > > > > > > > > > > > > > > > > > > > > I have no explanation for the change in RAM usage. Two guesses come to > > > > mind. Memory fragmentation. Or the change in the way the work was > > > > split up altered RAM usage. > > > > > > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70% > > > > efficient. During the compress phase all four cores were active until > > > > the last 90 seconds. Writing the objects took over 23 minutes CPU > > > > bound on one core. > > > > > > > > New pack file is: 270,594,853 > > > > Old one was: 344,543,752 > > > > It still has 828,660 objects > > > > > > You mean the pack for the gcc repo is now less than 300MB? Wow. > > > > > > > > > Nicolas > > > > > > > > > -- > > Jon Smirl > > jonsmirl@gmail.com > > > > > -- > Jon Smirl > jonsmirl@gmail.com > -- Jon Smirl jonsmirl@gmail.com From ae@op5.se Tue Dec 11 11:11:00 2007 From: ae@op5.se (Andreas Ericsson) Date: Tue, 11 Dec 2007 11:11:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: <475E3D86.9030408@op5.se> Jon Smirl wrote: > Switching to the Google perftools malloc > http://goog-perftools.sourceforge.net/ > > Google allocator knocked 600MB off from memory use. > Memory consumption did not fall during the write out phase like it did with gcc. > > Since all of this is with the same code except for changing the > threading split, those runs where memory consumption went to 4.5GB > with the gcc allocator must have triggered an extreme problem with > fragmentation. > > Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of > being faster are not true. > Did you use the tcmalloc with heap checker/profiler, or tcmalloc_minimal? -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 From r.emrich@de.tecosim.com Tue Dec 11 13:31:00 2007 From: r.emrich@de.tecosim.com (Rainer Emrich) Date: Tue, 11 Dec 2007 13:31:00 -0000 Subject: Bootstrap Failure in trunk (fortran) Message-ID: <475E704E.8010700@de.tecosim.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 /SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.3.0/gcc-4.3.0/./prev-gcc/xgcc - -B/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.3.0/gcc-4.3.0/./prev-gcc/ - -B/opt/gcc/Linux/i686-pc-linux-gnu/gcc-4.3.0/i686-pc-linux-gnu/bin/ -c -g -O2 - -fomit-frame-pointer -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes - -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -pedantic - -Wno-long-long -Wno-variadic-macros - -Wno-overlength-strings -Werror -DHAVE_CONFIG_H -I. -Ifortran - -I/home/em/devel/projects/develtools/src/gcc-4.3.0/gcc - -I/home/em/devel/projects/develtools/src/gcc-4.3.0/gcc/fortran - -I/home/em/devel/projects/develtools/src/gcc-4.3.0/gcc/../include - -I/home/em/devel/projects/develtools/src/gcc-4.3.0/gcc/../libcpp/include - -I/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/install/include - -I/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/install/include - -I/home/em/devel/projects/develtools/src/gcc-4.3.0/gcc/../libdecnumber - -I/home/em/devel/projects/develtools/src/gcc-4.3.0/gcc/../libdecnumber/bid - -I../libdecnumber /home/em/devel/projects/develtools/src/gcc-4.3.0/gcc/fortran/decl.c -o fortran/decl.o cc1: warnings being treated as errors /home/em/devel/projects/develtools/src/gcc-4.3.0/gcc/fortran/decl.c: In function ???add_global_entry???: /home/em/devel/projects/develtools/src/gcc-4.3.0/gcc/fortran/decl.c:4344: error: comparison between signed and unsigned gmake[3]: *** [fortran/decl.o] Error 1 gmake[3]: Leaving directory `/SCRATCH/gcc-build/Linux/i686-pc-linux-gnu/gcc-4.3.0/gcc-4.3.0/gcc' caused by: 2007-12-11 Bernhard Fischer * decl.c (match_prefix): Make seen_type a boolean. (add_global_entry): Cache type distinction. * trans-decl.c: Whitespace cleanup. Rainer -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHXnBO3s6elE6CYeURAiWUAKCkrkYFxlt9sr2gN4fN93L+KKyZ7QCfTIBz 42ApNC7+HlrqQgbpX3tNNMI= =pmC0 -----END PGP SIGNATURE----- From nico@cam.org Tue Dec 11 13:49:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Tue, 11 Dec 2007 13:49:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Jon Smirl wrote: > I added the gcc people to the CC, it's their repository. Maybe they > can help up sort this out. Unless there is a Git expert amongst the gcc crowd, I somehow doubt it. And gcc people with an interest in Git internals are probably already on the Git mailing list. Nicolas From nico@cam.org Tue Dec 11 15:01:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Tue, 11 Dec 2007 15:01:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Jon Smirl wrote: > Switching to the Google perftools malloc > http://goog-perftools.sourceforge.net/ > > 10% 30 828M > 20% 15 831M > 30% 10 834M > 40% 50 1014M > 50% 80 1086M > 60% 80 1500M > 70% 200 1.53G > 80% 200 1.85G > 90% 260 1.87G > 95% 520 1.97G > 100% 1335 2.24G > > Google allocator knocked 600MB off from memory use. > Memory consumption did not fall during the write out phase like it did with gcc. > > Since all of this is with the same code except for changing the > threading split, those runs where memory consumption went to 4.5GB > with the gcc allocator must have triggered an extreme problem with > fragmentation. Did you mean the glibc allocator? > Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of > being faster are not true. > > So why does our threaded code take 20 CPU minutes longer (12%) to run > than the same code with a single thread? Clock time is obviously > faster. Are the threads working too close to each other in memory and > bouncing cache lines between the cores? Q6600 is just two E6600s in > the same package, the caches are not shared. Of course there'll always be a certain amount of wasted cycles when threaded. The locking overhead, the extra contention for IO, etc. So 12% overhead (3% per thread) when using 4 threads is not that bad I would say. > Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) > with 4 threads? But only need 950MB with one thread? Where's the extra > gigabyte going? I really don't know. Did you try with pack.deltacachesize set to 1 ? And yet, this is still missing the actual issue. The issue being that the 2.1GB pack as a _source_ doesn't cause as much memory to be allocated even if the _result_ pack ends up being the same. I was able to repack the 2.1GB pack on my machine which has 1GB of ram. Now that it has been repacked, I can't repack it anymore, even when single threaded, as it start crowling into swap fairly quickly. It is really non intuitive and actually senseless that Git would require twice as much RAM to deal with a pack that is 7 times smaller. Nicolas (still puzzled) From nico@cam.org Tue Dec 11 15:36:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Tue, 11 Dec 2007 15:36:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Nicolas Pitre wrote: > And yet, this is still missing the actual issue. The issue being that > the 2.1GB pack as a _source_ doesn't cause as much memory to be > allocated even if the _result_ pack ends up being the same. > > I was able to repack the 2.1GB pack on my machine which has 1GB of ram. > Now that it has been repacked, I can't repack it anymore, even when > single threaded, as it start crowling into swap fairly quickly. It is > really non intuitive and actually senseless that Git would require twice > as much RAM to deal with a pack that is 7 times smaller. OK, here's something else for you to try: core.deltabasecachelimit=0 pack.threads=2 pack.deltacachesize=1 With that I'm able to repack the small gcc pack on my machine with 1GB of ram using: git repack -a -f -d --window=250 --depth=250 and top reports a ~700m virt and ~500m res without hitting swap at all. It is only at 25% so far, but I was unable to get that far before. Would be curious to know what you get with 4 threads on your machine. Nicolas From jonsmirl@gmail.com Tue Dec 11 16:20:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Tue, 11 Dec 2007 16:20:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: <9e4733910712110736w34495ba2l86b2de82055620fd@mail.gmail.com> On 12/11/07, Nicolas Pitre wrote: > On Tue, 11 Dec 2007, Nicolas Pitre wrote: > > > And yet, this is still missing the actual issue. The issue being that > > the 2.1GB pack as a _source_ doesn't cause as much memory to be > > allocated even if the _result_ pack ends up being the same. > > > > I was able to repack the 2.1GB pack on my machine which has 1GB of ram. > > Now that it has been repacked, I can't repack it anymore, even when > > single threaded, as it start crowling into swap fairly quickly. It is > > really non intuitive and actually senseless that Git would require twice > > as much RAM to deal with a pack that is 7 times smaller. > > OK, here's something else for you to try: > > core.deltabasecachelimit=0 > pack.threads=2 > pack.deltacachesize=1 > > With that I'm able to repack the small gcc pack on my machine with 1GB > of ram using: > > git repack -a -f -d --window=250 --depth=250 > > and top reports a ~700m virt and ~500m res without hitting swap at all. > It is only at 25% so far, but I was unable to get that far before. > > Would be curious to know what you get with 4 threads on your machine. Changing those parameters really slowed down counting the objects. I used to be able to count in 45 seconds now it took 130 seconds. I am still have the Google allocator linked in. 4 threads, cumulative clock time 25% 200 seconds, 820/627M 55% 510 seconds, 1240/1000M - little late recording 75% 15 minutes, 1658/1500M 90% 22 minutes, 1974/1800M it's still running but there is no significant change. Are two types of allocations being mixed? 1) long term, global objects kept until the end of everything 2) volatile, private objects allocated only while the object is being compressed and then freed Separating these would make a big difference to the fragmentation problem. Single threading probably wouldn't see a fragmentation problem from mixing the allocation types. When a thread is created it could allocated a private 20MB (or whatever) pool. The volatile, private objects would come from that pool. Long term objects would stay in the global pool. Since they are long term they will just get laid down sequentially in memory. Separating these allocation types make things way easier for malloc. CPU time would be helped by removing some of the locking if possible. -- Jon Smirl jonsmirl@gmail.com From nico@cam.org Tue Dec 11 16:22:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Tue, 11 Dec 2007 16:22:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Nicolas Pitre wrote: > OK, here's something else for you to try: > > core.deltabasecachelimit=0 > pack.threads=2 > pack.deltacachesize=1 > > With that I'm able to repack the small gcc pack on my machine with 1GB > of ram using: > > git repack -a -f -d --window=250 --depth=250 > > and top reports a ~700m virt and ~500m res without hitting swap at all. > It is only at 25% so far, but I was unable to get that far before. Well, around 55% memory usage skyrocketed to 1.6GB and the system went deep into swap. So I restarted it with no threads. Nicolas (even more puzzled) From jonsmirl@gmail.com Tue Dec 11 16:34:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Tue, 11 Dec 2007 16:34:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> On 12/11/07, Nicolas Pitre wrote: > On Tue, 11 Dec 2007, Nicolas Pitre wrote: > > > OK, here's something else for you to try: > > > > core.deltabasecachelimit=0 > > pack.threads=2 > > pack.deltacachesize=1 > > > > With that I'm able to repack the small gcc pack on my machine with 1GB > > of ram using: > > > > git repack -a -f -d --window=250 --depth=250 > > > > and top reports a ~700m virt and ~500m res without hitting swap at all. > > It is only at 25% so far, but I was unable to get that far before. > > Well, around 55% memory usage skyrocketed to 1.6GB and the system went > deep into swap. So I restarted it with no threads. > > Nicolas (even more puzzled) On the plus side you are seeing what I see, so it proves I am not imagining it. -- Jon Smirl jonsmirl@gmail.com From torvalds@linux-foundation.org Tue Dec 11 17:19:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Tue, 11 Dec 2007 17:19:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Jon Smirl wrote: > > So why does our threaded code take 20 CPU minutes longer (12%) to run > than the same code with a single thread? Threaded code *always* takes more CPU time. The only thing you can hope for is a wall-clock reduction. You're seeing probably a combination of (a) more cache misses (b) bigger dataset active at a time and a probably fairly miniscule (c) threading itself tends to have some overheads. > Q6600 is just two E6600s in the same package, the caches are not shared. Sure they are shared. They're just not *entirely* shared. But they are shared between each two cores, so each thread essentially has only half the cache they had with the non-threaded version. Threading is *not* a magic solution to all problems. It gives you potentially twice the CPU power, but there are real downsides that you should keep in mind. > Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) > with 4 threads? But only need 950MB with one thread? Where's the extra > gigabyte going? I suspect that it's really simple: you have a few rather big files in the gcc history, with deep delta chains. And what happens when you have four threads running at the same time is that they all need to keep all those objects that they are working on - and their hash state - in memory at the same time! So if you want to use more threads, that _forces_ you to have a bigger memory footprint, simply because you have more "live" objects that you work on. Normally, that isn't much of a problem, since most source files are small, but if you have a few deep delta chains on big files, both the delta chain itself is going to use memory (you may have limited the size of the cache, but it's still needed for the actual delta generation, so it's not like the memory usage went away). That said, I suspect there are a few things fighting you: - threading is hard. I haven't looked a lot at the changes Nico did to do a threaded object packer, but what I've seen does not convince me it is correct. The "trg_entry" accesses are *mostly* protected with "cache_lock", but nothing else really seems to be, so quite frankly, I wouldn't trust the threaded version very much. It's off by default, and for a good reason, I think. For example: the packing code does this: if (!src->data) { read_lock(); src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); read_unlock(); ... and that's racy. If two threads come in at roughly the same time and see a NULL src->data, they??'ll both get the lock, and they'll both (serially) try to fill it in. It will all *work*, but one of them will have done unnecessary work, and one of them will have their result thrown away and leaked. Are you hitting issues like this? I dunno. The object sorting means that different threads normally shouldn't look at the same objects (not even the sources), so probably not, but basically, I wouldn't trust the threading 100%. It needs work, and it needs to stay off by default. - you're working on a problem that isn't really even worth optimizing that much. The *normal* case is to re-use old deltas, which makes all of the issues you are fighting basically go away (because you only have a few _incremental_ objects that need deltaing). In other words: the _real_ optimizations have already been done, and are done elsewhere, and are much smarter (the best way to optimize X is not to make X run fast, but to avoid doing X in the first place!). The thing you are trying to work with is the one-time-only case where you explicitly disable that big and important optimization, and then you complain about the end result being slow! It's like saying that you're compiling with extreme debugging and no optimizations, and then complaining that the end result doesn't run as fast as if you used -O2. Except this is a hundred times worse, because you literally asked git to do the really expensive thing that it really really doesn't want to do ;) > Is there another allocator to try? One that combines Google's > efficiency with gcc's speed? See above: I'd look around at threading-related bugs and check the way we lock (or don't) accesses. Linus From bmei@broadcom.com Tue Dec 11 17:21:00 2007 From: bmei@broadcom.com (Bingfeng Mei) Date: Tue, 11 Dec 2007 17:21:00 -0000 Subject: error: no data type for mode ".." Message-ID: <2E073B3ABB3F664DBA1D1C4D5FB47EF407769805@NT-IRVA-0752.brcm.ad.broadcom.com> Hello, I tried to define a new machine mode for a data type only allocated to certain registers, e.g., MAC registers. I first used an unused PDI mode (same as Blackfin porting). In target-modes.def file: PARTIAL_INT_MODE (DI); Then in my test program, I tried to define a new data type using PDI mode. typedef int __attribute__ ((mode (PDI))) MREG; But GCC reports error: tst.c:13: error: no data type for mode 'PDI' On this typedef statement. How can I use the newly defined MODE to specify a new data type? I cannot find any example for Blackfin, where PDI mode is used to represennt 40-bit MAC similarly. I also tries to define a new INT_MODE by: INT_MODE(PDI, 8). The error message is the same. Any hint? Thanks in advance. Cheers, Bingfeng Mei Broadcom UK From nico@cam.org Tue Dec 11 17:24:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Tue, 11 Dec 2007 17:24:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Linus Torvalds wrote: > That said, I suspect there are a few things fighting you: > > - threading is hard. I haven't looked a lot at the changes Nico did to do > a threaded object packer, but what I've seen does not convince me it is > correct. The "trg_entry" accesses are *mostly* protected with > "cache_lock", but nothing else really seems to be, so quite frankly, I > wouldn't trust the threaded version very much. It's off by default, and > for a good reason, I think. I beg to differ (of course, since I always know precisely what I do, and like you, my code never has bugs). Seriously though, the trg_entry has not to be protected at all. Why? Simply because each thread has its own exclusive set of objects which no other threads ever mess with. They never overlap. > For example: the packing code does this: > > if (!src->data) { > read_lock(); > src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); > read_unlock(); > ... > > and that's racy. If two threads come in at roughly the same time and > see a NULL src->data, they??'ll both get the lock, and they'll both > (serially) try to fill it in. It will all *work*, but one of them will > have done unnecessary work, and one of them will have their result > thrown away and leaked. No. Once again, it is impossible for two threads to ever see the same src->data at all. The lock is there simply because read_sha1_file() is not reentrant. > Are you hitting issues like this? I dunno. The object sorting means > that different threads normally shouldn't look at the same objects (not > even the sources), so probably not, but basically, I wouldn't trust the > threading 100%. It needs work, and it needs to stay off by default. For now it is, but I wouldn't say it really needs significant work at this point. The latest thread patches were more about tuning than correctness. What the threading could be doing, though, is uncovering some other bugs, like in the pack mmap windowing code for example. Although that code is serialized by the read lock above, the fact that multiple threads are hammering on it in turns means that the mmap window is possibly seeking back and forth much more often than otherwise, possibly leaking something in the process. > - you're working on a problem that isn't really even worth optimizing > that much. The *normal* case is to re-use old deltas, which makes all > of the issues you are fighting basically go away (because you only have > a few _incremental_ objects that need deltaing). > > In other words: the _real_ optimizations have already been done, and > are done elsewhere, and are much smarter (the best way to optimize X is > not to make X run fast, but to avoid doing X in the first place!). The > thing you are trying to work with is the one-time-only case where you > explicitly disable that big and important optimization, and then you > complain about the end result being slow! > > It's like saying that you're compiling with extreme debugging and no > optimizations, and then complaining that the end result doesn't run as > fast as if you used -O2. Except this is a hundred times worse, because > you literally asked git to do the really expensive thing that it really > really doesn't want to do ;) Linus, please pay attention to the _actual_ important issue here. Sure I've been tuning the threading code in parallel to the attempt to debug this memory usage issue. BUT. The point is that repacking the gcc repo using "git repack -a -f --window=250" has a radically different memory usage profile whether you do the repack on the earlier 2.1GB pack or the later 300MB pack. _That_ is the issue. Ironically, it is the 300MB pack that causes the repack to blow memory usage out of proportion. And in both cases, the threading code has to do the same work whether or not the original pack was densely packed or not since -f throws away every existing deltas anyway. So something is fishy elsewhere than in the packing code. Nicolas From davem@davemloft.net Tue Dec 11 17:28:00 2007 From: davem@davemloft.net (David Miller) Date: Tue, 11 Dec 2007 17:28:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: <20071211.092402.266823343.davem@davemloft.net> From: Nicolas Pitre Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST) > BUT. The point is that repacking the gcc repo using "git repack -a -f > --window=250" has a radically different memory usage profile whether you > do the repack on the earlier 2.1GB pack or the later 300MB pack. If you repack on the smaller pack file, git has to expand more stuff internally in order to search the deltas, whereas with the larger pack file I bet git has to less often undelta'ify to get base objects blobs for delta search. In fact that behavior makes perfect sense to me and I don't understand GIT internals very well :-) From dberlin@dberlin.org Tue Dec 11 17:44:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Tue, 11 Dec 2007 17:44:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: <4aca3dc20712110928ybb84c16n40b6dbd50feddb06@mail.gmail.com> On 12/11/07, Jon Smirl wrote: > > Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of > being faster are not true. Depends on your allocation patterns. For our apps, it certainly is :) Of course, i don't know if we've updated the external allocator in a while, i'll bug the people in charge of it. From nico@cam.org Tue Dec 11 18:43:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Tue, 11 Dec 2007 18:43:00 -0000 Subject: Something is broken in repack In-Reply-To: <20071211.092402.266823343.davem@davemloft.net> References: <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <20071211.092402.266823343.davem@davemloft.net> Message-ID: On Tue, 11 Dec 2007, David Miller wrote: > From: Nicolas Pitre > Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST) > > > BUT. The point is that repacking the gcc repo using "git repack -a -f > > --window=250" has a radically different memory usage profile whether you > > do the repack on the earlier 2.1GB pack or the later 300MB pack. > > If you repack on the smaller pack file, git has to expand more stuff > internally in order to search the deltas, whereas with the larger pack > file I bet git has to less often undelta'ify to get base objects blobs > for delta search. Of course. I came to that conclusion two days ago. And despite being pretty familiar with the involved code (I wrote part of it myself) I just can't spot anything wrong with it so far. But somehow the threading code keep distracting people from that issue since it gets to do the same work whether or not the source pack is densely packed or not. Nicolas (who wish he had access to a much faster machine to investigate this issue) From jonsmirl@gmail.com Tue Dec 11 18:57:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Tue, 11 Dec 2007 18:57:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> Message-ID: <9e4733910712111043h6a361996x740f4dba3d742da5@mail.gmail.com> On 12/11/07, Linus Torvalds wrote: > > > On Tue, 11 Dec 2007, Jon Smirl wrote: > > > > So why does our threaded code take 20 CPU minutes longer (12%) to run > > than the same code with a single thread? > > Threaded code *always* takes more CPU time. The only thing you can hope > for is a wall-clock reduction. You're seeing probably a combination of > (a) more cache misses > (b) bigger dataset active at a time > and a probably fairly miniscule > (c) threading itself tends to have some overheads. > > > Q6600 is just two E6600s in the same package, the caches are not shared. > > Sure they are shared. They're just not *entirely* shared. But they are > shared between each two cores, so each thread essentially has only half > the cache they had with the non-threaded version. > > Threading is *not* a magic solution to all problems. It gives you > potentially twice the CPU power, but there are real downsides that you > should keep in mind. > > > Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) > > with 4 threads? But only need 950MB with one thread? Where's the extra > > gigabyte going? > > I suspect that it's really simple: you have a few rather big files in the > gcc history, with deep delta chains. And what happens when you have four > threads running at the same time is that they all need to keep all those > objects that they are working on - and their hash state - in memory at the > same time! > > So if you want to use more threads, that _forces_ you to have a bigger > memory footprint, simply because you have more "live" objects that you > work on. Normally, that isn't much of a problem, since most source files > are small, but if you have a few deep delta chains on big files, both the > delta chain itself is going to use memory (you may have limited the size > of the cache, but it's still needed for the actual delta generation, so > it's not like the memory usage went away). This makes sense. Those runs that blew up to 4.5GB were a combination of this effect and fragmentation in the gcc allocator. Google allocator appears to be much better at controlling fragmentation. Is there a reasonable scheme to force the chains to only be loaded once and then shared between worker threads? The memory blow up appears to be directly correlated with chain length. > > That said, I suspect there are a few things fighting you: > > - threading is hard. I haven't looked a lot at the changes Nico did to do > a threaded object packer, but what I've seen does not convince me it is > correct. The "trg_entry" accesses are *mostly* protected with > "cache_lock", but nothing else really seems to be, so quite frankly, I > wouldn't trust the threaded version very much. It's off by default, and > for a good reason, I think. > > For example: the packing code does this: > > if (!src->data) { > read_lock(); > src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); > read_unlock(); > ... > > and that's racy. If two threads come in at roughly the same time and > see a NULL src->data, they?'ll both get the lock, and they'll both > (serially) try to fill it in. It will all *work*, but one of them will > have done unnecessary work, and one of them will have their result > thrown away and leaked. That may account for the threaded version needing an extra 20 minutes CPU time. An extra 12% of CPU seems like too much overhead for threading. Just letting a couple of those long chain compressions be done twice > > Are you hitting issues like this? I dunno. The object sorting means > that different threads normally shouldn't look at the same objects (not > even the sources), so probably not, but basically, I wouldn't trust the > threading 100%. It needs work, and it needs to stay off by default. > > - you're working on a problem that isn't really even worth optimizing > that much. The *normal* case is to re-use old deltas, which makes all > of the issues you are fighting basically go away (because you only have > a few _incremental_ objects that need deltaing). I agree, this problem only occurs when people import giant repositories. But every time someone hits these problems they declare git to be screwed up and proceed to thrash it in their blogs. > In other words: the _real_ optimizations have already been done, and > are done elsewhere, and are much smarter (the best way to optimize X is > not to make X run fast, but to avoid doing X in the first place!). The > thing you are trying to work with is the one-time-only case where you > explicitly disable that big and important optimization, and then you > complain about the end result being slow! > > It's like saying that you're compiling with extreme debugging and no > optimizations, and then complaining that the end result doesn't run as > fast as if you used -O2. Except this is a hundred times worse, because > you literally asked git to do the really expensive thing that it really > really doesn't want to do ;) > > > Is there another allocator to try? One that combines Google's > > efficiency with gcc's speed? > > See above: I'd look around at threading-related bugs and check the way we > lock (or don't) accesses. > > Linus > -- Jon Smirl jonsmirl@gmail.com From nico@cam.org Tue Dec 11 19:17:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Tue, 11 Dec 2007 19:17:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712111043h6a361996x740f4dba3d742da5@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712111043h6a361996x740f4dba3d742da5@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Jon Smirl wrote: > This makes sense. Those runs that blew up to 4.5GB were a combination > of this effect and fragmentation in the gcc allocator. I disagree. This is insane. > Google allocator appears to be much better at controlling fragmentation. Indeed. And if fragmentation is indeed wasting half of Git's memory usage then we'll have to come with a custom memory allocator. > Is there a reasonable scheme to force the chains to only be loaded > once and then shared between worker threads? The memory blow up > appears to be directly correlated with chain length. No. That would be the equivalent of holding each revision of all files uncompressed all at once in memory. > > That said, I suspect there are a few things fighting you: > > > > - threading is hard. I haven't looked a lot at the changes Nico did to do > > a threaded object packer, but what I've seen does not convince me it is > > correct. The "trg_entry" accesses are *mostly* protected with > > "cache_lock", but nothing else really seems to be, so quite frankly, I > > wouldn't trust the threaded version very much. It's off by default, and > > for a good reason, I think. > > > > For example: the packing code does this: > > > > if (!src->data) { > > read_lock(); > > src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); > > read_unlock(); > > ... > > > > and that's racy. If two threads come in at roughly the same time and > > see a NULL src->data, they??'ll both get the lock, and they'll both > > (serially) try to fill it in. It will all *work*, but one of them will > > have done unnecessary work, and one of them will have their result > > thrown away and leaked. > > That may account for the threaded version needing an extra 20 minutes > CPU time. An extra 12% of CPU seems like too much overhead for > threading. Just letting a couple of those long chain compressions be > done twice No it may not. This theory is wrong as explained before. > > > > Are you hitting issues like this? I dunno. The object sorting means > > that different threads normally shouldn't look at the same objects (not > > even the sources), so probably not, but basically, I wouldn't trust the > > threading 100%. It needs work, and it needs to stay off by default. > > > > - you're working on a problem that isn't really even worth optimizing > > that much. The *normal* case is to re-use old deltas, which makes all > > of the issues you are fighting basically go away (because you only have > > a few _incremental_ objects that need deltaing). > > I agree, this problem only occurs when people import giant > repositories. But every time someone hits these problems they declare > git to be screwed up and proceed to thrash it in their blogs. It's not only for repack. Someone just reported git-blame being unusable too due to insane memory usage, which I suspect is due to the same issue. Nicolas From torvalds@linux-foundation.org Tue Dec 11 19:40:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Tue, 11 Dec 2007 19:40:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712111043h6a361996x740f4dba3d742da5@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712111043h6a361996x740f4dba3d742da5@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Jon Smirl wrote: > > > > So if you want to use more threads, that _forces_ you to have a bigger > > memory footprint, simply because you have more "live" objects that you > > work on. Normally, that isn't much of a problem, since most source files > > are small, but if you have a few deep delta chains on big files, both the > > delta chain itself is going to use memory (you may have limited the size > > of the cache, but it's still needed for the actual delta generation, so > > it's not like the memory usage went away). > > This makes sense. Those runs that blew up to 4.5GB were a combination > of this effect and fragmentation in the gcc allocator. Google > allocator appears to be much better at controlling fragmentation. Yes. I think we do have some case where we simply keep a lot of objects around, and if we are talking reasonably large deltas, we'll have the whole delta-chain in memory just to unpack one single object. The delta cache size limits kick in only when we explicitly cache old delta results (in case they will be re-used, which is rather common), it doesn't affect the normal "I'm using this data right now" case at all. And then fragmentation makes it much much worse. Since the allocation patterns aren't nice (they are pretty random and depend on just the sizes of the objects), and the lifetimes aren't always nicely nested _either_ (they become more so when you disable the cache entirely, but that's just death for performance), I'm not surprised that there can be memory allocators that end up having some issues. > Is there a reasonable scheme to force the chains to only be loaded > once and then shared between worker threads? The memory blow up > appears to be directly correlated with chain length. The worker threads explicitly avoid touching the same objects, and no, you definitely don't want to explode the chains globally once, because the whole point is that we do fit 15 years worth of history into 300MB of pack-file thanks to having a very dense representation. The "loaded once" part is the mmap'ing of the pack-file into memory, but if you were to actually then try to expand the chains, you'd be talking about many *many* more gigabytes of memory than you already see used ;) So what you actually want to do is to just re-use already packed delta chains directly, which is what we normally do. But you are explicitly looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why it then blows up. I'm sure we can find places to improve. But I would like to re-iterate the statement that you're kind of doing a "don't do that then" case which is really - by design - meant to be done once and never again, and is using resources - again, pretty much by design - wildly inappropriately just to get an initial packing done. > That may account for the threaded version needing an extra 20 minutes > CPU time. An extra 12% of CPU seems like too much overhead for > threading. Just letting a couple of those long chain compressions be > done twice Well, Nico pointed out that those things should all be thread-private data, so no, the race isn't there (unless there's some other bug there). > I agree, this problem only occurs when people import giant > repositories. But every time someone hits these problems they declare > git to be screwed up and proceed to thrash it in their blogs. Sure. I'd love to do global packing without paying the cost, but it really was a design decision. Thanks to doing off-line packing ("let it run overnight on some beefy machine") we can get better results. It's expensive, yes. But it was pretty much meant to be expensive. It's a very efficient compression algorithm, after all, and you're turning it up to eleven ;) I also suspect that the gcc archive makes things more interesting thanks to having some rather large files. The ChangeLog is probably the worst case (large file with *lots* of edits), but I suspect the *.po files aren't wonderful either. Linus From gitster@pobox.com Tue Dec 11 20:26:00 2007 From: gitster@pobox.com (Junio C Hamano) Date: Tue, 11 Dec 2007 20:26:00 -0000 Subject: Something is broken in repack In-Reply-To: (Linus Torvalds's message of "Tue, 11 Dec 2007 11:17:08 -0800 (PST)") References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712111043h6a361996x740f4dba3d742da5@mail.gmail.com> Message-ID: <7v7ijldnq1.fsf@gitster.siamese.dyndns.org> Linus Torvalds writes: > On Tue, 11 Dec 2007, Jon Smirl wrote: >> > >> > So if you want to use more threads, that _forces_ you to have a bigger >> > memory footprint, simply because you have more "live" objects that you >> > work on. Normally, that isn't much of a problem, since most source files >> > are small, but if you have a few deep delta chains on big files, both the >> > delta chain itself is going to use memory (you may have limited the size >> > of the cache, but it's still needed for the actual delta generation, so >> > it's not like the memory usage went away). >> >> This makes sense. Those runs that blew up to 4.5GB were a combination >> of this effect and fragmentation in the gcc allocator. Google >> allocator appears to be much better at controlling fragmentation. > > Yes. I think we do have some case where we simply keep a lot of objects > around, and if we are talking reasonably large deltas, we'll have the > whole delta-chain in memory just to unpack one single object. Eh, excuse me. unpack_delta_entry() - first unpacks the base object (this goes recursive); - uncompresses the delta; - applies the delta to the base to obtain the target object; - frees delta; - frees (but allows it to be cached) the base object; - returns the result So no matter how deep a chain is, you keep only one delta at a time in core, not whole delta-chain in core. > So what you actually want to do is to just re-use already packed delta > chains directly, which is what we normally do. But you are explicitly > looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why > it then blows up. While that does not explain, as Nico pointed out, the huge difference between the two repack runs that have different starting pack, I would say it is a fair thing to say. If you have a suboptimal pack (i.e. not enough reusable deltas, as in the 2.1GB pack case), do run "repack -f", but if you have a good pack (i.e. 300MB pack), don't. From ae@op5.se Tue Dec 11 20:34:00 2007 From: ae@op5.se (Andreas Ericsson) Date: Tue, 11 Dec 2007 20:34:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <20071211.092402.266823343.davem@davemloft.net> Message-ID: <475EF27B.7060609@op5.se> Nicolas Pitre wrote: > On Tue, 11 Dec 2007, David Miller wrote: > >> From: Nicolas Pitre >> Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST) >> >>> BUT. The point is that repacking the gcc repo using "git repack -a -f >>> --window=250" has a radically different memory usage profile whether you >>> do the repack on the earlier 2.1GB pack or the later 300MB pack. >> If you repack on the smaller pack file, git has to expand more stuff >> internally in order to search the deltas, whereas with the larger pack >> file I bet git has to less often undelta'ify to get base objects blobs >> for delta search. > > Of course. I came to that conclusion two days ago. And despite being > pretty familiar with the involved code (I wrote part of it myself) I > just can't spot anything wrong with it so far. > > But somehow the threading code keep distracting people from that issue > since it gets to do the same work whether or not the source pack is > densely packed or not. > > Nicolas > (who wish he had access to a much faster machine to investigate this issue) If it's still an issue next week, we'll have a 16 core (8 dual-core cpu's) machine with some 32gb of ram in that'll be free for about two days. You'll have to remind me about it though, as I've got a lot on my mind these days. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 From ae@op5.se Wed Dec 12 05:06:00 2007 From: ae@op5.se (Andreas Ericsson) Date: Wed, 12 Dec 2007 05:06:00 -0000 Subject: Something is broken in repack In-Reply-To: <7v7ijldnq1.fsf@gitster.siamese.dyndns.org> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712111043h6a361996x740f4dba3d742da5@mail.gmail.com> <7v7ijldnq1.fsf@gitster.siamese.dyndns.org> Message-ID: <475EF453.90404@op5.se> Junio C Hamano wrote: > Linus Torvalds writes: > >> So what you actually want to do is to just re-use already packed delta >> chains directly, which is what we normally do. But you are explicitly >> looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why >> it then blows up. > > While that does not explain, as Nico pointed out, the huge difference > between the two repack runs that have different starting pack, I would > say it is a fair thing to say. If you have a suboptimal pack (i.e. not > enough reusable deltas, as in the 2.1GB pack case), do run "repack -f", > but if you have a good pack (i.e. 300MB pack), don't. I think this is too much of a mystery for a lot of people to let it go. Even I started looking into it, and I've got so little spare time just now that I wouldn't stand much of a chance of making a contribution even if I had written the code originally. That being said, I the fact that some git repositories really *can't* be repacked on some machines (because it eats ALL virtual memory) is really something that lowers git's reputation among huge projects. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 From bviyer@ncsu.edu Wed Dec 12 05:13:00 2007 From: bviyer@ncsu.edu (Balaji V. Iyer) Date: Wed, 12 Dec 2007 05:13:00 -0000 Subject: Help with another constraint In-Reply-To: <20071210171542.GL17368@sygehus.dk> References: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> <20071209130740.GI17368@sygehus.dk> <000d01c83a81$85462ca0$33160e98@ece.ncsu.edu> <20071210171542.GL17368@sygehus.dk> Message-ID: <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> Hello Everyone, I got past that negdi2 and some errors..now I am trying to compile some linux module, and it says I am not able to find this constraint: init/main.c: In function 'start_kernel': init/main.c:441: error: insn does not satisfy its constraints: (insn 112 110 478 12 (set (mem:QI (reg/v/f:SI 16 r16 [orig:72 line.183 ] [72]) [0 S1 A8]) (const_int 0 [0x0])) 16 {movqi} (nil) (nil)) init/main.c:441: internal compiler error: in reload_cse_simplify_operands, at postreload.c:391 Please submit a full bug report, Here is what I have for movqi: (define_insn "movqi" [(set (match_operand:QI 0 "nonimmediate_operand" "=p,q,m,m,p,q,p,q") (match_operand:QI 1 "general_operand" "m,m,p,q,p,q,I,I"))] "" "* switch(which_alternative) { case 0: case 1: return \"l.lbz \\t%0,%1\"; case 2: case 3: return \"l.sb \\t%0,%1\"; case 4: case 5: return \"l.ori \\t%0,%1,0\\t # move reg to reg\"; case 6: case 7: return \"l.addi \\t%0,r0,%1\\t # move immediate\"; default: return \"invalid alternative\"; } " To give a quick explanation: p = register numbers between 0-31 (inclusive) q = register numbers between 32-63 (inclusive) I = constant int value: ((VALUE) >=-32768 && (VALUE) <=32767) So, what am I missing? Any help is highly appreciated! Thanking You, Yours Sincerely, Balaji V. Iyer. -- Balaji V. Iyer PhD Student, Center for Efficient, Scalable and Reliable Computing, Department of Electrical and Computer Engineering, North Carolina State University. -----Original Message----- From: 'Rask Ingemann Lambertsen' [mailto:rask@sygehus.dk] Sent: Monday, December 10, 2007 12:16 PM To: Balaji V. Iyer Cc: gcc@gcc.gnu.org; openrisc@opencores.org Subject: Re: Help with another constraint On Sun, Dec 09, 2007 at 11:35:32AM -0500, Balaji V. Iyer wrote: > Hello Rask, > I am not understanding your response, can you clarify it for me? > > As per the question about the error message above? > > ../../gcc-4.0.2/gcc/libgcc2.c -o libgcc/./_negdi2.o > ../../gcc-4.0.2/gcc/libgcc2.c: In function '__negdi2': > ../../gcc-4.0.2/gcc/libgcc2.c:72: error: insn does not satisfy its > constraints: I think this is misleading you. It seems likely that the problem is with the predicate and not the constraint. > (insn 15 13 16 (set (mem:SI (plus:SI (reg/f:SI 2 r2) ^^^ This has to be a register, doesn't it? If so, use -fdump-rtl-all and look at the dump files to see where it goes wrong. > (const_int -28 [0xffffffe4])) [0 D.1256+0 S4 A32]) > (neg:SI (reg:SI 3 r3 [orig:80 D.1255 ] [80]))) 38 {negsi2} (nil) > (nil)) Please also post your negsi2 pattern. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From nico@cam.org Wed Dec 12 07:25:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Wed, 12 Dec 2007 07:25:00 -0000 Subject: Something is broken in repack In-Reply-To: <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: On Tue, 11 Dec 2007, Jon Smirl wrote: > On 12/11/07, Nicolas Pitre wrote: > > On Tue, 11 Dec 2007, Nicolas Pitre wrote: > > > > > OK, here's something else for you to try: > > > > > > core.deltabasecachelimit=0 > > > pack.threads=2 > > > pack.deltacachesize=1 > > > > > > With that I'm able to repack the small gcc pack on my machine with 1GB > > > of ram using: > > > > > > git repack -a -f -d --window=250 --depth=250 > > > > > > and top reports a ~700m virt and ~500m res without hitting swap at all. > > > It is only at 25% so far, but I was unable to get that far before. > > > > Well, around 55% memory usage skyrocketed to 1.6GB and the system went > > deep into swap. So I restarted it with no threads. > > > > Nicolas (even more puzzled) > > On the plus side you are seeing what I see, so it proves I am not imagining it. Well... This is weird. It seems that memory fragmentation is really really killing us here. The fact that the Google allocator did manage to waste quite less memory is a good indicator already. I did modify the progress display to show accounted memory that was allocated vs memory that was freed but still not released to the system. At least that gives you an idea of memory allocation and fragmentation with glibc in real time: diff --git a/progress.c b/progress.c index d19f80c..46ac9ef 100644 --- a/progress.c +++ b/progress.c @@ -8,6 +8,7 @@ * published by the Free Software Foundation. */ +#include #include "git-compat-util.h" #include "progress.h" @@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done) if (progress->total) { unsigned percent = n * 100 / progress->total; if (percent != progress->last_percent || progress_update) { + struct mallinfo m = mallinfo(); progress->last_percent = percent; - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", - progress->title, percent, n, - progress->total, tp, eol); + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s", + progress->title, percent, n, progress->total, + m.uordblks >> 18, m.fordblks >> 18, + tp, eol); fflush(stderr); progress_update = 0; return 1; This shows that at some point the repack goes into a big memory surge. I don't have enough RAM to see how fragmented memory gets though, since it starts swapping around 50% done with 2 threads. With only 1 thread, memory usage grows significantly at around 11% with a pretty noticeable slowdown in the progress rate. So I think the theory goes like this: There is a block of big objects together in the list somewhere. Initially, all those big objects are assigned to thread #1 out of 4. Because those objects are big, they get really slow to delta compress, and storing them all in a window with 250 slots takes significant memory. Threads 2, 3, and 4 have "easy" work loads, so they complete fairly quicly compared to thread #1. But since the progress display is global then you won't notice that one thread is actually crawling slowly. To keep all threads busy until the end, those threads that are done with their work load will steal some work from another thread, choosing the one with the largest remaining work. That is most likely thread #1. So as threads 2, 3, and 4 complete, they will steal from thread 1 and populate their own window with those big objects too, and get slow too. And because all threads gets to work on those big objects towards the end, the progress display will then show a significant slowdown, and memory usage will almost quadruple. Add memory fragmentation to that and you have a clogged system. Solution: pack.deltacachesize=1 pack.windowmemory=16M Limiting the window memory to 16MB will automatically shrink the window size when big objects are encountered, therefore keeping much fewer of those objects at the same time in memory, which in turn means they will be processed much more quickly. And somehow that must help with memory fragmentation as well. Setting pack.deltacachesize to 1 is simply to disable the caching of delta results entirely which will only slow down the writing phase, but I wanted to keep it out of the picture for now. With the above settings, I'm currently repacking the gcc repo with 2 threads, and memory allocation never exceeded 700m virt and 400m res, while the mallinfo shows about 350MB, and progress has reached 90% which has never occurred on this machine with the 300MB source pack so far. Nicolas From a2220333@yahoo.com.tw Wed Dec 12 08:05:00 2007 From: a2220333@yahoo.com.tw (a2220333) Date: Wed, 12 Dec 2007 08:05:00 -0000 Subject: porting gcc to tic54x Message-ID: <984587.48623.qm@web73407.mail.tp2.yahoo.com> hi, I have been porting tic54x to gcc. I use gcc-4.2.2 version. I write some simplest c54x.h and c54x.c and a empty md, and I compile it to generate the tic54x-gcc compiler. But when I execute the compiler I generate I got a segmentation fault error. Is there anything must be define in c54x.c or c54x.h that could make the simplest compiler with no correct output and no errors? Because I want to add functions from this basic port. thanks. here is my files -------------------------------------------------------- /*******************c54x.h************************/ /* number of registers */ #define FIRST_PSEUDO_REGISTER 25 /* number of register classes */ #define N_REG_CLASSES 26 struct cumul_args { int has_varargs; int numarg; }; #define CUMULATIVE_ARGS struct cumul_args /* Node: Register Classes */ /* TODO: get rid of single-register classes? */ enum reg_class { NO_REGS, IMR_REG, IFR_REG, A_REG, B_REG, T_REG, TRN_REG, SP_REG, BK_REG, BRC_REG, RSA_REG, REA_REG, PMST_REG, XPC_REG, DP_REG, ST_REGS, INT_REGS, STAT_REGS, ACC_REGS, BR_REGS, DBL_OP_REGS, AUX_REGS, ARSP_REGS, MMR_REGS, GENERAL_REGS, ALL_REGS, LIM_REG_CLASSES }; #define STRICT_ALIGNMENT 1 /* Nothing is smaller than alignment.. */ #define BYTES_BIG_ENDIAN 0 #define FUNCTION_BOUNDARY BITS_PER_WORD #define UNITS_PER_WORD 1 #define BIGGEST_ALIGNMENT BITS_PER_WORD*2 /* Node: 13.11 Trampolines for Nested Functions */ #define TRAMPOLINE_SIZE 2 /* Just a guess for now */ #define STACK_BOUNDARY BITS_PER_WORD #define Pmode QImode /* Stack pointer */ #define SP_REGNO 16 #define STACK_POINTER_REGNUM SP_REGNO #define AR7_REGNO 15 #define FRAME_POINTER_REGNUM AR7_REGNO /* Fake argument pointer reg */ #define ARG_REGNO 24 #define ARG_POINTER_REGNUM ARG_REGNO #define WORDS_BIG_ENDIAN 0 #define PARM_BOUNDARY BITS_PER_WORD #define FUNCTION_MODE QImode #define BASE_REG_CLASS ARSP_REGS #define MOVE_MAX 1 #define BITS_BIG_ENDIAN 1 /* Node: 10.10.5 Elimination */ #define FRAME_POINTER_REQUIRED 0 /* Node: 13.15 Describing Relative Costs of Operations */ #define SLOW_BYTE_ACCESS 1 #define CASE_VECTOR_MODE QImode /* Node: 13.13 Addressing Modes */ #define MAX_REGS_PER_ADDRESS 2 #define ASM_APP_ON "#APP" #define ASM_APP_OFF "#NO_APP" #define STARTING_FRAME_OFFSET -1 /* Local frame starts just below the frame pointer */ /*sam added start*/ //optabs.c used this... #define CODE_FOR_indirect_jump 8 /*sam added end*/ #define DEFAULT_SIGNED_CHAR 0 /* FIXME (ripped from c4x) */ /* FIXME: double check this */ #define INDEX_REG_CLASS NO_REGS #define GO_IF_LEGITIMATE_ADDRESS(MODE, X, ADDR) \ do { \ } while (0) #define GO_IF_MODE_DEPENDENT_ADDRESS(ADDR, LABEL) \ do { \ } while(0); /* registers that have a fixed purpose * * and can't be used for general tasks. */ #define FIXED_REGISTERS \ { \ /* IMR IFR ST0 ST1 A B T TRN AR0 AR1 AR2 */ \ 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, \ /* AR3 AR4 AR5 AR6 AR7 SP BK BRC RSA REA PMST XPC DP ARG*/ \ 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1 \ } #define CALL_USED_REGISTERS \ { \ /* IMR IFR ST0 ST1 A B T TRN AR0 AR1 AR2 */ \ 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, \ /* AR3 AR4 AR5 AR6 AR7 SP BK BRC RSA REA PMST XPC DP ARG */ \ 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1 \ } /* Defines which registers are in which classes */ #define REG_CLASS_CONTENTS \ { \ {0x00000000}, /* NO_REGS */ \ {0x00000001}, /* IMR_REG */ \ {0x00000002}, /* IFR_REG */ \ {0x00000010}, /* A_REG */ \ {0x00000020}, /* B_REG */ \ {0x00000040}, /* T_REG */ \ {0x00000080}, /* TRN_REG */ \ {0x00010000}, /* SP_REG */ \ {0x00020000}, /* BK_REG */ \ {0x00040000}, /* BRC_REG */ \ {0x00080000}, /* RSA_REG */ \ {0x00100000}, /* REA_REG */ \ {0x00200000}, /* PMST_REG */ \ {0x00400000}, /* XPC_REG */ \ {0x00800000}, /* DP_REG */ \ {0x0000000c}, /* ST_REGS */ \ {0x00000003}, /* INT_REGS */ \ {0x0020000c}, /* STAT_REGS */ \ {0x00000030}, /* ACC_REGS */ \ {0x001c0000}, /* BR_REGS */ \ {0x00003c00}, /* DBL_OP_REGS */ \ {0x0000ff00}, /* AUX_REGS */ \ {0x0001ff00}, /* ARSP_REGS */ \ {0x007fffcf}, /* MMR_REGS */ \ {0x011efff0}, /* GENERAL_REGS */ \ {0xffffffff} /* ALL_REGS */ \ } #define REGISTER_NAMES { \ "imr", "ifr", "st0", "st1", \ "a", "b", "t", "trn", "ar0", "ar1", "ar2", \ "ar3", "ar4", "ar5", "ar6", "ar7", \ "sp", "bk", "brc", "rsa", "rea", \ "pmst", "xpc", "dp", "arg" } #define REG_CLASS_NAMES \ { \ "NO_REGS", \ "IMR_REG", \ "IFR_REG", \ "A_REG", \ "B_REG", \ "T_REG", \ "TRN_REG", \ "SP_REG", \ "BK_REG", \ "BRC_REG", \ "RSA_REG", \ "REA_REG", \ "PMST_REG", \ "XPC_REG", \ "DP_REG", \ "ST_REGS", \ "INT_REGS", \ "STAT_REGS", \ "ACC_REGS", \ "BR_REGS", \ "DBL_OP_REGS", \ "AUX_REGS", \ "ARSP_REGS", \ "MMR_REGS", \ "GENERAL_REGS", \ "ALL_REGS", \ } #define TARGET_CPU_CPP_BUILTINS() \ do { \ builtin_assert ("cpu=c54x"); \ builtin_assert ("machine=c54x"); \ builtin_define_std ("c54x"); \ } while (0) /* Accumulators A and B */ #define A_REGNO 4 #define FUNCTION_ARG_REGNO_P(REGNO) (REGNO == A_REGNO) #define ASM_GENERATE_INTERNAL_LABEL(BUFFER, PREFIX, NUM) \ sprintf((BUFFER), "%s%lu?", (PREFIX), (unsigned long)(NUM)) //#define HARD_REGNO_MODE_OK(REGNO, MODE) c54x_hard_regno_mode_ok(REGNO, MODE) #define HARD_REGNO_MODE_OK(REGNO, MODE) 0 //#define FUNCTION_VALUE_REGNO_P(REGNO) ((REGNO) == A_REGNO) #define FUNCTION_VALUE_REGNO_P(REGNO) ((REGNO) == A_REGNO) //#define INITIALIZE_TRAMPOLINE(TRAMP, FNADDR, CXT) c54x_initialize_trampoline((TRAMP), (FNADDR), (CXT)) #define INITIALIZE_TRAMPOLINE(TRAMP, FNADDR, CXT) 0 /* The caller does all the popping. */ #define RETURN_POPS_ARGS(FUNDECL, FUNTYPE, STACKSIZE) 0 //#define INIT_CUMULATIVE_ARGS(CUM, FNTYPE, LIBNAME, FNDECL, N_NAMED_ARGS) (init_cumulative_args(&(CUM), (FNTYPE), (LIBNAME), (FNDECL))) #define INIT_CUMULATIVE_ARGS(CUM, FNTYPE, LIBNAME, FNDECL, N_NAMED_ARGS) 0 //#define FUNCTION_ARG(CUM, MODE, TYPE, NAMED) (function_arg(&(CUM), (MODE), (TYPE), (NAMED))) #define FUNCTION_ARG(CUM, MODE, TYPE, NAMED) 0 //#define FUNCTION_ARG_ADVANCE(CUM, MODE, TYPE, NAMED) (function_arg_advance(&(CUM), (MODE), (TYPE), (NAMED))) #define FUNCTION_ARG_ADVANCE(CUM, MODE, TYPE, NAMED) 0 //#define REGNO_REG_CLASS(REGNO) (regclass_map[REGNO]) #define REGNO_REG_CLASS(REGNO) 0 #define LEGITIMATE_CONSTANT_P(X) 1 /* Not sure */ #define TRULY_NOOP_TRUNCATION(OUTPREC, INPREC) 1 //#define CONSTANT_ADDRESS_P(X) (GET_CODE (X) == LABEL_REF || GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == CONST_INT || GET_CODE (X) == CONST) #define CONSTANT_ADDRESS_P(X) 0 #define LIBCALL_VALUE(MODE) gen_rtx_REG((MODE), A_REGNO) #define ASM_OUTPUT_ALIGN(STREAM, POWER) fprintf((STREAM), ".align %d", (POWER)) #define FIRST_PARM_OFFSET(FUNCDECL) 0 /* Argument pointer points to the first(lower)argument */ /*sam added start*/ #define CONSTRAINT_LEN(a,b) 0 #define REG_CLASS_FROM_CONSTRAINT(a,b) 0 /*sam added end*/ /* Node: 13.9.12 Generating Code for Profiling */ #define FUNCTION_PROFILER {} //#define PRINT_OPERAND(STREAM, X, CODE) c54x_print_operand((STREAM), (X), (CODE)) #define PRINT_OPERAND(STREAM, X, CODE) 0 //#define PRINT_OPERAND_ADDRESS(STREAM, X) c54x_print_operand_address((STREAM), (X)) #define PRINT_OPERAND_ADDRESS(STREAM, X) 0 //#define MODES_TIEABLE_P(MODE1, MODE2) ((MODE1) == (MODE2) || GET_MODE_CLASS (MODE1) == GET_MODE_CLASS (MODE2)) #define MODES_TIEABLE_P(MODE1, MODE2) 0 /*sam added start*/ #define CONST_DOUBLE_OK_FOR_CONSTRAINT_P(a,b,c) 0 #define CONST_OK_FOR_CONSTRAINT_P(a,b,c) 0 #define CLASS_MAX_NREGS(a,b) 0 #define PREFERRED_RELOAD_CLASS(a,b) 0 #define HARD_REGNO_NREGS(a,b) 0 #define REGNO_OK_FOR_BASE_P(a) 0 #define REGNO_OK_FOR_INDEX_P(a) 0 #define ASM_OUTPUT_COMMON(a,b,c,d) 0 #define ASM_OUTPUT_LOCAL(a,b,c,d) 0 #define ASM_OUTPUT_SKIP(a,b) 0 #define INITIAL_FRAME_POINTER_OFFSET(a) 0 /*sam added end*/ -------------------------------------------------------------------- /**************************c54x.c*******************************/ /* #include "target.h" #include "target-def.h" #include "system.h" #include "function.h" */ #include "config.h" #include "system.h" #include "coretypes.h" #include "tm.h" #include "rtl.h" #include "tree.h" #include "tm_p.h" #include "regs.h" #include "hard-reg-set.h" #include "real.h" #include "insn-config.h" #include "conditions.h" #include "output.h" #include "insn-codes.h" #include "insn-modes.h" #include "insn-attr.h" #include "flags.h" #include "except.h" #include "function.h" #include "recog.h" #include "expr.h" #include "optabs.h" #include "toplev.h" #include "basic-block.h" #include "ggc.h" #include "target.h" #include "target-def.h" #include "langhooks.h" #include "cgraph.h" #include "tree-gimple.h" #include "emit-rtl.h" struct gcc_target targetm = TARGET_INITIALIZER; /*sam added start*/ rtx gen_jump (rtx operand0 ATTRIBUTE_UNUSED) { return 0; } rtx gen_indirect_jump (rtx operand0 ATTRIBUTE_UNUSED) { return 0; } void default_globalize_label (FILE *fp, const char *str) { } /*sam added end*/ ----------------------------------------------------------- /**************************compile_install.sh**************/ #!/bin/bash DPATH="/home/sam/download/gcc_installer/SRC/" PREFIX="${DPATH}c54_4.2.2" TARGET="tic54x" gcc="gcc-4.2.2" binutils="binutils-2.18" newlib="newlib-1.15.0" bin_gcc="build_${gcc}" bin_binutils="build_${binutils}" bin_newlib="build_${newlib}"; export CC="gcc" # set bootstrp compiler export PATH=${PREFIX}/bin:$PATH mkdir -p ${PREFIX} cd SRC cd BUILD function installbinutils() { mkdir ${bin_binutils} cd ${bin_binutils} ../../${binutils}/configure --target=${TARGET} --prefix=${PREFIX} --disable-nls 2>&1 | tee configure.out make all 2>&1 | tee make.out make install 2>&1 | tee -a install_make.out cd - } function installgcc() { mkdir ${bin_gcc} cd ${bin_gcc} ../../${gcc}/configure --target=${TARGET} --prefix=${PREFIX} --disable-nls --enable-languages=c --disable-libssp -- with-headers --with-newlib 2>&1 | tee configure.out make all-gcc 2>&1 | tee make.out make install-gcc 2>&1 | tee install.out cd - } installbinutils installgcc cd $PROJ ------------------------------------------------------- thanks, sam _____________________________________________________________________________________ ??????Yahoo!??????2.0? http://tw.mg0.mail.yahoo.com/dc/landing From dak@gnu.org Wed Dec 12 12:02:00 2007 From: dak@gnu.org (David Kastrup) Date: Wed, 12 Dec 2007 12:02:00 -0000 Subject: Something is broken in repack In-Reply-To: (Nicolas Pitre's message of "Wed, 12 Dec 2007 00:12:57 -0500 (EST)") References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: <85d4tc8hi8.fsf@lola.goethe.zz> Nicolas Pitre writes: > Well... This is weird. > > It seems that memory fragmentation is really really killing us here. > The fact that the Google allocator did manage to waste quite less memory > is a good indicator already. Maybe an malloc/free/mmap wrapper that records the requested sizes and alloc/free order and dumps them to file so that one can make a compact git-free standalone test case for the glibc maintainers might be a good thing. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum From baembel@gmx.de Wed Dec 12 12:13:00 2007 From: baembel@gmx.de (Boris Boesler) Date: Wed, 12 Dec 2007 12:13:00 -0000 Subject: branch delay slots Message-ID: <0375011D-0C51-4101-801E-5AC4217DB0DC@gmx.de> Hi! I "implemented" branch delay slots (define_delay) for my architecture and I use the command line option -fdelayed-branch. But branch delay slot filling is done just for a few candidates. Even for the same rule within the same compilation unit (C file) it is done in a few cases but not in all. How can this happen? Boris From ERES@il.ibm.com Wed Dec 12 12:53:00 2007 From: ERES@il.ibm.com (Revital1 Eres) Date: Wed, 12 Dec 2007 12:53:00 -0000 Subject: Help with another constraint In-Reply-To: <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> Message-ID: Hello, I think you should add the pair of constraints m and I respectively to the description of the instruction in your md file (and a relevant case 8 to handle such instruction), i.e.: (define_insn "movqi" - [(set (match_operand:QI 0 "nonimmediate_operand" "=p,q,m,m,p,q,p,q") - (match_operand:QI 1 "general_operand" "m,m,p,q,p,q,I,I"))] + [(set (match_operand:QI 0 "nonimmediate_operand" "=p,q,m,m,p,q,p,q,m") + (match_operand:QI 1 "general_operand" "m,m,p,q,p,q,I,I,I"))] "" "* switch(which_alternative) @@ -17,6 +17,8 @@ case 6: case 7: return \"l.addi \\t%0,r0,%1\\t # move immediate\";, + case 8: + return ...; default: return \"invalid alternative\"; } It seems that the pair m and I is missing (which indicate the memory = constant instruction). You could look for which_alternative variable in GCC internals for more details on this. Revital From bviyer@ncsu.edu Wed Dec 12 13:01:00 2007 From: bviyer@ncsu.edu (Balaji V. Iyer) Date: Wed, 12 Dec 2007 13:01:00 -0000 Subject: Help with another constraint In-Reply-To: References: <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> Message-ID: <002701c83cbd$f83c0e30$33160e98@ece.ncsu.edu> Hi Revital1, Thank you very much for your help. The ISA I am using (OpenRISC) does not provide an alternative for moving a constant into memory. The only way of doing this is to move the constant into a register (which i am doing) and then move that register value into memory. So what can I do in that case? Thanks, Baljai V. Iyer. -- Balaji V. Iyer PhD Student, Center for Efficient, Scalable and Reliable Computing, Department of Electrical and Computer Engineering, North Carolina State University. -----Original Message----- From: Revital1 Eres [mailto:ERES@il.ibm.com] Sent: Wednesday, December 12, 2007 7:14 AM To: Balaji V. Iyer Cc: gcc@gcc.gnu.org; openrisc@opencores.org; 'Rask Ingemann Lambertsen' Subject: RE: Help with another constraint Hello, I think you should add the pair of constraints m and I respectively to the description of the instruction in your md file (and a relevant case 8 to handle such instruction), i.e.: (define_insn "movqi" - [(set (match_operand:QI 0 "nonimmediate_operand" "=p,q,m,m,p,q,p,q") - (match_operand:QI 1 "general_operand" "m,m,p,q,p,q,I,I"))] + [(set (match_operand:QI 0 "nonimmediate_operand" "=p,q,m,m,p,q,p,q,m") + (match_operand:QI 1 "general_operand" "m,m,p,q,p,q,I,I,I"))] "" "* switch(which_alternative) @@ -17,6 +17,8 @@ case 6: case 7: return \"l.addi \\t%0,r0,%1\\t # move immediate\";, + case 8: + return ...; default: return \"invalid alternative\"; } It seems that the pair m and I is missing (which indicate the memory = constant instruction). You could look for which_alternative variable in GCC internals for more details on this. Revital From dave.korn@artimi.com Wed Dec 12 14:35:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Wed, 12 Dec 2007 14:35:00 -0000 Subject: Help with another constraint In-Reply-To: References: <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> Message-ID: <000601c83cbf$0c887a30$2e08a8c0@CAM.ARTIMI.COM> On 12 December 2007 12:14, Revital1 Eres wrote: > It seems that the pair m and I is missing (which indicate the memory = > constant instruction). So doesn't the question then become "Why isn't reload reloading the constant into a register"? cheers, DaveK -- Can't think of a witty .sigline today.... From rask@sygehus.dk Wed Dec 12 15:21:00 2007 From: rask@sygehus.dk ('Rask Ingemann Lambertsen') Date: Wed, 12 Dec 2007 15:21:00 -0000 Subject: Help with another constraint In-Reply-To: <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> References: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> <20071209130740.GI17368@sygehus.dk> <000d01c83a81$85462ca0$33160e98@ece.ncsu.edu> <20071210171542.GL17368@sygehus.dk> <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> Message-ID: <20071212143508.GP17368@sygehus.dk> On Wed, Dec 12, 2007 at 12:06:04AM -0500, Balaji V. Iyer wrote: > Hello Everyone, > I got past that negdi2 and some errors..now I am trying to compile > some linux module, and it says I am not able to find this constraint: > > init/main.c: In function 'start_kernel': > init/main.c:441: error: insn does not satisfy its constraints: > (insn 112 110 478 12 (set (mem:QI (reg/v/f:SI 16 r16 [orig:72 line.183 ] > [72]) [0 S1 A8]) > (const_int 0 [0x0])) 16 {movqi} (nil) > (nil)) > init/main.c:441: internal compiler error: in > reload_cse_simplify_operands, at postreload.c:391 > Please submit a full bug report, > > Here is what I have for movqi: The movxx patterns are special and you'll need to hold the compiler's hands a little. Since your target can't move immediates directly to memory, you have to ask for a secondary reload to an intermediate register. Use the target hook TARGET_SECONDARY_RELOAD. When you've got the secondary reloads working, you can likely improve code quality: 1) Use a movqi expander to expand the instructions correctly to begin with. For example, if operand 0 is in memory and operand 1 is an immediate, use operands[1] = force_reg (QImode, operands[1]); Rename the "movqi" insn to "*movqi". > (define_insn "movqi" > [(set (match_operand:QI 0 "nonimmediate_operand" "=p,q,m,m,p,q,p,q") > (match_operand:QI 1 "general_operand" "m,m,p,q,p,q,I,I"))] > "" ^^ 2) Reject operand combinations that aren't supported, such as operand 0 being in memory and operand 1 being an immediate. You can look at other RISC targets (e.g. ARM, PA-RISC, MIPS, SPARC, Alpha or RS6000) for examples. > "* New ports should not use the old-style "* ... " C-blocks. Use { ... } as documented. Then you'll also avoid the \" and \\ sequences. > switch(which_alternative) > { > case 0: > case 1: > return \"l.lbz \\t%0,%1\"; > case 2: > case 3: > return \"l.sb \\t%0,%1\"; > case 4: > case 5: > return \"l.ori \\t%0,%1,0\\t # move reg to reg\"; > case 6: > case 7: > return \"l.addi \\t%0,r0,%1\\t # move immediate\"; > default: > return \"invalid alternative\"; > } Presumably you've temporarily coded it this way for debugging purposes. If not, use the normal way: "@ l.lbz ... l.sb ... ..." > To give a quick explanation: > p = register numbers between 0-31 (inclusive) > q = register numbers between 32-63 (inclusive) You use them in pairs a lot. Define a register class which consists of registers 0-64 and use that in your constraints. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From iant@google.com Wed Dec 12 15:27:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Wed, 12 Dec 2007 15:27:00 -0000 Subject: branch delay slots In-Reply-To: <0375011D-0C51-4101-801E-5AC4217DB0DC@gmx.de> References: <0375011D-0C51-4101-801E-5AC4217DB0DC@gmx.de> Message-ID: Boris Boesler writes: > I "implemented" branch delay slots (define_delay) for my > architecture and I use the command line option -fdelayed-branch. But > branch delay slot filling is done just for a few candidates. Even for > the same rule within the same compilation unit (C file) it is done in > a few cases but not in all. How can this happen? There are many possible reasons. The first step is to look at the dump file generated by -fdump-rtl-dbr. Ian From ebotcazou@libertysurf.fr Wed Dec 12 15:48:00 2007 From: ebotcazou@libertysurf.fr (Eric Botcazou) Date: Wed, 12 Dec 2007 15:48:00 -0000 Subject: branch delay slots In-Reply-To: <0375011D-0C51-4101-801E-5AC4217DB0DC@gmx.de> References: <0375011D-0C51-4101-801E-5AC4217DB0DC@gmx.de> Message-ID: <200712121628.48748.ebotcazou@libertysurf.fr> > I "implemented" branch delay slots (define_delay) for my > architecture and I use the command line option -fdelayed-branch. But > branch delay slot filling is done just for a few candidates. Even for > the same rule within the same compilation unit (C file) it is done in > a few cases but not in all. How can this happen? You should not need to pass -fdelayed-branch explicitly. If you do, this means you're compiling at -O0, in which case the problem you run into is not very surprising. Just compile with bare -O1 at a minimum. -- Eric Botcazou From nico@cam.org Wed Dec 12 16:14:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Wed, 12 Dec 2007 16:14:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: On Wed, 12 Dec 2007, Nicolas Pitre wrote: > Add memory fragmentation to that and you have a clogged system. > > Solution: > > pack.deltacachesize=1 > pack.windowmemory=16M > > Limiting the window memory to 16MB will automatically shrink the window > size when big objects are encountered, therefore keeping much fewer of > those objects at the same time in memory, which in turn means they will > be processed much more quickly. And somehow that must help with memory > fragmentation as well. OK scrap that. When I returned to the computer this morning, the repack was completed... with a 1.3GB pack instead. So... The gcc repo apparently really needs a large window to efficiently compress those large objects. But when those large objects are already well deltified and you repack again with a large window, somehow the memory allocator is way more involved, probably even more so when there are several threads in parallel amplifying the issue, and things probably get to a point of no return with regard to memory fragmentation after a while. So... my conclusion is that the glibc allocator has fragmentation issues with this work load, given the notable difference with the Google allocator, which itself might not be completely immune to fragmentation issues of its own. And because the gcc repo requires a large window of big objects to get good compression, then you're better not using 4 threads to repack it with -a -f. The fact that the size of the source pack has such an influence is probably only because the increased usage of the delta base object cache is playing a role in the global memory allocation pattern, allowing for the bad fragmentation issue to occur. If you could run one last test with the mallinfo patch I posted, without the pack.windowmemory setting, and adding the reported values along with those from top, then we could formally conclude to memory fragmentation issues. So I don't think Git itself is actually bad. The gcc repo most certainly constitute a nasty use case for memory allocators, but I don't think there is much we can do about it besides possibly implementing our own memory allocator with active defragmentation where possible (read memcpy) at some point to give glibc's allocator some chance to breathe a bit more. In the mean time you might have to use only one thread and lots of memory to repack the gcc repo, or find the perfect memory allocator to be used with Git. After all, packing the whole gcc history to around 230MB is quite a stunt but it requires sufficient resources to achieve it. Fortunately, like Linus said, such a wholesale repack is not something that most users have to do anyway. Nicolas From nico@cam.org Wed Dec 12 16:19:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Wed, 12 Dec 2007 16:19:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: On Wed, 12 Dec 2007, Nicolas Pitre wrote: > I did modify the progress display to show accounted memory that was > allocated vs memory that was freed but still not released to the system. > At least that gives you an idea of memory allocation and fragmentation > with glibc in real time: > > diff --git a/progress.c b/progress.c > index d19f80c..46ac9ef 100644 > --- a/progress.c > +++ b/progress.c > @@ -8,6 +8,7 @@ > * published by the Free Software Foundation. > */ > > +#include > #include "git-compat-util.h" > #include "progress.h" > > @@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done) > if (progress->total) { > unsigned percent = n * 100 / progress->total; > if (percent != progress->last_percent || progress_update) { > + struct mallinfo m = mallinfo(); > progress->last_percent = percent; > - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", > - progress->title, percent, n, > - progress->total, tp, eol); > + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s", > + progress->title, percent, n, progress->total, > + m.uordblks >> 18, m.fordblks >> 18, > + tp, eol); Note: I didn't know what unit of memory those blocks represents, so the shift is most probably wrong. Nicolas From bonzini@gnu.org Wed Dec 12 16:37:00 2007 From: bonzini@gnu.org (Paolo Bonzini) Date: Wed, 12 Dec 2007 16:37:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: > When I returned to the computer this morning, the repack was > completed... with a 1.3GB pack instead. > > So... The gcc repo apparently really needs a large window to efficiently > compress those large objects. So, am I right that if you have a very well-done pack (such as gcc's), you might want to repack in two phases: - first discarding the old deltas and using a small window, thus producing a bad pack that can be repacked without humongous amounts of memory... - ... then discarding the old deltas and producing another well-compressed pack? Paolo From torvalds@linux-foundation.org Wed Dec 12 16:42:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Wed, 12 Dec 2007 16:42:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: On Wed, 12 Dec 2007, Nicolas Pitre wrote: > > So... my conclusion is that the glibc allocator has fragmentation issues > with this work load, given the notable difference with the Google > allocator, which itself might not be completely immune to fragmentation > issues of its own. Yes. Note that delta following involves patterns something like allocate (small) space for delta for i in (1..depth) { allocate large space for base allocate large space for result .. apply delta .. free large space for base free small space for delta } so if you have some stupid heap algorithm that doesn't try to merge and re-use free'd spaces very aggressively (because that takes CPU time!), you might have memory usage be horribly inflated by the heap having all those holes for all the objects that got free'd in the chain that don't get aggressively re-used. Threaded memory allocators then make this worse by probably using totally different heaps for different threads (in order to avoid locking), so they will *all* have the fragmentation issue. And if you *really* want to cause trouble for a memory allocator, what you should try to do is to allocate the memory in one thread, and free it in another, and then things can really explode (the freeing thread notices that the allocation is not in its thread-local heap, so instead of really freeing it, it puts it on a separate list of areas to be freed later by the original thread when it needs memory - or worse, it adds it to the local thread list, and makes it effectively totally impossible to then ever merge different free'd allocations ever again because the freed things will be on different heap lists!). I'm not saying that particular case happens in git, I'm just saying that it's not unheard of. And with the delta cache and the object lookup, it's not at _all_ impossible that we hit the "allocate in one thread, free in another" case! Linus From davem@davemloft.net Wed Dec 12 16:54:00 2007 From: davem@davemloft.net (David Miller) Date: Wed, 12 Dec 2007 16:54:00 -0000 Subject: Something is broken in repack In-Reply-To: References: Message-ID: <20071212.084212.02518392.davem@davemloft.net> From: Linus Torvalds Date: Wed, 12 Dec 2007 08:37:10 -0800 (PST) > I'm not saying that particular case happens in git, I'm just saying that > it's not unheard of. And with the delta cache and the object lookup, it's > not at _all_ impossible that we hit the "allocate in one thread, free in > another" case! One thing that supports these theories is that, while running these large repacks, I notice that the RSS is roughly 2/3 of the amount of virtual address space allocated. I personally don't think it's unreasonable for GIT to have it's own customized allocator at least for certain object types. From torvalds@linux-foundation.org Wed Dec 12 17:12:00 2007 From: torvalds@linux-foundation.org (Linus Torvalds) Date: Wed, 12 Dec 2007 17:12:00 -0000 Subject: Something is broken in repack In-Reply-To: <20071212.084212.02518392.davem@davemloft.net> References: <20071212.084212.02518392.davem@davemloft.net> Message-ID: On Wed, 12 Dec 2007, David Miller wrote: > > I personally don't think it's unreasonable for GIT to have it's > own customized allocator at least for certain object types. Well, we actually already *do* have a customized allocator, but currently only for the actual core "object descriptor" that really just has the SHA1 and object flags in it (and a few extra words depending on object type). Those are critical for certain loads, and small too (so using the standard allocator wasted a _lot_ of memory). In addition, they're fixed-size and never free'd, so a specialized allocator really can do a lot better than any general-purpose memory allocator ever could. But the actual object *contents* are currently all allocated with whatever the standard libc malloc/free allocator is that you compile for (or load dynamically). Havign a specialized allocator for them is a much more involved issue, exactly because we do have interesting allocation patterns etc. That said, at least those object allocations are all single-threaded (for right now, at least), so even when git does multi-threaded stuff, the core sha1_file.c stuff is always run under a single lock, and a simpler allocator that doesn't care about threads is likely to be much better than one that tries to have thread-local heaps etc. I suspect that is what the google allocator does. It probably doesn't have per-thread heaps, it just uses locking (and quite possibly things like per-*size* heaps, which is much more memory-efficient and helps avoid some of the fragmentation problems). Locking is much slower than per-thread accesses, but it doesn't have the issues with per-thread-fragmentation and all the problems with one thread allocating and another one freeing. Linus From jonsmirl@gmail.com Wed Dec 12 17:29:00 2007 From: jonsmirl@gmail.com (Jon Smirl) Date: Wed, 12 Dec 2007 17:29:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: <9e4733910712120912l342350f2i1f190c45730108f2@mail.gmail.com> On 12/12/07, Linus Torvalds wrote: > > > On Wed, 12 Dec 2007, Nicolas Pitre wrote: > > > > So... my conclusion is that the glibc allocator has fragmentation issues > > with this work load, given the notable difference with the Google > > allocator, which itself might not be completely immune to fragmentation > > issues of its own. > > Yes. > > Note that delta following involves patterns something like > > allocate (small) space for delta > for i in (1..depth) { > allocate large space for base > allocate large space for result > .. apply delta .. > free large space for base > free small space for delta > } Is it hard to hack up something that statically allocates a big block of memory per thread for these two and then just reuses it? allocate (small) space for delta allocate large space for base The alternating between long term and short term allocations definitely aggravates fragmentation. > > so if you have some stupid heap algorithm that doesn't try to merge and > re-use free'd spaces very aggressively (because that takes CPU time!), you > might have memory usage be horribly inflated by the heap having all those > holes for all the objects that got free'd in the chain that don't get > aggressively re-used. > > Threaded memory allocators then make this worse by probably using totally > different heaps for different threads (in order to avoid locking), so they > will *all* have the fragmentation issue. > > And if you *really* want to cause trouble for a memory allocator, what you > should try to do is to allocate the memory in one thread, and free it in > another, and then things can really explode (the freeing thread notices > that the allocation is not in its thread-local heap, so instead of really > freeing it, it puts it on a separate list of areas to be freed later by > the original thread when it needs memory - or worse, it adds it to the > local thread list, and makes it effectively totally impossible to then > ever merge different free'd allocations ever again because the freed > things will be on different heap lists!). > > I'm not saying that particular case happens in git, I'm just saying that > it's not unheard of. And with the delta cache and the object lookup, it's > not at _all_ impossible that we hit the "allocate in one thread, free in > another" case! > > Linus > -- Jon Smirl jonsmirl@gmail.com From rask@sygehus.dk Wed Dec 12 17:50:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Wed, 12 Dec 2007 17:50:00 -0000 Subject: Help with another constraint In-Reply-To: <000601c83cbf$0c887a30$2e08a8c0@CAM.ARTIMI.COM> References: <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> <000601c83cbf$0c887a30$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <20071212162844.GQ17368@sygehus.dk> On Wed, Dec 12, 2007 at 01:01:00PM -0000, Dave Korn wrote: > On 12 December 2007 12:14, Revital1 Eres wrote: > > > It seems that the pair m and I is missing (which indicate the memory = > > constant instruction). > > So doesn't the question then become "Why isn't reload reloading the constant > into a register"? One possibility is that reload is not run on insns produced by reload itself. Other usual suspects are post-reload splitters and peepholes. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From baembel@gmx.de Wed Dec 12 18:13:00 2007 From: baembel@gmx.de (Boris Boesler) Date: Wed, 12 Dec 2007 18:13:00 -0000 Subject: branch delay slots In-Reply-To: References: <0375011D-0C51-4101-801E-5AC4217DB0DC@gmx.de> Message-ID: Am 12.12.2007 um 16:21 schrieb Ian Lance Taylor: > Boris Boesler writes: > >> I "implemented" branch delay slots (define_delay) for my >> architecture and I use the command line option -fdelayed-branch. But >> branch delay slot filling is done just for a few candidates. Even for >> the same rule within the same compilation unit (C file) it is done in >> a few cases but not in all. How can this happen? > > There are many possible reasons. The first step is to look at the > dump file generated by -fdump-rtl-dbr. Could have thought of this myself .. Ok, I think I found it: GCC leaves control-flow operations as they are, if it can not place other operations in branch delay slots (represented as SEQUENCEs in GCC); or in other words: GCC does not represent empty delay slots. Is this correct? Boris From rsandifo@nildram.co.uk Wed Dec 12 18:47:00 2007 From: rsandifo@nildram.co.uk (Richard Sandiford) Date: Wed, 12 Dec 2007 18:47:00 -0000 Subject: branch delay slots In-Reply-To: (Boris Boesler's message of "Wed\, 12 Dec 2007 18\:50\:36 +0100") References: <0375011D-0C51-4101-801E-5AC4217DB0DC@gmx.de> Message-ID: <87d4tblr2r.fsf@firetop.home> Boris Boesler writes: > Ok, I think I found it: GCC leaves control-flow operations as they > are, if it can not place other operations in branch delay slots > (represented as SEQUENCEs in GCC); or in other words: GCC does not > represent empty delay slots. > > Is this correct? Yeah. See e.g. the way the MIPS port deals with this. The jump and call patterns add a nop to the asm string when final_sequence is null. Richard From jcpiza@gmail.com Wed Dec 12 19:41:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Wed, 12 Dec 2007 19:41:00 -0000 Subject: Something is broken in repack. Why not with fork and pipes? Message-ID: <998d0e4a0712121047m3cb09f37qc3157b96e5d171e7@mail.gmail.com> At http://gcc.gnu.org/ml/gcc/2007-12/msg00360.html, Andreas Ericsson wrote: > If it's still an issue next week, we'll have a 16 core (8 dual-core cpu's) > machine with some 32gb of ram in that'll be free for about two days. > You'll have to remind me about it though, as I've got a lot on my mind > these days. > > > -- > Andreas Ericsson andreas.ericsson@op5.se > OP5 AB www.op5.se > Tel: +46 8-230225 Fax: +46 8-230231 It's good idea if it's for 24/365.25 that it does autorepack-compute-again-again-again-those-unexplored-deltas of git repositories in realtime. :D Some body can do "git clone" that it could give smaller that one hour ago :D ----------------------------------------------------------------- To Linus, Why don't you forget the threaded implementation of your repo-pack? To imagine a "buggy bloated threading implementation originated to try it to work only in HyperThreading Intel CPUs and 8 cores x 8 threads/core Niagara Sparcs" IMHO, in multicored machine, multiprocessed implementation of repo-pack perfomes better than multithreaded implementation, although i've not their results. It has not issue, not problem, etc. with memory allocation of threads, so monothreaded memory allocation is simple and fast! You can see "Why not with fork and pipes like in linux?" at http://gcc.gnu.org/ml/gcc/2007-12/msg00203.html http://gcc.gnu.org/ml/gcc/2007-12/msg00209.html For easy implementation, don't use threads due to complicated condition races between threads of multithreaded processes. To use only condition races between monothreaded processes with select/epoll only in the parent process. It's due to the KISS principle works. The children processes share almost readed-only memory due to COW (Copy On Write), so, before forking, the parent must to have a large plain data structures in C for children. The children use pipes to realize a complex intercommunication that the parent updates the results computated by the children almost of the time. Another implementation is that the children can realize a locked load-and-store to/from unique filesystem's database if big memory to store data is a big problem. Another implementation is to consider children processes as intensive-CPU slaves and parent process as the master that manipulates the big database. If you want to measure the performance between multiprocessed vs multithreaded implementation of repo-pack then you have to remember that For same data input size and same data output size, to get the seconds of your wall-clock or watch-clock as a measure of the benchmark of this repo-pack. The numeric data posted to mailing list about the timings dependently of # of threads are bad measured because they don't say how is small the result repo. and don't say if the results are the same independently of # of threads. For good measures, we need "to plot the curves", e.g. based in ( # of threads, elapsed time of wall-clock, data input size, data output size ) and we can observe the intersection between above curves. J.C.Pizarro From Johannes.Schindelin@gmx.de Wed Dec 12 20:08:00 2007 From: Johannes.Schindelin@gmx.de (Johannes Schindelin) Date: Wed, 12 Dec 2007 20:08:00 -0000 Subject: Something is broken in repack. Why not with fork and pipes? In-Reply-To: <998d0e4a0712121047m3cb09f37qc3157b96e5d171e7@mail.gmail.com> References: <998d0e4a0712121047m3cb09f37qc3157b96e5d171e7@mail.gmail.com> Message-ID: Hi, On Wed, 12 Dec 2007, J.C. Pizarro wrote: > It's good idea if it's for 24/365.25 that it does > autorepack-compute-again-again-again-those-unexplored-deltas of > git repositories in realtime. :D This sentence does not parse. > Some body can do "git clone" that it could give smaller that one hour ago :D Neither does this. > To Linus, Why don't you forget the threaded implementation of your > repo-pack? Please do a little research before you ask such questions: it is neither Linus who did it, nor is it better to use processes than threads. Besides, your proposal has nothing to do with the issue of this thread (memory consumption). Ciao, Dscho From dnovillo@google.com Wed Dec 12 20:54:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Wed, 12 Dec 2007 20:54:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC Message-ID: <47603F3C.2090808@google.com> Over the last few weeks we (Google) have been discussing ideas on how to leverage the LTO work to implement a whole program optimizer that is both fast and scalable. While we do not have everything thought out in detail, we think we have enough to start doing some implementation work. I tried attaching the document, but the mailing list rejected it. I've uploaded it to http://airs.com/dnovillo/pub/whopr.pdf The most important goal we have with this project is the ability to handle Really Large programs (millions of functions, millions of call-graph edges) with some grace. So, the design tries pretty hard to make use of concurrency and clusters to partition the work. At this point we are interested in getting feedback on the general idea. There is some refactoring that will be needed inside the call-graph manager and some aspects of the design may not even need a lot of changes in GCC. But in general, it will require very efficient IR streaming. In terms of implementation, we will likely use the LTO branch as a basis. Many of the features we will need are already being implemented in the branch, so we will keep helping with that implementation. Thanks. Diego. From jcpiza@gmail.com Wed Dec 12 21:15:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Wed, 12 Dec 2007 21:15:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC Message-ID: <998d0e4a0712121254q5e8d48dfxe9afc225e3f28b20@mail.gmail.com> On 2007/12/12, "Diego Novillo" wrote: > Over the last few weeks we (Google) have been discussing ideas on how to > leverage the LTO work to implement a whole program optimizer that is > both fast and scalable. > > While we do not have everything thought out in detail, we think we have > enough to start doing some implementation work. I tried attaching the document, > but the mailing list rejected it. I've uploaded it to > http://airs.com/dnovillo/pub/whopr.pdf > > The most important goal we have with this project is the ability to > handle Really Large programs (millions of functions, millions of > call-graph edges) with some grace. So, the design tries pretty hard to > make use of concurrency and clusters to partition the work. > > At this point we are interested in getting feedback on the general idea. > There is some refactoring that will be needed inside the call-graph > manager and some aspects of the design may not even need a lot of > changes in GCC. But in general, it will require very efficient IR streaming. > > In terms of implementation, we will likely use the LTO branch as a > basis. Many of the features we will need are already being implemented > in the branch, so we will keep helping with that implementation. > > Thanks. Diego. * The googlish user says "i'm using the massive googlecc compiler that uses a lot of tons of libraries distributed in all the world!" * google shutdown => googlecc compiler doesn't work, ended history, byebye. J.C.Pizarro From jcpiza@gmail.com Wed Dec 12 22:42:00 2007 From: jcpiza@gmail.com (J.C. Pizarro) Date: Wed, 12 Dec 2007 22:42:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <4348dea50712121315l6d11ebagd96aff02d2e0f9f7@mail.gmail.com> References: <998d0e4a0712121254q5e8d48dfxe9afc225e3f28b20@mail.gmail.com> <4348dea50712121315l6d11ebagd96aff02d2e0f9f7@mail.gmail.com> Message-ID: <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> On 2007/12/12, Jonathan Wakely wrote: > On 12/12/2007, J.C. Pizarro wrote: > > > > * The googlish user says > > "i'm using the massive googlecc compiler that uses a lot of tons > > of libraries > > distributed in all the world!" > > > > * google shutdown => googlecc compiler doesn't work, ended history, byebye. > > Yet again you've jumped into a thread without understanding the > subject and you've said something completely irrelevant. Bravo! > > I suggest you shut up and read the paper before you embarrass yourself further. > > This is only a suggestion, intended to be helpful. I respect your > right to say anything you want and to make yourself look like a fool > in public. I will not prevent you from doing that. > > Jon > They are gaming or playing with the words of the language for Google. If the world is global then ^^ what means "global optimizer" using the infraestructure for google? ^^. For google: ----------- * "whole program optimizer infrastructure for GCC" means "a whole program (like DoD program?) to build an optimizer infrastructure (physical, it that has Google but want to optimize still more it) for GCC"? * "flexible enough to accommodate" means "like it from google.com"? * "an efficient implementation" means "an efficient construction of the infrastructure"? * "Whole-program analysis" means "to analyze the whole program"? * "massive memory consumption during compilation" means "hey, more memory? => more machines like from google!"? * "whole-program optimization framework for GCC" doesn't mean "program whole-optimization framework for GCC" but, means it "a whole-program that optimizes the framework for GCC"? * "Call-graph partitioning to group closely related functions" means "fragmenting the graph of remote calls to group closely related services"? * "Support incremental optimization" means "we can support incrementaly our services of optimizing the infrastructure"? * "impossible (or undesirable) to fit all the function bodies in memory" means "impossible in memory of machines but possible in disks of machines"? * "global call-graph" means "world's graph of remote calls"? * "global call-graph itself can always fit in memory" means "ohh, the world's graph of remote calls always is in memory of machines"? * "1M nodes" means "one million of machines"? * "1M edges" means "one million of cables"? * "< 500 Mb of memory" means "< 500 megabits of memory for remote calls"? * "WHOPR tries to maximize the amount of parallel and independent work to take advantage of clustered and multiprocessor machines" means "WHOPR tries to use ALL that Google has parallelly it taking its advantage due to its machines that Google has"? * "local generation" means "it generated locally by each machine"? * "global call-graph is assembled" means "world's graph of remote calls is assembled"? * "global analysis process makes transformation decisions" means "the process that Google analyzes spying the world make decisions of how to transformate the world"? * "partitioned to facilite optimization" means "fragmented to facilite the reduced computation of Google"? Or * "global call-graph may be partitioned to facilitate optimization" means "partitioned to metropolies, cities and towns the ADSL telephonic branches needed by Google to facilitate its operation of this infrastructure"? * "local transformations" means "the local police will do its action against e-criminals"? * "executable" means "it's from an execution to death of the e-prisoner"? * "Indirect call promotion" means "this promotion indirectly ehhhh?"? * "Dead variable elimination" means "elimination variable of R.I.P.s"? * etc. J.C.Pizarro i though that the Apocalypsis is near. From jwakely.gcc@gmail.com Wed Dec 12 22:42:00 2007 From: jwakely.gcc@gmail.com (Jonathan Wakely) Date: Wed, 12 Dec 2007 22:42:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <998d0e4a0712121254q5e8d48dfxe9afc225e3f28b20@mail.gmail.com> References: <998d0e4a0712121254q5e8d48dfxe9afc225e3f28b20@mail.gmail.com> Message-ID: <4348dea50712121315l6d11ebagd96aff02d2e0f9f7@mail.gmail.com> On 12/12/2007, J.C. Pizarro wrote: > > * The googlish user says > "i'm using the massive googlecc compiler that uses a lot of tons > of libraries > distributed in all the world!" > > * google shutdown => googlecc compiler doesn't work, ended history, byebye. Yet again you've jumped into a thread without understanding the subject and you've said something completely irrelevant. Bravo! I suggest you shut up and read the paper before you embarrass yourself further. This is only a suggestion, intended to be helpful. I respect your right to say anything you want and to make yourself look like a fool in public. I will not prevent you from doing that. Jon From gccadmin@gcc.gnu.org Wed Dec 12 22:49:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Wed, 12 Dec 2007 22:49:00 -0000 Subject: gcc-4.2-20071212 is now available Message-ID: <20071212224158.16663.qmail@sourceware.org> Snapshot gcc-4.2-20071212 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20071212/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.2 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch revision 130795 You'll find: gcc-4.2-20071212.tar.bz2 Complete GCC (includes all of below) gcc-core-4.2-20071212.tar.bz2 C front end and core compiler gcc-ada-4.2-20071212.tar.bz2 Ada front end and runtime gcc-fortran-4.2-20071212.tar.bz2 Fortran front end and runtime gcc-g++-4.2-20071212.tar.bz2 C++ front end and runtime gcc-java-4.2-20071212.tar.bz2 Java front end and runtime gcc-objc-4.2-20071212.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.2-20071212.tar.bz2 The GCC testsuite Diffs from 4.2-20071205 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.2 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From drow@false.org Wed Dec 12 22:50:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Wed, 12 Dec 2007 22:50:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> References: <998d0e4a0712121254q5e8d48dfxe9afc225e3f28b20@mail.gmail.com> <4348dea50712121315l6d11ebagd96aff02d2e0f9f7@mail.gmail.com> <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> Message-ID: <20071212224928.GA11746@caradoc.them.org> On Wed, Dec 12, 2007 at 11:42:23PM +0100, J.C. Pizarro wrote: > They are gaming or playing with the words of the language for Google. This is absurd and off-topic. Please stop. -- Daniel Jacobowitz CodeSourcery From sebpop@gmail.com Wed Dec 12 23:29:00 2007 From: sebpop@gmail.com (Sebastian Pop) Date: Wed, 12 Dec 2007 23:29:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> References: <998d0e4a0712121254q5e8d48dfxe9afc225e3f28b20@mail.gmail.com> <4348dea50712121315l6d11ebagd96aff02d2e0f9f7@mail.gmail.com> <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> Message-ID: Please stop spamming my gcc@gcc.gnu.org email box. All the messages that you sent are just off-topic for this mailing list. Please STOP sending emails. Thank you, Sebastian Pop On Dec 12, 2007 4:42 PM, J.C. Pizarro wrote: > > On 2007/12/12, Jonathan Wakely wrote: > > On 12/12/2007, J.C. Pizarro wrote: > > > > > > * The googlish user says > > > "i'm using the massive googlecc compiler that uses a lot of tons > > > of libraries > > > distributed in all the world!" > > > > > > * google shutdown => googlecc compiler doesn't work, ended history, byebye. > > > > Yet again you've jumped into a thread without understanding the > > subject and you've said something completely irrelevant. Bravo! > > > > I suggest you shut up and read the paper before you embarrass yourself further. > > > > This is only a suggestion, intended to be helpful. I respect your > > right to say anything you want and to make yourself look like a fool > > in public. I will not prevent you from doing that. > > > > Jon > > > > They are gaming or playing with the words of the language for Google. > > If the world is global then > ^^ what means "global optimizer" using the infraestructure for google? ^^. > > For google: > ----------- > * "whole program optimizer infrastructure for GCC" means > "a whole program (like DoD program?) to build an optimizer infrastructure > (physical, it that has Google but want to optimize still more it) for GCC"? > > * "flexible enough to accommodate" means "like it from google.com"? > > * "an efficient implementation" means > "an efficient construction of the infrastructure"? > > * "Whole-program analysis" means "to analyze the whole program"? > > * "massive memory consumption during compilation" means > "hey, more memory? => more machines like from google!"? > > * "whole-program optimization framework for GCC" doesn't mean > "program whole-optimization framework for GCC" but, > means it "a whole-program that optimizes the framework for GCC"? > > * "Call-graph partitioning to group closely related functions" means > "fragmenting the graph of remote calls to group closely related services"? > > * "Support incremental optimization" means > "we can support incrementaly our services of optimizing the infrastructure"? > > * "impossible (or undesirable) to fit all the function bodies in memory" > means "impossible in memory of machines but possible in disks of machines"? > > * "global call-graph" means "world's graph of remote calls"? > > * "global call-graph itself can always fit in memory" means > "ohh, the world's graph of remote calls always is in memory of machines"? > > * "1M nodes" means "one million of machines"? > > * "1M edges" means "one million of cables"? > > * "< 500 Mb of memory" means "< 500 megabits of memory for remote calls"? > * "WHOPR tries to maximize the amount of parallel and independent work to > take advantage of clustered and multiprocessor machines" means > "WHOPR tries to use ALL that Google has parallelly it taking its advantage > due to its machines that Google has"? > > * "local generation" means "it generated locally by each machine"? > > * "global call-graph is assembled" means > "world's graph of remote calls is assembled"? > > * "global analysis process makes transformation decisions" means > "the process that Google analyzes spying the world make decisions of how to > transformate the world"? > > * "partitioned to facilite optimization" means "fragmented to facilite the > reduced computation of Google"? Or > > * "global call-graph may be partitioned to facilitate optimization" means > "partitioned to metropolies, cities and towns the ADSL telephonic branches > needed by Google to facilitate its operation of this infrastructure"? > > * "local transformations" means "the local police will do its action > against e-criminals"? > > * "executable" means "it's from an execution to death of the e-prisoner"? > > * "Indirect call promotion" means "this promotion indirectly ehhhh?"? > > * "Dead variable elimination" means "elimination variable of R.I.P.s"? > > * etc. > > J.C.Pizarro i though that the Apocalypsis is near. > From tejgcc@westnet.com.au Wed Dec 12 23:32:00 2007 From: tejgcc@westnet.com.au (Tim Josling) Date: Wed, 12 Dec 2007 23:32:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <47603F3C.2090808@google.com> References: <47603F3C.2090808@google.com> Message-ID: <1197502096.6006.14.camel@tim-gcc> On Wed, 2007-12-12 at 15:06 -0500, Diego Novillo wrote: > Over the last few weeks we (Google) have been discussing ideas on how to > leverage the LTO work to implement a whole program optimizer that is > both fast and scalable. > > While we do not have everything thought out in detail, we think we have > enough to start doing some implementation work. I tried attaching the > document, but the mailing list rejected it. I've uploaded it to > http://airs.com/dnovillo/pub/whopr.pdf A few questions: Do you have any thoughts on how this approach would be able to use profiling information, which is very a very powerful source of information for producing good optimisations? Would there be much duplication of code between this and normal GCC processing or would it be possible to share a common code base? A few years back there were various suggestions about having files containing intermediate representations and this was criticised because it could make it possible for people for subvert the GPL by connecting to the optimisation phases via such an intermediate file. Arguably the language front end is then a different program and not covered by the GPL. It might be worth thinking about this aspect. This also triggers the thought that if you have this intermediate representation, and it is somewhat robust to GCC patchlevels, you do not actually need source code of proprietary libraries to optimize into them. You only need the intermediate files, which may be easier to get than source code. Tim Josling From lopezibanez@gmail.com Wed Dec 12 23:41:00 2007 From: lopezibanez@gmail.com (=?ISO-8859-1?Q?Manuel_L=F3pez-Ib=E1=F1ez?=) Date: Wed, 12 Dec 2007 23:41:00 -0000 Subject: Poisonous people Message-ID: <6c33472e0712121532l5afd2f39ma763fb5934728253@mail.gmail.com> On 12/12/2007, Daniel Jacobowitz wrote: > On Wed, Dec 12, 2007 at 11:42:23PM +0100, J.C. Pizarro wrote: > > They are gaming or playing with the words of the language for Google. > > This is absurd and off-topic. Please stop. > This time it went too far. We have been infinitely patient. We have tried to understand. We have tried to answer. We have tried to teach. I am Spanish and I understand that language barriers are a problem. But language barriers can't excuse immodest ignorance and pretentiousness up to the point of upsetting actual contributors to this project. Every second that I wasted on reading or replying to him is a second of my life that I wasted. I believe the time and attention that useful contributors are spending replying or reading (or just deleting) his diatribes damages GCC's development. I just kill-filed J.C. Pizarro and I suggest you do the same. I will also support banning him from the main gcc mailing list. If he ever wants to send a patch, he still could use gcc-patches. Plonk! Manuel. From harvey.harrison@gmail.com Thu Dec 13 00:03:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 13 Dec 2007 00:03:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <47603F3C.2090808@google.com> References: <47603F3C.2090808@google.com> Message-ID: <1197502902.21291.52.camel@brick> On Wed, 2007-12-12 at 15:06 -0500, Diego Novillo wrote: > Over the last few weeks we (Google) have been discussing ideas on how to > leverage the LTO work to implement a whole program optimizer that is > both fast and scalable. > > While we do not have everything thought out in detail, we think we have > enough to start doing some implementation work. I tried attaching the > document, but the mailing list rejected it. I've uploaded it to > http://airs.com/dnovillo/pub/whopr.pdf > > The most important goal we have with this project is the ability to > handle Really Large programs (millions of functions, millions of > call-graph edges) with some grace. So, the design tries pretty hard to > make use of concurrency and clusters to partition the work. > > At this point we are interested in getting feedback on the general idea. > There is some refactoring that will be needed inside the call-graph > manager and some aspects of the design may not even need a lot of > changes in GCC. But in general, it will require very efficient IR streaming. > > In terms of implementation, we will likely use the LTO branch as a > basis. Many of the features we will need are already being implemented > in the branch, so we will keep helping with that implementation. > I'm curious how this interacts/complements with any efforts to using the LLVM IR in LTO. Any pointers to where that discussion ended up? Harvey From clattner@apple.com Thu Dec 13 00:11:00 2007 From: clattner@apple.com (Chris Lattner) Date: Thu, 13 Dec 2007 00:11:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <1197502902.21291.52.camel@brick> References: <47603F3C.2090808@google.com> <1197502902.21291.52.camel@brick> Message-ID: On Dec 12, 2007, at 3:41 PM, Harvey Harrison wrote: >> In terms of implementation, we will likely use the LTO branch as a >> basis. Many of the features we will need are already being >> implemented >> in the branch, so we will keep helping with that implementation. >> > > I'm curious how this interacts/complements with any efforts to > using the LLVM IR in LTO. > > Any pointers to where that discussion ended up? There are no plans to integrate LLVM with mainline GCC. LLVM maintains its own permanent fork of GCC, which we periodically sync up with GCC's progress (e.g. LLVM 2.2 will include a GCC 4.2 based front-end). There is also work underway to build llvm-native front- end technology (http://clang.llvm.org). If you want LTO today, feel free to go to http://llvm.org/ :) otherwise LLVM is irrelevant to this discussion. -Chris http://llvm.org/ http://clang.llvm.org/ http://nondot.org/sabre/ From harvey.harrison@gmail.com Thu Dec 13 07:14:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 13 Dec 2007 07:14:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: References: <47603F3C.2090808@google.com> <1197502902.21291.52.camel@brick> Message-ID: <1197504667.21291.57.camel@brick> On Wed, 2007-12-12 at 16:02 -0800, Chris Lattner wrote: > On Dec 12, 2007, at 3:41 PM, Harvey Harrison wrote: > >> In terms of implementation, we will likely use the LTO branch as a > >> basis. Many of the features we will need are already being > >> implemented > >> in the branch, so we will keep helping with that implementation. > >> > > > > I'm curious how this interacts/complements with any efforts to > > using the LLVM IR in LTO. > > > > Any pointers to where that discussion ended up? > > There are no plans to integrate LLVM with mainline GCC. LLVM > maintains its own permanent fork of GCC, which we periodically sync > up with GCC's progress (e.g. LLVM 2.2 will include a GCC 4.2 based > front-end). There is also work underway to build llvm-native front- > end technology (http://clang.llvm.org). > > If you want LTO today, feel free to go to http://llvm.org/ :) > otherwise LLVM is irrelevant to this discussion. I was more interested in the format of the IR gcc ends up using, I was curious where the discussion had gotten for LTO in gcc-land. The LLVM representation seemed rather sane, and already has at least one implementation of tools using it. Harvey From praveen.gcc@gmail.com Thu Dec 13 07:32:00 2007 From: praveen.gcc@gmail.com (Praveen Raghavan) Date: Thu, 13 Dec 2007 07:32:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <47603F3C.2090808@google.com> References: <47603F3C.2090808@google.com> Message-ID: <738f85410712122314y75cfe3acyb3a7b1d6ef99044@mail.gmail.com> > While we do not have everything thought out in detail, we think we have > enough to start doing some implementation work. I tried attaching the > document, but the mailing list rejected it. I've uploaded it to > http://airs.com/dnovillo/pub/whopr.pdf > Very very interesting proposal indeed! I have a few questions: 1. Are there also plans to extend the global transformation capabilities. I see that the original set of global transformations is limited (rightfully so). 2. Also any thoughts on how you keep the complete GIMPLE representation of millions of functions together? You would have some serious complexity issues inside the WPA engine? Or is it the idea that you start with the minimal information in the wpo1 file and if required read in the GIMPLE section? 3. Is there a plan/schedule on when 'a' version of this would be out? Regards, Praveen From ae@op5.se Thu Dec 13 07:39:00 2007 From: ae@op5.se (Andreas Ericsson) Date: Thu, 13 Dec 2007 07:39:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712101825l33cdc2c0mca2ddbfd5afdb298@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: <4760E005.6040102@op5.se> Nicolas Pitre wrote: > On Wed, 12 Dec 2007, Nicolas Pitre wrote: > >> I did modify the progress display to show accounted memory that was >> allocated vs memory that was freed but still not released to the system. >> At least that gives you an idea of memory allocation and fragmentation >> with glibc in real time: >> >> diff --git a/progress.c b/progress.c >> index d19f80c..46ac9ef 100644 >> --- a/progress.c >> +++ b/progress.c >> @@ -8,6 +8,7 @@ >> * published by the Free Software Foundation. >> */ >> >> +#include >> #include "git-compat-util.h" >> #include "progress.h" >> >> @@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done) >> if (progress->total) { >> unsigned percent = n * 100 / progress->total; >> if (percent != progress->last_percent || progress_update) { >> + struct mallinfo m = mallinfo(); >> progress->last_percent = percent; >> - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", >> - progress->title, percent, n, >> - progress->total, tp, eol); >> + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s", >> + progress->title, percent, n, progress->total, >> + m.uordblks >> 18, m.fordblks >> 18, >> + tp, eol); > > Note: I didn't know what unit of memory those blocks represents, so the > shift is most probably wrong. > Me neither, but it appears to me as if hblkhd holds the actual memory consumed by the process. It seems to store the information in bytes, which I find a bit dubious unless glibc has some internal multiplier. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 From aaw@google.com Thu Dec 13 07:53:00 2007 From: aaw@google.com (Ollie Wild) Date: Thu, 13 Dec 2007 07:53:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <1197502096.6006.14.camel@tim-gcc> References: <47603F3C.2090808@google.com> <1197502096.6006.14.camel@tim-gcc> Message-ID: <65dd6fd50712122339h2c856fdbpe127417e6726bd6@mail.gmail.com> On Dec 12, 2007 3:28 PM, Tim Josling wrote: > > Do you have any thoughts on how this approach would be able to use > profiling information, which is very a very powerful source of > information for producing good optimisations? The intent is for the WPA phase to utilize profile information, both for making transformation decisions and for assigning call-graph edge weights (which impacts partitioning and, hence, the local optimizations available during LTRANS). We've also discussed the possibility of bypassing LTRANS altogether for portions of the call-graph which profiling determines are insignificant. > A few years back there were various suggestions about having files > containing intermediate representations and this was criticised because > it could make it possible for people for subvert the GPL by connecting > to the optimisation phases via such an intermediate file. Arguably the > language front end is then a different program and not covered by the > GPL. It might be worth thinking about this aspect. The lto branch is already doing this, so presumably that discussion was resolved (Maybe someone in the know should pipe up.). This proposal aims to leverage (and augment) that work in progress. > This also triggers the thought that if you have this intermediate > representation, and it is somewhat robust to GCC patchlevels, you do not > actually need source code of proprietary libraries to optimize into > them. You only need the intermediate files, which may be easier to get > than source code. I believe a stable representation is an explicit non-goal of the LTO project (Perhaps that was the "resolution" of the discussion above.). It's an interesting idea, though. Maybe this is something to revisit once the representation has had a chance to stabilize. Ollie From aaw@google.com Thu Dec 13 08:10:00 2007 From: aaw@google.com (Ollie Wild) Date: Thu, 13 Dec 2007 08:10:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <738f85410712122314y75cfe3acyb3a7b1d6ef99044@mail.gmail.com> References: <47603F3C.2090808@google.com> <738f85410712122314y75cfe3acyb3a7b1d6ef99044@mail.gmail.com> Message-ID: <65dd6fd50712122353g693ae907ob43221ac3ac2f6e4@mail.gmail.com> On Dec 12, 2007 11:14 PM, Praveen Raghavan wrote: > > 1. Are there also plans to extend the global transformation > capabilities. I see that the original set of global transformations is > limited (rightfully so). This is still at a very early design stage. Additional transformations could (and should) be added where they make sense. The main constraints to consider are the time and memory requirements of the WPA phase. Since this is the only phase which can't be parallelized, its scalability is paramount. > 2. Also any thoughts on how you keep the complete GIMPLE > representation of millions of functions together? You would have some > serious complexity issues inside the WPA engine? > Or is it the idea that you start with the minimal information in the > wpo1 file and if required read in the GIMPLE section? You don't. WPA is the only global phase, and it operates only on summary data. The actually reading of GIMPLE occurs in LTRANS, which partitions the problem. Depending on how large partitions need to be before reasonable performance benefits are observed, it may be necessary to allow LTRANS to swap functions in and out of memory. That's still an open question. > 3. Is there a plan/schedule on when 'a' version of this would be out? TBD. Ollie From bonzini@gnu.org Thu Dec 13 10:22:00 2007 From: bonzini@gnu.org (Paolo Bonzini) Date: Thu, 13 Dec 2007 10:22:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> References: <998d0e4a0712121254q5e8d48dfxe9afc225e3f28b20@mail.gmail.com> <4348dea50712121315l6d11ebagd96aff02d2e0f9f7@mail.gmail.com> <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> Message-ID: > They are gaming or playing with the words of the language for Google. > > If the world is global then > ^^ what means "global optimizer" using the infraestructure for google? ^^. Wow, paranoia is a new feature of J.C.'s messages. Seriously, your future employers might search for your name in the future, and this will not be good advertising for you. In the past it has already happened that someone asked his messages to be removed, and this request was impossible to fulfill (not even taking into account the moral issues). You are being ridiculed *forever*. Why can't you understand this? Paolo From njn@csse.unimelb.edu.au Thu Dec 13 13:32:00 2007 From: njn@csse.unimelb.edu.au (Nicholas Nethercote) Date: Thu, 13 Dec 2007 13:32:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> References: <998d0e4a0712121254q5e8d48dfxe9afc225e3f28b20@mail.gmail.com> <4348dea50712121315l6d11ebagd96aff02d2e0f9f7@mail.gmail.com> <998d0e4a0712121442t6194ba98kcb3fa6c72a675e07@mail.gmail.com> Message-ID: On Wed, 12 Dec 2007, J.C. Pizarro wrote: > [...] > > * "executable" means "it's from an execution to death of the e-prisoner"? > > * "Indirect call promotion" means "this promotion indirectly ehhhh?"? > > * "Dead variable elimination" means "elimination variable of R.I.P.s"? > > * etc. > > J.C.Pizarro i though that the Apocalypsis is near. My theory is that J.C.Pizarro is an advanced AI chat-bot designed to produce streams of nearly-intelligible programming-related verbiage, and that last email was the result of a malfunction that caused it to dump part of its internal word association database. Nick From dnovillo@google.com Thu Dec 13 14:09:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Thu, 13 Dec 2007 14:09:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <65dd6fd50712122339h2c856fdbpe127417e6726bd6@mail.gmail.com> References: <47603F3C.2090808@google.com> <1197502096.6006.14.camel@tim-gcc> <65dd6fd50712122339h2c856fdbpe127417e6726bd6@mail.gmail.com> Message-ID: <4761334A.2070709@google.com> On 12/13/07 2:39 AM, Ollie Wild wrote: > The lto branch is already doing this, so presumably that discussion > was resolved (Maybe someone in the know should pipe up.). Yes, streaming the IL to/from disk is a resolved issue. > I believe a stable representation is an explicit non-goal of the LTO > project (Perhaps that was the "resolution" of the discussion above.). Right. The on-disk representation will tend to be unstable. Diego. From pclouds@gmail.com Thu Dec 13 14:09:00 2007 From: pclouds@gmail.com (Nguyen Thai Ngoc Duy) Date: Thu, 13 Dec 2007 14:09:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: On Dec 12, 2007 10:48 PM, Nicolas Pitre wrote: > In the mean time you might have to use only one thread and lots of > memory to repack the gcc repo, or find the perfect memory allocator to > be used with Git. After all, packing the whole gcc history to around > 230MB is quite a stunt but it requires sufficient resources to > achieve it. Fortunately, like Linus said, such a wholesale repack is not > something that most users have to do anyway. Is there an alternative to "git repack -a -d" that repacks everything but the first pack? -- Duy From dnovillo@google.com Thu Dec 13 14:41:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Thu, 13 Dec 2007 14:41:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <1197502902.21291.52.camel@brick> References: <47603F3C.2090808@google.com> <1197502902.21291.52.camel@brick> Message-ID: <47613457.9070600@google.com> On 12/12/07 6:41 PM, Harvey Harrison wrote: > I'm curious how this interacts/complements with any efforts to > using the LLVM IR in LTO. No. At least not at this moment. GCC uses its own IR (GIMPLE) as the in-core and on-disk representation. > Any pointers to where that discussion ended up? There was some discussion about merging LLVM and GCC a couple of years back but nothing concrete came out of it. Diego. From espindola@google.com Thu Dec 13 16:30:00 2007 From: espindola@google.com (Rafael Espindola) Date: Thu, 13 Dec 2007 16:30:00 -0000 Subject: Git and GCC In-Reply-To: <1196897840.10408.57.camel@brick> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <1196891451.10408.54.camel@brick> <1196897840.10408.57.camel@brick> Message-ID: <38a0d8450712130640p1b5d74d6nfa124ad0b0110d64@mail.gmail.com> > Yes, everything, by default you only get the more modern branches/tags, > but it's all in there. If there is interest I can work with Bernardo > and get the rest publically exposed. I decided to give it a try, but could not find the tuples branch. Is it too hard to make gimple-tuples-branch and lto visible? > Harvey > > Thanks a lot, -- Rafael Avila de Espindola Google Ireland Ltd. Gordon House Barrow Street Dublin 4 Ireland Registered in Dublin, Ireland Registration Number: 368047 From bonzini@gnu.org Thu Dec 13 16:39:00 2007 From: bonzini@gnu.org (Paolo Bonzini) Date: Thu, 13 Dec 2007 16:39:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: <47615E04.8000400@gnu.org> >> Is there an alternative to "git repack -a -d" that repacks everything >> but the first pack? > > That would be a pretty good idea for big repositories. If I were to > implement it, I would actually add a .git/config option like > pack.permanent so that more than one pack could be made permanent; then > to repack really really everything you'd need "git repack -a -a -d". Actually there is something like this, as seen from the source of git-repack: for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \ | sed -e 's/^\.\///' -e 's/\.pack$//'` do if [ -e "$PACKDIR/$e.keep" ]; then : keep else args="$args --unpacked=$e.pack" existing="$existing $e" fi done So, just create a file named as the pack, but with extension ".keep". Paolo From j.sixt@viscovery.net Thu Dec 13 17:40:00 2007 From: j.sixt@viscovery.net (Johannes Sixt) Date: Thu, 13 Dec 2007 17:40:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <9e4733910712071505y6834f040k37261d65a2d445c4@mail.gmail.com> <9e4733910712102125w56c70c0cxb8b00a060b62077@mail.gmail.com> <9e4733910712102129v140c2affqf2e73e75855b61ea@mail.gmail.com> <9e4733910712102301p5e6c4165v6afb32d157478828@mail.gmail.com> <9e4733910712110821o7748802ag75d9df4be8b2c123@mail.gmail.com> Message-ID: <47616044.7070504@viscovery.net> Paolo Bonzini schrieb: > Nguyen Thai Ngoc Duy wrote: >> On Dec 12, 2007 10:48 PM, Nicolas Pitre wrote: >>> In the mean time you might have to use only one thread and lots of >>> memory to repack the gcc repo, or find the perfect memory allocator to >>> be used with Git. After all, packing the whole gcc history to around >>> 230MB is quite a stunt but it requires sufficient resources to >>> achieve it. Fortunately, like Linus said, such a wholesale repack is not >>> something that most users have to do anyway. >> >> Is there an alternative to "git repack -a -d" that repacks everything >> but the first pack? > > That would be a pretty good idea for big repositories. If I were to > implement it, I would actually add a .git/config option like > pack.permanent so that more than one pack could be made permanent; then > to repack really really everything you'd need "git repack -a -a -d". It's already there: If you have a pack .git/objects/pack/pack-foo.pack, then "touch .git/objects/pack/pack-foo.keep" marks the pack as precious. -- Hannes From clattner@apple.com Thu Dec 13 19:06:00 2007 From: clattner@apple.com (Chris Lattner) Date: Thu, 13 Dec 2007 19:06:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <47613457.9070600@google.com> References: <47603F3C.2090808@google.com> <1197502902.21291.52.camel@brick> <47613457.9070600@google.com> Message-ID: On Dec 13, 2007, at 5:32 AM, Diego Novillo wrote: > On 12/12/07 6:41 PM, Harvey Harrison wrote: >> Any pointers to where that discussion ended up? > > There was some discussion about merging LLVM and GCC a couple of > years back but nothing concrete came out of it. The concrete thing that came out of it is that it isn't going to happen. :) -Chris From harvey.harrison@gmail.com Thu Dec 13 20:31:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Thu, 13 Dec 2007 20:31:00 -0000 Subject: Git and GCC In-Reply-To: <38a0d8450712130640p1b5d74d6nfa124ad0b0110d64@mail.gmail.com> References: <4aca3dc20712051108s216d3331t8061ef45b9aa324a@mail.gmail.com> <2007-12-05-21-23-14+trackit+sam@rfc1149.net> <1196891451.10408.54.camel@brick> <1196897840.10408.57.camel@brick> <38a0d8450712130640p1b5d74d6nfa124ad0b0110d64@mail.gmail.com> Message-ID: <1197572755.898.15.camel@brick> On Thu, 2007-12-13 at 14:40 +0000, Rafael Espindola wrote: > > Yes, everything, by default you only get the more modern branches/tags, > > but it's all in there. If there is interest I can work with Bernardo > > and get the rest publically exposed. > > I decided to give it a try, but could not find the tuples branch. Is > it too hard to make gimple-tuples-branch and lto visible? > Here's a suggestion I sent to the git list, it's a bit loner than it needs to be, but I think you'll understand a lot better what's going on this way.: After the discussions lately regarding the gcc svn mirror. I'm coming up with a recipe to set up your own git-svn mirror. Suggestions on the following. // Create directory and initialize git mkdir gcc cd gcc git init // add the remote site that currently mirrors gcc // I have chosen the name gcc.gnu.org *1* as my local name to refer to // this choose something else if you like git remote add gcc.gnu.org git://git.infradead.org/gcc.git // fetching someone else's remote branches is not a standard thing to do // so we'll need to edit our .git/config file // you should have a section that looks like: [remote "gcc.gnu.org"] url = git://git.infradead.org/gcc.git fetch = +refs/heads/*:refs/remotes/gcc.gnu.org/* // infradead's mirror puts the gcc svn branches in its own namespace // refs/remotes/gcc.gnu.org/* // change our fetch line accordingly [remote "gcc.gnu.org"] url = git://git.infradead.org/gcc.git fetch = +refs/remotes/gcc.gnu.org/*:refs/remotes/gcc.gnu.org/* // fetch the remote data from the mirror site git remote update // set up git-svn // gcc has the standard trunk/branches/tags naming so use -s // add a prefix so git-svn uses the metadata we just got from the // mirror so we don't have to get everything from the svn server // the --prefix must match whatever you chose in *1*, the trailing // slash is important. git svn init -s --prefix=gcc.gnu.org/ svn://gcc.gnu.org/svn/gcc // your config should look like this now: [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [remote "gccmirror"] url = git://git.infradead.org/gcc.git fetch = +refs/heads/*:refs/remotes/gccmirror/* [svn-remote "svn"] url = svn://gcc.gnu.org/svn/gcc fetch = trunk:refs/remotes/gcc.gnu.org/trunk branches = branches/*:refs/remotes/gcc.gnu.org/* tags = tags/*:refs/remotes/gcc.gnu.org/tags/* // Try and get more revisions from the svn server // this may take a little while the first time as git-svn builds // metadata to allow bi-directional operation // Note: git-svn has a patch in testing to use a _vastly_ more // space efficient mapping from svn rev to git sha, I'd // suggest you get it. // // This will rebuild the mapping for every svn branch git svn fetch // If you only care about one branch // Check out a local copy of the tuples branch and switch // to it git checkout -b tuples remotes/gcc.gnu.org/tuples // Update the git-svn metadata git svn rebase Hope that helps to get you started. Harvey From rask@sygehus.dk Fri Dec 14 01:25:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Fri, 14 Dec 2007 01:25:00 -0000 Subject: Status of simulator targets after dataflow merge In-Reply-To: <20070616153102.GS5690@sygehus.dk> References: <20070616153102.GS5690@sygehus.dk> Message-ID: <20071213203051.GU17368@sygehus.dk> On Sat, Jun 16, 2007 at 05:31:02PM +0200, Rask Ingemann Lambertsen wrote: > > The following targets have significantly more unexpected failures than > before: > > mipsisa64-unknown-elf 841 -> 921 g++ (my results) Results have recently improved dramatically to only 3 unexpected g++ failures. I'm guessing it's because of the changes in the binutils tree. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From jnareb@gmail.com Fri Dec 14 02:31:00 2007 From: jnareb@gmail.com (Jakub Narebski) Date: Fri, 14 Dec 2007 02:31:00 -0000 Subject: Something is broken in repack References: <47616044.7070504@viscovery.net> Message-ID: Johannes Sixt wrote: > Paolo Bonzini schrieb: >> Nguyen Thai Ngoc Duy wrote: >>> >>> Is there an alternative to "git repack -a -d" that repacks everything >>> but the first pack? >> >> That would be a pretty good idea for big repositories. If I were to >> implement it, I would actually add a .git/config option like >> pack.permanent so that more than one pack could be made permanent; then >> to repack really really everything you'd need "git repack -a -a -d". > > It's already there: If you have a pack .git/objects/pack/pack-foo.pack, then > "touch .git/objects/pack/pack-foo.keep" marks the pack as precious. Actually you can (and probably should) put the one line with the _reason_ pack is to be kept in the *.keep file. Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of all things. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git From petschy@praire-chicken.com Fri Dec 14 02:39:00 2007 From: petschy@praire-chicken.com (Peter A. Felvegi) Date: Fri, 14 Dec 2007 02:39:00 -0000 Subject: ctor style cast vs c style cast Message-ID: <4761EAFF.1030801@praire-chicken.com> hello all, today i've run into this: if i cast a double value to an unsigned int using the C style cast when passing it to printf, it's fine. however, if i use the ctor style cast, i get a compile error. in theory, these two should do the same: create a temporary unsigned int, and assign the double to it after conversion, just the syntax is different. made a little test, see attachment. plain int's are ok, but when qualified with signed/unsigned the error occurs. i can't judge whether this is an error or not, please clarify. $ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) regards, p -------------- next part -------------- A non-text attachment was scrubbed... Name: t.cpp Type: text/x-c++src Size: 949 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature URL: From fang@csl.cornell.edu Fri Dec 14 06:00:00 2007 From: fang@csl.cornell.edu (David Fang) Date: Fri, 14 Dec 2007 06:00:00 -0000 Subject: ctor style cast vs c style cast In-Reply-To: <4761EAFF.1030801@praire-chicken.com> References: <4761EAFF.1030801@praire-chicken.com> Message-ID: <20071213213652.N93316@shannon.csl.cornell.edu> > today i've run into this: if i cast a double value to an unsigned int > using the C style cast when passing it to printf, it's fine. however, if > i use the ctor style cast, i get a compile error. in theory, these two > should do the same: create a temporary unsigned int, and assign the > double to it after conversion, just the syntax is different. made a > little test, see attachment. plain int's are ok, but when qualified with > signed/unsigned the error occurs. Hi, I've run into this before, not sure if it's a bug, but what I do is use typedefs to work-around. (gcc has rejected your example on your error-marked lines for a long time) Try the following for kicks: typedef signed int sint; typedef unsigned int uint; typedef const int cint; typedef const sint csint; typedef const uint cuint; and replace them where you use ctor style initializations. David Fang Computer Systems Laboratory Electrical & Computer Engineering Cornell University http://www.csl.cornell.edu/~fang/ -- (2400 baud? Netscape 3.0?? lynx??? No problem!) From webminster@gmail.com Fri Dec 14 06:24:00 2007 From: webminster@gmail.com (=?GB2312?B?xcDFwLPm?=) Date: Fri, 14 Dec 2007 06:24:00 -0000 Subject: help pls, about gcc-3.0.1 on HP unix Message-ID: <23e0013d0712132200x1e6cc510w7099262e53cf0e4f@mail.gmail.com> hi Tapani Tarvainen, sorry for interrupt you. I searched a your mail in gnu-gcc maillist. http://gcc.gnu.org/ml/gcc/2002-01/msg00062.html you said you have installed gcc 3.0.3 in HP-UX 11.11, but I really wasn't successfull on it. my step: export CC=/opt/hp-gcc/bin/gcc (the gcc is binary gcc-3.3.6 or use "/opt/ansic/bin/cc") export CFLAGS="-D_HPUX_SOURCE" ../configure --disable-nls --disable-threads --prefix=/usr/local/gcc-3.0.1 --with-as=/usr/local/bin/as --with-ld=... --disable-hecking --enable--long-long --host=hppa2.0w-hp-hpux11.11 --enable-languages=c,c++ make bootstrap (the as is from binutils, but haven't ld) waiting for about 20mins. the error is: " /usr/ccs/bin/ld: Unsatisfied symbols: __main (first referenced in gengenrtl.o) (code) collect2: ld returned 1 exit status *** Error exit code 1 Stop. *** Error exit code 1 Stop. *** Error exit code 1 Stop. " it seems, the problem with system's ld, but the ld of binutils don't support hppa2.0w-hp-hpux11.11 . could you pls help me on that. I'm very appreciated to you. -- Thanks and Best Regards, Ma GuoLiang (???) Tel: (86-10)82782244-2392 http://www.papachong.org/ Internet Email: webminster@gmail.com, papa_chong@hotmail.com From pclouds@gmail.com Fri Dec 14 06:39:00 2007 From: pclouds@gmail.com (Nguyen Thai Ngoc Duy) Date: Fri, 14 Dec 2007 06:39:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <47616044.7070504@viscovery.net> Message-ID: On Dec 14, 2007 1:14 PM, Paolo Bonzini wrote: > > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of > > all things. > > I found that the .keep file is not transmitted over the network (at > least I tried with git+ssh:// and http:// protocols), however. I'm thinking about "git clone --keep" to mark initial packs precious. But 'git clone' is under rewrite to C. Let's wait until C rewrite is done. -- Duy From fanzier@gmail.com Fri Dec 14 08:42:00 2007 From: fanzier@gmail.com (Fan Zhang) Date: Fri, 14 Dec 2007 08:42:00 -0000 Subject: how to compile gcc4 on cygwin? Message-ID: how to compile gcc4 on cygwin? thanks From bonzini@gnu.org Fri Dec 14 09:02:00 2007 From: bonzini@gnu.org (Paolo Bonzini) Date: Fri, 14 Dec 2007 09:02:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <47616044.7070504@viscovery.net> Message-ID: > I'm thinking about "git clone --keep" to mark initial packs precious. > But 'git clone' is under rewrite to C. Let's wait until C rewrite is > done. It should be the default, IMHO. Paolo From harvey.harrison@gmail.com Fri Dec 14 09:39:00 2007 From: harvey.harrison@gmail.com (Harvey Harrison) Date: Fri, 14 Dec 2007 09:39:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <47616044.7070504@viscovery.net> Message-ID: <1197622912.898.53.camel@brick> On Fri, 2007-12-14 at 09:20 +0100, Paolo Bonzini wrote: > > I'm thinking about "git clone --keep" to mark initial packs precious. > > But 'git clone' is under rewrite to C. Let's wait until C rewrite is > > done. > > It should be the default, IMHO. > While it doesn't mark the packs as .keep, git will reuse all of the old deltas you got in the original clone, so you're not losing anything. Harvey From schwab@suse.de Fri Dec 14 10:40:00 2007 From: schwab@suse.de (Andreas Schwab) Date: Fri, 14 Dec 2007 10:40:00 -0000 Subject: ctor style cast vs c style cast In-Reply-To: <4761EAFF.1030801@praire-chicken.com> (Peter A. Felvegi's message of "Fri\, 14 Dec 2007 03\:31\:27 +0100") References: <4761EAFF.1030801@praire-chicken.com> Message-ID: "Peter A. Felvegi" writes: > today i've run into this: if i cast a double value to an unsigned int > using the C style cast when passing it to printf, it's fine. however, if > i use the ctor style cast, i get a compile error. This question is off-topic here, please use gcc-help@gcc.gnu.org in future. Functional cast notation [expr.type.conv] only allows a single simple-type-specifier, but unsigned int isn't. Use static_cast instead. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux Products GmbH, Maxfeldstra??e 5, 90409 N??rnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From jnareb@gmail.com Fri Dec 14 10:52:00 2007 From: jnareb@gmail.com (Jakub Narebski) Date: Fri, 14 Dec 2007 10:52:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <47616044.7070504@viscovery.net> Message-ID: "Nguyen Thai Ngoc Duy" writes: > On Dec 14, 2007 1:14 PM, Paolo Bonzini wrote: > > > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of > > > all things. > > > > I found that the .keep file is not transmitted over the network (at > > least I tried with git+ssh:// and http:// protocols), however. > > I'm thinking about "git clone --keep" to mark initial packs precious. > But 'git clone' is under rewrite to C. Let's wait until C rewrite is > done. But if you clone via network, pack might be network optimized if you use "smart" transport, not disk optimized, at least with current git which regenerates pack also on clone AFAIK. -- Jakub Narebski Poland ShadeHawk on #git From pclouds@gmail.com Fri Dec 14 13:25:00 2007 From: pclouds@gmail.com (Nguyen Thai Ngoc Duy) Date: Fri, 14 Dec 2007 13:25:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <47616044.7070504@viscovery.net> Message-ID: On Dec 14, 2007 4:01 PM, Harvey Harrison wrote: > While it doesn't mark the packs as .keep, git will reuse all of the old > deltas you got in the original clone, so you're not losing anything. There is another reason I want it. I have an ~800MB pack and I don't want git to rewrite the pack every time I repack my changes. So it's kind of disk-wise (don't require 800MB on disk to prepare new pack, and don't write too much). On Dec 14, 2007 5:40 PM, Jakub Narebski wrote: > But if you clone via network, pack might be network optimized if you > use "smart" transport, not disk optimized, at least with current git > which regenerates pack also on clone AFAIK. Um.. that's ok it just regenerate once. -- Duy From nico@cam.org Fri Dec 14 13:53:00 2007 From: nico@cam.org (Nicolas Pitre) Date: Fri, 14 Dec 2007 13:53:00 -0000 Subject: Something is broken in repack In-Reply-To: References: <47616044.7070504@viscovery.net> Message-ID: On Fri, 14 Dec 2007, Paolo Bonzini wrote: > > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of > > all things. > > I found that the .keep file is not transmitted over the network (at least I > tried with git+ssh:// and http:// protocols), however. That is a local policy. Nicolas From tprince@computer.org Fri Dec 14 16:03:00 2007 From: tprince@computer.org (Tim Prince) Date: Fri, 14 Dec 2007 16:03:00 -0000 Subject: how to compile gcc4 on cygwin? In-Reply-To: References: Message-ID: <476288C2.3020105@computer.org> Fan Zhang wrote: > how to compile gcc4 on cygwin? > thanks The generic instructions are here http://gcc.gnu.org/install/ The mailing lists for asking questions are gcc-help http://gcc.gnu.org/lists.html and possibly http://cygwin.com/lists.html You should be able to find useful hints on the archives of those lists. From wmglo@dent.med.uni-muenchen.de Fri Dec 14 16:12:00 2007 From: wmglo@dent.med.uni-muenchen.de (Wolfram Gloger) Date: Fri, 14 Dec 2007 16:12:00 -0000 Subject: Something is broken in repack In-Reply-To: <4760E005.6040102@op5.se> (message from Andreas Ericsson on Thu, 13 Dec 2007 08:32:21 +0100) References: <4760E005.6040102@op5.se> Message-ID: <20071214160326.2424.qmail@md.dent.med.uni-muenchen.de> Hi, > >> if (progress->total) { > >> unsigned percent = n * 100 / progress->total; > >> if (percent != progress->last_percent || progress_update) { > >> + struct mallinfo m = mallinfo(); > >> progress->last_percent = percent; > >> - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", > >> - progress->title, percent, n, > >> - progress->total, tp, eol); > >> + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s", > >> + progress->title, percent, n, progress->total, > >> + m.uordblks >> 18, m.fordblks >> 18, > >> + tp, eol); > > > > Note: I didn't know what unit of memory those blocks represents, so the > > shift is most probably wrong. > > > > Me neither, but it appears to me as if hblkhd holds the actual memory > consumed by the process. It seems to store the information in bytes, > which I find a bit dubious unless glibc has some internal multiplier. mallinfo() will only give you the used memory for the main arena. When you have separate arenas (likely when concurrent threads have been used), the only way to get the full picture is to call malloc_stats(), which prints to stderr. Regards, Wolfram. From wmglo@dent.med.uni-muenchen.de Fri Dec 14 16:19:00 2007 From: wmglo@dent.med.uni-muenchen.de (Wolfram Gloger) Date: Fri, 14 Dec 2007 16:19:00 -0000 Subject: Something is broken in repack In-Reply-To: (message from Linus Torvalds on Wed, 12 Dec 2007 08:37:10 -0800 (PST)) References: Message-ID: <20071214161236.3080.qmail@md.dent.med.uni-muenchen.de> Hi, > Note that delta following involves patterns something like > > allocate (small) space for delta > for i in (1..depth) { > allocate large space for base > allocate large space for result > .. apply delta .. > free large space for base > free small space for delta > } > > so if you have some stupid heap algorithm that doesn't try to merge and > re-use free'd spaces very aggressively (because that takes CPU time!), ptmalloc2 (in glibc) _per arena_ is basically best-fit. This is the best known general strategy, but it certainly cannot be the best in every case. > you > might have memory usage be horribly inflated by the heap having all those > holes for all the objects that got free'd in the chain that don't get > aggressively re-used. It depends how large 'large' is -- if it exceeds the mmap() threshold (settable with mallopt(M_MMAP_THRESHOLD, ...)) the 'large' spaces will be allocated with mmap() and won't cause any internal fragmentation. It might pay to experiment with this parameter if it is hard to avoid the alloc/free large space sequence. > Threaded memory allocators then make this worse by probably using totally > different heaps for different threads (in order to avoid locking), so they > will *all* have the fragmentation issue. Indeed. Could someone perhaps try ptmalloc3 (http://malloc.de/malloc/ptmalloc3-current.tar.gz) on this case? Thanks, Wolfram. From wmglo@dent.med.uni-muenchen.de Fri Dec 14 16:44:00 2007 From: wmglo@dent.med.uni-muenchen.de (Wolfram Gloger) Date: Fri, 14 Dec 2007 16:44:00 -0000 Subject: Something is broken in repack In-Reply-To: <85d4tc8hi8.fsf@lola.goethe.zz> (message from David Kastrup on Wed, 12 Dec 2007 09:05:51 +0100) References: <85d4tc8hi8.fsf@lola.goethe.zz> Message-ID: <20071214161858.3506.qmail@md.dent.med.uni-muenchen.de> Hi, > Maybe an malloc/free/mmap wrapper that records the requested sizes and > alloc/free order and dumps them to file so that one can make a compact > git-free standalone test case for the glibc maintainers might be a good > thing. I already have such a wrapper: http://malloc.de/malloc/mtrace-20060529.tar.gz But note that it does interfere with the thread scheduling, so it can't record the exact same allocation pattern as when not using the wrapper. Regards, Wolfram. From dak@gnu.org Fri Dec 14 16:59:00 2007 From: dak@gnu.org (David Kastrup) Date: Fri, 14 Dec 2007 16:59:00 -0000 Subject: Something is broken in repack In-Reply-To: <20071214161236.3080.qmail@md.dent.med.uni-muenchen.de> (Wolfram Gloger's message of "14 Dec 2007 16:12:36 -0000") References: <20071214161236.3080.qmail@md.dent.med.uni-muenchen.de> Message-ID: <85r6hptecs.fsf@lola.goethe.zz> Wolfram Gloger writes: > Hi, > >> Note that delta following involves patterns something like >> >> allocate (small) space for delta >> for i in (1..depth) { >> allocate large space for base >> allocate large space for result >> .. apply delta .. >> free large space for base >> free small space for delta >> } >> >> so if you have some stupid heap algorithm that doesn't try to merge and >> re-use free'd spaces very aggressively (because that takes CPU time!), > > ptmalloc2 (in glibc) _per arena_ is basically best-fit. This is the > best known general strategy, Uh what? Someone crank out his copy of "The Art of Computer Programming", I think volume 1. Best fit is known (analyzed and proven and documented decades ago) to be one of the worst strategies for memory allocation. Exactly because it leads to huge fragmentation problems. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum From wmglo@dent.med.uni-muenchen.de Fri Dec 14 18:49:00 2007 From: wmglo@dent.med.uni-muenchen.de (Wolfram Gloger) Date: Fri, 14 Dec 2007 18:49:00 -0000 Subject: Something is broken in repack In-Reply-To: <85r6hptecs.fsf@lola.goethe.zz> (message from David Kastrup on Fri, 14 Dec 2007 17:45:07 +0100) References: <20071214161236.3080.qmail@md.dent.med.uni-muenchen.de> <85r6hptecs.fsf@lola.goethe.zz> Message-ID: <20071214165937.6405.qmail@md.dent.med.uni-muenchen.de> Hi, > Uh what? Someone crank out his copy of "The Art of Computer > Programming", I think volume 1. Best fit is known (analyzed and proven > and documented decades ago) to be one of the worst strategies for memory > allocation. Exactly because it leads to huge fragmentation problems. Well, quoting http://gee.cs.oswego.edu/dl/html/malloc.html: "As shown by Wilson et al, best-fit schemes (of various kinds and approximations) tend to produce the least fragmentation on real loads compared to other general approaches such as first-fit." See [Wilson 1995] ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for more details and references. Regards, Wolfram. From joel.sherrill@oarcorp.com Fri Dec 14 19:01:00 2007 From: joel.sherrill@oarcorp.com (Joel Sherrill) Date: Fri, 14 Dec 2007 19:01:00 -0000 Subject: Ada ACATS Failures on SVN Trunk Message-ID: <4762D019.6000402@oarcorp.com> Hi, Even with the large gnat1 compile time issue, I have managed to patiently run the ACATS on powerpc-rtems. This configuration worked well with gcc 4.2.2 (3 failures). I am seeing lot of failures (total of 691) and they do not appear to be RTEMS related. Here is a sample. Do any of these look like know problems? Does anyone see them on other platforms? ,.,. C34003C ACATS 2.5 88-01-01 00:00:00 ---- C34003C CHECK THAT ALL VALUES OF THE PARENT (BASE) TYPE ARE PRESENT FOR THE DERIVED (BASE) TYPE WHEN THE DERIVED TYPE DEFINITION IS CONSTRAINED. ALSO CHECK THAT ANY CONSTRAINT IMPOSED ON THE PARENT SUBTYPE IS ALSO IMPOSED ON THE DERIVED SUBTYPE. CHECK FOR DERIVED FLOATING POINT TYPES. raised CONSTRAINT_ERROR : c34003c.adb:104 range check failed ,.,. CXG2013 ACATS 2.5 88-01-01 00:00:00 ---- CXG2013 Check the accuracy of the TAN and COT functions. raised ADA.NUMERICS.ARGUMENT_ERROR : a-ngelfu.adb:969 instantiated at cxg2013.adb:100 instantiated at cxg2013.adb:337 ,.,. CXF3A03 ACATS 2.5 88-01-01 00:00:00 ---- CXF3A03 Check that function Length returns the number of characters in the edited output string produced by function Image, for a particular decimal type, currency string, and radix mark. Check that function Valid returns correct results based on the particular decimal value, and the Picture and Currency string parameters. raised ADA.IO_EXCEPTIONS.LAYOUT_ERROR : a-teioed.adb:300 ,.,. CXF2001 ACATS 2.5 88-01-01 00:00:00 ---- CXF2001 Check that the Divide procedure provides correct results. Check that the Remainder is calculated exactly. raised CONSTRAINT_ERROR : a-decima.adb:59 divide by zero Thanks. --joel From laurent@guerby.net Fri Dec 14 19:41:00 2007 From: laurent@guerby.net (Laurent GUERBY) Date: Fri, 14 Dec 2007 19:41:00 -0000 Subject: Ada ACATS Failures on SVN Trunk In-Reply-To: <4762D019.6000402@oarcorp.com> References: <4762D019.6000402@oarcorp.com> Message-ID: <1197658871.13773.28.camel@pc2> ACATS is clean (0 FAIL) on trunk for x86. CXF3A03 is the only FAIL for hppa-linux. c41328a is the only FAIL for powerpc64-linux. cxb3014/16 are the only FAIL on ia64-linux. You'll quickly find up to date ACATS results here: http://gcc.gnu.org/ml/gcc-testresults/2007-12/ Laurent On Fri, 2007-12-14 at 12:48 -0600, Joel Sherrill wrote: > Hi, > > Even with the large gnat1 compile time issue, I > have managed to patiently run the ACATS on > powerpc-rtems. This configuration worked > well with gcc 4.2.2 (3 failures). I am seeing > lot of failures (total of 691) and they do not > appear to be RTEMS related. > > Here is a sample. Do any of these look like > know problems? Does anyone see them on other > platforms? > > > ,.,. C34003C ACATS 2.5 88-01-01 00:00:00 > ---- C34003C CHECK THAT ALL VALUES OF THE PARENT (BASE) TYPE ARE PRESENT > FOR THE DERIVED (BASE) TYPE WHEN THE DERIVED TYPE > DEFINITION IS CONSTRAINED. ALSO CHECK THAT ANY > CONSTRAINT IMPOSED ON THE PARENT SUBTYPE IS ALSO IMPOSED > ON THE DERIVED SUBTYPE. CHECK FOR DERIVED FLOATING > POINT TYPES. > > raised CONSTRAINT_ERROR : c34003c.adb:104 range check failed > > > ,.,. CXG2013 ACATS 2.5 88-01-01 00:00:00 > ---- CXG2013 Check the accuracy of the TAN and COT functions. > > raised ADA.NUMERICS.ARGUMENT_ERROR : a-ngelfu.adb:969 instantiated at > cxg2013.adb:100 instantiated at cxg2013.adb:337 > > ,.,. CXF3A03 ACATS 2.5 88-01-01 00:00:00 > ---- CXF3A03 Check that function Length returns the number of characters > in the edited output string produced by function Image, > for a particular decimal type, currency string, and > radix mark. Check that function Valid returns correct > results based on the particular decimal value, and the > Picture and Currency string parameters. > > raised ADA.IO_EXCEPTIONS.LAYOUT_ERROR : a-teioed.adb:300 > > > ,.,. CXF2001 ACATS 2.5 88-01-01 00:00:00 > ---- CXF2001 Check that the Divide procedure provides correct results. > Check that the Remainder is calculated exactly. > > raised CONSTRAINT_ERROR : a-decima.adb:59 divide by zero > > Thanks. > > --joel > From joel.sherrill@oarcorp.com Fri Dec 14 20:13:00 2007 From: joel.sherrill@oarcorp.com (Joel Sherrill) Date: Fri, 14 Dec 2007 20:13:00 -0000 Subject: Ada ACATS Failures on SVN Trunk In-Reply-To: <1197658871.13773.28.camel@pc2> References: <4762D019.6000402@oarcorp.com> <1197658871.13773.28.camel@pc2> Message-ID: <4762DC6F.6060506@oarcorp.com> Laurent GUERBY wrote: > ACATS is clean (0 FAIL) on trunk for x86. CXF3A03 is the > only FAIL for hppa-linux. c41328a is the only FAIL for powerpc64-linux. > cxb3014/16 are the only FAIL on ia64-linux. > > You'll quickly find up to date ACATS results here: > > http://gcc.gnu.org/ml/gcc-testresults/2007-12/ > > > Thanks Laurent. Those look much better than what I am getting. Could this be related to PR34400? --joel > Laurent > > On Fri, 2007-12-14 at 12:48 -0600, Joel Sherrill wrote: > >> Hi, >> >> Even with the large gnat1 compile time issue, I >> have managed to patiently run the ACATS on >> powerpc-rtems. This configuration worked >> well with gcc 4.2.2 (3 failures). I am seeing >> lot of failures (total of 691) and they do not >> appear to be RTEMS related. >> >> Here is a sample. Do any of these look like >> know problems? Does anyone see them on other >> platforms? >> >> >> ,.,. C34003C ACATS 2.5 88-01-01 00:00:00 >> ---- C34003C CHECK THAT ALL VALUES OF THE PARENT (BASE) TYPE ARE PRESENT >> FOR THE DERIVED (BASE) TYPE WHEN THE DERIVED TYPE >> DEFINITION IS CONSTRAINED. ALSO CHECK THAT ANY >> CONSTRAINT IMPOSED ON THE PARENT SUBTYPE IS ALSO IMPOSED >> ON THE DERIVED SUBTYPE. CHECK FOR DERIVED FLOATING >> POINT TYPES. >> >> raised CONSTRAINT_ERROR : c34003c.adb:104 range check failed >> >> >> ,.,. CXG2013 ACATS 2.5 88-01-01 00:00:00 >> ---- CXG2013 Check the accuracy of the TAN and COT functions. >> >> raised ADA.NUMERICS.ARGUMENT_ERROR : a-ngelfu.adb:969 instantiated at >> cxg2013.adb:100 instantiated at cxg2013.adb:337 >> >> ,.,. CXF3A03 ACATS 2.5 88-01-01 00:00:00 >> ---- CXF3A03 Check that function Length returns the number of characters >> in the edited output string produced by function Image, >> for a particular decimal type, currency string, and >> radix mark. Check that function Valid returns correct >> results based on the particular decimal value, and the >> Picture and Currency string parameters. >> >> raised ADA.IO_EXCEPTIONS.LAYOUT_ERROR : a-teioed.adb:300 >> >> >> ,.,. CXF2001 ACATS 2.5 88-01-01 00:00:00 >> ---- CXF2001 Check that the Divide procedure provides correct results. >> Check that the Remainder is calculated exactly. >> >> raised CONSTRAINT_ERROR : a-decima.adb:59 divide by zero >> >> Thanks. >> >> --joel >> >> > > From laurent@guerby.net Fri Dec 14 20:45:00 2007 From: laurent@guerby.net (Laurent GUERBY) Date: Fri, 14 Dec 2007 20:45:00 -0000 Subject: Ada ACATS Failures on SVN Trunk In-Reply-To: <4762DC6F.6060506@oarcorp.com> References: <4762D019.6000402@oarcorp.com> <1197658871.13773.28.camel@pc2> <4762DC6F.6060506@oarcorp.com> Message-ID: <1197663114.13773.32.camel@pc2> On Fri, 2007-12-14 at 13:41 -0600, Joel Sherrill wrote: > Laurent GUERBY wrote: > > ACATS is clean (0 FAIL) on trunk for x86. CXF3A03 is the > > only FAIL for hppa-linux. c41328a is the only FAIL for powerpc64-linux. > > cxb3014/16 are the only FAIL on ia64-linux. > > > > You'll quickly find up to date ACATS results here: > > > > http://gcc.gnu.org/ml/gcc-testresults/2007-12/ > > > > > > > Thanks Laurent. Those look much better than what I > am getting. > > Could this be related to PR34400? 34400 is about slow compile but not about wrong code so I doubt it's the issue. Could you send me privately the compressed log of the ACATS run? How are the other language part of the GCC testsuite doing on the target? Laurent PS: I'll be travelling and be mostly offline this week-end. From joel.sherrill@oarcorp.com Fri Dec 14 21:22:00 2007 From: joel.sherrill@oarcorp.com (Joel Sherrill) Date: Fri, 14 Dec 2007 21:22:00 -0000 Subject: Ada ACATS Failures on SVN Trunk In-Reply-To: <1197663114.13773.32.camel@pc2> References: <4762D019.6000402@oarcorp.com> <1197658871.13773.28.camel@pc2> <4762DC6F.6060506@oarcorp.com> <1197663114.13773.32.camel@pc2> Message-ID: <4762EB5B.7080705@oarcorp.com> Laurent GUERBY wrote: > On Fri, 2007-12-14 at 13:41 -0600, Joel Sherrill wrote: > >> Laurent GUERBY wrote: >> >>> ACATS is clean (0 FAIL) on trunk for x86. CXF3A03 is the >>> only FAIL for hppa-linux. c41328a is the only FAIL for powerpc64-linux. >>> cxb3014/16 are the only FAIL on ia64-linux. >>> >>> You'll quickly find up to date ACATS results here: >>> >>> http://gcc.gnu.org/ml/gcc-testresults/2007-12/ >>> >>> >>> >>> >> Thanks Laurent. Those look much better than what I >> am getting. >> >> Could this be related to PR34400? >> > > 34400 is about slow compile but not about wrong code so I doubt it's the > issue. Could you send me privately the compressed log of the ACATS run? > > Sure. > How are the other language part of the GCC testsuite doing on the > target? > > We have never figured out how to run them linking with RTEMS. :( The last time I tried passing the extra arguments resulted in the gcc driver core dumping. It is on my long term wish list. RTEMS itself was compiled with it and I don't see any of our tests failing. Doesn't prove perfection but not grossly broken. > Laurent > > PS: I'll be travelling and be mostly offline this week-end. > > Be safe. --joel From laurent@guerby.net Fri Dec 14 21:36:00 2007 From: laurent@guerby.net (Laurent GUERBY) Date: Fri, 14 Dec 2007 21:36:00 -0000 Subject: Ada ACATS Failures on SVN Trunk In-Reply-To: <4762EB5B.7080705@oarcorp.com> References: <4762D019.6000402@oarcorp.com> <1197658871.13773.28.camel@pc2> <4762DC6F.6060506@oarcorp.com> <1197663114.13773.32.camel@pc2> <4762EB5B.7080705@oarcorp.com> Message-ID: <1197667316.13773.41.camel@pc2> On Fri, 2007-12-14 at 14:45 -0600, Joel Sherrill wrote: > > 34400 is about slow compile but not about wrong code so I doubt it's the > > issue. Could you send me privately the compressed log of the ACATS run? > Sure. >From the log it looks like the reason for most test FAIL is because an expected exception raised in the test is not catched as it should by a local test exception handler and just propagates to the main handler which terminates execution. So what looks broken to me is exception propagation, that would explain good success in RTEMS/C testsuite which I assume is exception-free. Could you compile and run the following small program on your target: with Ada.Text_IO; use Ada.Text_IO; procedure P is begin begin raise Constraint_Error; exception when others => Put_Line ("catch1"); end; exception when others => Put_Line ("catch2"); end P; It should just print "catch1": if not we have a reduced testcase for the problem (open a bugzilla), otherwise ping me privately I'll send you small variations of this code until we get it. Laurent From joel.sherrill@oarcorp.com Fri Dec 14 22:24:00 2007 From: joel.sherrill@oarcorp.com (Joel Sherrill) Date: Fri, 14 Dec 2007 22:24:00 -0000 Subject: Ada ACATS Failures on SVN Trunk In-Reply-To: <1197667316.13773.41.camel@pc2> References: <4762D019.6000402@oarcorp.com> <1197658871.13773.28.camel@pc2> <4762DC6F.6060506@oarcorp.com> <1197663114.13773.32.camel@pc2> <4762EB5B.7080705@oarcorp.com> <1197667316.13773.41.camel@pc2> Message-ID: <4762F73C.7050308@oarcorp.com> Laurent GUERBY wrote: > On Fri, 2007-12-14 at 14:45 -0600, Joel Sherrill wrote: > >>> 34400 is about slow compile but not about wrong code so I doubt it's the >>> issue. Could you send me privately the compressed log of the ACATS run? >>> > > >> Sure. >> > > >> >From the log it looks like the reason for most test FAIL is >> > because an expected exception raised in the test is not catched as it > should by a local test exception handler and just propagates to the main > handler which terminates execution. So what looks broken to me is > exception propagation, that would explain good success in RTEMS/C > testsuite which I assume is exception-free. > > Could you compile and run the following small program on your target: > > with Ada.Text_IO; use Ada.Text_IO; > procedure P is > begin > begin > raise Constraint_Error; > exception > when others => > Put_Line ("catch1"); > end; > exception > when others => > Put_Line ("catch2"); > end P; > > It should just print "catch1": if not we have a reduced testcase for the > problem (open a bugzilla), otherwise ping me privately I'll send you > small variations of this code until we get it. > > psim-4.9 p.exe raised CONSTRAINT_ERROR : p.adb:5 explicit raise Looks like the reduced test case. Wow! You are quick. :) PR filed and you are cc'ed on it. I will verify this worked OK on 4.2.2 but I appear to have removed that to get some disk space back. --joel > Laurent > > > From toon@moene.indiv.nluug.nl Fri Dec 14 22:30:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Fri, 14 Dec 2007 22:30:00 -0000 Subject: HIRLAM and -ftree-loop-linear Message-ID: <47630282.6090408@moene.indiv.nluug.nl> Sebastian, Here are (attached) results for testing HIRLAM with and without -ftree-loop-linear. As you can see, the results are neutral: 4 loops fewer vectorized, but about 50 fewer recognized. Now I like to redo that test with -ftree-loop-distribution. Can you send me a patch against the trunk (otherwise it won't be a fair comparison). Kind regards, -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: loop-tests.txt URL: From dougkwan@google.com Fri Dec 14 22:35:00 2007 From: dougkwan@google.com (=?BIG5?B?RG91ZyBLd2FuICjD9q62vHcp?=) Date: Fri, 14 Dec 2007 22:35:00 -0000 Subject: Adding new dwarf encoding formats for complex integers Message-ID: <498552560712141430q75098aaembcda8db5c6edd8fb@mail.gmail.com> Hi, I am working on the gcc LTO project and I found that gcc does not generate sufficient debugging information for complex integer types. Currently gcc uses encoding DW_ATE_lo_user ( 0x80) for complex integer types but that 1) clashes with an HP extension and 2) does not distinguish between complex signed integer and complex unsigned integer types. I'm thinking about adding DW_ATE_GNU_complex_signed (0x87) and DW_ATE_GNU_complex_unsigned (0x88) encoding formats. Is there anything I need to do in addition to changing gcc? Are there people I should talk to? And what documentation should be updated? Currently gdb (I checked 6.7) does not support complex integer properly. So it needs to be changed anyway. -Doug From drow@false.org Fri Dec 14 22:47:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Fri, 14 Dec 2007 22:47:00 -0000 Subject: Adding new dwarf encoding formats for complex integers In-Reply-To: <498552560712141430q75098aaembcda8db5c6edd8fb@mail.gmail.com> References: <498552560712141430q75098aaembcda8db5c6edd8fb@mail.gmail.com> Message-ID: <20071214223532.GA11966@caradoc.them.org> On Fri, Dec 14, 2007 at 02:30:36PM -0800, Doug Kwan (?????????) wrote: > Is there anything I need to do in addition to changing gcc? Are > there people I should talk to? And what documentation should be > updated? Currently gdb (I checked 6.7) does not support complex > integer properly. So it needs to be changed anyway. The DWARF standard has its own mailing list and working group. I recommend contacting them first, to see if there's interest in a general definition. If you just want to add it to the GNU tools, then lo_user seems like the best place to put it - it's a vendor extension - so I don't see the problem. -- Daniel Jacobowitz CodeSourcery From dougkwan@google.com Fri Dec 14 22:47:00 2007 From: dougkwan@google.com (=?BIG5?B?RG91ZyBLd2FuICjD9q62vHcp?=) Date: Fri, 14 Dec 2007 22:47:00 -0000 Subject: Adding new dwarf encoding formats for complex integers In-Reply-To: <20071214223532.GA11966@caradoc.them.org> References: <498552560712141430q75098aaembcda8db5c6edd8fb@mail.gmail.com> <20071214223532.GA11966@caradoc.them.org> Message-ID: <498552560712141447q41c168b7k17c9bde81319feb7@mail.gmail.com> The new encoding format I am proposing fall between DW_ATE_lo_user and DW_ATE_hi_user. So they are vendor extensions. Currently gcc uses DW_ATE_lo_user, which collides with an HP vendor extension. -Doug 2007/12/14, Daniel Jacobowitz : > On Fri, Dec 14, 2007 at 02:30:36PM -0800, Doug Kwan (???) wrote: > > Is there anything I need to do in addition to changing gcc? Are > > there people I should talk to? And what documentation should be > > updated? Currently gdb (I checked 6.7) does not support complex > > integer properly. So it needs to be changed anyway. > > The DWARF standard has its own mailing list and working group. I > recommend contacting them first, to see if there's interest in a > general definition. > > If you just want to add it to the GNU tools, then lo_user seems like > the best place to put it - it's a vendor extension - so I don't > see the problem. > > -- > Daniel Jacobowitz > CodeSourcery > From gccadmin@gcc.gnu.org Sat Dec 15 00:18:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Sat, 15 Dec 2007 00:18:00 -0000 Subject: gcc-4.3-20071214 is now available Message-ID: <20071214224715.5714.qmail@sourceware.org> Snapshot gcc-4.3-20071214 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20071214/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.3 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 130946 You'll find: gcc-4.3-20071214.tar.bz2 Complete GCC (includes all of below) gcc-core-4.3-20071214.tar.bz2 C front end and core compiler gcc-ada-4.3-20071214.tar.bz2 Ada front end and runtime gcc-fortran-4.3-20071214.tar.bz2 Fortran front end and runtime gcc-g++-4.3-20071214.tar.bz2 C++ front end and runtime gcc-java-4.3-20071214.tar.bz2 Java front end and runtime gcc-objc-4.3-20071214.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.3-20071214.tar.bz2 The GCC testsuite Diffs from 4.3-20071207 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.3 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From nix@esperi.org.uk Sat Dec 15 04:12:00 2007 From: nix@esperi.org.uk (Nix) Date: Sat, 15 Dec 2007 04:12:00 -0000 Subject: Git and GCC In-Reply-To: (Johannes Schindelin's message of "Sat, 8 Dec 2007 12:24:00 +0000 (GMT)") References: <998d0e4a0712071821o520a75c4lbcaae92256071f48@mail.gmail.com> Message-ID: <877ijg6c9u.fsf@hades.wkstn.nix> On 8 Dec 2007, Johannes Schindelin said: > Hi, > > On Sat, 8 Dec 2007, J.C. Pizarro wrote: > >> On 2007/12/07, "Linus Torvalds" wrote: >> >> > SHA1 is almost totally insignificant on x86. It hardly shows up. But >> > we have a good optimized version there. >> >> If SHA1 is slow then why dont he contribute adding Haval160 (3 rounds) >> that it's faster than SHA1? And to optimize still more it with SIMD >> instructions in kernelspace and userland. > > He said SHA-1 is insignificant. Actually davem also said it *is* significant on SPARC. But of course J. C. Pizarro's suggested solution won't work because you can't just go around replacing SHA-1 in git with something else :) you could *add* new hashing methods, but you couldn't avoid SHA-1, and adding a new hashing method would bloat every object and every hash in objects like commits with an indication of which hashing method was in use. (But you know this.) >> 1. "Don't compress this repo but compact this uncompressed repo >> using minimal spanning forest and deltas" ... and then you do a git-gc. Oops, now what? ... or perhaps you want to look something up in the pack. Now you have to unpack a large hunk of the whole damn thing. >> 2. "After, compress this whole repo with LZMA (e.g. 48MiB) from 7zip before >> burning it to DVD for backup reasons or before replicating it to >> internet". > > Patches? ;-) Replicating a pack to the internet is almost invariably replicating *parts* of a pack anyway, which reduces to the problem with option 1 above... -- `The rest is a tale of post and counter-post.' --- Ian Rawlings describes USENET From sebpop@gmail.com Sat Dec 15 04:17:00 2007 From: sebpop@gmail.com (Sebastian Pop) Date: Sat, 15 Dec 2007 04:17:00 -0000 Subject: Fails for SPEC2006 using -O3 -ftree-parallelize-loops Message-ID: Hi, I've run a build for spec cpu2006 with -O3 -ftree-parallelize-loops=16 and interestingly there were some fails that I will investigate later. So I'm just reporting these, and asking for somebody who could fix the link options for autopar. I'm attaching a patch, not sure it will build. Before going in the depths of the build machinery, I'm asking if somebody could help with this. Benchmarks ICEing: 400.perlbench 401.bzip2 403.gcc 445.gobmk 456.hmmer 458.sjeng 462.libquantum 464.h264ref 471.omnetpp 473.astar 483.xalancbmk 416.gamess 434.zeusmp 435.gromacs 436.cactusADM 437.leslie3d 444.namd 447.dealII 453.povray 454.calculix 459.GemsFDTD 465.tonto 470.lbm 481.wrf 482.sphinx3 Not ICEing, but failing on the link step, can't find the libgomp functions: 429.mcf 433.milc 450.soplex For these a patch like this would be needed to automatically link the gomp and pthread libs. Index: gcc.c =================================================================== --- gcc.c (revision 130927) +++ gcc.c (working copy) @@ -721,6 +721,7 @@ proper position among the other output f %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}\ %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\ %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\ + %{ftree-parallelize-loops=:%:include(libgomp.spec)%(link_gomp)}\ %{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ %{fprofile-arcs|fprofile-generate|coverage:-lgcov}\ %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\ @@ -873,8 +874,13 @@ static const char *const multilib_defaul #define GOMP_SELF_SPECS "%{fopenmp: -pthread}" #endif +#ifndef PARLOOPS_SELF_SPECS +#define PARLOOPS_SELF_SPECS "%{ftree-parallelize-loops=: -pthread}" +#endif + + static const char *const driver_self_specs[] = { - DRIVER_SELF_SPECS, GOMP_SELF_SPECS + DRIVER_SELF_SPECS, GOMP_SELF_SPECS, PARLOOPS_SELF_SPECS }; #ifndef OPTION_DEFAULT_SPECS -- Sebastian AMD - GNU Tools From sebpop@gmail.com Sat Dec 15 04:45:00 2007 From: sebpop@gmail.com (Sebastian Pop) Date: Sat, 15 Dec 2007 04:45:00 -0000 Subject: Fails for SPEC2006 using -O3 -ftree-parallelize-loops In-Reply-To: References: Message-ID: > I've run a build for spec cpu2006 with -O3 -ftree-parallelize-loops=16 This is on amd64-linux. From drow@false.org Sat Dec 15 09:35:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Sat, 15 Dec 2007 09:35:00 -0000 Subject: Adding new dwarf encoding formats for complex integers In-Reply-To: <498552560712141447q41c168b7k17c9bde81319feb7@mail.gmail.com> References: <498552560712141430q75098aaembcda8db5c6edd8fb@mail.gmail.com> <20071214223532.GA11966@caradoc.them.org> <498552560712141447q41c168b7k17c9bde81319feb7@mail.gmail.com> Message-ID: <20071215044528.GA30384@caradoc.them.org> On Fri, Dec 14, 2007 at 02:47:02PM -0800, Doug Kwan (?????????) wrote: > The new encoding format I am proposing fall between DW_ATE_lo_user and > DW_ATE_hi_user. So they are vendor extensions. Currently gcc uses > DW_ATE_lo_user, which collides with an HP vendor extension. If we already have one vendor extension, why switch? The vendor extensions conflict with other vendors... that's by definition. -- Daniel Jacobowitz CodeSourcery From jakub@redhat.com Sat Dec 15 10:46:00 2007 From: jakub@redhat.com (Jakub Jelinek) Date: Sat, 15 Dec 2007 10:46:00 -0000 Subject: Fails for SPEC2006 using -O3 -ftree-parallelize-loops In-Reply-To: References: Message-ID: <20071215094121.GD2947@sunsite.mff.cuni.cz> On Fri, Dec 14, 2007 at 10:12:11PM -0600, Sebastian Pop wrote: > I've run a build for spec cpu2006 with -O3 -ftree-parallelize-loops=16 > and interestingly there were some fails that I will investigate later. > So I'm just reporting these, and asking for somebody who could fix > the link options for autopar. I'm attaching a patch, not sure it will build. > Before going in the depths of the build machinery, I'm asking if > somebody could help with this. There is a bunch of -ftree-parallelize-loops ICEs in gcc bugzilla already. > For these a patch like this would be needed to automatically link the > gomp and pthread libs. > > Index: gcc.c > =================================================================== > --- gcc.c (revision 130927) > +++ gcc.c (working copy) > @@ -721,6 +721,7 @@ proper position among the other output f > %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}\ > %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\ > %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\ > + %{ftree-parallelize-loops=:%:include(libgomp.spec)%(link_gomp)}\ > %{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ > %{fprofile-arcs|fprofile-generate|coverage:-lgcov}\ > %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\ This should be: - %{fopenmp:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ + %{fopenmp|ftree-parallelize-loops=:%:include(libgomp.spec)%(link_gomp)} %(mflib)\ instead, so that for -fopenmp -ftree-parallelize-loops=4 you don't get it duplicated on the link line. > @@ -873,8 +874,13 @@ static const char *const multilib_defaul > #define GOMP_SELF_SPECS "%{fopenmp: -pthread}" > #endif > > +#ifndef PARLOOPS_SELF_SPECS > +#define PARLOOPS_SELF_SPECS "%{ftree-parallelize-loops=: -pthread}" > +#endif > + > + > static const char *const driver_self_specs[] = { > - DRIVER_SELF_SPECS, GOMP_SELF_SPECS > + DRIVER_SELF_SPECS, GOMP_SELF_SPECS, PARLOOPS_SELF_SPECS > }; The same, just add |ftree-parallelize-loops= into GOMP_SELF_SPECS. Jakub From mmokrejs@ribosome.natur.cuni.cz Sat Dec 15 20:25:00 2007 From: mmokrejs@ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=) Date: Sat, 15 Dec 2007 20:25:00 -0000 Subject: How to interpret the automaton output during gcc bootstrap and -mcpu=arm926ej-s or --with-cpu=arm926ejs Message-ID: <4763B05E.1040904@ribosome.natur.cuni.cz> Hi, I am trying to build gcc-4.2.2 for this CPU and am surprised or badly *interpreting* that `arm1026ejs' code is maybe faster than `arm926ejs'? I tried to find this in the Documentation and by Google but no luck. $ cat /proc/cpuinfo Processor : ARM926EJ-Sid(wb) rev 5 (v5l) BogoMIPS : 99.73 Features : swp half thumb fastmult edsp java CPU implementer : 0x41 CPU architecture: 5TEJ CPU variant : 0x0 CPU part : 0x926 CPU revision : 5 Cache type : write-back Cache clean : cp15 c7 ops Cache lockdown : format C Cache format : Harvard I size : 32768 I assoc : 4 I line length : 32 I sets : 256 D size : 32768 D assoc : 4 D line length : 32 D sets : 256 Hardware : Oxsemi NAS Revision : 0000 Serial : 0000000000000000 $ CFLAGS="-mcpu=arm926ejs -msoft-float -fomit-frame-pointer -pipe -O2" \ ../configure --with-cpu=arm926ej-s --with-float=soft \ --enable-languages=c,c++,objc --disable-nls --with-newlib $ make bootstrap-lean ... build/genautomata ../../gcc/config/arm/arm.md \ insn-conditions.md > tmp-automata.c Automaton `arm' 444 NDFA states, 1168 NDFA arcs 444 DFA states, 1168 DFA arcs 116 minimal DFA states, 482 minimal DFA arcs 123 all insns 16 insn equivalence classes 0 locked states 468 transition comb vector els, 1856 trans table els: use comb vect 1856 min delay table els, compression factor 1 Automaton `arm926ejs' 17 NDFA states, 47 NDFA arcs 17 DFA states, 47 DFA arcs 11 minimal DFA states, 35 minimal DFA arcs 123 all insns 9 insn equivalence classes 0 locked states 39 transition comb vector els, 99 trans table els: use comb vect 99 min delay table els, compression factor 2 Automaton `arm1020e' 3185 NDFA states, 9075 NDFA arcs 3185 DFA states, 9075 DFA arcs 451 minimal DFA states, 2740 minimal DFA arcs 123 all insns 17 insn equivalence classes 0 locked states 2771 transition comb vector els, 7667 trans table els: use comb vect 7667 min delay table els, compression factor 1 Automaton `arm1026ejs' 10 NDFA states, 27 NDFA arcs 10 DFA states, 27 DFA arcs 6 minimal DFA states, 19 minimal DFA arcs 123 all insns 7 insn equivalence classes 0 locked states 18 transition comb vector els, 42 trans table els: use simple vect 42 min delay table els, compression factor 2 Automaton `arm1136jfs' 19 NDFA states, 53 NDFA arcs 19 DFA states, 53 DFA arcs 9 minimal DFA states, 33 minimal DFA arcs 123 all insns 8 insn equivalence classes 0 locked states 35 transition comb vector els, 72 trans table els: use simple vect 72 min delay table els, compression factor 2 Automaton `armfp' 70 NDFA states, 147 NDFA arcs 70 DFA states, 147 DFA arcs 70 minimal DFA states, 147 minimal DFA arcs 123 all insns 9 insn equivalence classes 0 locked states 150 transition comb vector els, 630 trans table els: use comb vect 630 min delay table els, compression factor 1 Automaton `vfp11' 198 NDFA states, 631 NDFA arcs 198 DFA states, 631 DFA arcs 198 minimal DFA states, 631 minimal DFA arcs 123 all insns 8 insn equivalence classes 0 locked states 749 transition comb vector els, 1584 trans table els: use simple vect 1584 min delay table els, compression factor 1 4331 all allocated states, 9599 all allocated arcs 4316 all allocated alternative states 4230 all transition comb vector els, 11950 all trans table els 11950 all min delay table els 0 all locked states transformation: 0.010000, building DFA: 5.850000 DFA minimization: 0.830000, making insn equivalence: 0.040000 all automaton generation: 7.650000, output: 1.990000 ... objext='.o' \ LIB1ASMFUNCS='_udivsi3 _divsi3 _umodsi3 _modsi3 _dvmd_lnx' \ LIB2FUNCS_ST='_eprintf __gcc_bcmp' \ LIB2FUNCS_EXCLUDE='' \ LIBGCOV='_gcov _gcov_merge_add _gcov_merge_single _gcov_merge_delta _gcov_fork _gcov_execl _gcov_execlp _gcov_execle _gcov_execv _gcov_execvp _gcov_execve _gcov_interval_profiler _gcov_pow2_profiler _gcov_one_value_profiler' \ LIB2ADD='' \ LIB2ADD_ST='' \ LIB2ADDEH='../../gcc/unwind-dw2.c ../../gcc/unwind-dw2-fde-glibc.c ../../gcc/unwind-sjlj.c ../../gcc/gthr-gnat.c ../../gcc/unwind-c.c' \ LIB2ADDEHSTATIC='../../gcc/unwind-dw2.c ../../gcc/unwind-dw2-fde-glibc.c ../../gcc/unwind-sjlj.c ../../gcc/gthr-gnat.c ../../gcc/unwind-c.c' \ LIB2ADDEHSHARED='../../gcc/unwind-dw2.c ../../gcc/unwind-dw2-fde-glibc.c ../../gcc/unwind-sjlj.c ../../gcc/gthr-gnat.c ../../gcc/unwind-c.c' \ LIB2ADDEHDEP='unwind.inc unwind-dw2-fde.h unwind-dw2-fde.c' \ LIB2_SIDITI_CONV_FUNCS='' \ LIBUNWIND='' \ LIBUNWINDDEP='' \ SHLIBUNWIND_LINK='' \ SHLIBUNWIND_INSTALL='' \ FPBIT='' \ FPBIT_FUNCS='_pack_sf _unpack_sf _addsub_sf _mul_sf _div_sf _fpcmp_parts_sf _compare_sf _eq_sf _ne_sf _gt_sf _ge_sf _lt_sf _le_sf _unord_sf _si_to_sf _sf_to_si _negate_sf _make_sf _sf_to_df _sf_to_tf _thenan_sf _sf_to_usi _usi_to_sf' \ LIB2_DIVMOD_FUNCS='_divdi3 _moddi3 _udivdi3 _umoddi3 _udiv_w_sdiv _udivmoddi4' \ DPBIT='' \ DPBIT_FUNCS='_pack_df _unpack_df _addsub_df _mul_df _div_df _fpcmp_parts_df _compare_df _eq_df _ne_df _gt_df _ge_df _lt_df _le_df _unord_df _si_to_df _df_to_si _negate_df _make_df _df_to_sf _df_to_tf _thenan_df _df_to_usi _usi_to_df' \ TPBIT='' \ TPBIT_FUNCS='_pack_tf _unpack_tf _addsub_tf _mul_tf _div_tf _fpcmp_parts_tf _compare_tf _eq_tf _ne_tf _gt_tf _ge_tf _lt_tf _le_tf _unord_tf _si_to_tf _tf_to_si _negate_tf _make_tf _tf_to_df _tf_to_sf _thenan_tf _tf_to_usi _usi_to_tf' \ DFP_ENABLE='' \ DFP_CFLAGS='' \ D32PBIT='' \ D32PBIT_FUNCS='_addsub_sd _div_sd _mul_sd _plus_sd _minus_sd _eq_sd _ne_sd _lt_sd _gt_sd _le_sd _ge_sd _sd_to_si _sd_to_di _sd_to_usi _sd_to_udi _si_to_sd _di_to_sd _usi_to_sd _udi_to_sd _sd_to_sf _sd_to_df _sd_to_xf _sf_to_sd _df_to_sd _xf_to_sd _sd_to_dd _sd_to_td _unord_sd _conv_sd' \ D64PBIT='' \ D64PBIT_FUNCS='_addsub_dd _div_dd _mul_dd _plus_dd _minus_dd _eq_dd _ne_dd _lt_dd _gt_dd _le_dd _ge_dd _dd_to_si _dd_to_di _dd_to_usi _dd_to_udi _si_to_dd _di_to_dd _usi_to_dd _udi_to_dd _dd_to_sf _dd_to_df _dd_to_xf _sf_to_dd _df_to_dd _xf_to_dd _dd_to_sd _dd_to_td _unord_dd _conv_dd' \ D128PBIT='' \ D128PBIT_FUNCS='_addsub_td _div_td _mul_td _plus_td _minus_td _eq_td _ne_td _lt_td _gt_td _le_td _ge_td _td_to_si _td_to_di _td_to_usi _td_to_udi _si_to_td _di_to_td _usi_to_td _udi_to_td _td_to_sf _td_to_df _td_to_xf _sf_to_td _df_to_td _xf_to_td _td_to_sd _td_to_dd _unord_td _conv_td' \ MULTILIBS=`/scratch/gcc-4.2.2/objdir/./gcc/xgcc -B/scratch/gcc-4.2.2/objdir/./gcc/ -B/usr/local/armv5tejl-unknown-linux-gnu/bin/ -B/usr/local/armv5tejl-unknown-linux-gnu/lib/ -isystem /usr/local/armv5tejl-unknown-linux-gnu/include -isystem /usr/local/armv5tejl-unknown-linux-gnu/sys-include --print-multi-lib` \ EXTRA_MULTILIB_PARTS='' \ SHLIB_LINK='/scratch/gcc-4.2.2/objdir/./gcc/xgcc -B/scratch/gcc-4.2.2/objdir/./gcc/ -B/usr/local/armv5tejl-unknown-linux-gnu/bin/ -B/usr/local/armv5tejl-unknown-linux-gnu/lib/ -isystem /usr/local/armv5tejl-unknown-linux-gnu/include -isystem /usr/local/armv5tejl-unknown-linux-gnu/sys-include -O2 -O2 -mcpu=arm926ejs -msoft-float -fomit-frame-pointer -pipe -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fomit-frame-pointer -fPIC -g0 -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -Dinhibit_libc -shared -nodefaultlibs -Wl,--soname=@shlib_base_name@.so.1 -Wl,--version-script=@shlib_map_file@ -o @multilib_dir@/@shlib_base_name@.so.1.tmp @multilib_flags@ @shlib_objs@ -lc && rm -f @multilib_dir@/@shlib_base_name@.so && if [ -f @multilib_dir@/@shlib_base_name@.so.1 ]; then mv -f @multilib_dir@/@shlib_base_name@.so.1 @multilib_dir@/@shlib_base_name@.so.1.backup; else true; fi && mv @multi lib_dir@/@shlib_base_name@.so.1.tmp @multilib_dir@/@shlib_base_name@.so.1 && ln -s @shlib_base_name@.so.1 @multilib_dir@/@shlib_base_name@.so' \ SHLIB_INSTALL='$(mkinstalldirs) $(DESTDIR)$(slibdir)@shlib_slibdir_qual@; /usr/bin/install -c -m 644 @multilib_dir@/@shlib_base_name@.so.1 $(DESTDIR)$(slibdir)@shlib_slibdir_qual@/@shlib_base_name@.so.1; rm -f $(DESTDIR)$(slibdir)@shlib_slibdir_qual@/@shlib_base_name@.so; ln -s @shlib_base_name@.so.1 $(DESTDIR)$(slibdir)@shlib_slibdir_qual@/@shlib_base_name@.so' \ SHLIB_EXT='.so' \ SHLIB_MULTILIB='' \ SHLIB_MKMAP='../../gcc/mkmap-symver.awk' \ SHLIB_MKMAP_OPTS='' \ SHLIB_MAPFILES='../../gcc/libgcc-std.ver ../../gcc/config/libgcc-glibc.ver ../../gcc/config/libgcc-glibc.ver' \ SHLIB_NM_FLAGS='-pg' \ MULTILIB_OSDIRNAMES='' \ ASM_HIDDEN_OP='' \ GCC_FOR_TARGET='/scratch/gcc-4.2.2/objdir/./gcc/xgcc -B/scratch/gcc-4.2.2/objdir/./gcc/ -B/usr/local/armv5tejl-unknown-linux-gnu/bin/ -B/usr/local/armv5tejl-unknown-linux-gnu/lib/ -isystem /usr/local/armv5tejl-unknown-linux-gnu/include -isystem /usr/local/armv5tejl-unknown-linux-gnu/sys-include' \ mkinstalldirs='/bin/sh ../../gcc/../mkinstalldirs' \ /bin/sh mklibgcc > tmp-libgcc.mk mv tmp-libgcc.mk libgcc.mk TARGET_CPU_DEFAULT="" \ HEADERS="auto-host.h ansidecl.h" DEFINES="USED_FOR_TARGET " \ /bin/sh ../../gcc/mkconfig.sh tconfig.h /scratch/gcc-4.2.2/objdir/./gcc/xgcc -B/scratch/gcc-4.2.2/objdir/./gcc/ -B/usr/local/armv5tejl-unknown-linux-gnu/bin/ -B/usr/local/armv5tejl-unknown-linux-gnu/lib/ -isystem /usr/local/armv5tejl-unknown-linux-gnu/include -isystem /usr/local/armv5tejl-unknown-linux-gnu/sys-include -O2 -O2 -mcpu=arm926ejs -msoft-float -fomit-frame-pointer -pipe -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libdecnumber -I../libdecnumber -g0 -finhibit-size-directive -fno-inline-functions -fno-exceptions -fno-zero-initialized-in-bss -fno-toplevel-reorder -Dinhibit_libc \ -c ../../gcc/crtstuff.c -DCRT_BEGIN \ -o crtbegin.o ../../gcc/crtstuff.c:1: error: bad value (arm926ejs) for -mcpu= switch make[3]: *** [crtbegin.o] Error 1 make[3]: Leaving directory `/scratch/gcc-4.2.2/objdir/gcc' make[2]: *** [all-stage1-gcc] Error 2 make[2]: Leaving directory `/scratch/gcc-4.2.2/objdir' make[1]: *** [stage1-bubble] Error 2 make[1]: Leaving directory `/scratch/gcc-4.2.2/objdir' make: *** [bootstrap-lean] Error 2 $ # gcc -v Using built-in specs. Configured with: /data/releases/v1.18/buildroot/toolchain_build_arm_nofpu/gcc-3.4.2/configure --prefix=/usr --build=i386-pc-linux-gnu --host=arm-linux-uclibc --target=arm-linux-uclibc --enable-languages=c,c++ --enable-shared --with-gxx-include-dir=/usr/include/c++ --disable-__cxa_atexit --enable-target-optspace --with-gnu-ld --disable-nls --enable-threads --enable-multilib --with-float=soft Thread model: posix gcc version 3.4.2 $ /scratch/gcc-4.2.2/objdir/./gcc/xgcc -v Using built-in specs. Target: armv5tejl-unknown-linux-gnu Configured with: ../configure --with-cpu=arm926ej-s --with-float=soft --enable-languages=c,c++,objc --disable-nls --with-newlib Thread model: posix gcc version 4.2.2 $ So which -mcpu values should I pass to configure and as C*FLAGS to bootstrap to yield a compiler optimized and running only at this processor? It seems gcc 3.4.2 and 4.2.2 accept either 'arm926ejs' or 'arm926ej-s', and configure again just one of these? Thanks for your help Martin From aoliva@redhat.com Sat Dec 15 20:32:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Sat, 15 Dec 2007 20:32:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4748B43B.8040609@adacore.com> (Robert Dewar's message of "Sat\, 24 Nov 2007 18\:31\:07 -0500") References: <571f6b510711121208m2bf7c77fp884f52d458df118b@mail.gmail.com> <571f6b510711231556o439e7bbek9ab4855079bab51d@mail.gmail.com> <10711240545.AA22279@vlsi1.ultra.nyu.edu> <4748B43B.8040609@adacore.com> Message-ID: On Nov 24, 2007, Robert Dewar wrote: > Alexandre Oliva wrote: >> Besides, the Ada RTS compiles differently with -g than without -g, >> such that compare-debug doesn't pass if you compare sysdep.o. Nobody >> but me seems to care. > We certainly care about this, and appreciate efforts to fix it! Should be fixed now, FWIW. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dewar@adacore.com Sat Dec 15 21:41:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Sat, 15 Dec 2007 21:41:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <571f6b510711121208m2bf7c77fp884f52d458df118b@mail.gmail.com> <571f6b510711231556o439e7bbek9ab4855079bab51d@mail.gmail.com> <10711240545.AA22279@vlsi1.ultra.nyu.edu> <4748B43B.8040609@adacore.com> Message-ID: <476439CE.6050907@adacore.com> Alexandre Oliva wrote: > On Nov 24, 2007, Robert Dewar wrote: > >> Alexandre Oliva wrote: > >>> Besides, the Ada RTS compiles differently with -g than without -g, >>> such that compare-debug doesn't pass if you compare sysdep.o. Nobody >>> but me seems to care. > >> We certainly care about this, and appreciate efforts to fix it! > > Should be fixed now, FWIW. Good to hear, definition worth while! that's an important invariant. > From aoliva@redhat.com Sat Dec 15 22:51:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Sat, 15 Dec 2007 22:51:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4756B02D.9010302@google.com> (Diego Novillo's message of "Wed\, 05 Dec 2007 09\:05\:33 -0500") References: <84fc9c000711050327x74845c78ya18a3329fcf9e4d2@mail.gmail.com> <4733A637.8070004@adacore.com> <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> Message-ID: On Dec 5, 2007, Diego Novillo wrote: > On 11/25/07 3:43 PM, Mark Mitchell wrote: >> My suggestion (not as a GCC SC member or GCC RM, but just as a fellow >> GCC developer with an interest in improving the compiler in the same way >> that you're trying to do) is that you stop writing code and start >> writing a paper about what you're trying to do. >> >> Ignore the implementation. Describe the problem in detail. Narrow its >> scope if necessary. Describe the success criteria in detail. Ideally, >> the success criteria are mechanically checkable properties: i.e., given >> a C program as input, and optimized code + debug information as output, >> it should be possible to algorithmically prove whether the output is >> correct. > Yes, please. I would very much like to see an abstract design > document on what you are trying to accomplish. Other than the ones I've already posted, here's one: http://dwarfstd.org/Dwarf3Std.php Seriously. There is a standard for this stuff. My ultimate goal in this project is that we comply with it, at least as far as emitting debug information for location of variables is concerned. Here are some relevant postings on design strategies, rationales and goals: http://gcc.gnu.org/ml/gcc/2007-11/msg00229.html (goals) http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00160.html (initial plan) http://gcc.gnu.org/ml/gcc/2007-11/msg00261.html (detailed plan) http://gcc.gnu.org/ml/gcc/2007-11/msg00317.html (example) http://gcc.gnu.org/ml/gcc/2007-11/msg00590.html (more example) http://gcc.gnu.org/ml/gcc/2007-11/msg00176.html (design rationale) http://gcc.gnu.org/ml/gcc/2007-11/msg00177.html (clarification) > I would like to see exactly what Mark is asking for. Perhaps a > presentation in next year's Summit? Sure, if there's interest, I could sure plan on doing that. I could use sponsors, BTW; I haven't discussed this with my employer, and writing articles and presenting speeches are not part of this assignment I was given. Anyhow, by the time of the next year's Summit, I hope this is mostly old news. > I don't think I understand the goal of the project. Follow the standard, as in (1) emit debug information that is correct (standard-compliant), as in, if we emit some piece of debug information, it reflects reality, rather than being a sometimes distant approximation of some past reality long destroyed by some optimization pass, and (2) emit debug information that is more complete, as in, we currently fail to emit a lot of debug information that we could, because we lose track of the location of variables as optimization passes fail to maintain the needed information to do so. > "Correct debugging info" means little, particularly if you say that > it's not debuggers that you are thinking about. Thinking of the debuggers is a mistake. We don't think of specific compilers when reading a programming language standard. We don't think of specific processors when reading an ISA or ABI specification. Even when we read documentation specific to a processor, we still don't think of its internal implementation details in order to write a compiler for it; even the scheduling properties are abstracted out in the design specification and optimization guidelines. When someone finds that the compiler deviates from one of these standards, we just cite chapter and verse of the relevant standard, and people see there's a bug. Why should debug information standards be treated any differently? > It's certainly worrisome that your implementation seems to be > intrusive to the point of brittleness. What part of instrusiveness are you concerned about? The change of INSN_P such that it covers DEBUG_INSN_P too in the supported range? Or the few changes that revert to the original INSN_P, in the few exceptions in which DEBUG_INSN_P is not to be handled as an INSN? I've heard this "intrusiveness" argument be pointed out so many times, by so many people that claim to not have been able to keep up with the thread, and who claim to have not looked at the patches at all, that I'm more and more convinced it's just fear of the unknown than any actual rational evaluation of the impact of the changes. Seriously. Have a look at the patches and tell me what in them you regard as intrusive. We're talking about infrastructure here, needed to fix GCC's carelessness about maintaining a mapping between source and implementation concepts that went on for years and years, while optimizations were added and debug information was degraded. At some point you have to face reality and see that such information isn't kept around by magic, it takes some effort, and this effort is needed at every location where there are changes that might affect debug information. And that's pretty much everywhere. Even if we had consistent interfaces to make some changes, such as variable renaming, substitution, etc, this would only cover a small amount of the data a debug info generator would need: it needs higher-level information than that, especially in rtl, where transformations, for historical reasons, are messier than in the tree IL. So, the approach I've taken is to use the strength of the problem against itself: take advantage of the fact that optimizers already know how to perform transformations they need to do in order to keep things consistent, and represent debug information in a way that, to them, will look just like any other use, so they will adjust it likewise. And then, on top of that, handle the few exceptions, in which the optimizer needs to do something cleverer, because the transformation it performs wouldn't work when say there's more than one use or so. > Will every new optimization need to think about debug information > from scratch and refrain from doing certain transformations? Refraining from doing certain transformations would be wrong. We don't want debug information to affect code generation, and we don't want it to reduce the amount of optimization you can make. So, you optimize away, and if you find that you can't keep track of debug information, you mark stuff as unavailable, or, most likely, the safety nets in place will do that for you, rather than taking the current approach, in which we silently corrupt debug information. Sure, this might require a little bit more thinking in some optimizations. But in my experience fixing up the tree and rtl passes that needed tweaking, the additional thinking needed is a no-brainer in most cases; in a few, you have to work a bit harder to keep information around rather than simply noting it as unavailable. But it has never required optimizations to be disabled, and it must not do so. In fact, in a few cases, I noticed we were missing trivial optimizations and fixed them. > In my simplistic view of this problem, I've always had the idea that > -O0 -g means "full debugging bliss", -O1 -g means "tolerable > debugging" (symbols shouldn't disappear, for instance, though they do > now) and -O2 -g means "you can probably know what line+function you're > executing". I've never seen this documented as such, and we've never worked toward these stated goals. However, I see that, underlying all of this, we should be concerned about emitting debug information that is correct, i.e., never emit information that says the location of FOO is BAR while it's actually at BAZ. I've seen many people (including myself, in a distant past) claiming that imprecise information is better than no information. I've learned better. Debugger information consumers are often equipped with heuristics to fill in common gaps in debug information. But if the information is there, and wrong, the heuristics that might very well have worked are disabled in favor of the incorrect information, and then the whole system (debuggers, monitors, etc, along with the program) misbehaves. And then, even when heuristics don't exist and the information is gone, it's better to tell the user "I don't know how to get you that" than to hand it something other than it needs (e.g., an incorrect variable location). > But you seem to be addressing other problems. And it even seems to me > that you want debugging information that is capable of deconstructing > arbitrary transformations done by the optimizers. No. I don't see where this notion came from, but it appears to be quite widespread. Omitting certain pieces of debug information is almost always correct, since most debug info attributes are optional. But emitting information that doesn't reflect the program is always incorrect. So, if you perform an arbitrary transformation that is too hard to represent in debug information, that's fine, just throw the information away. The debug information might become less complete, and therefore less useful, but it will at least won't induce errors elsewhere. The parallel I draw is that emitting an optional piece of debug information is like applying an optional optimization. If it's correct, and it's not too expensive, go for it. But if it's going to get you the wrong output, it's broken, so don't do it. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From sebpop@gmail.com Sat Dec 15 22:55:00 2007 From: sebpop@gmail.com (Sebastian Pop) Date: Sat, 15 Dec 2007 22:55:00 -0000 Subject: HIRLAM and -ftree-loop-linear In-Reply-To: <47630282.6090408@moene.indiv.nluug.nl> References: <47630282.6090408@moene.indiv.nluug.nl> Message-ID: On Dec 14, 2007 4:24 PM, Toon Moene wrote: > Here are (attached) results for testing HIRLAM with and without > -ftree-loop-linear. > Thanks Toon for checking this. > Compilation flags: > > CCFLAGS := -g -O3 $(MACHINECPP) -ftree-loop-linear -ffast-math -fno-associative-math -march=native -mtune=native -ftree-vectorizer-verbose=2 > FCFLAGS := -g -O3 -ftree-loop-linear -fbacktrace -ffpe-trap=invalid,zero,overflow -ffast-math -fno-associative-math -march=native -mtune=native -ftree-vectorizer-verbose=2 > > This compilation got one ICE: > > rttov_aitosu.f90: In function 'rttov_aitosu': > rttov_aitosu.f90:4: error: definition in block 262 does not dominate use in block 134 > for SSA_NAME: pretmp.240_59 in statement: > prephitmp.220_58 = PHI > PHI argument > pretmp.240_59 > for PHI node > prephitmp.220_58 = PHI > rttov_aitosu.f90:4: internal compiler error: verify_ssa failed > Please submit a full bug report, > with preprocessed source if appropriate. > See for instructions. > > Worked around by compiling this file without -ftree-loop-linear > Could you verify that the attached patch fixes also this problem? Thanks again, Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: 859_pr34123.diff Type: text/x-diff Size: 8877 bytes Desc: not available URL: From aoliva@redhat.com Sun Dec 16 03:03:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Sun, 16 Dec 2007 03:03:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712031329.AA20246@vlsi1.ultra.nyu.edu> (Richard Kenner's message of "Mon\, 03 Dec 2007 08\:29\:16 EST") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> Message-ID: On Dec 3, 2007, kenner@vlsi1.ultra.nyu.edu (Richard Kenner) wrote: > In my view, ChangeLog is mostly "write-only" from a developer's > perspective. It's a document that the GNU project requires us to produce > for ... a good example of compliance with the GPL: 5. Conveying Modified Source Versions. a) The work must carry prominent notices stating that you modified it, and giving a relevant date. FWIW, I've used ChangeLogs to find problems a number of times in my 14 years of work in GCC, and I find them very useful. When I need more details, web-searching for the author of the patch and some relevant keywords in the ChangeLog will often point at the relevant e-mail, so burdening people with adding a direct URL seems pointless to me. It's pessimizing the common case for a small optimization in far less common cases. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dberlin@dberlin.org Sun Dec 16 06:27:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Sun, 16 Dec 2007 06:27:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4733A637.8070004@adacore.com> <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> Message-ID: <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> On 12/15/07, Alexandre Oliva wrote: > On Dec 5, 2007, Diego Novillo wrote: > > > On 11/25/07 3:43 PM, Mark Mitchell wrote: > > >> My suggestion (not as a GCC SC member or GCC RM, but just as a fellow > >> GCC developer with an interest in improving the compiler in the same way > >> that you're trying to do) is that you stop writing code and start > >> writing a paper about what you're trying to do. > >> > >> Ignore the implementation. Describe the problem in detail. Narrow its > >> scope if necessary. Describe the success criteria in detail. Ideally, > >> the success criteria are mechanically checkable properties: i.e., given > >> a C program as input, and optimized code + debug information as output, > >> it should be possible to algorithmically prove whether the output is > >> correct. > > > Yes, please. I would very much like to see an abstract design > > document on what you are trying to accomplish. > > Other than the ones I've already posted, here's one: > > http://dwarfstd.org/Dwarf3Std.php > > Seriously. There is a standard for this stuff. My ultimate goal in > this project is that we comply with it Comply with it how? There is no portion of the DWARF3 spec which requires you output information that is correct or useful. The same way the C standard does not require you to write correct programs, only valid ones, the DWARF3 spec does not require you to output correct information, only information that is encoded properly. It is certainly a goal of DWARF3 to allow producers to provide correct info (as witness by the one of the listed goals: "Debugging information must provide consumers a way to find the location of program variables, determine the bounds of dynamic arrays and strings, and possibly to find the base address of a subroutine's stack frame or the return address of a subroutine. Furthermore, to meet the needs of recent computer architectures and optimization techniques, debugging information must be able to describe the location of an object whose location changes over the object's lifetime.") If you search the entire spec for the word "correct", you will find it 3 times. If you search for "must", you will discover they all related to encoding or the goals of the standard. It may be entirely useless to output incorrect information, and in fact, worse than useless. It is however, compliant, as long as they are encoded properly. I have to say, this is typical of the argumentation you have used thus far in this thread, and honestly, it's not winning you any points. That said, nobody here believes we should output useless or incorrect info, even though we could. A lot of people appear to disagree with you about the best way to do it, and in fact, about what we should be trying to provide users in what cases. > >What part of instrusiveness are you concerned about? The change of >INSN_P such that it covers DEBUG_INSN_P too in the supported range? >Or the few changes that revert to the original INSN_P, in the few >exceptions in which DEBUG_INSN_P is not to be handled as an INSN? >I've heard this "intrusiveness" argument be pointed out so many times, >by so many people that claim to not have been able to keep up with the >thread, and who claim to have not looked at the patches at all, that >I'm more and more convinced it's just fear of the unknown than any >actual rational evaluation of the impact of the changes. Well, no. You yourself have shown it to be intrusiveness in the extreme, in the very next paragraphs! " At some point you have to face reality and see that such information isn't kept around by magic, it takes some effort, and this effort is needed at every location where there are changes that might affect debug information. And that's pretty much everywhere. " So, everywhere needs to change. That's pretty intrusiveness, no? "Sure, this might require a little bit more thinking in some optimizations. But in my experience fixing up the tree and rtl passes that needed tweaking, the additional thinking needed is a no-brainer in most cases; in a few, you have to work a bit harder to keep information around rather than simply noting it as unavailable. " Having to stop and think at every point in an optimization about the debug info, having to deal with debug info at every single point of change, and then your other patches This is intrusiveness as well (having to stop and think about debug info at every single point of every single optimization). You don't need to be this intrusiveness to stop outputting the incorrect info we do. >I've never seen this documented as such, and we've never worked toward > these stated goals. Who is we? I certainly have worked exactly towards these goals. As have almost all the authors of the current debugging info framework. The reason it is the way it is because these in fact, *were exactly the goals we were working towards*. As for not documented, a lot of gcc is not documented. If you look in the mailing list archives, you will even discover Diego is not the first one have exactly the viewpoint about what should and should not be debuggable, and that the community has consistenly worked towards exactly the viewpoint diego describes. Anyway, I give up on reading this thread. It has turned into a mess. You really need to step back and see that you have not achieved any sort of consensus of what levels of optimization should be how debuggable, before you start telling everyone their approach isn't as good as yours. I certainly wouldn't agree that we should take such intrusive steps to make -O2 -g as debuggable as you want, I'd much rather see us do what we can easily, and drop any info that ends up being incorrect. From nightstrike@gmail.com Sun Dec 16 10:25:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Sun, 16 Dec 2007 10:25:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> Message-ID: On 12/15/07, Alexandre Oliva wrote: > On Dec 3, 2007, kenner@vlsi1.ultra.nyu.edu (Richard Kenner) wrote: > > > In my view, ChangeLog is mostly "write-only" from a developer's > > perspective. It's a document that the GNU project requires us to produce > > for > > ... a good example of compliance with the GPL: > > 5. Conveying Modified Source Versions. > > a) The work must carry prominent notices stating that you modified > it, and giving a relevant date. > > > FWIW, I've used ChangeLogs to find problems a number of times in my 14 > years of work in GCC, and I find them very useful. When I need more > details, web-searching for the author of the patch and some relevant > keywords in the ChangeLog will often point at the relevant e-mail, so > burdening people with adding a direct URL seems pointless to me. It's > pessimizing the common case for a small optimization in far less > common cases. Maybe Changelogs should be reserved for important changes. For instance, something like "Fixed a typo" is a complete waste. I doubt anyone looks ta a Changelog to see if someone fixed a typo recently or at any point in the past. Perhaps there could be some criteria so that not every single iota gets a log entry. From toon@moene.indiv.nluug.nl Sun Dec 16 12:31:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Sun, 16 Dec 2007 12:31:00 -0000 Subject: HIRLAM with -ftree-loop-distribution. Message-ID: <4764FCCF.4010805@moene.indiv.nluug.nl> Sebastian, Here are, in addition, the numbers for compiling and running HIRLAM with -ftree-loop-distribution (after applying your patch, obviously). There something weird going on with the count of the "loops not vectorized" - every successfully vectorized loop gets an additional message: note: not vectorized: vectorization may not beprofitable. which rather defeats the purpose of the "not vectorized" messages. In short, almost 1900 more loops are vectorized, but that's of course certainly due to the fact that loop distribution *makes* more loops. In run time it has little (but positive) effect. Kind regards, -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: loop-tests.txt URL: From ubizjak@gmail.com Sun Dec 16 12:44:00 2007 From: ubizjak@gmail.com (Uros Bizjak) Date: Sun, 16 Dec 2007 12:44:00 -0000 Subject: HIRLAM with -ftree-loop-distribution. Message-ID: <47651A7C.4090009@gmail.com> Hello! > There something weird going on with the count of the "loops not > vectorized" - every successfully vectorized loop gets an additional > message: > > note: not vectorized: vectorization may not beprofitable. This is due to switching on vector cost model by default for x86. BTW: Attached patch fixed the message by adding the space between "be" and "profitable.". Patch was commited to SVN after bootstrappnig on x86_64. 2007-12-16 Uros Bizjak * tree-vect-transform.c (conservative_cost_threshold): Add missing space to "not vectorized" message. Uros. Index: tree-vect-transform.c =================================================================== --- tree-vect-transform.c (revision 130987) +++ tree-vect-transform.c (working copy) @@ -6552,7 +6552,7 @@ th = (unsigned) min_profitable_iters; if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS)) - fprintf (vect_dump, "not vectorized: vectorization may not be" + fprintf (vect_dump, "not vectorized: vectorization may not be " "profitable."); if (th && vect_print_dump_info (REPORT_DETAILS)) From aoliva@redhat.com Sun Dec 16 12:47:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Sun, 16 Dec 2007 12:47:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> (Daniel Berlin's message of "Sat\, 15 Dec 2007 22\:03\:36 -0500") References: <4733A637.8070004@adacore.com> <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> Message-ID: On Dec 16, 2007, "Daniel Berlin" wrote: > There is no portion of the DWARF3 spec which requires you output > information that is correct or useful. The same way the C standard > does not require you to write correct programs, only valid ones, the > DWARF3 spec does not require you to output correct information, only > information that is encoded properly. But if a C compiler translated programs to garbage, that would be wrong. By the same reasoning, if a Dwarf producer created garbage, that would be wrong. It's true that most of Dwarf 3 attributes are optional. But when it says "if you output this attribute, its operand must be such and such", if you output the attribute with operands that don't match the specification, that's a bug. > It is certainly a goal of DWARF3 to allow producers to provide correct > info Exactly. And where's the permission to provide incorrect info, rather than merely leaving it out? >> I've heard this "intrusiveness" argument be pointed out so many times, >> by so many people that claim to not have been able to keep up with the >> thread, and who claim to have not looked at the patches at all, that >> I'm more and more convinced it's just fear of the unknown than any >> actual rational evaluation of the impact of the changes. > Well, no. > You yourself have shown it to be intrusiveness in the extreme, in the > very next paragraphs! > " > At some point you have to face reality and see that such information > isn't kept around by magic, it takes some effort, and this effort is > needed at every location where there are changes that might affect > debug information. And that's pretty much everywhere. " > So, everywhere needs to change. That's pretty intrusiveness, no? No. Looks like selective attention, because you're reasoning out the part in which I discussed using the strength of the optimizers against the problem, by letting them do what they are already used to on the debug information too. If we add a new RTL code or a new TREE code, is that intrusive because now every optimization pass will deal with the new node types in very much the same way they've dealt with other similar node types forever? Of course not. And if we have to add a few exceptions here and there to deal with the specifics of this new node type, does that become too intrusive then? I don't think so. Then what's the fuss about the new node types? Do you want to count the number of places in which INSN_P remains there, lexically unchanged, and compare with the number of places in which I've added a !DEBUG_INSN_P after it? > Having to stop and think at every point in an optimization about the > debug info, Well, sorry, writing compilers is hard. You have to think about several things at the same time. Shall we just go shopping instead? I'm trying to make it as simple as possible. The fact that nearly 100% of the code is unchanged seems to indicate to me that it's not such a bad an approach, but if you want something that just magically works, you're up for much disappointment. > (having to stop and think about debug info at every single point of > every single optimization). Information doesn't come out of thin air, and thin air doesn't maintain information accurate just because we wish it does. We have to work to create and update the information throughout compilation, at every transformation, and my reasoning is precisely that optimizers already do this all the time, so why not use them for what we need? > You don't need to be this intrusiveness to stop outputting the > incorrect info we do. What do you have to back your statement up? Let me help you: sure we don't. We can just refrain from outputting any debug information whatsoever. Then, it will be compliant with the standard. But it won't be useful. >> I've never seen this documented as such, and we've never worked toward >> these stated goals. > Who is we? > I certainly have worked exactly towards these goals. > As have almost all the authors of the current debugging info > framework. Oh, wow, I guess I just wasn't welcome into the club, because I didn't get the guidelines book. How unfortunate, now I have to give up my plan of doing better and abide by the unpublished and undocumented goals of some small cabal. Or do I? > If you look in the mailing list archives, you will even discover Diego > is not the first one have exactly the viewpoint about what should and > should not be debuggable, and that the community has consistenly > worked towards exactly the viewpoint diego describes. I've seen several different viewpoints from "the community". > Anyway, I give up on reading this thread. It has turned into a mess. > You really need to step back Oh, do I? Why is that? > and see that you have not achieved any sort of consensus of what > levels of optimization should be how debuggable, Why would I expect to get any consensus on that? I haven't even tried, and I won't. This is not what the issue is about. The issue is about not emitting incorrect information. Better debuggability for all levels of optimization will be a side effect of achieving that, and it will be achievable incrementally once we have an actual framework that enables us to take steps in this direction without introducing further regressions. > I certainly wouldn't agree that we should take such intrusive steps to > make -O2 -g as debuggable as you want, It is obvious that you misunderstood what I want, and how intrusive the approach is. > I'd much rather see us do what we can easily, and drop any info that > ends up being incorrect. So what's your plan to find out what's incorrect? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Sun Dec 16 13:04:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Sun, 16 Dec 2007 13:04:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: (nightstrike@gmail.com's message of "Sun\, 16 Dec 2007 01\:27\:13 -0500") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> Message-ID: On Dec 16, 2007, NightStrike wrote: > On 12/15/07, Alexandre Oliva wrote: >> ... a good example of compliance with the GPL: >> >> 5. Conveying Modified Source Versions. >> >> a) The work must carry prominent notices stating that you modified >> it, and giving a relevant date. > Maybe Changelogs should be reserved for important changes. For > instance, something like "Fixed a typo" is a complete waste. I doubt > anyone looks ta a Changelog to see if someone fixed a typo recently or > at any point in the past. I've done that, while backporting patches. Oftentimes there are small fixes on top of larger patches, and you want to credit those who made the small fixes, and you want to be sure you caught them next time you look at the patch. ChangeLogs for these are useful for this purpose. > Perhaps there could be some criteria so that not every single iota > gets a log entry. How would leaving changes out comply with 5a above? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From toon@moene.indiv.nluug.nl Sun Dec 16 13:34:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Sun, 16 Dec 2007 13:34:00 -0000 Subject: HIRLAM with -ftree-loop-distribution. In-Reply-To: <47651A7C.4090009@gmail.com> References: <47651A7C.4090009@gmail.com> Message-ID: <4765225A.1030803@moene.indiv.nluug.nl> Uros Bizjak wrote: >> note: not vectorized: vectorization may not beprofitable. > > This is due to switching on vector cost model by default for x86. Ah, but my hidden critique of the message was: -ftree-vectorizer-verbose=2 should *only* tell us: 1. Which loops are vectorized. 2. Which are not - and why (in a single sentence). For more detailed logging, one should use -ftree-vectorizer-verbose=n with n>2, IMNSHO. Kind regards, -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 From toon@moene.indiv.nluug.nl Sun Dec 16 13:59:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Sun, 16 Dec 2007 13:59:00 -0000 Subject: libgfortran, libgomp not compiled with BOOT_CFLAGS. Message-ID: <4765294E.9020203@moene.indiv.nluug.nl> L.S., Recently, I've begun to bootstrap with make BOOT_CFLAGS="flags", basically to get the run time libraries (libgfortran, libgomp) compiled with -mcpu=native -mtune=native (the speed of the compiler doesn't interest me that much). However, I see that almost everything is compiled with -mcpu=native -mtune=native, *except* the run time libraries ... Is that because they're target libraries - if so, how would one get them compiled "optimally" if not building a cross-compiler ? Thanks in advance for any insight. -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 From DORIT@il.ibm.com Sun Dec 16 14:14:00 2007 From: DORIT@il.ibm.com (Dorit Nuzman) Date: Sun, 16 Dec 2007 14:14:00 -0000 Subject: HIRLAM with -ftree-loop-distribution. In-Reply-To: <4765225A.1030803@moene.indiv.nluug.nl> Message-ID: > Uros Bizjak wrote: > > >> note: not vectorized: vectorization may not beprofitable. > > > > This is due to switching on vector cost model by default for x86. > > Ah, but my hidden critique of the message was: > -ftree-vectorizer-verbose=2 should *only* tell us: > > 1. Which loops are vectorized. > 2. Which are not - and why (in a single sentence). > > For more detailed logging, one should use -ftree-vectorizer-verbose=n > with n>2, IMNSHO. > yes, you are right. this printing should be either removed (as it's anyhow already being printed also under REPORT_DETAILS), or we may want to add a new verbosity level (lower than REPORT_DETAILS) for cost-model info ("REPORT_COST"). dorit > Kind regards, > > -- > Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > At home: http://moene.indiv.nluug.nl/~toon/ > GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 From toon@moene.indiv.nluug.nl Sun Dec 16 14:15:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Sun, 16 Dec 2007 14:15:00 -0000 Subject: HIRLAM and -ftree-loop-linear In-Reply-To: References: <47630282.6090408@moene.indiv.nluug.nl> Message-ID: <4765329D.9050907@moene.indiv.nluug.nl> Sebastian Pop wrote: > I wrote: >> rttov_aitosu.f90: In function 'rttov_aitosu': >> rttov_aitosu.f90:4: error: definition in block 262 does not dominate use in block 134 >> >> Worked around by compiling this file without -ftree-loop-linear >> > > Could you verify that the attached patch fixes also this problem? Unfortunately, it doesn't; I get exactly the same error message as before. Kind regards, -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 From hp@bitrange.com Sun Dec 16 14:24:00 2007 From: hp@bitrange.com (Hans-Peter Nilsson) Date: Sun, 16 Dec 2007 14:24:00 -0000 Subject: Help with another constraint In-Reply-To: <000601c83cbf$0c887a30$2e08a8c0@CAM.ARTIMI.COM> References: <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> <000601c83cbf$0c887a30$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <20071216091307.F82190@dair.pair.com> On Wed, 12 Dec 2007, Dave Korn wrote: > On 12 December 2007 12:14, Revital1 Eres wrote: > > > It seems that the pair m and I is missing (which indicate the memory = > > constant instruction). > > So doesn't the question then become "Why isn't reload reloading the constant > into a register"? Yes. And the answer AFAIK is "because it doesn't see a way to move a constant into a register; it understands "r", not "p" and "q". So bviyer, add an "r" alternative. See also the "*" and "#" qualifiers. No need for bogus 0 -to- memory alternatives. brgds, H-P From DORIT@il.ibm.com Sun Dec 16 14:33:00 2007 From: DORIT@il.ibm.com (Dorit Nuzman) Date: Sun, 16 Dec 2007 14:33:00 -0000 Subject: HIRLAM and -ftree-loop-linear In-Reply-To: <47630282.6090408@moene.indiv.nluug.nl> Message-ID: > Sebastian, > > Here are (attached) results for testing HIRLAM with and without > -ftree-loop-linear. > > As you can see, the results are neutral: 4 loops fewer vectorized, but > about 50 fewer recognized. > any chance you kept the dumps and can report which loops were not vectorized/recognized with -ftree-loop-linear (so we could see if these represent missed vectorization opportunities?) thanks, dorit > Now I like to redo that test with -ftree-loop-distribution. Can you > send me a patch against the trunk (otherwise it won't be a fair comparison). > > Kind regards, > > -- > Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > At home: http://moene.indiv.nluug.nl/~toon/ > GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 > Baseline, no source changes: > > Mon Dec 10 17:45:19 UTC 2007 (revision 130746) > > Compilation flags: > > CCFLAGS := -g -O3 $(MACHINECPP) -ffast-math -fno-associative-math - > march=native -mtune=native -ftree-vectorizer-verbose=2 > FCFLAGS := -g -O3 -fbacktrace -ffpe-trap=invalid,zero,overflow - > ffast-math -fno-associative-math -march=native -mtune=native -ftree- > vectorizer-verbose=2 > > Loops vectorized: > 5675 > Loops not vectorized: > 13705 > > Timings: > 20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK 12.7488 SECONDS > 20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK 2445.9609 SECONDS > 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK 259.3362 SECONDS > 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK 12.4408 SECONDS > 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK 305.9351 SECONDS > 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK 262.1124 SECONDS > 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK 12.7448 SECONDS > 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK 2323.3733 SECONDS > 20061201_12r/HL_Cycle_2006120112r.html: FORECAST TOOK 412.7058 SECONDS > 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK 264.5685 SECONDS > 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK 12.6648 SECONDS > 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK 306.7352 SECONDS > 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK 261.5164 SECONDS > 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK 12.7688 SECONDS > 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK 2325.3774 SECONDS > 20061202_00r/HL_Cycle_2006120200r.html: FORECAST TOOK 413.8739 SECONDS > > Baseline, no source changes, with -ftree-loop-linear: > > Mon Dec 10 17:45:19 UTC 2007 (revision 130746) > > Compilation flags: > > CCFLAGS := -g -O3 $(MACHINECPP) -ftree-loop-linear -ffast-math -fno- > associative-math -march=native -mtune=native -ftree-vectorizer-verbose=2 > FCFLAGS := -g -O3 -ftree-loop-linear -fbacktrace -ffpe-trap=invalid, > zero,overflow -ffast-math -fno-associative-math -march=native - > mtune=native -ftree-vectorizer-verbose=2 > > This compilation got one ICE: > > rttov_aitosu.f90: In function 'rttov_aitosu': > rttov_aitosu.f90:4: error: definition in block 262 does not dominate > use in block 134 > for SSA_NAME: pretmp.240_59 in statement: > prephitmp.220_58 = PHI > PHI argument > pretmp.240_59 > for PHI node > prephitmp.220_58 = PHI > rttov_aitosu.f90:4: internal compiler error: verify_ssa failed > Please submit a full bug report, > with preprocessed source if appropriate. > See for instructions. > > Worked around by compiling this file without -ftree-loop-linear > > Loops vectorized: > 5671 > Loops not vectorized: > 13655 > > Timings: > 20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK 12.5648 SECONDS > 20061201_00/HL_Cycle_2006120100.html: FORECAST TOOK 2444.1208 SECONDS > 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK 259.3402 SECONDS > 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK 12.4728 SECONDS > 20061201_06/HL_Cycle_2006120106.html: FORECAST TOOK 307.8672 SECONDS > 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK 260.0323 SECONDS > 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK 12.8608 SECONDS > 20061201_12/HL_Cycle_2006120112.html: FORECAST TOOK 2310.2485 SECONDS > 20061201_12r/HL_Cycle_2006120112r.html: FORECAST TOOK 411.3977 SECONDS > 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK 261.1283 SECONDS > 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK 12.7248 SECONDS > 20061201_18/HL_Cycle_2006120118.html: FORECAST TOOK 308.1313 SECONDS > 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK 262.7564 SECONDS > 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK 12.6528 SECONDS > 20061202_00/HL_Cycle_2006120200.html: FORECAST TOOK 2336.5620 SECONDS > 20061202_00r/HL_Cycle_2006120200r.html: FORECAST TOOK 410.6577 SECONDS From DORIT@il.ibm.com Sun Dec 16 14:54:00 2007 From: DORIT@il.ibm.com (Dorit Nuzman) Date: Sun, 16 Dec 2007 14:54:00 -0000 Subject: HIRLAM with -ftree-loop-distribution. In-Reply-To: Message-ID: Here's a tentative patch to do that: - removes the confusing printing "not vectorized: vectorization may not be profitable" from REPORT_UNVECTORIZED_LOOPS - instead print "vectorization may not be profitable" under a new verbosity level REPORT_COST - change (hopefully all) other cost-model printings to be printed under REPORT_COST I'll test it later this week. I assume this kind of thing is an ok stage 3 material (it's a regression fix cause this confusion in the dump reports was introduced with the cost model patches during 4.3) dorit --- tree-vect-transform.c 2007-12-16 14:09:20.000000000 +0200 +++ tree-vect-transform.cost_verbose.c 2007-12-16 16:07:09.000000000 +0200 @@ -134,7 +134,7 @@ /* Cost model disabled. */ if (!flag_vect_cost_model) { - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model disabled.");: return 0; } @@ -153,7 +153,7 @@ /* FIXME: Make cost depend on complexity of individual check. */ vec_outside_cost += VEC_length (tree, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo)); - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model: Adding cost of checks for loop " "versioning to treat misalignment.\n"); } @@ -163,7 +163,7 @@ /* FIXME: Make cost depend on complexity of individual check. */ vec_outside_cost += VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo));. - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model: Adding cost of checks for loop " "versioning aliasing.\n"); } @@ -224,14 +224,14 @@ if (byte_misalign < 0) { peel_iters_prologue = vf/2; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model: " "prologue peel iters set to vf/2."); /* If peeling for alignment is unknown, loop bound of main loop becomes unknown. */ peel_iters_epilogue = vf/2; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model: " "epilogue peel iters set to vf/2 because " "peeling for alignment is unknown ."); @@ -261,7 +261,7 @@ if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) { peel_iters_epilogue = vf/2; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model: " "epilogue peel iters set to vf/2 because " "loop iterations are unknown ."); @@ -391,7 +391,7 @@ /* vector version will never be profitable. */ else { - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model: vector iteration cost = %d " "is divisible by scalar iteration cost = %d by a factor " "greater than or equal to the vectorization factor = %d .", @@ -399,7 +399,7 @@ return -1; } - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) { fprintf (vect_dump, "Cost model analysis: \n"); fprintf (vect_dump, " Vector inside of loop cost: %d\n", @@ -425,7 +425,7 @@ then skip the vectorized loop. */ min_profitable_iters--; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, " Profitability threshold = %d\n", min_profitable_iters); @@ -465,7 +465,7 @@ vectype = get_vectype_for_scalar_type (TREE_TYPE (reduction_op)); if (!vectype) { - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) { fprintf (vect_dump, "unsupported data-type "); print_generic_expr (vect_dump, TREE_TYPE (reduction_op), TDF_SLIM); @@ -520,7 +520,7 @@ STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = outer_cost; - if (vect_print_dump_info (REPORT_DETAILS))) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_reduction_cost: inside_cost = %d, " "outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info), STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info)); @@ -541,7 +541,7 @@ /* prologue cost for vec_init and vec_step. */ STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = 2 * TARG_SCALAR_TO_VEC_COST; - if (vect_print_dump_info (REPORT_DETAILS))) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_induction_cost: inside_cost = %d, " "outside_cost = %d .", STMT_VINFO_INSIDE_OF_LOOP_COST (stmt_info), STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info)); @@ -570,7 +570,7 @@ outside_cost += TARG_SCALAR_TO_VEC_COST; } - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_simple_cost: inside_cost = %d, " "outside_cost = %d .", inside_cost, outside_cost);% @@ -628,7 +628,7 @@ inside_cost = ncopies * exact_log2(group_size) * group_size, * TARG_VEC_STMT_COST; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_store_cost: strided group_size = %d .", group_size); @@ -637,7 +637,7 @@ /* Costs of the stores. */ inside_cost += ncopies * TARG_VEC_STORE_COST; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_store_cost: inside_cost = %d, " "outside_cost = %d .", inside_cost, outside_cost); @@ -688,7 +688,7 @@ inside_cost = ncopies * exact_log2(group_size) * group_size * TARG_VEC_STMT_COST; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_load_cost: strided group_size = %d .",, group_size); @@ -701,7 +701,7 @@ { inside_cost += ncopies * TARG_VEC_LOAD_COST; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_load_cost: aligned."); break; @@ -711,7 +711,7 @@ /* Here, we assign an additional cost for the unaligned load. */ inside_cost += ncopies * TARG_VEC_UNALIGNED_LOAD_COST; - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_load_cost: unaligned supported by ",, "hardware."); @@ -731,7 +731,7 @@ } case dr_explicit_realign_optimized: { - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_load_cost: unaligned software " "pipelined."); @@ -758,7 +758,7 @@ gcc_unreachable (); } - if (vect_print_dump_info (REPORT_DETAILS)) + if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_load_cost: inside_cost = %d, " "outside_cost = %d .", inside_cost, outside_cost);, @@ -6552,11 +6552,7 @@ || min_profitable_iters > min_scalar_loop_bound)) th = (unsigned) min_profitable_iters; - if (vect_print_dump_info (REPORT_UNVECTORIZED_LOOPS)) - fprintf (vect_dump, "not vectorized: vectorization may not be" - "profitable."); - - if (th && vect_print_dump_info (REPORT_DETAILS)) + if (th && vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "Vectorization may not be profitable."); return th; --- tree-vectorizer.h 2007-12-16 16:08:07.000000000 +0200 +++ tree-vectorizer.cost_verbose.h 2007-12-16 16:08:02.000000000 +0200 @@ -73,6 +73,7 @@ REPORT_NONE, REPORT_VECTORIZED_LOOPS, REPORT_UNVECTORIZED_LOOPS, + REPORT_COST, REPORT_ALIGNMENT, REPORT_DR_DETAILS, REPORT_BAD_FORM_LOOPS, Dorit Nuzman/Haifa/IBM@ IBMIL To Sent by: Toon Moene gcc-owner@gcc.gnu .org cc GCC , Uros Bizjak 16/12/2007 16:02 Subject Re: HIRLAM with -ftree-loop-distribution. > Uros Bizjak wrote: > > >> note: not vectorized: vectorization may not beprofitable. > > > > This is due to switching on vector cost model by default for x86. > > Ah, but my hidden critique of the message was: > -ftree-vectorizer-verbose=2 should *only* tell us: > > 1. Which loops are vectorized. > 2. Which are not - and why (in a single sentence). > > For more detailed logging, one should use -ftree-vectorizer-verbose=n > with n>2, IMNSHO. > yes, you are right. this printing should be either removed (as it's anyhow already being printed also under REPORT_DETAILS), or we may want to add a new verbosity level (lower than REPORT_DETAILS) for cost-model info ("REPORT_COST"). dorit > Kind regards, > > -- > Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > At home: http://moene.indiv.nluug.nl/~toon/ > GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 From hp@bitrange.com Sun Dec 16 14:58:00 2007 From: hp@bitrange.com (Hans-Peter Nilsson) Date: Sun, 16 Dec 2007 14:58:00 -0000 Subject: porting gcc to tic54x In-Reply-To: <984587.48623.qm@web73407.mail.tp2.yahoo.com> References: <984587.48623.qm@web73407.mail.tp2.yahoo.com> Message-ID: <20071216093432.R82190@dair.pair.com> On Wed, 12 Dec 2007, a2220333 wrote: > hi, > I have been porting tic54x to gcc. I use gcc-4.2.2 version. I write some simplest c54x.h and c54x.c and a empty md, and I I think the answer is right there ^^^^^^^^^^ > compile it to generate the tic54x-gcc compiler. > > But when I execute the compiler I generate I got a segmentation fault error. Is there anything must be define in c54x.c or > c54x.h that could make the simplest compiler with no correct output and no errors? Because I want to add functions from this > basic port. If that wasn't the bug, I suggest you start up gdb and step through cc1, but I'd be surprised if you get anywhere without the prerequisite move, add, and control flow insns in the .md. brgds, H-P From toon@moene.indiv.nluug.nl Sun Dec 16 17:17:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Sun, 16 Dec 2007 17:17:00 -0000 Subject: HIRLAM and -ftree-loop-linear In-Reply-To: References: Message-ID: <47653D1C.60408@moene.indiv.nluug.nl> Dorit Nuzman wrote: > any chance you kept the dumps and can report which loops were not > vectorized/recognized with -ftree-loop-linear (so we could see if these > represent missed vectorization opportunities?) I haven't, but it wouldn't be too much effort do this. I'll try stage 1 tonight - i.e., to establish a base (with the latest trunk check-out, not using -ftree-loop-linear), and then subsequently using that flag. Kind regards, -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 From hp@bitrange.com Sun Dec 16 17:18:00 2007 From: hp@bitrange.com (Hans-Peter Nilsson) Date: Sun, 16 Dec 2007 17:18:00 -0000 Subject: Help with another constraint In-Reply-To: <20071216091307.F82190@dair.pair.com> References: <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> <000601c83cbf$0c887a30$2e08a8c0@CAM.ARTIMI.COM> <20071216091307.F82190@dair.pair.com> Message-ID: <20071216120328.H62417@dair.pair.com> On Sun, 16 Dec 2007, Hans-Peter Nilsson wrote: > On Wed, 12 Dec 2007, Dave Korn wrote: > > > On 12 December 2007 12:14, Revital1 Eres wrote: > > > > > It seems that the pair m and I is missing (which indicate the memory = > > > constant instruction). > > > > So doesn't the question then become "Why isn't reload reloading the constant > > into a register"? > > Yes. And the answer AFAIK is "because it doesn't see a way to > move a constant into a register; it understands "r", not "p" and > "q". I think I have to correct myself; register allocation and reload *should* understand p and q as register constraints, given e.g. a correct REG_CLASS_FROM_LETTER definition and correct regclass macros. The latter were not disclosed and are usually a source of hard-to-find errors. Besides, if you can't directly move between p and q (as your constraints indicate) then as Rask says, you also need to tell GCC through the secondary-reload mechanisms. I can't help but thinking the best suggetion is for bviyer to let gdb answer the question by stepping through cc1 instead of relying on indirect debugging. That's what people do. ;) brgds, H-P From sebpop@gmail.com Sun Dec 16 17:53:00 2007 From: sebpop@gmail.com (Sebastian Pop) Date: Sun, 16 Dec 2007 17:53:00 -0000 Subject: HIRLAM with -ftree-loop-distribution. In-Reply-To: <4764FCCF.4010805@moene.indiv.nluug.nl> References: <4764FCCF.4010805@moene.indiv.nluug.nl> Message-ID: On Dec 16, 2007 4:24 AM, Toon Moene wrote: > Here are, in addition, the numbers for compiling and > running HIRLAM with -ftree-loop-distribution (after applying your patch, > obviously). > > In short, almost 1900 more loops are vectorized, but that's of course > certainly due to the fact that loop distribution *makes* more loops. > > In run time it has little (but positive) effect. > Wow! Thanks for the numbers. I guess from your message that there were no ICEs or other problems with the loop distribution patch. Mark, is the loop distribution patch okay for trunk? Thanks, Sebastian -- AMD - GNU Tools From ebotcazou@adacore.com Sun Dec 16 18:01:00 2007 From: ebotcazou@adacore.com (Eric Botcazou) Date: Sun, 16 Dec 2007 18:01:00 -0000 Subject: Problem with SSA inlining and default defs Message-ID: <200712161854.29543.ebotcazou@adacore.com> Hi, How SSA inlining and default defs for uninitialized variables are supposed to interact? Suppose you have the following situation BB0 ... | \ (ab) | BB1 s_2 = f(s_1(D)) | / BB2 s_3 = PHI in a function that gets inlined into a loop. The liveness of s_1(D) in BB0 will propagate to BB2 along the backwards edge and you get overlapping live ranges for s_1(D) and s_3. If s_1(D) is SSA_NAME_OCCURS_IN_ABNORMAL_PHI, the compilation will abort during SSA coalescing because they must be coalesced. This is on the mainline, Ada testcase attached, run 'gnatchop' on it and compile at -O -gnatp. Thanks in advance. -- Eric Botcazou -------------- next part -------------- package Q is procedure Read(S : out Integer); procedure Restore(S : in out Integer); end Q; package P is type Int_Ptr is access all Integer; procedure Exec(P : Int_Ptr); end P; with Q; use Q; package body P is procedure Lock is S : Integer; begin Read(S); Restore(S); exception when others => Restore(S); end; procedure Exec(P : Int_Ptr) is begin while P /= NULL loop Lock; end loop; end; end P; From jakub@redhat.com Sun Dec 16 18:10:00 2007 From: jakub@redhat.com (Jakub Jelinek) Date: Sun, 16 Dec 2007 18:10:00 -0000 Subject: Problem with SSA inlining and default defs In-Reply-To: <200712161854.29543.ebotcazou@adacore.com> References: <200712161854.29543.ebotcazou@adacore.com> Message-ID: <20071216180653.GE2947@sunsite.mff.cuni.cz> On Sun, Dec 16, 2007 at 06:54:29PM +0100, Eric Botcazou wrote: > How SSA inlining and default defs for uninitialized variables are supposed to > interact? Suppose you have the following situation > > BB0 ... > | \ > (ab) | BB1 s_2 = f(s_1(D)) > | / > BB2 s_3 = PHI > > in a function that gets inlined into a loop. The liveness of s_1(D) in BB0 > will propagate to BB2 along the backwards edge and you get overlapping live > ranges for s_1(D) and s_3. If s_1(D) is SSA_NAME_OCCURS_IN_ABNORMAL_PHI, the > compilation will abort during SSA coalescing because they must be coalesced. This sounds like PR31081. Jakub From ebotcazou@adacore.com Sun Dec 16 18:27:00 2007 From: ebotcazou@adacore.com (Eric Botcazou) Date: Sun, 16 Dec 2007 18:27:00 -0000 Subject: Problem with SSA inlining and default defs In-Reply-To: <20071216180653.GE2947@sunsite.mff.cuni.cz> References: <200712161854.29543.ebotcazou@adacore.com> <20071216180653.GE2947@sunsite.mff.cuni.cz> Message-ID: <200712161911.49593.ebotcazou@adacore.com> > This sounds like PR31081. Indeed, the C++ testcase is the exact translation of my Ada testcase. :-) The problem seems to arise relatively often in Ada, I think the PR should be made "critical". -- Eric Botcazou From belyshev@depni.sinp.msu.ru Sun Dec 16 18:51:00 2007 From: belyshev@depni.sinp.msu.ru (Serge Belyshev) Date: Sun, 16 Dec 2007 18:51:00 -0000 Subject: libgfortran, libgomp not compiled with BOOT_CFLAGS. In-Reply-To: <4765294E.9020203@moene.indiv.nluug.nl> (Toon Moene's message of "Sun\, 16 Dec 2007 14\:34\:06 +0100") References: <4765294E.9020203@moene.indiv.nluug.nl> Message-ID: <878x3uts09.fsf@depni.sinp.msu.ru> Toon Moene writes: > L.S., > > Recently, I've begun to bootstrap with make BOOT_CFLAGS="flags", > basically to get the run time libraries (libgfortran, libgomp) > compiled with -mcpu=native -mtune=native (the speed of the compiler > doesn't interest me that much). > > However, I see that almost everything is compiled with -mcpu=native > -mtune=native, *except* the run time libraries ... > > Is that because they're target libraries - if so, how would one get > them compiled "optimally" if not building a cross-compiler ? Yeah, try adding appropriate FCFLAGS into environment. Also CXXFLAGS, GCJFLAGS affect some of resulting binaries or libraries when bootstrapping, and CFLAGS (don't remember about this one for sure). From nightstrike@gmail.com Sun Dec 16 18:52:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Sun, 16 Dec 2007 18:52:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> Message-ID: On 12/16/07, Alexandre Oliva wrote: > On Dec 16, 2007, NightStrike wrote: > > > On 12/15/07, Alexandre Oliva wrote: > >> ... a good example of compliance with the GPL: > >> > >> 5. Conveying Modified Source Versions. > >> > >> a) The work must carry prominent notices stating that you modified > >> it, and giving a relevant date. > > > Maybe Changelogs should be reserved for important changes. For > > instance, something like "Fixed a typo" is a complete waste. I doubt > > anyone looks ta a Changelog to see if someone fixed a typo recently or > > at any point in the past. > > I've done that, while backporting patches. Oftentimes there are small > fixes on top of larger patches, and you want to credit those who made > the small fixes, and you want to be sure you caught them next time you > look at the patch. ChangeLogs for these are useful for this purpose. > > > Perhaps there could be some criteria so that not every single iota > > gets a log entry. > > How would leaving changes out comply with 5a above? It wouldn't without some "creative interpretations". From toon@moene.indiv.nluug.nl Sun Dec 16 19:43:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Sun, 16 Dec 2007 19:43:00 -0000 Subject: HIRLAM with -ftree-loop-distribution. In-Reply-To: References: <4764FCCF.4010805@moene.indiv.nluug.nl> Message-ID: <476573D9.1060301@moene.indiv.nluug.nl> Sebastian Pop wrote: > Wow! Thanks for the numbers. I guess from your message that there > were no ICEs or other problems with the loop distribution patch. Exactly. -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 From jakub@redhat.com Sun Dec 16 19:45:00 2007 From: jakub@redhat.com (Jakub Jelinek) Date: Sun, 16 Dec 2007 19:45:00 -0000 Subject: Problem with SSA inlining and default defs In-Reply-To: <200712161911.49593.ebotcazou@adacore.com> References: <200712161854.29543.ebotcazou@adacore.com> <20071216180653.GE2947@sunsite.mff.cuni.cz> <200712161911.49593.ebotcazou@adacore.com> Message-ID: <20071216194830.GF2947@sunsite.mff.cuni.cz> On Sun, Dec 16, 2007 at 07:11:49PM +0100, Eric Botcazou wrote: > > This sounds like PR31081. > > Indeed, the C++ testcase is the exact translation of my Ada testcase. :-) > > The problem seems to arise relatively often in Ada, I think the PR should be > made "critical". Yeah, to me this looks like the most worrisome P1 4.3 regression. Jakub From lukepadawan@gmail.com Sun Dec 16 20:00:00 2007 From: lukepadawan@gmail.com (Lucas Prado Melo) Date: Sun, 16 Dec 2007 20:00:00 -0000 Subject: Problem with posix threads In-Reply-To: <9f4be2240712161142r49e09784o9a3154901072d900@mail.gmail.com> References: <9f4be2240712161142r49e09784o9a3154901072d900@mail.gmail.com> Message-ID: <9f4be2240712161144h49e5b3e8r20b673702727274d@mail.gmail.com> Please forgive me if this is off-topic: I've written a simple test program with posix threads and a 'glibc' attempt was detected. The code: -----main.c------- #include #include #include #include #include #include #include #include "stack.c" /* * THREAD EXPERIMENT * * There are various threads: * I, II and main * thread I repeatly pushes to 'stck' a value (NTIMES times) * thread II then pop repeatly and show the value (NTIMES-1 times) * main thread then pushes the last values and quit */ #define NTIMES 10000000 #define die(msg) do { perror(msg); exit(1); } while(0) void * doPush(void * data); void * doPop(void * data); struct pack{ Stack stck; pthread_mutex_t mutex; }; int main(int argc, char *argv[]){ void *trashbin; pthread_t peer[2]; struct pack pck; //initialize pck mutex if( pthread_mutex_init(&( pck.mutex), NULL) != 0 ) die("pthread_mutex_init"); //and make it multi-threaded if( pthread_create(&peer[0], NULL, doPush, (void*)&pck) != 0 ) die("pthread_create"); if( pthread_create(&peer[1], NULL, doPop, (void*)&pck) != 0 ) die("pthread_create"); //wait all threads do their stuff pthread_join(peer[0],&trashbin); pthread_join(peer[1],&trashbin); pthread_mutex_lock( &(pck.mutex) ); printf("Last one: %c\n", pop( &(pck.stck) )); pthread_mutex_unlock( &(pck.mutex) ); //destroy pck mutex if( pthread_mutex_destroy( &( pck.mutex) ) != 0 ) die("pthread_mutex_destroy"); return 0; } void * doPush(void * data){ struct pack * pck = (struct pack *)data; int x; for(x=0;xmutex) ); push( &(pck->stck), (void*)chr ); pthread_mutex_unlock( &(pck->mutex) ); } pthread_exit( NULL ); } void * doPop(void * data){ struct pack * pck = (struct pack *)data; int x; for(x=0;x<(NTIMES-1);x++){ pthread_mutex_lock( &(pck->mutex) ); printf("%c ", pop( &(pck->stck) ) ); pthread_mutex_unlock( &(pck->mutex) ); } printf("\n"); pthread_exit( NULL ); } ---------------------- -----stack.c------- #ifndef STACK_C #define STACK_C 1 #include struct stack { void * el; struct stack *next; }; typedef struct stack * Stack; void stackInit(Stack * stck){ *stck = NULL; return; } //pushes an element to the stack void push(Stack * stck, void * el){ struct stack * ne; ne = malloc(sizeof(struct stack)); ne->el = el; ne->next = *stck; *stck = ne; return; } //pops an element from stack //return NULL if there's no element in the stack void * pop(Stack * stck){ struct stack * de; void * el; de = *stck; *stck = (*stck)->next; if(de != NULL ){ el = de->el; free(de); } else el = NULL; return el; } #endif ---------------------- From rgalberquilla@fis.ucm.es Sun Dec 16 21:28:00 2007 From: rgalberquilla@fis.ucm.es (=?iso-8859-1?Q?=22Rodrigo_Gonz=E1lez_Alberquilla=22?=) Date: Sun, 16 Dec 2007 21:28:00 -0000 Subject: =?windows-1252?B?dW5hYmxlIHRvIGZpbmQgYSByZWdpc3RlciB0byBzcGlsbCBpbiBjbGFzcyCRTURfUkVHU5I=?= Message-ID: Dear GCC Developers/Users, I am working on a port of a target backend to PISA architecture (a MIPS-IV like ISA used by the SimpleScalar simulator). When compiling libgcc2 for __muldi3: #ifdef L_muldi3 DWtype __muldi3 (DWtype u, DWtype v) { const DWunion uu = {.ll = u}; const DWunion vv = {.ll = v}; DWunion w = {.ll = __umulsidi3 (uu.s.low, vv.s.low)}; w.s.high += ((UWtype) uu.s.low * (UWtype) vv.s.high + (UWtype) uu.s.high * (UWtype) vv.s.low); return w.ll; } #endif I get the following error: ../.././gcc/libgcc2.c: In function ?__muldi3?: ../.././gcc/libgcc2.c:542: error: unable to find a register to spill in class ?MD_REGS? ../.././gcc/libgcc2.c:542: error: este es el insn: (insn 37 36 38 2 (set (reg:DI 116) (mult:DI (zero_extend:DI (reg:SI 3 v1 [orig:117 __ul ] [117])) (zero_extend:DI (reg:SI 2 v0 [orig:118 __vl ] [118])))) 14 {umulsidi3_32bit_internal} (nil) (expr_list:REG_DEAD (reg:SI 3 v1 [orig:117 __ul ] [117]) (expr_list:REG_DEAD (reg:SI 2 v0 [orig:118 __vl ] [118]) (nil)))) I have compile it with -da and the dumps are: greg: Spilling for insn 37. Using reg 4 for reload 0 reload failure for reload 1 Reloads for insn # 37 Reload 0: GR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p Reload 1: reload_out (DI) = (reg:DI 116) MD_REGS, RELOAD_FOR_OUTPUT (opnum = 0) reload_out_reg: (reg:DI 116) secondary_out_reload = 0 lreg: [...] Register 116 costs: LEA_REGS:10000000 GR_REGS:10000000 MEM:8000 [...] Register 116 used 2 times across 2 insns in block 2; set 1 time; 8 bytes; NO_REGS or none. [...] (insn 37 36 38 2 (set (reg:DI 116) (mult:DI (zero_extend:DI (reg:SI 117 [ __ul ])) (zero_extend:DI (reg:SI 118 [ __vl ])))) 14 {umulsidi3_32bit_internal} (nil) (expr_list:REG_DEAD (reg:SI 117 [ __ul ]) (expr_list:REG_DEAD (reg:SI 118 [ __vl ]) (nil)))) [...] I have read a thread of a guy having a paroblem like this but it has not helped me. If anybody would tell me where the problem could be, I would be very pleased. Regards, Rodrigo Gonz?lez From mark@codesourcery.com Sun Dec 16 22:20:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Sun, 16 Dec 2007 22:20:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <84fc9c000711050327x74845c78ya18a3329fcf9e4d2@mail.gmail.com> <4733A637.8070004@adacore.com> <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> Message-ID: <4765986F.90904@codesourcery.com> Alexandre Oliva wrote: >> Yes, please. I would very much like to see an abstract design >> document on what you are trying to accomplish. > > Other than the ones I've already posted, here's one: > > http://dwarfstd.org/Dwarf3Std.php > > Seriously. There is a standard for this stuff. That's the specification for the encoding format. I agree with you that emitting incorrect debugging information, in the sense of declaring that the location of a variable is in one place, even though its value is not available in that place, is bad. In -O0 code, I consider it a serious bug. In -O2 code, I think it's still a bug, but with our current infrastructure, we may have little choice: we either deny all knowledge of the variable's location, or give one that's sometimes incorrect. Which alternative is better depends on what you're trying to do with the information; for interactive debugging, mostly-right is probably better than nothing, whereas for some programmatic activities, the opposite may be true. If your goal is to avoid the information ever being wrong -- without worrying about whether it is complete -- there is of course a trivial solution: do not emit the information. That is not a serious suggestion, but it does provide a path to a serious suggestion, which I gave earlier: conservatively emit location information you provide based on what you can prove at the time you generate debugging information. For example, if the value of "x" is in a register, and you cross a call which might clobber that register value, then emit debugging information that says that at that point the value is unavailable. You could probably do this kind of thing with relatively few changes to the GCC internal representation; you would run a pass before debug-information generation that attempted to prove dataflow properties about variables and told you where values could reliably be found. Your earlier messages, however, suggest that you are trying to do something harder: emit information that is essentially both complete (in the sense of providing as much information as possible about the locations and values of variables) and correct (in the sense of never giving incorrect information). If you want to do that, you're going to have to answer the harder questions, like "what line number corresponds to this address?" and "what should the debugging information say that the value of a variable is when it has been optimized away?" If that's still your goal, then pointing at the DWARF3 specification doesn't help. Diego and I are asking you to confront these fundamental questions about what information you want to provide and what the correctness criteria are. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From lukepadawan@gmail.com Sun Dec 16 22:44:00 2007 From: lukepadawan@gmail.com (Lucas Prado Melo) Date: Sun, 16 Dec 2007 22:44:00 -0000 Subject: Problem with posix threads In-Reply-To: <9f4be2240712161144h49e5b3e8r20b673702727274d@mail.gmail.com> References: <9f4be2240712161142r49e09784o9a3154901072d900@mail.gmail.com> <9f4be2240712161144h49e5b3e8r20b673702727274d@mail.gmail.com> Message-ID: <9f4be2240712161420n4682881vffecad702d616373@mail.gmail.com> Why does it happen? From drow@false.org Mon Dec 17 01:12:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Mon, 17 Dec 2007 01:12:00 -0000 Subject: Problem with posix threads In-Reply-To: <9f4be2240712161420n4682881vffecad702d616373@mail.gmail.com> References: <9f4be2240712161142r49e09784o9a3154901072d900@mail.gmail.com> <9f4be2240712161144h49e5b3e8r20b673702727274d@mail.gmail.com> <9f4be2240712161420n4682881vffecad702d616373@mail.gmail.com> Message-ID: <20071216224402.GA8080@caradoc.them.org> On Sun, Dec 16, 2007 at 07:20:37PM -0300, Lucas Prado Melo wrote: > Why does it happen? This list is for the development of GCC. Try gcc-help or some other programming forum, please. -- Daniel Jacobowitz CodeSourcery From dberlin@dberlin.org Mon Dec 17 01:27:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Mon, 17 Dec 2007 01:27:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> Message-ID: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> > It is obvious that you misunderstood what I want, and how intrusive > the approach is. > Yes Alexandre, everyone who disagrees with you must not understand! That's really the problem here. None of us understand but you. From Joe.Buck@synopsys.COM Mon Dec 17 05:38:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Mon, 17 Dec 2007 05:38:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> Message-ID: <20071217012735.GA9275@synopsys.com> On Sun, Dec 16, 2007 at 08:12:07PM -0500, Daniel Berlin wrote: > > It is obvious that you misunderstood what I want, and how intrusive > > the approach is. > > > > Yes Alexandre, everyone who disagrees with you must not understand! > That's really the problem here. > None of us understand but you. I have some sympathy for going in Alexandre's direction, in that it would be nice to have a mode that provided optimization as well as accurate debugging. However, since preserving accurate debug information has a cost, I think it would be better to turn -O1, not -O2, into the mode that Alexandre wants, where debug information is preserved. Trying to rework all optimizations to keep perfect debug information is going to take forever and make the compiler worse. From bosch@adacore.com Mon Dec 17 08:20:00 2007 From: bosch@adacore.com (Geert Bosch) Date: Mon, 17 Dec 2007 08:20:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <20071217012735.GA9275@synopsys.com> References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> Message-ID: <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> On Dec 16, 2007, at 20:27, Joe Buck wrote: > I have some sympathy for going in Alexandre's direction, in that it > would be nice to have a mode that provided optimization as well as > accurate debugging. However, since preserving accurate debug > information > has a cost, I think it would be better to turn -O1, not -O2, into the > mode that Alexandre wants, where debug information is preserved. > Trying > to rework all optimizations to keep perfect debug information is going > to take forever and make the compiler worse. Right, at the moment -O1 is far too much like -O2. There is room for an optimization mode that is mostly local, scales well far large programs and allows for high-quality debug information. Fortunately, these goals seem all to match. We could conceptually have inspection points between each source statement and declaration, which would roughly correspond to a use of all memory and all source variables, wether in memory or in registers. These inspections points would be considered potentially trapping. This approach would still allow some scheduling. For example, loads and arithmetic operations that are known not to trap could still be done early. On the other hand, when breaking at any statement, all variables can be printed. Also, since no user-visible state can be modified by speculatively executed instructions such as loads, such instructions should not be tagged with their original source location information. This would prevent the very annoying and unhelpful jumping around the program during debugging. The method I describe here, which roughly corresponds to the semantics of Ada's "pragma Inspection_Point", seems relatively easy to implement using an empty "asm" or similar. -Geert PS. For convenience, I'm including a snippet of the Ada 2005 standard, the full version of which is freely available on the web. H.3.2 Pragma Inspection_Point 1 An occurrence of a pragma Inspection_Point identifies a set of objects each of whose values is to be available at the point(s) during program execution corresponding to the position of the pragma in the compilation unit. The purpose of such a pragma is to facilitate code validation. Syntax 2 The form of a pragma Inspection_Point is as follows: 3 pragma Inspection_Point[(object_name {, object_name})]; Legality Rules 4 A pragma Inspection_Point is allowed wherever a declarative_item or statement is allowed. Each object_name shall statically denote the declaration of an object. Static Semantics 5/2 An inspection point is a point in the object code corresponding to the occurrence of a pragma Inspection_Point in the compilation unit. An object is inspectable at an inspection point if the corresponding pragma Inspection_Point either has an argument denoting that object, or has no arguments and the declaration of the object is visible at the inspection point. Dynamic Semantics 6 Execution of a pragma Inspection_Point has no effect. Implementation Requirements 7 Reaching an inspection point is an external interaction with respect to the values of the inspectable objects at that point (see 1.1.3). Documentation Requirements 8 For each inspection point, the implementation shall identify a mapping between each inspectable object and the machine resources (such as memory locations or registers) from which the object's value can be obtained. NOTES 9/2 7 The implementation is not allowed to perform "dead store elimination" on the last assignment to a variable prior to a point where the variable is inspectable. Thus an inspection point has the effect of an implicit read of each of its inspectable objects. 10 8 Inspection points are useful in maintaining a correspondence between the state of the program in source code terms, and the machine state during the program's execution. Assertions about the values of program objects can be tested in machine terms at inspection points. Object code between inspection points can be processed by automated tools to verify programs mechanically. 11 9 The identification of the mapping from source program objects to machine resources is allowed to be in the form of an annotated object listing, in human-readable or tool-processable form. From hailijuan@gmail.com Mon Dec 17 14:24:00 2007 From: hailijuan@gmail.com (Lijuan Hai) Date: Mon, 17 Dec 2007 14:24:00 -0000 Subject: gcc don't allow commas between clauses for openmp In-Reply-To: <48353bf60712162327s581a0bc4p3eb127907e9605a4@mail.gmail.com> References: <48353bf60712162327s581a0bc4p3eb127907e9605a4@mail.gmail.com> Message-ID: <48353bf60712170020q51b15dd7i421161c372e36a21@mail.gmail.com> gcc-4.3-20070912 doesn't allow commas between clauses. Details given following. I have just scanned c-parser.c and found we could change c_parser_omp_clause_name () to enable it. But I want to know more before making any changes on it myself. "openmp implementation in gcc" in GCC SUMMIT 2006 seems not covering the details, e.g. what kind of features gcc doesn't presently support for openmp. Thanks, > --lijuan > > micro# /import/dr3/s10/gcc-4.3/bin/gcc a.c -fopenmp > a.c: In function 'main': > a.c:11: error: expected '#pragma omp' clause before ',' token > micro# cat a.c > #include > #include > > int main(void) > { > int i = 1, j = 2; > > omp_set_dynamic(0); > omp_set_num_threads(4); > > #pragma omp parallel shared(i), private(j) > { > j = omp_get_thread_num(); > printf("t#: %i i: %i j: %i\n", omp_get_thread_num(), i, j); > } > > return 0; > } > micro# /import/dr3/s10/gcc-4.3/bin/gcc -v > Using built-in specs. > Target: sparc-sun-solaris2.10 > Configured with: /import/dr2/starlex/orig/trunk/configure --prefix=/import/dr3/s10/gcc-4.3/ --enable-languages=c,c++,fortran --disable-gnattools --with-mpfr=/ws/gccfss/tools --with-gmp=/ws/gccfss/tools > Thread model: posix > gcc version 4.3.0 20070912 (experimental) (GCC) > From dnovillo@google.com Mon Dec 17 17:54:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Mon, 17 Dec 2007 17:54:00 -0000 Subject: gcc don't allow commas between clauses for openmp In-Reply-To: <48353bf60712162327s581a0bc4p3eb127907e9605a4@mail.gmail.com> References: <48353bf60712162327s581a0bc4p3eb127907e9605a4@mail.gmail.com> Message-ID: <4766869B.6020407@google.com> On 12/17/07 02:27, Lijuan Hai wrote: > gcc-4.3-20070912 doesn't allow commas between clauses. Details given > following. I have just scanned c-parser.c and found we could change > c_parser_omp_clause_name () to enable it. Thanks for the report. Jakub submitted a patch to fix this problem which I recently approved. It should be available in 4.3 and the 4.2 branch (if backported). > before making any changes on it myself. "openmp implementation in gcc" > in GCC SUMMIT 2006 seems not covering the details, e.g. what kind of > features gcc doesn't presently support for openmp. Thanks, GCC should support the whole OpenMP 2.5 standard. Support for 3.0 is being implemented by Jakub. Anything not supported is considered a bug and we'd ask you to submit it to bugzilla. Thanks. Diego. From aoliva@redhat.com Mon Dec 17 17:59:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Mon, 17 Dec 2007 17:59:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> (Daniel Berlin's message of "Sun\, 16 Dec 2007 20\:12\:07 -0500") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> Message-ID: On Dec 16, 2007, "Daniel Berlin" wrote: >> It is obvious that you misunderstood what I want, and how intrusive >> the approach is. > Yes Alexandre, everyone who disagrees with you must not understand! My conclusion is not based on disagreement, but rather on the faulty arguments presented during the discussion. For example, when you took the argument that every transformation had effects on debug information, and used that to conclude that every transformation would need difficult changes to generate correct debug information, you left out from your reasoning a major strength of the design, that I had mentioned in the e-mail you responded to: that the optimizers already perform the transformations we need to keep debug information accurate. So, by missing or misunderstanding an essential part of the thought process that went into the design, you came to a false conclusion about it. > That's really the problem here. > None of us understand but you. I guess I'm to blame, for having na?vely put the code out without as much as a design and goals document, such that people started looking at it without actually understanding what it was about, and at the same time taking conclusions about it based on hunches rather than on solid logical grounds. At this point, we have a scenario in which people have already jumped to their conclusions, and whatever I say requires a much higher threshold to be listened to and accepted. It's quite unfortunate that psychological factors take such a large role in the making of technical decisions, and I na?vely assumed this wouldn't raise so much rejection, for being such a simple and well thought-out design. Oh, well... Something to avoid next time... -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dnovillo@google.com Mon Dec 17 18:02:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Mon, 17 Dec 2007 18:02:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> Message-ID: <4766B8E5.60500@google.com> On 12/17/07 12:51, Alexandre Oliva wrote: > I guess I'm to blame, for having na????vely put the code out without as > much as a design and goals document Yes, you are. You need to provide such a document now. I can't see how you'll be able to incorporate your implementation without a convincing design. The barrier is probably going to be higher. You raised too much controversy, so I have my doubts about your simplicity claims. Diego. From aoliva@redhat.com Mon Dec 17 18:33:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Mon, 17 Dec 2007 18:33:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <20071217012735.GA9275@synopsys.com> (Joe Buck's message of "Sun\, 16 Dec 2007 17\:27\:35 -0800") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> Message-ID: On Dec 16, 2007, Joe Buck wrote: > However, since preserving accurate debug information > has a cost, I think it would be better to turn -O1, not -O2, into the > mode that Alexandre wants, where debug information is preserved. In terms of memory, that's true, it does have a cost, for we have to keep more information around. That's one of the reasons why I'm implementing this all under the control of a command-line option: you can selectively enable or disable it, regardless of the level of optimization. If we want to make it default for -O1, but not for -O2, sure, that works. But this won't make much of a difference in terms of code change. Except for the fact that we could simply leave alone the passes that are only executed at -O2 or higher (which is not worth it, given that I've already done the small work needed for them to keep debug info accurate), most of the passes will still keep the information accurate, nearly all of them without any code changes whatsoever. So, doing this only for -O1 seems like a waste, given that -O2 is the most common optimization level, and it's most often accompanied by -g. > Trying to rework all optimizations to keep perfect debug information > is going to take forever and make the compiler worse. This statement is easy to make and to believe, but my approach is proving it false, given a design that took this concern into account. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From rask@sygehus.dk Mon Dec 17 18:38:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Mon, 17 Dec 2007 18:38:00 -0000 Subject: Help with another constraint In-Reply-To: <20071212143508.GP17368@sygehus.dk> References: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> <20071209130740.GI17368@sygehus.dk> <000d01c83a81$85462ca0$33160e98@ece.ncsu.edu> <20071210171542.GL17368@sygehus.dk> <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> <20071212143508.GP17368@sygehus.dk> Message-ID: <20071217183242.GE17368@sygehus.dk> On Wed, Dec 12, 2007 at 03:35:09PM +0100, 'Rask Ingemann Lambertsen' wrote: > > The movxx patterns are special and you'll need to hold the compiler's > hands a little. Since your target can't move immediates directly to memory, > you have to ask for a secondary reload to an intermediate register. Use the > target hook TARGET_SECONDARY_RELOAD. Actually, how do you do that? I can't see any place in the documentation that says how TARGET_SECONDARY_RELOAD can be used for that purpose. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From bviyer@ncsu.edu Mon Dec 17 20:31:00 2007 From: bviyer@ncsu.edu (Balaji V. Iyer) Date: Mon, 17 Dec 2007 20:31:00 -0000 Subject: Help with another constraint In-Reply-To: <20071217183242.GE17368@sygehus.dk> References: <000c01c83a41$4499e240$33160e98@ece.ncsu.edu> <20071209130740.GI17368@sygehus.dk> <000d01c83a81$85462ca0$33160e98@ece.ncsu.edu> <20071210171542.GL17368@sygehus.dk> <002601c83c7c$b3763960$33160e98@ece.ncsu.edu> <20071212143508.GP17368@sygehus.dk> <20071217183242.GE17368@sygehus.dk> Message-ID: <007701c840db$f4b92910$33160e98@ece.ncsu.edu> Hi Rask, First, Thank you very much for all help you have provided me. It really help me finish my project. This is what I did: I capture all the moves regardless of the operand and then to move an immediate into a regiser, I force a register: here is the code for this: if (!no_new_pseudos) { /* taking care of moving constant integers */ if (GET_CODE (operands[1]) == CONST_INT) { rtx reg = gen_reg_rtx (SImode); emit_insn (gen_movsi (reg, operands[1])); operands[1] = gen_lowpart (QImode, reg); } /* moving memory operands */ if (GET_CODE (operands[1]) == MEM) { rtx reg = gen_reg_rtx (SImode); emit_insn (gen_rtx_SET (SImode, reg, gen_rtx_ZERO_EXTEND (SImode, operands[1]))); operands[1] = gen_lowpart (QImode, reg); } /* moving register operands */ if (GET_CODE (operands[0]) != REG) operands[1] = force_reg (QImode, operands[1]); } I hope this helps. -Balaji V. Iyer. -- Balaji V. Iyer PhD Student, Center for Efficient, Scalable and Reliable Computing, Department of Electrical and Computer Engineering, North Carolina State University. -----Original Message----- From: Rask Ingemann Lambertsen [mailto:rask@sygehus.dk] Sent: Monday, December 17, 2007 1:33 PM To: Balaji V. Iyer Cc: gcc@gcc.gnu.org; openrisc@opencores.org Subject: Re: Help with another constraint On Wed, Dec 12, 2007 at 03:35:09PM +0100, 'Rask Ingemann Lambertsen' wrote: > > The movxx patterns are special and you'll need to hold the > compiler's hands a little. Since your target can't move immediates > directly to memory, you have to ask for a secondary reload to an > intermediate register. Use the target hook TARGET_SECONDARY_RELOAD. Actually, how do you do that? I can't see any place in the documentation that says how TARGET_SECONDARY_RELOAD can be used for that purpose. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From aoliva@redhat.com Mon Dec 17 20:43:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Mon, 17 Dec 2007 20:43:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4766B8E5.60500@google.com> (Diego Novillo's message of "Mon\, 17 Dec 2007 12\:59\:01 -0500") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> Message-ID: On Dec 17, 2007, Diego Novillo wrote: > On 12/17/07 12:51, Alexandre Oliva wrote: >> I guess I'm to blame, for having na??vely put the code out without as >> much as a design and goals document > Yes, you are. Wow, thanks. At least we agree on something! ;-) > You need to provide such a document now. Can't I instead provide it when it's ready? You know, it wasn't me who asked to have the thing developed in the open. I didn't push it out just so that people who didn't want to understand it could beat on it before it was ready to defend itself. I put it out because there was an offer for contribution. > I can't see how you'll be able to incorporate your implementation > without a convincing design. Agreed, I don't see how this would be doable for any but the most trivial patches. > The barrier is probably going to be higher. > You raised too much controversy, so I have my doubts about your > simplicity claims. Oh, nice! *I* raised too much controversy. So people first ask me to put the code out such that they can peek at it and help, then most refrain from peeking at it because it's not ready and some who do raise some concerns that are not reflected by the code, and then everyone doubts I've taken those concerns into account and demand a design document that will no more than just repeat the information that's already out there but that people fail to take into account. And then, this is a technical discussion, so historical controversy shouldn't play any role in it, if people were rational about it. Now, can you please explain to me how the efforts of repeating myself one more time, rather than completing the implementation, are going to make it any more likely that people who have already made up their minds based on groundless fears will be convinced? If you really think it would be worth it, can you point out at what you feel to be missing in the consolidated documentation I posted upthread, in response to your request? I'd be happy to fill in the blanks, if you're willing to listen. But I wouldn't be happy to waste more time. (This is not to say that the document won't ever be produced; it's to say that I'm to work on it right now. I have other deliverables ahead of it.) -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dnovillo@google.com Mon Dec 17 21:20:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Mon, 17 Dec 2007 21:20:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> Message-ID: <4766DF5C.1020802@google.com> On 12/17/07 15:28, Alexandre Oliva wrote: >> You need to provide such a document now. > > Can't I instead provide it when it's ready? Of course. Diego. From toon@moene.indiv.nluug.nl Mon Dec 17 21:24:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Mon, 17 Dec 2007 21:24:00 -0000 Subject: HIRLAM and -ftree-loop-linear In-Reply-To: References: Message-ID: <4766E7B5.80404@moene.indiv.nluug.nl> Dorit Nuzman wrote: > any chance you kept the dumps and can report which loops were not > vectorized/recognized with -ftree-loop-linear (so we could see if these > represent missed vectorization opportunities?) It's a bit much to send you everything, so I'll just send you the diff + two routines (from blas/lapack) that had differences (one positive, one negative). The sum total of 'LOOP VECTORIZED' messages: $ wc -l vect.* 5651 vect.lin # -ftree-loop-linear 5673 vect.nolin # no -ftree-loop-linear $ diff -cp vect.nolin vect.lin # attached Hope this is useful, -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 -------------- next part -------------- A non-text attachment was scrubbed... Name: vect.diff Type: text/x-diff Size: 12457 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: sposv.f URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ssyev.f URL: From toon@moene.indiv.nluug.nl Mon Dec 17 22:45:00 2007 From: toon@moene.indiv.nluug.nl (Toon Moene) Date: Mon, 17 Dec 2007 22:45:00 -0000 Subject: HIRLAM and -ftree-loop-linear In-Reply-To: <4766E7B5.80404@moene.indiv.nluug.nl> References: <4766E7B5.80404@moene.indiv.nluug.nl> Message-ID: <4766E8DC.9090104@moene.indiv.nluug.nl> I wrote: > Dorit Nuzman wrote: > >> any chance you kept the dumps and can report which loops were not >> vectorized/recognized with -ftree-loop-linear (so we could see if these >> represent missed vectorization opportunities?) > > It's a bit much to send you everything, so I'll just send you the diff + > two routines (from blas/lapack) that had differences (one positive, one > negative). Sigh, dgeev.f is the positive one (i.e., -ftree-loop-linear vectorized more loops - attached). -- Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.indiv.nluug.nl/~toon/ GNU Fortran's path to Fortran 2003: http://gcc.gnu.org/wiki/Fortran2003 -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: dgeev.f URL: From gccadmin@gcc.gnu.org Tue Dec 18 00:06:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Tue, 18 Dec 2007 00:06:00 -0000 Subject: gcc-4.1-20071217 is now available Message-ID: <20071217224450.8504.qmail@sourceware.org> Snapshot gcc-4.1-20071217 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20071217/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.1 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch revision 131021 You'll find: gcc-4.1-20071217.tar.bz2 Complete GCC (includes all of below) gcc-core-4.1-20071217.tar.bz2 C front end and core compiler gcc-ada-4.1-20071217.tar.bz2 Ada front end and runtime gcc-fortran-4.1-20071217.tar.bz2 Fortran front end and runtime gcc-g++-4.1-20071217.tar.bz2 C++ front end and runtime gcc-java-4.1-20071217.tar.bz2 Java front end and runtime gcc-objc-4.1-20071217.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.1-20071217.tar.bz2 The GCC testsuite Diffs from 4.1-20071210 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.1 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From trevor_smigiel@playstation.sony.com Tue Dec 18 00:52:00 2007 From: trevor_smigiel@playstation.sony.com (trevor_smigiel@playstation.sony.com) Date: Tue, 18 Dec 2007 00:52:00 -0000 Subject: __builtin_expect for indirect function calls Message-ID: <20071218000552.GV3656@playstation.sony.com> Hi, I'm looking for comments on a possible GCC extensions described below. For the target I'm interested in, Cell SPU, taken branches are only predicted correctly by explicitly inserting a specific instructions (a hint instruction) that says "the branch at address A is branching to address B". This allows the processor to prefetch the instructions at B, potentially with no penalty. For indirect function calls, the ideal case is we know the target soon enough at run-time that the hint instruction simply specifies the real target. Soon enough means about 18 cycles before the execution of the branch. I don't have any numbers as to how often this happens, but there are enough cases where it doesn't. When we can't hint the real target, we want to hint the most common target. There are potentially clever ways for the compiler to do this automatically, but I'm most interested in giving the user some way to do it explicitly. One possiblity is to have something similar to __builtin_expect, but for functions. For example, I propose: __builtin_expect_call (FP, PFP) which returns the value of FP with the same type as FP, and tells the compiler that PFP is the expected target of FP. Trival examples: typedef void (*fptr_t)(void); extern void foo(void); void call_fp (fptr_t fp) { /* Call the function pointed to by fp, but predict it as if it is calling foo() */ __builtin_expect_call (fp, foo)(); } void call_fp_predicted (fptr_t fp, fptr_t predicted) { /* same as above but the function we are calling doesn't have to be known at compile time */ __builtin_expect_call (fp, predicted)(); } I believe I can add this just for the SPU target without effecting anything else, but it could be useful for other targets. Are there any comments about the name, semantics, or usefulness of this extension? Thanks, Trevor From aoliva@redhat.com Tue Dec 18 01:01:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 01:01:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4766DF5C.1020802@google.com> (Diego Novillo's message of "Mon\, 17 Dec 2007 15\:43\:08 -0500") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> Message-ID: On Dec 17, 2007, Diego Novillo wrote: > On 12/17/07 15:28, Alexandre Oliva wrote: >>> You need to provide such a document now. >> >> Can't I instead provide it when it's ready? > Of course. Thanks, Now, since you're so interested in it and you've already read the various perspectives on the issue that I listed in my yesterday's e-mail to you, would you help me improve this document, by letting me know what you believe to be missing from the selected postings on design strategies, rationales and goals: http://gcc.gnu.org/ml/gcc/2007-11/msg00229.html (goals) http://gcc.gnu.org/ml/gcc-patches/2007-10/msg00160.html (initial plan) http://gcc.gnu.org/ml/gcc/2007-11/msg00261.html (detailed plan) http://gcc.gnu.org/ml/gcc/2007-11/msg00317.html (example) http://gcc.gnu.org/ml/gcc/2007-11/msg00590.html (more example) http://gcc.gnu.org/ml/gcc/2007-11/msg00176.html (design rationale) http://gcc.gnu.org/ml/gcc/2007-11/msg00177.html (clarification) I could then focus on these missing aspects too, in addition to the ones I already have, while designing the best form to present the ideas. Thanks in advance, -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dnovillo@google.com Tue Dec 18 01:14:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Tue, 18 Dec 2007 01:14:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> Message-ID: <47671BF4.5050704@google.com> On 12/17/07 19:50, Alexandre Oliva wrote: > Now, since you're so interested in it and you've already read the > various perspectives on the issue that I listed in my yesterday's > e-mail to you, would you help me improve this document, by letting me > know what you believe to be missing from the selected postings on > design strategies, rationales and goals: No. I am not interested in organizing your thoughts for you. I am interested in reading a single, concise and well organized design document that you produce for all of us to understand what you want to do. Take your time. It doesn't need to be now. Diego. From aoliva@redhat.com Tue Dec 18 01:24:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 01:24:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> (Geert Bosch's message of "Mon\, 17 Dec 2007 00\:38\:04 -0500") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> Message-ID: On Dec 17, 2007, Geert Bosch wrote: > We could conceptually have inspection points between each source > statement and declaration, which would roughly correspond to a > use of all memory and all source variables, wether in memory or > in registers. > These inspections points would be considered potentially trapping. Yes, I've considered something along these lines, but decided against it, for we can't afford for debug information to affect executable code generation in any way whatsoever, and we don't want to pessimize optimized code when compiling without -g just so that compiling with -g would get us the same code. > Also, since no user-visible state can be modified by speculatively > executed instructions such as loads, such instructions should not > be tagged with their original source location information. Line number information has a well-defined meaning: it ought to represent the source code line that best represents the source-code construct that ended up implemented using that instruction. To address what we have in mind, there's an additional annotation on top of line number information: the is_stmt flag. This is what we should use to tell debuggers what the best instruction is to set a breakpoint at a certain line number or so, and for debuggers to be able to step line by line more seamlessly. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From Joe.Buck@synopsys.COM Tue Dec 18 02:02:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Tue, 18 Dec 2007 02:02:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> Message-ID: <20071218012438.GD2908@synopsys.com> On Mon, Dec 17, 2007 at 11:11:46PM -0200, Alexandre Oliva wrote: > Line number information has a well-defined meaning: it ought to > represent the source code line that best represents the source-code > construct that ended up implemented using that instruction. You implicitly assume that souch a source code line exists. Consider something like int func(bool cond, int a, int b, int c) { int out; if (cond) out = a + b; else out = a + b + c; return out; } The optimizer might produce something that structurally resembles out = a + b; if (!cond) out += c; return out; If you set a breakpoint on the addition of a and b, it will trigger regardless of the value of cond. Furthermore, there isn't a place to put a breakpoint that will trigger only for the case where cond is true, as you can on unoptimized code. So you need to choose between natural debugging and optimization. From jadamcze@utas.edu.au Tue Dec 18 02:27:00 2007 From: jadamcze@utas.edu.au (Jonathan Adamczewski) Date: Tue, 18 Dec 2007 02:27:00 -0000 Subject: __builtin_expect for indirect function calls In-Reply-To: <20071218000552.GV3656@playstation.sony.com> References: <20071218000552.GV3656@playstation.sony.com> Message-ID: <47672A45.3090006@utas.edu.au> trevor_smigiel@playstation.sony.com wrote: > Are there any comments about the name, semantics, or usefulness of this > extension? > Sounds very useful for SPU code. I look forward to trying it out. Toying with the idea, the following seems like a potentially useful C++ form of the proposed extension : struct A { virtual void foo(); }; struct B : public A { virtual void foo(); }; A* a; ... __builtin_expect_call (a->foo, B::foo)(); jonathan. From joey.ye@intel.com Tue Dec 18 04:25:00 2007 From: joey.ye@intel.com (Ye, Joey) Date: Tue, 18 Dec 2007 04:25:00 -0000 Subject: A proposal to align GCC stack Message-ID: -- 0. MOTIVATION -- Some local variables (such as of __m128 type or marked with alignment attribute) require stack aligned at a boundary larger than the default stack boundary. Current GCC partially supports this with limitations. We are proposing a new design to fully solve the problem. -- 1. CURRENT IMPLEMENTATION -- There are two ways current GCC supports bigger than default stack alignment. One is to make sure that stack is aligned at program entry point, and then ensure that for each non-leaf function, its frame size is aligned. This approach doesn't work when linking with libs or objects compiled by other psABI confirming compilers. Some problems are logged as PR 33721. Another is to adjust stack alignment at the entry point of a function if it is marked with __attribute__ ((force_align_arg_pointer)) or -mstackrealign option is provided. This method guarantees the alignment in most of the cases but with following problems and limitations: * Only 16 bytes alignment is supported * Adjusting stack alignment at each function prologue hurts performance unnecessarily, because not all functions need bigger alignment. In fact, commonly only those functions which have SSE variables defined locally (either declared by the user or compiler generated internal temporary variables) need corresponding alignment. * Doesn't support x86_64 for the cases when required stack alignment is > 16 bytes * Emits inefficient and complicated prologue/epilogue code to adjust stack alignment * Doesn't work with nested functions * Has a bug handling register parameters, which resulted in a cpu2006 failure. A patch is available as a workaround. -- 2. NEW PROPOSAL: DESIGN -- Here, we propose a new design to fully support stack alignment while overcoming above problems. The new design will * Support arbitrary alignment value, including 4,8,16,32... * Adjust function stack alignment only when necessary * Initial development will be on i386 and x86_64, but can be extended to other platforms * Emit more efficient prologue/epilogue code * Coexist with special features like dynamic stack allocation (alloca), nested functions, register parameter passing, PIC code and tail call optimization * Be able to debug and unwind stack 2.1 Support arbitrary alignment value Different source code and optimizations requires different stack alignment, as in following table: Feature Alignment (bytes) i386_ABI 4 x86_64_ABI 16 char 1 short 2 int 4 long 4/8* long long 8 __m64 8 __m128 16 float 4 double 8 long double 4/16* user specified any power of 2 *Note: 4 for i386, 8/16 for x86_64 The new design will support any alignment value in this table. 2.2 Adjust function stack alignment only when necessary Current GCC defines following macros related to stack alignment: i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386 and 64 for x86_64. It is the minimum stack boundary. It is fixed. ii. PREFERRED_STACK_BOUNDARY. It sets the stack alignment when calling a function. It may be set at command line and has no impact on stack alignment at function entry. This proposal requires PREFERRED >= STACK, and by default set to ABI_STACK_BOUNDARY This design will define a few more macros, or concepts not explicitly defined in code: iii. ABI_STACK_BOUNDARY in bits, which is the stack boundary specified by psABI, 32 for i386 and 128 for x86_64. ABI_STACK_BOUNDARY >= STACK_BOUNDARY. It is fixed for a given psABI. iv. LOCAL_STACK_BOUNDARY in bits. Each function stack has its own stack alignment requirement, which depends the alignment of its stack variables, LOCAL_STACK_BOUNDARY = MAX (alignment of each effective stack variable). v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary at function entry. If a function is marked with __attribute__ ((force_align_arg_pointer)) or -mstackrealign option is provided, INCOMING = STACK_BOUNDARY. Otherwise, INCOMING == MIN(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) because a function can be called via psABI externally or called locally with PREFERRED_STACK_BOUNDARY. vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required by local variables and calling other function. REQUIRED_STACK_ALIGNMENT == MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT == LOCAL_STACK_BOUNDARY. This proposal won't adjust stack when INCOMING_STACK_BOUNDARY >= REQUIRED_STACK_ALIGNMENT. Only when INCOMING_STACK_BOUNDARY < REQUIRED_STACK_ALIGNMENT, it will adjust stack to REQUIRED_STACK_ALIGNMENT at prologue. 2.3 Initial development on i386 and x86_64 We initially support i386 and x86_64. In this document we focus more on i386 because it is hard to implement because of the restriction of having a small register file. But all that we discuss can be easily applied to x86_64. 2.4 Emit more efficient prologue/epilogue When a function needs to adjust stack alignment and has no dynamic stack allocation, this design will generate following example prologue/epilogue code: IA32 example Prologue: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $4, %esp ; is $-4 the local stack size? Epilogue: movl %ebp, %esp popl %ebp ret Locals will be addressed as esp + offset and parameters as ebp + offset. Add x86_64 example here. Thus BP points to parameter frame and SP points to local frame. 2.5 Coexist with special features Stack alignment adjustment will coexist with varying GCC features that have special calling conventions and frame layout, such as dynamic stack allocation (alloca), nested functions and parameter passing via registers to local functions. I386 hard register usage is the major problem to make the proposal friendly to various GCC features. This design requires an additional hard register in prologue/epilogue in case of dynamic stack allocation. Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX and CX as parameter passing registers, there are limited candidates for this proposal to choose. Current proposal suggests EDI, because it won't conflict with i386 PIC or regparm. X86_64 is much easier. This proposal just chooses RBX. 2.5.1 When stack alignment adjustment comes together with alloca, following example prologue/epilogue will be emitted: Prologue: pushl %edi // Save callee save reg edi leal 8(%esp), %edi // Save address of parameter frame andl $-16, %esp // Align local stack // Reserve two stack slots and save return address // and previous frame pointer into them. By // pointing new ebp to them, we build a pseudo // stack for unwinding. pushl $4(%edi) // save return address pushl %ebp // save old ebp movl %esp, %ebp // point ebp to pseudo frame start subl $24, %esp // adjust local frame size movl %edi, vreg1 epilogue: movl vreg1, %edi movl %ebp, %esp // Restore esp to pseudo frame start popl %ebp leal -8(%edi), %esp // restore esp to real frame start popl %edi // Restore edi ret Locals will be addressed as ebp - offset, parameters as vreg1 + offset Where BX is used to set up virtual parameter frame pointer, BP points to local frame and SP points to dynamic allocation frame. 2.5.2 Nested functions will automatically work because it uses CX as static pointer, which won't conflict with any registers used by stack alignment adjustment, even when nested functions are called via function pointer and a function stub on stack. 2.5.3 GCC may optimize to use registers to pass parameters . At most AX, DX and CX will be used. Such optimization won't conflict with stack alignment adjustment thus it should automatically work. 2.5.4 I386 PIC uses EBX as GOT pointer. This design work well under i386 PIC: For example: i686 Prologue: pushl %edi leal 8(%esp), %edi andl $-16, %esp pushl $4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp call .L1 .L1: popl %ebx movl %edi, vreg1 Body: // code for alloca movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax i686 Epilogue: movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret Locals will be addressed as ebp - offset, parameters as vreg1 + offset, ebx has the GOT pointer. 2.6 Debug and unwind will work since DWARF2 has the flexibility to define different frame pointers. 2.7 Some intrinsics rely on stack layout. Need to handle them accordingly. They are __builtin_return_address, __builtin_frame_address. This proposal will setup pseudo frame slot to help unwinder find return address and parent frame address by emit following prologue code after adjusting alignment: pushl $4(%edi) pushl %ebp -- 3. NEW PROPOSAL: IMPLEMENTATION -- The proposed implementation can be partitioned into following subtasks. * Alignment requirement collection * Frames addressing * Alignment code generation * Debug and unwind information 3.1 Collect alignment requirement Collecting each function's alignment requirement from frontend or from optimization passes like vectorizer, and informing backend. Current GCC uses cfun->stack_alignment_needed to store MIN(largest stack variable alignment, PREFERRED_STACK_BOUNDARY). We will reuse this field and define its value only as "largest stack variable alignment" 3.2 Frames addressing Adding parameter frame, local frame, static frame and dynamic frame with appropriate pointers, either hard registers or virtual registers. Backend will customize CAN_ELIMINATE hook to assign hard registers to corresponding virtual registers. 3.3 Alignment code generation Emit prologue/epilogue code to guarantee correct stack alignment based on each function's alignment requirement collected previously. Modification should happen in ix86_expand_prologue and ix86_expand_epilogue. Code to be emitted can follow above design in a straight forward manner. 3.4 Debug information Emit debug and unwind information for aligned stacks. It also happens in ix86_expand_prologue and ix86_expand_epilogue corresponding the prologue/epilogue code emitted. 4. Code Example Simply function: void foo() { volatile int local; ... } i686 Prologue: pushl %ebp movl %esp, %ebp subl $4, %esp // Adjust local frame size by 4 i686 Epilogue: movl %ebp, %esp popl %ebp ret x86_64 Prologue: pushq %rbp movq %rsp, %rbp subq $16, %rsp x86_64 Epilogue: movl %rbp, %rsp popl %rbp ret Pure 16 bytes align: void foo() { volatile __m128 m = _mm_set_ps1(0.f); } i686 Prologue: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $16, %esp // this is space for m, 16 byte aligned i686 Epilogue: movl %ebp, %esp popl %ebp ret x86_64 Prologue: pushq %rbp movq %rsp, %rbp andq $-16, %rsp subq $16, %rsp x86_64 Epilogue: movl %rbp, %rsp popl %rbp ret 16 bytes align with alloca: void foo(int size) { char * ptr=alloca(size); volatile int __attribute((aligned(32)) m = 0; ... } i686 Prologue: pushl %edi leal 8(%esp), %edi andl $-32, %esp pushl $4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp Body: // code for alloca movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax i686 Epilogue: movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret void foo(int dummy1, int dummy2, int dummy3, int dummy4, int dummy5, int dummy6, int size) { char * ptr=alloca(size); volatile int __attribute((aligned(32)) m = 0; ... } x86_64 Prologue: pushq %rbx leaq $16(%rsp), %rbx andq $-32, %rsp pushq 8(%rbx) pushq %rbp movq %rsp, %rbp subq $24, %rsp Body: movq %rbx, vreg1 movl (vreg1), %eax subq %rax, %rsp andq $-16, %rsp movq %rsp, %rax x86_64 Epilogue: movl %rbp, %rsp popl %rbp movl %rbx, %rsp popl %rbx ret m128 and PIC int g_i; void foo() { volatile __m128 m = _mm_set_ps1(0.f); g_i = 123; ... } i686 Prologue: pushl %ebp movl %esp, %ebp andl $-16, %esp pushl %ebx subl $16, %esp call .L1 .L1: popl %ebx ... i686 Epilogue: addl $16, %esp popl %ebx movl %ebp, %esp popl %ebp ret m128 + alloca + PIC void foo(int size) { char * ptr=alloca(size); volatile __m128 m = _mm_set_ps1(0.f); ... } i686 Prologue: pushl %edi leall 8(%esp), %edi andl $-16, %esp pushl 4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp call .L1 .L1: popl %ebx Body: movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax i686 Epilogue: movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret m128 + alloca + PIC + library call void foo(int size) { char * ptr=alloca(size); volatile __m128 m = _mm_set_ps1(0.f); printf("Hello\n"); ... } i686 Prologue: pushl %edi leal 8(%esp), %edi andl $-16, %esp pushl 4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp call .L1 .L1: popl %ebx i686 Body: movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax Body: call printf@PLT i686 Epilogue: movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret m128 and nested function and PIC void foo() { void bar(int arg1, int arg 2) { volatile __m128 m = _mm_set_ps1(0.f); ... } bar(1,2); } i686: foo: ... movl %ebp, %ecx call bar@PLT ... bar: pushl %edi leal 8(%esp), %edi andl $-16, %esp pushl 4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp call .L1 .L1: popl %ebx movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax ... movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret m128, dynamic stack alloc and register parameter function call static void bar(int arg1, int arg 2, int arg3) { char * ptr=alloca(size); volatile __m128 m = _mm_set_ps1(0.f); ... } void foo() { bar(1,2,3); } i686 foo: movl $1, %eax movl $2, %edx movl $3, %ecx call bar ... bar: pushl %edi leal 8(%esp), %edi andl $-16, %esp pushl $4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax ... movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret Thanks - Joey From rridge@csclub.uwaterloo.ca Tue Dec 18 04:29:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Tue, 18 Dec 2007 04:29:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> Ye, Joey writes: >i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386 >and 64 for x86_64. It is the minimum stack boundary. It is fixed. Strictly speaking by the above definition it would be 8 for i386. The hardware doesn't force the stack to be 32-bit aligned, it just performs poorly if it isn't. >v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary >at function entry. If a function is marked with __attribute__ >((force_align_arg_pointer)) or -mstackrealign option is provided, >INCOMING = STACK_BOUNDARY. Otherwise, INCOMING == MIN(ABI_STACK_BOUNDARY, >PREFERRED_STACK_BOUNDARY) because a function can be called via psABI >externally or called locally with PREFERRED_STACK_BOUNDARY. This section doesn't make sense to me. The force_align_arg_pointer attribute and -mstackrealign assume that the ABI is being followed, while the -fpreferred-stack-boundary option effectively changes the ABI. According your defintions, I would think that INCOMING should be ABI_STACK_BOUNDARY in the first case, and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second. (Or just PREFERRED_STACK_BOUNDARY because a boundary less than the ABI's should be rejected during command line processing.) >vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required >by local variables and calling other function. REQUIRED_STACK_ALIGNMENT >== MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a >non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT == >LOCAL_STACK_BOUNDARY. Hmm... I think you should define STACK_BOUNDARY as the minimum alignment that ABI requires the stack pointer to keep at all times. ABI_STACK_BOUNDARY should be defined as the stack alignment the ABI requires at function entry. In that case a leaf function's REQUIRED_STACK_ALIGMENT should be MAX(LOCAL_STACK_BOUNDARY, STACK_BOUNDARY). >Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX >and CX as parameter passing registers, there are limited candidates for >this proposal to choose. Current proposal suggests EDI, because it won't >conflict with i386 PIC or regparm. Could you pick a call-clobbered register in cases where one is availale? >// Reserve two stack slots and save return address >// and previous frame pointer into them. By >// pointing new ebp to them, we build a pseudo >// stack for unwinding Hmmm... I don't know much about the DWARF unwind information, but couldn't it handle this case without creating the "pseudo frame"? Or at least be extended so it could? Ross Ridge From aoliva@redhat.com Tue Dec 18 04:40:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 04:40:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <20071218012438.GD2908@synopsys.com> (Joe Buck's message of "Mon\, 17 Dec 2007 17\:24\:38 -0800") References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> <20071218012438.GD2908@synopsys.com> Message-ID: On Dec 17, 2007, Joe Buck wrote: > On Mon, Dec 17, 2007 at 11:11:46PM -0200, Alexandre Oliva wrote: >> Line number information has a well-defined meaning: it ought to >> represent the source code line that best represents the source-code >> construct that ended up implemented using that instruction. > You implicitly assume that souch a source code line exists. Actually, no. I'm not sure where you got that impression, and how you came to the conclusion that I'd assign line numbers the way you have. To me, when you hoist something that is present in both blocks of a conditional, it probably makes more sense to give it the line number of the conditional, rather than that of either block. But I won't pretend to have thought very hard about this particular issue. For the time being, I'm focusing my efforts on local variable locations. Anyhow, very clearly you don't want to mark such hoisted-out computation as is_stmt. This should eliminate at least the solvable problem you're worried about. > out = a + b; > if (!cond) > out += c; > return out; > Furthermore, there isn't a place to put a breakpoint that will > trigger only for the case where cond is true, as you can on > unoptimized code. Yep. Sometimes code just is optimized away. Can't stop that without harming optimizations. If dwarf line number programs were smarter, we could perhaps encode multiple lines for the same instruction, along with conditions to tell when the instruction applies to such or such lines, and even more fancy stuff like that. But line number programs don't let us express this in Dwarf3. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Tue Dec 18 05:17:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 05:17:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <47671BF4.5050704@google.com> (Diego Novillo's message of "Mon\, 17 Dec 2007 20\:01\:40 -0500") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: On Dec 17, 2007, Diego Novillo wrote: > On 12/17/07 19:50, Alexandre Oliva wrote: >> Now, since you're so interested in it and you've already read the >> various perspectives on the issue that I listed in my yesterday's >> e-mail to you, would you help me improve this document, by letting me >> know what you believe to be missing from the selected postings on >> design strategies, rationales and goals: > No. I am not interested in organizing your thoughts for you. Wow, nice shot! So tell me, what part of what you've read in the selected bibliography seemed not organized for you? Maybe that's what I have to work on first. > I am interested in reading a single, concise and well organized design > document that you produce for all of us to understand what you want to > do. You got that already, except now I'm no longer sure you've actually read it. Have you? You got the goals. You got the way I intend to get there, in two levels of detail. You got examples that show why the goals can't be achieved in other simpler ways. You got various justifications for the representation I've chosen. Would reformatting these and stamping a title on top make it worthy of your interest? I really don't see what else you might want, and if the above isn't enough, then my rephrasing it all into a single document still wouldn't be enough. I'd be just wasting my time, and yours. So, please do tell me, what is it that you're still missing? Note that I can't promise to deliver, but I can't possibly give you what you want unless you help me figure out what it is. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From hjl@lucon.org Tue Dec 18 06:15:00 2007 From: hjl@lucon.org (H.J. Lu) Date: Tue, 18 Dec 2007 06:15:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> References: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> Message-ID: <20071218051711.GA10166@lucon.org> On Mon, Dec 17, 2007 at 11:25:35PM -0500, Ross Ridge wrote: > Ye, Joey writes: > >i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386 > >and 64 for x86_64. It is the minimum stack boundary. It is fixed. > > Strictly speaking by the above definition it would be 8 for i386. > The hardware doesn't force the stack to be 32-bit aligned, it just > performs poorly if it isn't. We can change the wording. > > >v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary > >at function entry. If a function is marked with __attribute__ > >((force_align_arg_pointer)) or -mstackrealign option is provided, > >INCOMING = STACK_BOUNDARY. Otherwise, INCOMING == MIN(ABI_STACK_BOUNDARY, > >PREFERRED_STACK_BOUNDARY) because a function can be called via psABI > >externally or called locally with PREFERRED_STACK_BOUNDARY. > > This section doesn't make sense to me. The force_align_arg_pointer > attribute and -mstackrealign assume that the ABI is being > followed, while the -fpreferred-stack-boundary option effectively According to Apple engineer who implemented the -mstackrealign, on MacOS/ia32, psABI is 16byte, but -mstackrealign will assume 4byte, which is STACK_BOUNDARY. > changes the ABI. According your defintions, I would think > that INCOMING should be ABI_STACK_BOUNDARY in the first case, > and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second. That isn't true since some .o files may not be compiled with -fpreferred-stack-boundary or with a different value of -fpreferred-stack-boundary. > (Or just PREFERRED_STACK_BOUNDARY because a boundary less than the ABI's > should be rejected during command line processing.) On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may want to use 8 byte for PREFERRED_STACK_BOUNDARY. > > >vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required > >by local variables and calling other function. REQUIRED_STACK_ALIGNMENT > >== MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a > >non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT == > >LOCAL_STACK_BOUNDARY. > > Hmm... I think you should define STACK_BOUNDARY as the minimum > alignment that ABI requires the stack pointer to keep at all times. > ABI_STACK_BOUNDARY should be defined as the stack alignment the > ABI requires at function entry. In that case a leaf function's > REQUIRED_STACK_ALIGMENT should be MAX(LOCAL_STACK_BOUNDARY, > STACK_BOUNDARY). That is true since if the only local variable is char, LOCAL_STACK_BOUNDARY will be 1. But we want the stack to be aligned at STACK_BOUNDARY. We will update our proposal. > > >Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX > >and CX as parameter passing registers, there are limited candidates for > >this proposal to choose. Current proposal suggests EDI, because it won't > >conflict with i386 PIC or regparm. > > Could you pick a call-clobbered register in cases where one is availale? Joey, Xuepeng, is that doable? > > >// Reserve two stack slots and save return address > >// and previous frame pointer into them. By > >// pointing new ebp to them, we build a pseudo > >// stack for unwinding > > Hmmm... I don't know much about the DWARF unwind information, but > couldn't it handle this case without creating the "pseudo frame"? > Or at least be extended so it could? Joey, Xuepeng, what do you think? H.J. From dewar@adacore.com Tue Dec 18 06:16:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Tue, 18 Dec 2007 06:16:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> Message-ID: <4767656C.3000608@adacore.com> Alexandre Oliva wrote: > Yes, I've considered something along these lines, but decided against > it, for we can't afford for debug information to affect executable > code generation in any way whatsoever, and we don't want to pessimize > optimized code when compiling without -g just so that compiling with > -g would get us the same code. I disagree, I think it would be fine to degrade -O1 slightly to achieve full debuggability, and of course -g cannot affect the generated code. If indeed a) it is possible to get perfect debuggability without any pessimization b) that includes unexpected jumping around c) everyone agrees on how to achieve a) and b) d) this is implemented then fine, but in the absence of these conditions, if we need to pessimize -O1 code slightly to achieve this, that's OK by me. If it really worries people, introduce a -Og that achieves this. In my experience people use -O1 not because they are very performance sensitive (those folk use -O2), but because -O0 is so horrible, that they need something better than that for production delivery. From dewar@adacore.com Tue Dec 18 07:45:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Tue, 18 Dec 2007 07:45:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> <20071218012438.GD2908@synopsys.com> Message-ID: <476765C7.5050106@adacore.com> Alexandre Oliva wrote: > Yep. Sometimes code just is optimized away. Can't stop that without > harming optimizations. OK, so you are agreeing that good debuggability is impossible with all the optimizations in place, so once again, let's have an optimziation level that optimizes as far as possible without harming debuggability. > > If dwarf line number programs were smarter, we could perhaps encode > multiple lines for the same instruction, along with conditions to tell > when the instruction applies to such or such lines, and even more > fancy stuff like that. But line number programs don't let us express > this in Dwarf3. So, that's not an option. From joey.ye@intel.com Tue Dec 18 07:50:00 2007 From: joey.ye@intel.com (Ye, Joey) Date: Tue, 18 Dec 2007 07:50:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218051711.GA10166@lucon.org> References: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> <20071218051711.GA10166@lucon.org> Message-ID: Ross, HJ, > > >Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX > >and CX as parameter passing registers, there are limited candidates for > >this proposal to choose. Current proposal suggests EDI, because it won't > >conflict with i386 PIC or regparm. > > Could you pick a call-clobbered register in cases where one is availale? I think it is doable. In current Apple engineer's code to support -mstackrealign, hard register ECX is used. We need to add additional code to find which caller save register is not used to pass parameters. If none of them is available, we still have to use callee save reg like EDI. > > >// Reserve two stack slots and save return address > >// and previous frame pointer into them. By > >// pointing new ebp to them, we build a pseudo > >// stack for unwinding > > Hmmm... I don't know much about the DWARF unwind information, but > couldn't it handle this case without creating the "pseudo frame"? > Or at least be extended so it could? I haven't spent time investigated it yet. I agree it will be much more beautiful without "pseudo frame". I will be happy if solution can be found or be suggested here. But I doubt if it is worthwhile effort. Remember only when stack adjustment + alloca is present, will "pseudo frame" be generated. It may not be so common to impact performance. -----Original Message----- From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of H.J. Lu Sent: 2007?12?18? 13:17 To: Ross Ridge Cc: gcc@gcc.gnu.org Subject: Re: A proposal to align GCC stack On Mon, Dec 17, 2007 at 11:25:35PM -0500, Ross Ridge wrote: > Ye, Joey writes: > >i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386 > >and 64 for x86_64. It is the minimum stack boundary. It is fixed. > > Strictly speaking by the above definition it would be 8 for i386. > The hardware doesn't force the stack to be 32-bit aligned, it just > performs poorly if it isn't. We can change the wording. > > >v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary > >at function entry. If a function is marked with __attribute__ > >((force_align_arg_pointer)) or -mstackrealign option is provided, > >INCOMING = STACK_BOUNDARY. Otherwise, INCOMING == MIN(ABI_STACK_BOUNDARY, > >PREFERRED_STACK_BOUNDARY) because a function can be called via psABI > >externally or called locally with PREFERRED_STACK_BOUNDARY. > > This section doesn't make sense to me. The force_align_arg_pointer > attribute and -mstackrealign assume that the ABI is being > followed, while the -fpreferred-stack-boundary option effectively According to Apple engineer who implemented the -mstackrealign, on MacOS/ia32, psABI is 16byte, but -mstackrealign will assume 4byte, which is STACK_BOUNDARY. > changes the ABI. According your defintions, I would think > that INCOMING should be ABI_STACK_BOUNDARY in the first case, > and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second. That isn't true since some .o files may not be compiled with -fpreferred-stack-boundary or with a different value of -fpreferred-stack-boundary. > (Or just PREFERRED_STACK_BOUNDARY because a boundary less than the ABI's > should be rejected during command line processing.) On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may want to use 8 byte for PREFERRED_STACK_BOUNDARY. > > >vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required > >by local variables and calling other function. REQUIRED_STACK_ALIGNMENT > >== MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a > >non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT == > >LOCAL_STACK_BOUNDARY. > > Hmm... I think you should define STACK_BOUNDARY as the minimum > alignment that ABI requires the stack pointer to keep at all times. > ABI_STACK_BOUNDARY should be defined as the stack alignment the > ABI requires at function entry. In that case a leaf function's > REQUIRED_STACK_ALIGMENT should be MAX(LOCAL_STACK_BOUNDARY, > STACK_BOUNDARY). That is true since if the only local variable is char, LOCAL_STACK_BOUNDARY will be 1. But we want the stack to be aligned at STACK_BOUNDARY. We will update our proposal. H.J. From aoliva@redhat.com Tue Dec 18 07:56:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 07:56:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <476765C7.5050106@adacore.com> (Robert Dewar's message of "Tue\, 18 Dec 2007 01\:16\:39 -0500") References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> <20071218012438.GD2908@synopsys.com> <476765C7.5050106@adacore.com> Message-ID: On Dec 18, 2007, Robert Dewar wrote: > Alexandre Oliva wrote: >> Yep. Sometimes code just is optimized away. Can't stop that without >> harming optimizations. > OK, so you are agreeing that good debuggability is impossible > with all the optimizations in place, so once again, let's have > an optimziation level that optimizes as far as possible without > harming debuggability. I don't oppose such an optimization level, even though I don't know that we agree on what "good debuggability" stands for. It's just that changing optimizations is precisely *against* the goals of my current project. So, don't expect significant efforts to this end from me at this time. >> If dwarf line number programs were smarter, we could perhaps encode >> multiple lines for the same instruction, along with conditions to tell >> when the instruction applies to such or such lines, and even more >> fancy stuff like that. But line number programs don't let us express >> this in Dwarf3. > So, that's not an option. Yup. Best we can do right now is to emit the condition line number. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From kai@desktop.khms.westfalen.de Tue Dec 18 08:06:00 2007 From: kai@desktop.khms.westfalen.de (Kai Henningsen) Date: Tue, 18 Dec 2007 08:06:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: <20071218075311.GA9101@desktop.khms.westfalen.de> On Tue, Dec 18, 2007 at 02:38:31AM -0200, Alexandre Oliva wrote: > Would reformatting these and stamping a title on top make it worthy of > your interest? Actually, I think that *would* help (though, of course, it's impossible to predict if it would help *enough*). I've noticed before (though this thread is a particularly extreme example) that GCC developers seem no more immune than other people, from being able to ignore what's in a mail message (or news article) they're replying to, even up to ignoring the carefully-selected part they're quoting. I don't claim to understand it (nor to be completely immune to it myself), but I'm no longer surprised by it. Disappointed, but not surprised. Anyway, the point is that this seems much rarer when the subject is *not* in the inbox or a newsgroup. For whatever reason, people apply their reading skills differently in different situations. So, my advice would be: 1. Wait a while, so people have time to calm down. 2. Reformat and reorganize the stuff. 3. Put it in an obviously different format - say, give a link to a PDF, instead of putting it in a mail to this list. Oh, and it probably wouldn't hurt to give a short summary of what you did to the various optimizers, including mentioning "no change", *after* you know that that actually works. (For a work in progress, people seem to often disbelieve such claims, however well justified ... at least, if they're already looking hard for arguments against it, however spurious.) And no, I have no idea why this particular discussion degenerated so badly, and similar others didn't. Your style of argumentation may not have been perfect, but the same can be said for many other people here, and it doesn't always seem to lead to a meltdown. Maybe it depends on unpredictable factors like the mood people are in when they go reading their mail. From aoliva@redhat.com Tue Dec 18 08:09:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 08:09:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4767656C.3000608@adacore.com> (Robert Dewar's message of "Tue\, 18 Dec 2007 01\:15\:08 -0500") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> <4767656C.3000608@adacore.com> Message-ID: On Dec 18, 2007, Robert Dewar wrote: > Alexandre Oliva wrote: >> Yes, I've considered something along these lines, but decided against >> it, for we can't afford for debug information to affect executable >> code generation in any way whatsoever, and we don't want to pessimize >> optimized code when compiling without -g just so that compiling with >> -g would get us the same code. > I disagree, I think it would be fine to degrade -O1 slightly to achieve > full debuggability, Sure. But this is just not relevant to my project of getting GCC to emit correct (and, ideally, as complete as possible) variable location information, no matter what the optimization level. My goal is not so much about aiming at a perfect debugging experience, but rather at making sure that what the compiler encodes in debug information actually reflects the code it produced. This will surely benefit a future full debuggability project, of course. But, as much as I see value in perfect debuggability at some new optimization level, my current task is to get correct and more complete variable location information at vanilla-build optimization levels, i.e., at -O2 -g. It is possible to do much better than what we do now, and it appears to me that it's even possible to do much better than my current plan. But I need to get this task wrapped up before I can spend further time figuring out how to make it even better. In either case, it probably won't be like -O0, for optimizations are performed that make it impossible, and I'm not supposed to sacrifice them for the sake of better debug information. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Tue Dec 18 08:39:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 08:39:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Alexandre Oliva's message of "Tue\, 18 Dec 2007 02\:38\:31 -0200") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: On Dec 18, 2007, Alexandre Oliva wrote: > On Dec 17, 2007, Diego Novillo wrote: >> On 12/17/07 19:50, Alexandre Oliva wrote: >>> Now, since you're so interested in it and you've already read the >>> various perspectives on the issue that I listed in my yesterday's >>> e-mail to you, would you help me improve this document, by letting me >>> know what you believe to be missing from the selected postings on >>> design strategies, rationales and goals: >> No. I am not interested in organizing your thoughts for you. > Wow, nice shot! Rats, this below-the-waistline attack really got me annoyed. So annoyed that I spent the night writing up this consolidated design document. So, what do you say now? -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: debug-var-loc.txt URL: -------------- next part -------------- -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From rridge@csclub.uwaterloo.ca Tue Dec 18 11:55:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Tue, 18 Dec 2007 11:55:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071218083942.78BA073CC4@caffeine.csclub.uwaterloo.ca> Ross Ridge writes: > This section doesn't make sense to me. The force_align_arg_pointer > attribute and -mstackrealign assume that the ABI is being > followed, while the -fpreferred-stack-boundary option effectively "H.J. Lu" writes > According to Apple engineer who implemented the -mstackrealign, > on MacOS/ia32, psABI is 16byte, but -mstackrealign will assume > 4byte, which is STACK_BOUNDARY. Ok. The importanting thing is that for backwards compatibility it needs to continue to assume 4-byte alignment on entry and align the stack to a 16-byte alignment on x86 targets, so that makes more sense. >> changes the ABI. According your defintions, I would think >> that INCOMING should be ABI_STACK_BOUNDARY in the first case, >> and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second. > > That isn't true since some .o files may not be compiled with > -fpreferred-stack-boundary or with a different value of > -fpreferred-stack-boundary. Like with any ABI changing flag, that's not supported: ... Further, every function must be generated such that it keeps the stack aligned. Thus calling a function compiled with a higher preferred stack boundary from a function compiled with a lower preferred stack boundary will most likely misalign the stack. The -fpreferrred-stack-boundary flag currently generates code that assumes the stack aligned to the preferred alignment on function entry. If you assume a worse incoming alignment you'll be aligning the stack unnecessarily and generating code that this flag doesn't require. > On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may > want to use 8 byte for PREFERRED_STACK_BOUNDARY. Ok, if people are using this flag to change the alignment to something smaller than used by the standard ABI, then INCOMING should be MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY). Ross Ridge From dnovillo@google.com Tue Dec 18 13:15:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Tue, 18 Dec 2007 13:15:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: <4767B4ED.6010208@google.com> On 12/18/07 03:07, Alexandre Oliva wrote: > Rats, this below-the-waistline attack really got me annoyed. I'm sorry you feel that way, it was not meant as a personal attack, though it was rather brusque. I was getting tired of asking for the same thing over and over again. > So, what do you say now? Thank you. Now I have something concrete to read and comment on. Diego. From dewar@adacore.com Tue Dec 18 13:29:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Tue, 18 Dec 2007 13:29:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> <20071218012438.GD2908@synopsys.com> <476765C7.5050106@adacore.com> Message-ID: <4767C7E1.8030707@adacore.com> Alexandre Oliva wrote: > On Dec 18, 2007, Robert Dewar wrote: >> OK, so you are agreeing that good debuggability is impossible >> with all the optimizations in place, so once again, let's have >> an optimziation level that optimizes as far as possible without >> harming debuggability. > > I don't oppose such an optimization level, even though I don't know > that we agree on what "good debuggability" stands for. My definition is that it should be indistinguishable from -O0 except that I could live without being able to modify variables. > > It's just that changing optimizations is precisely *against* the goals > of my current project. So, don't expect significant efforts to this > end from me at this time. But you can't achieve the above criterion with your approach. From hubicka@ucw.cz Tue Dec 18 13:48:00 2007 From: hubicka@ucw.cz (Jan Hubicka) Date: Tue, 18 Dec 2007 13:48:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <47603F3C.2090808@google.com> References: <47603F3C.2090808@google.com> Message-ID: <20071218132914.GB12527@atrey.karlin.mff.cuni.cz> Hi, thanks for writting the proposal. It seems that at least in general terms we are all in sync. > At this point we are interested in getting feedback on the general idea. > There is some refactoring that will be needed inside the call-graph > manager and some aspects of the design may not even need a lot of > changes in GCC. But in general, it will require very efficient IR streaming. Doing call graph changes should not be that hard (I was trying to keep similar deisgn in mind when implementing it, even if we stepped away from the plan in some cases, like reorganizing passes from vertical to horisontal order). Nearest problem I see is merging different declarations of units read back, I have prototype implementation of DECL merging pass done from my trip this week and hope to have it working at least for --combine and C during christmas. Honza > > In terms of implementation, we will likely use the LTO branch as a > basis. Many of the features we will need are already being implemented > in the branch, so we will keep helping with that implementation. > > > Thanks. Diego. From drow@false.org Tue Dec 18 13:52:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Tue, 18 Dec 2007 13:52:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> References: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> Message-ID: <20071218134735.GA5524@caradoc.them.org> On Mon, Dec 17, 2007 at 11:25:35PM -0500, Ross Ridge wrote: > >// Reserve two stack slots and save return address > >// and previous frame pointer into them. By > >// pointing new ebp to them, we build a pseudo > >// stack for unwinding > > Hmmm... I don't know much about the DWARF unwind information, but > couldn't it handle this case without creating the "pseudo frame"? > Or at least be extended so it could? In practice, there are non-DWARF unwinders scattered all over that work on i386 and folks want to keep them working. DWARF has no trouble handling this sort of thing. -- Daniel Jacobowitz CodeSourcery From dewar@adacore.com Tue Dec 18 14:41:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Tue, 18 Dec 2007 14:41:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> References: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> Message-ID: <4767D08B.1080104@adacore.com> Ross Ridge wrote: > Ye, Joey writes: >> i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386 >> and 64 for x86_64. It is the minimum stack boundary. It is fixed. > > Strictly speaking by the above definition it would be 8 for i386. > The hardware doesn't force the stack to be 32-bit aligned, it just > performs poorly if it isn't. This seems wrong to me. First, although for some types, the accesses may work, the optimizer is allowed to assume that data is properly aligned, and could possibly generate incorrect code (in Ada it is formally erroneous to have any variable that is not properly aligned to its types alignment, unless the alignment is specficially set to some other value). Second, I am pretty sure there are SSE types that require alignment at the hardware levell, even on the i386. From aoliva@redhat.com Tue Dec 18 15:06:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 15:06:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4767B4ED.6010208@google.com> (Diego Novillo's message of "Tue\, 18 Dec 2007 06\:54\:21 -0500") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4767B4ED.6010208@google.com> Message-ID: On Dec 18, 2007, Diego Novillo wrote: > On 12/18/07 03:07, Alexandre Oliva wrote: >> Rats, this below-the-waistline attack really got me annoyed. > I'm sorry you feel that way, it was not meant as a personal attack, > though it was rather brusque. I was getting tired of asking for the > same thing over and over again. >> So, what do you say now? > Thank you. Now I have something concrete to read and comment on. You already had it. Really. You just didn't feel like reading and commenting on it, for whatever reason I can't understand, which is why you kept asking for what you already had over and over again. Anyhow... I expect your feedback, err... "now" ;-P :-D -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From hjl@lucon.org Tue Dec 18 16:14:00 2007 From: hjl@lucon.org (H.J. Lu) Date: Tue, 18 Dec 2007 16:14:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218083942.78BA073CC4@caffeine.csclub.uwaterloo.ca> References: <20071218083942.78BA073CC4@caffeine.csclub.uwaterloo.ca> Message-ID: <20071218150626.GA12651@lucon.org> On Tue, Dec 18, 2007 at 03:39:42AM -0500, Ross Ridge wrote: > >> changes the ABI. According your defintions, I would think > >> that INCOMING should be ABI_STACK_BOUNDARY in the first case, > >> and MAX(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) in the second. > > > > That isn't true since some .o files may not be compiled with > > -fpreferred-stack-boundary or with a different value of > > -fpreferred-stack-boundary. > > Like with any ABI changing flag, that's not supported: > > ... Further, every function must be generated such that it keeps > the stack aligned. Thus calling a function compiled with a higher > preferred stack boundary from a function compiled with a lower > preferred stack boundary will most likely misalign the stack. > > The -fpreferrred-stack-boundary flag currently generates code that > assumes the stack aligned to the preferred alignment on function entry. > If you assume a worse incoming alignment you'll be aligning the stack > unnecessarily and generating code that this flag doesn't require. That is how we get into trouble in the first place. The only place I think of where you can guarantee everything is compiled with the same -fpreferrred-stack-boundary is kernel. Our proposal will align stack only when needed. PREFERRED_STACK_BOUNDARY > ABI_STACK_BOUNDARY will generate a largr stack unnecessarily. We have considered adding a new option, -fincoming-stack-boundary. But we need to consider local/global functions as well as function pointers. If a function can only be called locally, its incoming stack boundary will be PREFERRED_STACK_BOUNDARY. Otherwise, its incoming stack boundary will be MIN(INCOMING_STACK_BOUNDARY, INCOMING_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY). We aren't sure if its benefit will be worth its complexity. > > > On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may > > want to use 8 byte for PREFERRED_STACK_BOUNDARY. > > Ok, if people are using this flag to change the alignment to something > smaller than used by the standard ABI, then INCOMING should be > MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY). On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may want to use 8 byte for PREFERRED_STACK_BOUNDARY. INCOMING will be MIN(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) == 8 byte. H.J. From iant@google.com Tue Dec 18 16:22:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Tue, 18 Dec 2007 16:22:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: Alexandre Oliva writes: > A plan to fix local variable debug information in GCC > > by Alexandre Oliva > > 2007-12-18 draft Thank you for writing this. It makes an enormous difference. > == Goals I note that you don't say anything about the other big problem with debugging optimized code, which is that the debugger jumps around all over the place. That is fine, of course. > Once this is established, a possible representation becomes almost > obvious: statements (in trees) or instructions (in rtl) that assert, > to the variable tracker, that a user variable or member is represented > by a given expression: > > # DEBUG var expr > > By var, we mean a tree expression that denotes a user variable, for > now. We envision trivially extending it to support components of > variables in the future. While you say that this is almost obvious, it still isn't obvious at all to me. You consider trees and RTL together, but I don't see why that is appropriate. My biggest concern at the tree level is the significantly increased memory usage and the introduction of a sort of a weak pointer to values. Since DEBUG statements shouldn't interfere with optimizations, we need to explicitly ignore them in things like has_single_use. But since our data structures need to be coherent, we can not ignore them when we actually eliminate SSA names. That seems sort of complicated. In SSA form it seems very natural to provide a set of associations with user variables for each GIMPLE variable. Since the GIMPLE variables never change, these associations never change. We have to get them right when we create a new GIMPLE variable and when we eliminate a GIMPLE variable. While this obviously requires some work, to me it seems less intrusive than the notion of weak references. Of course this means that we are keeping the debug information in a reversed form. Instead of saying that a user variable is associated with an expression in terms of GIMPLE variables, we will say that a GIMPLE variable is associated with an expression in terms of user variables. We will have to reverse the latter expression to get the correct debug information. Of course in some cases this will be impossible, as when a GIMPLE variable is associated with a sum of user variables; presumably in those cases you would have to drop the DEBUG statement anyhow. By the way, we shouldn't confuse the source code live range of the variable with the annotations on the GIMPLE variables. That will get us into the mapping of source code lines to optimized code. It is of course true that optimized code will move around unpredictably, and your proposal doesn't handle that. I don't see it as a flaw that it will be possible to view user variables outside of their source code range. In any case, RTL is different. We can't reasonably associate annotations with pseudo-registers, because they change during the function. The obvious choices are to annotate SET statements, or to annotate insns, or to introduce a DEBUG insn as you suggest. It's not obvious to me why a DEBUG insn is superior to a REG_NOTE attacked to an insn. The problem with DEBUG insns is of course that the RTL code is very sensitive to new insns, and also the additional memory usage. You discuss those, but it's not obvious to me why your proposed solution is the best one. > Testing for accuracy and completeness of debug information can be best > accomplished using a debugging environment. Of course this is very unsatisfactory without an automated testsuite. Ian From dewar@adacore.com Tue Dec 18 16:28:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Tue, 18 Dec 2007 16:28:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: <4767F3B4.4020709@adacore.com> Ian Lance Taylor wrote: > Alexandre Oliva writes: > >> A plan to fix local variable debug information in GCC >> >> by Alexandre Oliva >> >> 2007-12-18 draft > > Thank you for writing this. It makes an enormous difference. > > >> == Goals > > I note that you don't say anything about the other big problem with > debugging optimized code, which is that the debugger jumps around all > over the place. That is fine, of course. I don't think it is fine, we have constant complaints from our users about this. I think we definitely need an optimization level that avoids this. From aph@redhat.com Tue Dec 18 16:31:00 2007 From: aph@redhat.com (Andrew Haley) Date: Tue, 18 Dec 2007 16:31:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4767F3B4.4020709@adacore.com> References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4767F3B4.4020709@adacore.com> Message-ID: <18279.62756.166164.494778@zebedee.pink> Robert Dewar writes: > Ian Lance Taylor wrote: > > Alexandre Oliva writes: > > > >> A plan to fix local variable debug information in GCC > >> > >> by Alexandre Oliva > >> > >> 2007-12-18 draft > > > > Thank you for writing this. It makes an enormous difference. > > > > > >> == Goals > > > > I note that you don't say anything about the other big problem with > > debugging optimized code, which is that the debugger jumps around all > > over the place. That is fine, of course. > > I don't think it is fine, we have constant complaints from our > users about this. I think we definitely need an optimization > level that avoids this. Short of putting a barrier at every sequence point, how would you stop the debugger from jumping all over the place? I'm assuming that you do want the debugger to show what is actually going on, not fake it. Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From drow@false.org Tue Dec 18 16:32:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Tue, 18 Dec 2007 16:32:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4767F3B4.4020709@adacore.com> References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4767F3B4.4020709@adacore.com> Message-ID: <20071218163142.GA32482@caradoc.them.org> On Tue, Dec 18, 2007 at 11:22:12AM -0500, Robert Dewar wrote: >>> == Goals >> >> I note that you don't say anything about the other big problem with >> debugging optimized code, which is that the debugger jumps around all >> over the place. That is fine, of course. > > I don't think it is fine, we have constant complaints from our > users about this. I think we definitely need an optimization > level that avoids this. It's fine because it's not the problem he's working on. We don't have to fix everything at once! -- Daniel Jacobowitz CodeSourcery From dewar@adacore.com Tue Dec 18 16:42:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Tue, 18 Dec 2007 16:42:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <18279.62756.166164.494778@zebedee.pink> References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4767F3B4.4020709@adacore.com> <18279.62756.166164.494778@zebedee.pink> Message-ID: <4767F60A.5060505@adacore.com> Andrew Haley wrote: = > > I don't think it is fine, we have constant complaints from our > > users about this. I think we definitely need an optimization > > level that avoids this. > > Short of putting a barrier at every sequence point, how would you stop > the debugger from jumping all over the place? I'm assuming that you > do want the debugger to show what is actually going on, not fake it. Note that putting a barrier at every sequence point is exactly what Geert proposed, and I think we really need an optimization level that does the equivalent of this. It is also needed for effective source-object traceability for certification purposes. Yes, you can use -O0, but the trouble is that we generate so much rubbish at this level, much worse than commpetitive compilers with "optimization off", and the shear amount of object code makes the traceability analysis harder (and makes executables unnecessarily huge). > > Andrew. > From dewar@adacore.com Tue Dec 18 16:44:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Tue, 18 Dec 2007 16:44:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <20071218163142.GA32482@caradoc.them.org> References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4767F3B4.4020709@adacore.com> <20071218163142.GA32482@caradoc.them.org> Message-ID: <4767F844.6020803@adacore.com> Daniel Jacobowitz wrote: > On Tue, Dec 18, 2007 at 11:22:12AM -0500, Robert Dewar wrote: >> I don't think it is fine, we have constant complaints from our >> users about this. I think we definitely need an optimization >> level that avoids this. > > It's fine because it's not the problem he's working on. We don't have > to fix everything at once! Fair enough, I am all in favor of improving all aspects of debuggability :-) > From aph@redhat.com Tue Dec 18 17:04:00 2007 From: aph@redhat.com (Andrew Haley) Date: Tue, 18 Dec 2007 17:04:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4767F60A.5060505@adacore.com> References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4767F3B4.4020709@adacore.com> <18279.62756.166164.494778@zebedee.pink> <4767F60A.5060505@adacore.com> Message-ID: <18279.63703.961312.470807@zebedee.pink> Robert Dewar writes: > Andrew Haley wrote: > = > > > I don't think it is fine, we have constant complaints from our > > > users about this. I think we definitely need an optimization > > > level that avoids this. > > > > Short of putting a barrier at every sequence point, how would you stop > > the debugger from jumping all over the place? I'm assuming that you > > do want the debugger to show what is actually going on, not fake it. > > Note that putting a barrier at every sequence point is exactly what > Geert proposed, and I think we really need an optimization level > that does the equivalent of this. It is also needed for effective > source-object traceability for certification purposes. Yes, you > can use -O0, but the trouble is that we generate so much rubbish > at this level, much worse than commpetitive compilers with "optimization > off", and the shear amount of object code makes the traceability > analysis harder (and makes executables unnecessarily huge). I agree. It's a really interesting idea and should be fairly easy to prototype. Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From kenner@vlsi1.ultra.nyu.edu Tue Dec 18 17:12:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Tue, 18 Dec 2007 17:12:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <18279.62756.166164.494778@zebedee.pink> References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4767F3B4.4020709@adacore.com> <18279.62756.166164.494778@zebedee.pink> Message-ID: <10712181704.AA18381@vlsi1.ultra.nyu.edu> > Short of putting a barrier at every sequence point, how would you stop > the debugger from jumping all over the place? I'm assuming that you > do want the debugger to show what is actually going on, not fake it. You could, for example, add a -Og option that says "don't do any optimizations that will move instructions between lines". From hjl@lucon.org Tue Dec 18 18:05:00 2007 From: hjl@lucon.org (H.J. Lu) Date: Tue, 18 Dec 2007 18:05:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218134735.GA5524@caradoc.them.org> References: <20071218042535.3FB1F73CC4@caffeine.csclub.uwaterloo.ca> <20071218134735.GA5524@caradoc.them.org> Message-ID: <20071218171222.GA29254@lucon.org> On Tue, Dec 18, 2007 at 08:47:35AM -0500, Daniel Jacobowitz wrote: > On Mon, Dec 17, 2007 at 11:25:35PM -0500, Ross Ridge wrote: > > >// Reserve two stack slots and save return address > > >// and previous frame pointer into them. By > > >// pointing new ebp to them, we build a pseudo > > >// stack for unwinding > > > > Hmmm... I don't know much about the DWARF unwind information, but > > couldn't it handle this case without creating the "pseudo frame"? > > Or at least be extended so it could? > > In practice, there are non-DWARF unwinders scattered all over that > work on i386 and folks want to keep them working. DWARF has no > trouble handling this sort of thing. > Another thing is we may need to update prolog analyzer in gdb to support the new prolog. H.J. From mrs@apple.com Tue Dec 18 18:08:00 2007 From: mrs@apple.com (Mike Stump) Date: Tue, 18 Dec 2007 18:08:00 -0000 Subject: Objective-C 2.0 in GCC In-Reply-To: <1197956746.8411.6.camel@wendolene> References: <1197956746.8411.6.camel@wendolene> Message-ID: On Dec 17, 2007, at 9:45 PM, Sven Herzberg wrote: > I was just browsing the gcc-list to see if there are any updates on > the Objective-C 2.0 extensions. Can you please send and email to the > gcc-list with the current state? I hope to be able to contribute them in the next year, but exactly when remains uncertain. I hope to know more about timing toward the front of next year. Sorry I don't have anything more specific to say. From ismail@pardus.org.tr Tue Dec 18 18:47:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Tue, 18 Dec 2007 18:47:00 -0000 Subject: Objective-C 2.0 in GCC In-Reply-To: References: <1197956746.8411.6.camel@wendolene> Message-ID: <200712182008.14859.ismail@pardus.org.tr> Hi Mike, Tuesday 18 December 2007 20:04:45 tarihinde Mike Stump ?unlar? yazm??t?: > On Dec 17, 2007, at 9:45 PM, Sven Herzberg wrote: > > I was just browsing the gcc-list to see if there are any updates on > > the Objective-C 2.0 extensions. Can you please send and email to the > > gcc-list with the current state? > > I hope to be able to contribute them in the next year, but exactly > when remains uncertain. I hope to know more about timing toward the > front of next year. Sorry I don't have anything more specific to say. Any schedule for fixing Obj-C++ regressions on mainline? Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From mrs@apple.com Tue Dec 18 19:07:00 2007 From: mrs@apple.com (Mike Stump) Date: Tue, 18 Dec 2007 19:07:00 -0000 Subject: Objective-C 2.0 in GCC In-Reply-To: <200712182008.14859.ismail@pardus.org.tr> References: <1197956746.8411.6.camel@wendolene> <200712182008.14859.ismail@pardus.org.tr> Message-ID: On Dec 18, 2007, at 10:08 AM, Ismail D?nmez wrote: > Any schedule for fixing Obj-C++ regressions on mainline? Same answer. My hope would be that people that introduce regressions would fix them... From ismail@pardus.org.tr Tue Dec 18 20:15:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Tue, 18 Dec 2007 20:15:00 -0000 Subject: Objective-C 2.0 in GCC In-Reply-To: References: <1197956746.8411.6.camel@wendolene> <200712182008.14859.ismail@pardus.org.tr> Message-ID: <200712182106.25115.ismail@pardus.org.tr> Tuesday 18 December 2007 20:47:29 tarihinde Mike Stump ?unlar? yazm??t?: > On Dec 18, 2007, at 10:08 AM, Ismail D?nmez wrote: > > Any schedule for fixing Obj-C++ regressions on mainline? > > Same answer. My hope would be that people that introduce regressions > would fix them... We were talking about it the other day and its mentioned some regressions exist since the introduction of Obj-C++ support. Anyway hope you resolve non-technical issues without much sacrifise. Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From aoliva@redhat.com Tue Dec 18 22:15:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Tue, 18 Dec 2007 22:15:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4767C7E1.8030707@adacore.com> (Robert Dewar's message of "Tue\, 18 Dec 2007 08\:15\:13 -0500") References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <20071217012735.GA9275@synopsys.com> <8DF14649-456C-40D6-94C5-DC3285EFD7C2@adacore.com> <20071218012438.GD2908@synopsys.com> <476765C7.5050106@adacore.com> <4767C7E1.8030707@adacore.com> Message-ID: On Dec 18, 2007, Robert Dewar wrote: > Alexandre Oliva wrote: >> On Dec 18, 2007, Robert Dewar wrote: >>> OK, so you are agreeing that good debuggability is impossible >>> with all the optimizations in place, so once again, let's have >>> an optimziation level that optimizes as far as possible without >>> harming debuggability. >> It's just that changing optimizations is precisely *against* the goals >> of my current project. So, don't expect significant efforts to this >> end from me at this time. > But you can't achieve the above criterion with your approach. Actually, you can. My approach is about ensuring the mapping between the location of source and implementation variables is correct. This is orthogonal to how much optimization you make. If you optimize more, more values or locations may become unavailable, but this is not about correctness (what fraction of the annotations point at locations that hold the correct value), and it's not even about completeness (what fraction of the source variables are represented at all locations they are available), it's just about theoretical completeness (what fraction of the source variables are represented at all locations they would be available without optimization). -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dberlin@dberlin.org Tue Dec 18 23:19:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Tue, 18 Dec 2007 23:19:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> On 12/18/07, Alexandre Oliva wrote: > Then, we let tree optimizers do their jobs. Whenever they rename, > renumber, coalesce, combine or otherwise optimize a variable, they > will automatically update debug statements that mention them as well. > Speaking only about the tree level, in this entire email I make no representations about the RTL level ;) This is much harder than you give it credit for, unless you plan on throwing out all the info at elimination points. Consider PRE alone, which makes new statements that are combinations of old ones, and eliminate tons of variables in favor of it. If your debug statement strategy is "move debug statements when we insert code that is equivalent", it won't work, because our equivalence is based on value equivalence, not location equivalence. We only guarantee it has the same value as the whatever it is a copy of at that point, not that it has the same location. So you will lose info every time PRE makes an insertion, unless you make serious modifications to PRE. This is not to mention the data you lose if you just throw it away at elimination points. Let's take another problem. How do i say debug info for some variable is now dead, we have no idea what it is right now? How do I figure out which debug statements need to be modified when you introduce new memory operations? When you pass something by address, you get vops. The vops are not variables, and have no relation to the original variable (they can be partitions containing more vairables). If i have DEBUG(x, x_3) x_3 = x; // Read from global y = x_3; .... If i insert a new call DEBUG(x, x_3): 1 x_3 = x foo() // May modify x and *&x) y = x_3 Now you have two problems. It is no longer true that at the point of y = x_3, that DEBUG (x, x_3) is true In act, x_3 may no longer have any relation to x. You have three choices: 1. Either destroy the DEBUG(x, x_3) losing valuable and correct info 2. Add a new DEBUG (x, unknown) 3. Figure out which debug statement are reached by your call #3 is a dataflow problem, and not something you want to do every time you insert a call. If your answer is #1 or #2, then what you are really doing is computing roughly the same dataflow problem var-location does, except on trees and with a different meet-operation. var-location generates incorrect info not because it represents something fundamentally different than you are (it doesn't), it falls down because it uses union as the meet operation. It says "oh, i don't know which of these locations is right, it must be both of them". If you changed the meet operation to "oh, i don't know which of these locations is right, it must be none of them", and did a little more work you would inference the same info as yours *at the tree level* Nothing you have proposed is fundamentally going to give you better info. All you have done is annotated the IR in some places to make explicit some bits in the dataflow problem that you could inference anyway. It is provable you can inference them with a simple lattice and associated value, *unless you are going to start guessing* (which you have said you don't want to do because it can generate incorrect info). There is absolutely no reason what you are trying to do needs to modify the tree IR at all to achieve exactly the same accuracy of debug info as your design proposes at the tree level. You could simply compute the global dataflow problem. The RTL level is harder, of course. From rridge@csclub.uwaterloo.ca Tue Dec 18 23:31:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Tue, 18 Dec 2007 23:31:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071218233125.5586173D25@caffeine.csclub.uwaterloo.ca> Ye, Joey writes: >i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386 >and 64 for x86_64. It is the minimum stack boundary. It is fixed. Ross Ridge wrote: >Strictly speaking by the above definition it would be 8 for i386. >The hardware doesn't force the stack to be 32-bit aligned, it just >performs poorly if it isn't. Robert Dewar writes: >First, although for some types, the accesses may work, the optimizer >is allowed to assume that data is properly aligned, and could possibly >generate incorrect code ... That's not enforced by hardware. >Second, I am pretty sure there are SSE types that require >alignment at the hardware levell, even on the i386 This isn't a restriction on stack aligment. It's a restriction on what kinds of machine types can be accessed on the stack. As I mentioned later in my message STACK_BOUNDARY shouldn't be defined in terms of hardware, but in terms of the ABI. While the i386 allows the stack pointer to bet set to any value, by convention the stack pointer is always kept 4-byte aligned at all times. GCC should never generate code that that would violate this requirement, even in leaf-functions or transitorily during the prologue/epilogue. This is different than the proposed ABI_STACK_BOUNDARY macro which defines the possibily stricter aligment the ABI requires at function entry. Since most i386 ABIs don't require a stricter alignment, that has ment that SSE types couldn't be located on the stack. Currently you can get around this problem by changing the ABI using -fperferred-stack-boundary or by forcing an SSE compatible alignment using -mstackrealign or __attribute__ ((force_align_arg_pointer)). Joey Ye's proposal is another solution to this problem where GCC would automatically force an SSE compatible aligment when SSE types are used on the stack. Ross Ridge From dberlin@dberlin.org Tue Dec 18 23:31:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Tue, 18 Dec 2007 23:31:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: <4aca3dc20712181519rb637c16oea78bbcc18abe097@mail.gmail.com> > > It is desirable to be able to represent constants and other > optimized-away values, rather than stating variables have values they > can no longer have: > > int > x1 (int x) > { > int i; > > i = 2; > f(i); > i = x; > h(); > i = 7; > g(i); > } > > Even if variable i is completely optimized away, a debugger can still > print the correct values for i if we keep annotations such as: > > (debug (var_location i (const_int 2))) > (set (reg arg0) (const_int 2)) > (call (mem (symbol_ref f))) > (debug (var_location i unknown)) > (call (mem (symbol_ref h))) > (debug (var_location i (const_int 7))) > (set (reg arg0) (const_int 7)) > (call (mem (symbol_ref g))) > > In this case, before the call to h, not only the assignment to i was > dead, but also the value of the incoming argument x had already been > clobbered. If i had been assigned to another constant instead, debug > information could easily represent this. > > Another example that covers PHI nodes and conditionals: > > int > x2 (int x, int y, int z) > { > int c = z; > whatever0(c); > c = x; > whatever1(); > if (some_condition) > { > whatever2(); > c = y; > whatever3(); > } > whatever4(c); > } > > With SSA infrastructure, this program can be optimized to: > > int > x2 (int x, int y, int z) > { > int c; > # bb 1 > whatever0(z_0(D)); > whatever1(); > if (some_condition) > { > # bb 2 > whatever2(); > whatever3(); > } > # bb 3 > # c_1 = PHI ; > whatever4(c_1); > } > > Note how, without debug annotations, c is only initialized just before > the call to whatever4. At all other points, the value of c would be > unavailable to the debugger, possibly even wrong. > > If we were to annotate the SSA definitions forward-propagated into c > versions as applying to c, we'd end up with all of x_2, y_3 and z_0 I> f you forward propagate any annotations, ever, > applied to c throughout the entire function, in the absence of > additional markers. > > Now, with the annotations proposed in this paper, what is initially: > > int > x2 (int x, int y, int z) > { > int c; > # bb 1 > c_4 = z_0(D); > # DEBUG c z_0(D) > whatever0(z_0(D)); > # DEBUG c x_2(D) > whatever1(); > and then, at every one of the inspection points, we get the correct > value for variable c. Because you have added information you have no way of knowing. How exactly did you compute that the call *definitely sets c to the value of z_0*, and definitely sets the value of c to x_2. This must be "may-information", because we don't know what the call does. Ignoring this (the solution is to not assume anything at calls, because you run the risk of gettng the wrong answer at meet points later on!) your scheme is sufficient to get correct values, but not correct locations. However, value equivalene does not imply location equivalence, and all of our debug formats deal with locations of variables, except for constants. IE If you translate this directly into DWARF3, as written, you will claim that c and x_4 has the same location (since dwarf does not let you say "it has the same value as x, but not the same location), and thus incorrectly represent that p *x_4=5 modifies c if i were to do it in the debugger. Because of the may-problem, you will also claim the same value/location for c and x_2, which you can't prove is right, because you don't know what whatever1/2 actually does. if all you want is the values you compute above, on SSA, you can easily use a lattice to compute the same values you are going to compute as you update the annotations on the fly. (This is because it is a flow sensitive problem, and you want the flow answers at each unique definition point, which SSA neatly provides, except for calls, where you could hang it off the vops). Tracking which values *definitely represent user values* is actually quite easy at the tree level, and doesn't require any IR modification. It may be worth doing at the RTL level, however, where the solution requires making up program points at each definition site and computing the dataflow problem in terms of them. --Dan From rridge@csclub.uwaterloo.ca Wed Dec 19 01:00:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Wed, 19 Dec 2007 01:00:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071218233126.3EA1973D2A@caffeine.csclub.uwaterloo.ca> Ross Ridge wrote: > The -fpreferrred-stack-boundary flag currently generates code that > assumes the stack aligned to the preferred alignment on function entry. > If you assume a worse incoming alignment you'll be aligning the stack > unnecessarily and generating code that this flag doesn't require. H.J. Lu writes: > That is how we get into trouble in the first place. The only place I > think of where you can guarantee everything is compiled with the same > -fpreferrred-stack-boundary is kernel. Our proposal will align stack > only when needed. PREFERRED_STACK_BOUNDARY > ABI_STACK_BOUNDARY will > generate a largr stack unnecessarily. I'm currently using -fpreferred-stack-boundary without any trouble. Your proposal would in fact generate code to align stack when it's not necessary. This would change the behaviour of -fpreferred-stack-boundary, hurting performance and that's unacceptable to me. >> Ok, if people are using this flag to change the alignment to something >> smaller than used by the standard ABI, then INCOMING should be >> MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY). > > On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may > want to use 8 byte for PREFERRED_STACK_BOUNDARY. INCOMING will > be MIN(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) == 8 byte. Using MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) also equals 8 in that case and preserves the behaviour -fpreferred-stack-boundary in every case. Ross Ridge From stevenb.gcc@gmail.com Wed Dec 19 01:11:00 2007 From: stevenb.gcc@gmail.com (Steven Bosscher) Date: Wed, 19 Dec 2007 01:11:00 -0000 Subject: Regression count, and how to keep bugs around forever Message-ID: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> Hello, This is a complaint about how the bug database is being managed. It is getting harder and harder to find bug reports to work on, because too many old bug reports are being kept open even though there is no sign of intent to ever resolve the report. For example, PR18346 is a bug report in Bugzilla, and it is apparently a regression. It's not a high-priority regression, because it is for a target that is neither a secondary nor a primary platform. But the bug report of course does show up in the overview of regressions (such as the "All regressions" link on the homepage). Some facts about this bug report: * the bug was reported by the only listed maintainer for this target * the bug was confirmed by the reporter himself * the bug has seen *no* activity at all in 3.5 years. * the target maintainer insists that the bug report should be kept open Literally no action has been taken for three and a half years! Not by the target maintainer and not by anyone else. This is an extreme case, but it is not entirely unusual that regression PRs (even some P3 and P2 ones) have not seen any activity for two years or more. The bigger issue here, is that people seem to be using Bugzilla as a kind-of TODO list for things may some day work on, but probably will not. The result is that Bugzilla becomes increasingly hard to use over time: The number of open reports in Bugzilla keeps increasing, the list of so-called low-priority regression becomes less useful with each release, and identifying regressions that people actually care about becomes practically impossible. (See also my earlier complaint about a similar ages old bug report that isn't going anywhere: http://gcc.gnu.org/ml/gcc/2007-11/msg00659.html). The current list of "All regressions" should be a list of bugs that people are actively trying to resolve, preferably before the release of GCC 4.3. Instead, it is a mix of high-activity bug reports and bug reports where even the target maintainer has been unwilling for 3.5 years to spend some time on resolving the bug report. So to pick a bug report to work on, I need to go through the but report summaries of a long list, trying to pick out new regressions between the old no-one-cares P4 and P5 regressions. Maybe it is just me, but I find it very annoying to have to wade through long bug lists, so I just don't do this. Instead I just don't look at P4/P5 regressions anymore at all. It's just too much trouble to find a bug report where the reporter or the target maintainer cares as much as you do about resolving the bug. The victims, in the end, are GCC's users. Once a regression report from a user had been downgraded to P4 or P5, it disappears from the radar into the grey mess of older reports. Is this really how this community wants to manage its bug database? To me, the situation is quite clear: If a bug report is open for so long, and even the reporter and the responsible maintainer show no sign of motivation to work on resolving the bug, I think this tells us something about how important this bug is: Not important enough to fix. IMOH we should close such reports as WONTFIX or SUSPENDED to make them less visible, so that other bug reports don't fall through the cracks. So I'm asking for a policy here that says when it is OK to resolve old bug without progress as WONTFIX or SUSPENDED. Start shooting. Gr. Steven From paul@codesourcery.com Wed Dec 19 01:14:00 2007 From: paul@codesourcery.com (Paul Brook) Date: Wed, 19 Dec 2007 01:14:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> Message-ID: <200712190111.11959.paul@codesourcery.com> > So I'm asking for a policy here that says when it is OK to resolve old > bug without progress as WONTFIX or SUSPENDED. Start shooting. I think this would be a big mistake to reuse an existing state for this. If/when someone does start caring about that particular feature it'll be impossible for them to distinguish between bugs that have been deliberately closed (typically because the cure is worse than the disease), and those that you've closed through apathy. Paul From howarth@bromo.msbb.uc.edu Wed Dec 19 01:21:00 2007 From: howarth@bromo.msbb.uc.edu (Jack Howarth) Date: Wed, 19 Dec 2007 01:21:00 -0000 Subject: darwin libgomp build oddity Message-ID: <20071219011440.GA22738@bromo.msbb.uc.edu> Does anyone understand why we get the following when configure is run in the libgomp directory on darwin? configure:17711: checking for shared libgcc configure:17731: /sw/src/fink.build/gcc43-4.2.999-20071214/darwin_objdir/./gcc/xgcc -B/sw/src/fink.build/gcc43-4.2.999-20071214/darwin_objdir/./gcc/ - B/sw/lib/gcc4.3/i686-apple-darwin9/bin/ -B/sw/lib/gcc4.3/i686-apple-darwin9/lib/ -isystem /sw/lib/gcc4.3/i686-apple-darwin9/include -isystem /sw/lib/g cc4.3/i686-apple-darwin9/sys-include -o conftest -lgcc_s conftest.c >&5 ld: library not found for -lgcc_s collect2: ld returned 1 exit status configure:17737: $? = 1 configure: failed program was: | /* confdefs.h. */ | | #define PACKAGE_NAME "GNU OpenMP Runtime Library" | #define PACKAGE_TARNAME "libgomp" | #define PACKAGE_VERSION "1.0" | #define PACKAGE_STRING "GNU OpenMP Runtime Library 1.0" | #define PACKAGE_BUGREPORT "" | #define PACKAGE "libgomp" | #define VERSION "1.0" | #define STDC_HEADERS 1 | #define HAVE_SYS_TYPES_H 1 | #define HAVE_SYS_STAT_H 1 | #define HAVE_STDLIB_H 1 | #define HAVE_STRING_H 1 | #define HAVE_MEMORY_H 1 | #define HAVE_STRINGS_H 1 | #define HAVE_INTTYPES_H 1 | #define HAVE_STDINT_H 1 | #define HAVE_UNISTD_H 1 | #define HAVE_DLFCN_H 1 | #define LT_OBJDIR ".libs/" | #define STDC_HEADERS 1 | #define TIME_WITH_SYS_TIME 1 | #define HAVE_UNISTD_H 1 | #define HAVE_SEMAPHORE_H 1 | #define HAVE_SYS_TIME_H 1 | #define HAVE_GETLOADAVG 1 | #define HAVE_BROKEN_POSIX_SEMAPHORES 1 | #define HAVE_ATTRIBUTE_VISIBILITY 1 | /* end confdefs.h. */ | | int | main () | { | return 0; | ; | return 0; | } configure:17789: /sw/src/fink.build/gcc43-4.2.999-20071214/darwin_objdir/./gcc/xgcc -B/sw/src/fink.build/gcc43-4.2.999-20071214/darwin_objdir/./gcc/ - B/sw/lib/gcc4.3/i686-apple-darwin9/bin/ -B/sw/lib/gcc4.3/i686-apple-darwin9/lib/ -isystem /sw/lib/gcc4.3/i686-apple-darwin9/include -isystem /sw/lib/g cc4.3/i686-apple-darwin9/sys-include -o conftest -lgcc_s.10.5 conftest.c >&5 configure:17795: $? = 0 configure:17799: test -z || test ! -s conftest.err configure:17802: $? = 0 configure:17805: test -s conftest configure:17808: $? = 0 configure:17821: result: yes configure:17862: WARNING: === You have requested some kind of symbol versioning, but configure:17864: WARNING: === either you are not using a supported linker, or you are configure:17866: WARNING: === not building a shared libgcc_s (which is required). configure:17868: WARNING: === Symbol versioning will be disabled. configure:17884: versioning on shared library symbols is no I don't seem to see this warning about symbol versioning in config.log for any of the other shared libraries built for gcc trunk on darwin. Jack From Joe.Buck@synopsys.COM Wed Dec 19 01:21:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Wed, 19 Dec 2007 01:21:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <200712190111.11959.paul@codesourcery.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <200712190111.11959.paul@codesourcery.com> Message-ID: <20071219012050.GM2908@synopsys.com> On Wed, Dec 19, 2007 at 01:11:11AM +0000, Paul Brook wrote: > > So I'm asking for a policy here that says when it is OK to resolve old > > bug without progress as WONTFIX or SUSPENDED. Start shooting. > > I think this would be a big mistake to reuse an existing state for this. But this is pretty much what SUSPENDED means; it means that there's no intent to work on the bug in the near term. > If/when someone does start caring about that particular feature it'll be > impossible for them to distinguish between bugs that have been deliberately > closed (typically because the cure is worse than the disease), and those that > you've closed through apathy. That's an argument for not using WONTFIX. On the other hand, bugs could just be dropped to P5, and people doing queries would exclude that. From dewar@adacore.com Wed Dec 19 01:25:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Wed, 19 Dec 2007 01:25:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218233125.5586173D25@caffeine.csclub.uwaterloo.ca> References: <20071218233125.5586173D25@caffeine.csclub.uwaterloo.ca> Message-ID: <476871ED.1070601@adacore.com> Ross Ridge wrote: > Ye, Joey writes: >> i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386 >> and 64 for x86_64. It is the minimum stack boundary. It is fixed. > > Ross Ridge wrote: >> Strictly speaking by the above definition it would be 8 for i386. >> The hardware doesn't force the stack to be 32-bit aligned, it just >> performs poorly if it isn't. > > Robert Dewar writes: >> First, although for some types, the accesses may work, the optimizer >> is allowed to assume that data is properly aligned, and could possibly >> generate incorrect code ... > > That's not enforced by hardware. But suppose we have something like int(&k) & 1. The optimizer is permitted to replace this with 0 if it knows that the type of k is four byte aligned. > >> Second, I am pretty sure there are SSE types that require >> alignment at the hardware levell, even on the i386 > > This isn't a restriction on stack aligment. It's a restriction on what > kinds of machine types can be accessed on the stack. Well if we have local variables of type float (and we have specified use of SSE), we are in trouble, no? > This is different than the proposed ABI_STACK_BOUNDARY macro which defines > the possibily stricter aligment the ABI requires at function entry. Since > most i386 ABIs don't require a stricter alignment, that has ment that > SSE types couldn't be located on the stack. Currently you can get around > this problem by changing the ABI using -fperferred-stack-boundary or by > forcing an SSE compatible alignment using -mstackrealign or __attribute__ > ((force_align_arg_pointer)). Joey Ye's proposal is another solution > to this problem where GCC would automatically force an SSE compatible > aligment when SSE types are used on the stack. right ... > > Ross Ridge From paul@codesourcery.com Wed Dec 19 01:35:00 2007 From: paul@codesourcery.com (Paul Brook) Date: Wed, 19 Dec 2007 01:35:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <20071219012050.GM2908@synopsys.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <200712190111.11959.paul@codesourcery.com> <20071219012050.GM2908@synopsys.com> Message-ID: <200712190125.21231.paul@codesourcery.com> On Wednesday 19 December 2007, Joe Buck wrote: > On Wed, Dec 19, 2007 at 01:11:11AM +0000, Paul Brook wrote: > > > So I'm asking for a policy here that says when it is OK to resolve old > > > bug without progress as WONTFIX or SUSPENDED. Start shooting. > > > > I think this would be a big mistake to reuse an existing state for this. > > But this is pretty much what SUSPENDED means; it means that there's no > intent to work on the bug in the near term. Ok. I did check the GCC bugzilla help pages, and they don't mention SUSPENDED at all :-) Paul From rridge@csclub.uwaterloo.ca Wed Dec 19 01:46:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Wed, 19 Dec 2007 01:46:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071219014621.11CB773CC4@caffeine.csclub.uwaterloo.ca> Robert Dewar writes: >Well if we have local variables of type float (and we have specified >use of SSE), we are in trouble, no? Non-vector SSE instructions, like the ones that operate on scalar floats, don't require memory operands to be aligned. Ross Ridge From Joe.Buck@synopsys.COM Wed Dec 19 01:46:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Wed, 19 Dec 2007 01:46:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <200712190125.21231.paul@codesourcery.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <200712190111.11959.paul@codesourcery.com> <20071219012050.GM2908@synopsys.com> <200712190125.21231.paul@codesourcery.com> Message-ID: <20071219013529.GN2908@synopsys.com> On Wed, Dec 19, 2007 at 01:25:19AM +0000, Paul Brook wrote: > On Wednesday 19 December 2007, Joe Buck wrote: > > On Wed, Dec 19, 2007 at 01:11:11AM +0000, Paul Brook wrote: > > > > So I'm asking for a policy here that says when it is OK to resolve old > > > > bug without progress as WONTFIX or SUSPENDED. Start shooting. > > > > > > I think this would be a big mistake to reuse an existing state for this. > > > > But this is pretty much what SUSPENDED means; it means that there's no > > intent to work on the bug in the near term. > > Ok. I did check the GCC bugzilla help pages, and they don't mention SUSPENDED > at all :-) Patches welcome, as they say. We have 91 bugs in the SUSPENDED state. Many of them are odd corners of the standard where we're waiting for the appropriate committee to decide what the correct behavior is; others are ancient bugs that we've just agreed to live with. The oldest one relates to the issue of extra FPU precision on x86. From joey.ye@intel.com Wed Dec 19 01:53:00 2007 From: joey.ye@intel.com (Ye, Joey) Date: Wed, 19 Dec 2007 01:53:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218233126.3EA1973D2A@caffeine.csclub.uwaterloo.ca> References: <20071218233126.3EA1973D2A@caffeine.csclub.uwaterloo.ca> Message-ID: Ross Ridge wrote: > I'm currently using -fpreferred-stack-boundary without any trouble. > Your proposal would in fact generate code to align stack when it's not > necessary. This would change the behaviour of -fpreferred-stack-boundary, > hurting performance and that's unacceptable to me. This proposal values correctness at first place. So when compile can't make sure a function is only called from functions with the same or bigger preferred-stack-boundary, it will conservatively align the stack. One optimization is to set INCOMING = PREFERRED for local functions. Do you think it more acceptable? >> Ok, if people are using this flag to change the alignment to something >> smaller than used by the standard ABI, then INCOMING should be >> MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY). > > On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may > want to use 8 byte for PREFERRED_STACK_BOUNDARY. INCOMING will > be MIN(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) == 8 byte. > Using MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) also equals 8 in that > case and preserves the behaviour -fpreferred-stack-boundary in every case. I think HJ means MIN(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY). MAX(ABI, PREFERRED) == 16 in this case. Thanks - Joey From joseph@codesourcery.com Wed Dec 19 01:58:00 2007 From: joseph@codesourcery.com (Joseph S. Myers) Date: Wed, 19 Dec 2007 01:58:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> Message-ID: On Wed, 19 Dec 2007, Steven Bosscher wrote: > The bigger issue here, is that people seem to be using Bugzilla as a > kind-of TODO list for things may some day work on, but probably will I don't see any problem with that. It records known issues. We can then use the existing fields, or create new ones, to track how important we think those issues are. I'd rather have known testsuite failures filed centrally in Bugzilla than scattered elsewhere. (Once a testsuite failure is filed, it can be XFAILed with a comment referencing the PR, and the PR remain open until the bug is fixed and the XFAIL removed. In principle I'd like every XFAIL relating to a GCC bug (as opposed to a bug in system libraries etc.) to have an open PR, and any failure present for any length of time to be XFAILed and filed in Bugzilla so the normal test results show no "expected unexpected" FAILs.) > The current list of "All regressions" should be a list of bugs that > people are actively trying to resolve, preferably before the release No, "All regressions" should be a list of all regressions known. A list of bugs people are actively trying to resolve is useful - indeed more useful - but should be called something else. > To me, the situation is quite clear: If a bug report is open for so > long, and even the reporter and the responsible maintainer show no > sign of motivation to work on resolving the bug, I think this tells us > something about how important this bug is: Not important enough to > fix. IMOH we should close such reports as WONTFIX or SUSPENDED to > make them less visible, so that other bug reports don't fall through > the cracks. > > So I'm asking for a policy here that says when it is OK to resolve old > bug without progress as WONTFIX or SUSPENDED. Start shooting. I objected to the notion of WONTFIX six years ago when we were experimenting with Bugzilla, and I object to it now, with only the modification: if a bug relates to a feature that has been removed from trunk, then WONTFIX is reasonable, and likewise if it requests a fix in an old branch for a bug fixed on trunk where we think such a fix is too risky. If a bug requests a feature we decide would be a bad idea, I think INVALID is right but don't really care if you use WONTFIX. If a bug simply hasn't attracted attention, but is still a bug present in the current toolchain, it should be left open. Just because one developer won't fix it doesn't justify a declaration that no-one will ever be interested in fixing it. By design, SUSPENDED is an open status rather than a closed one. Using it within the terms of the description in bugs/management.html makes sense - go ahead. -- Joseph S. Myers joseph@codesourcery.com From hjl@lucon.org Wed Dec 19 02:07:00 2007 From: hjl@lucon.org (H.J. Lu) Date: Wed, 19 Dec 2007 02:07:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218233126.3EA1973D2A@caffeine.csclub.uwaterloo.ca> References: <20071218233126.3EA1973D2A@caffeine.csclub.uwaterloo.ca> Message-ID: <20071219015833.GA15046@lucon.org> On Tue, Dec 18, 2007 at 06:31:26PM -0500, Ross Ridge wrote: > Ross Ridge wrote: > > The -fpreferrred-stack-boundary flag currently generates code that > > assumes the stack aligned to the preferred alignment on function entry. > > If you assume a worse incoming alignment you'll be aligning the stack > > unnecessarily and generating code that this flag doesn't require. > > H.J. Lu writes: > > That is how we get into trouble in the first place. The only place I > > think of where you can guarantee everything is compiled with the same > > -fpreferrred-stack-boundary is kernel. Our proposal will align stack > > only when needed. PREFERRED_STACK_BOUNDARY > ABI_STACK_BOUNDARY will > > generate a largr stack unnecessarily. > > I'm currently using -fpreferred-stack-boundary without any trouble. BTW, it is -mpreferred-stack-boundary. What value did you use for -mpreferred-stack-boundary? The x86 backend defaults to 16byte. The x86-64 psABI specifies 16byte stack alignment. But the ia32 psABI only specifies 4byte stack alignment. That means that the object files generated by gcc may be incompatible with libs or objects compiled by other ia32 psABI confirming compilers. > Your proposal would in fact generate code to align stack when it's not > necessary. This would change the behaviour of -fpreferred-stack-boundary, > hurting performance and that's unacceptable to me. > > >> Ok, if people are using this flag to change the alignment to something > >> smaller than used by the standard ABI, then INCOMING should be > >> MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY). > > > > On x86-64, ABI_STACK_BOUNDARY is 16byte, but the Linux kernel may > > want to use 8 byte for PREFERRED_STACK_BOUNDARY. INCOMING will > > be MIN(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) == 8 byte. A typo, I meant "INCOMING will be MIN(ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) == 8". > > Using MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) also equals 8 in that > case and preserves the behaviour -fpreferred-stack-boundary in every case. STACK_BOUNDARY is the minimum stack boundary. MAX(STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY) == PREFERRED_STACK_BOUNDARY. So the question is if we should assume INCOMING == PREFERRED_STACK_BOUNDARY in all cases: Pros: 1. Keep the current behaviour of -mpreferred-stack-boundary. Cons: 1. The generated code is incompatible with the other object files. I guess we can live with that. H.J. From hjl@lucon.org Wed Dec 19 02:18:00 2007 From: hjl@lucon.org (H.J. Lu) Date: Wed, 19 Dec 2007 02:18:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071218233125.5586173D25@caffeine.csclub.uwaterloo.ca> References: <20071218233125.5586173D25@caffeine.csclub.uwaterloo.ca> Message-ID: <20071219020722.GB15046@lucon.org> On Tue, Dec 18, 2007 at 06:31:25PM -0500, Ross Ridge wrote: > Ye, Joey writes: > >i. STACK_BOUNDARY in bits, which is enforced by hardware, 32 for i386 > >and 64 for x86_64. It is the minimum stack boundary. It is fixed. > > Ross Ridge wrote: > >Strictly speaking by the above definition it would be 8 for i386. > >The hardware doesn't force the stack to be 32-bit aligned, it just > >performs poorly if it isn't. > > Robert Dewar writes: > >First, although for some types, the accesses may work, the optimizer > >is allowed to assume that data is properly aligned, and could possibly > >generate incorrect code ... > > That's not enforced by hardware. > > >Second, I am pretty sure there are SSE types that require > >alignment at the hardware levell, even on the i386 > > This isn't a restriction on stack aligment. It's a restriction on what > kinds of machine types can be accessed on the stack. > > As I mentioned later in my message STACK_BOUNDARY shouldn't be defined in > terms of hardware, but in terms of the ABI. While the i386 allows the > stack pointer to bet set to any value, by convention the stack pointer > is always kept 4-byte aligned at all times. GCC should never generate > code that that would violate this requirement, even in leaf-functions > or transitorily during the prologue/epilogue. >From gcc internal manual: -- Macro: STACK_BOUNDARY Define this macro to the minimum alignment enforced by hardware for the stack pointer on this machine. The definition is a C expression for the desired alignment (measured in bits). This value is used as a default if `PREFERRED_STACK_BOUNDARY' is not defined. On most machines, this should be the same as `PARM_BOUNDARY'. Since x86 always push/pop stack by decrementing/incrementing address size, it makes senses to define STACK_BOUNDARY as address size. It has nothing to do with application binary interface (ABI). > > This is different than the proposed ABI_STACK_BOUNDARY macro which defines The proposed ABI_STACK_BOUNDARY defines the value specified by the various psABIs which gcc conforms. > the possibily stricter aligment the ABI requires at function entry. Since > most i386 ABIs don't require a stricter alignment, that has ment that > SSE types couldn't be located on the stack. Currently you can get around > this problem by changing the ABI using -fperferred-stack-boundary or by No, gcc works around by setting ix86_preferred_stack_boundary = 128; by default. > forcing an SSE compatible alignment using -mstackrealign or __attribute__ > ((force_align_arg_pointer)). Joey Ye's proposal is another solution > to this problem where GCC would automatically force an SSE compatible > aligment when SSE types are used on the stack. > Our proposal isn't just "another" solution. It is a solution for generic stack alignment problems. H.J. From Joe.Buck@synopsys.COM Wed Dec 19 03:16:00 2007 From: Joe.Buck@synopsys.COM (Joe Buck) Date: Wed, 19 Dec 2007 03:16:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <20071219013529.GN2908@synopsys.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <200712190111.11959.paul@codesourcery.com> <20071219012050.GM2908@synopsys.com> <200712190125.21231.paul@codesourcery.com> <20071219013529.GN2908@synopsys.com> Message-ID: <20071219021753.GP2908@synopsys.com> On Wed, Dec 19, 2007 at 01:25:19AM +0000, Paul Brook wrote: > > Ok. I did check the GCC bugzilla help pages, and they don't mention SUSPENDED > > at all :-) I wrote: > Patches welcome, as they say. Never mind; see http://gcc.gnu.org/bugs/management.html for when to use SUSPENDED. From rridge@csclub.uwaterloo.ca Wed Dec 19 03:51:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Wed, 19 Dec 2007 03:51:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071219031617.02C7073CC4@caffeine.csclub.uwaterloo.ca> Ross Ridge wrote: > I'm currently using -fpreferred-stack-boundary without any trouble. > Your proposal would in fact generate code to align stack when it's > not necessary. This would change the behaviour of > -fpreferred-stack-boundary, hurting performance and that's unacceptable > to me. Ye, Joey writes: > This proposal values correctness at first place. So when compile can't > make sure a function is only called from functions with the same or bigger > preferred-stack-boundary, it will conservatively align the stack. One > optimization is to set INCOMING = PREFERRED for local functions. Do you > think it more acceptable? Not really. It might reduce the amount of unnecessary stack adjustment, but the performance regression would remain. Changing the behaviour of -fpreferred-stack-boundary doesn't make it more correct. It supposed to change the ABI, it works as documented and, yes, if it's misused it will cause problems. So will any number of GCC's ABI changing options. Look at it another way. Lets say you were compiling x86_64 code with -fpreferred-stack-boundary=3, an 8-byte PREFERRED alignment. As you know, this is different from the standard x86_64 ABI which requires a 16-byte alignment. Now with your proposal, GCC's behaviour of won't change, because it's safe to assume that incoming stack is at least 8-byte aligned. There should be no change in the code GCC generates, with or without your proposal. However, the outgoing stack won't be 16-byte aligned as the x86_64 ABI requires. In this case, what also doesn't change is the fact that mixing code compiled with different -fpreferred-stack-boundary values doesn't work. It's just as problematic and unsafe as it was before. So when you said "this proposal values correctness at first place", that really isn't true. The proposal only addresses safety when preferred alignment is raised from the standard ABI's alignment. You're conservatively aligning the incoming stack, but not the outgoing stack. You don't seem to be concerned about the problems that can arise when the preferred is raised above the ABI's. Why? My guess is that because "correctness" in this case would cause unacceptable regressions when compiling the x86_64 Linux kernel. If you can understand why it would be unacceptable to change how -fpreferred-stack-boundary behaves when compiling the Linux kernel, then maybe you can understand why I don't find it acceptable for it to change when compiling my code. Ross Ridge From aoliva@redhat.com Wed Dec 19 04:30:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Wed, 19 Dec 2007 04:30:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "18 Dec 2007 08\:13\:55 -0800") References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: On Dec 18, 2007, Ian Lance Taylor wrote: > Alexandre Oliva writes: >> A plan to fix local variable debug information in GCC >> >> by Alexandre Oliva >> >> 2007-12-18 draft > Thank you for writing this. It makes an enormous difference. NP. Thanks for the encouragement. >> == Goals > I note that you don't say anything about the other big problem with > debugging optimized code, which is that the debugger jumps around all > over the place. Yep, it's a separate project, that I'm somewhat interested in, and maybe somewhat easy to fix with judicious use of is_stmt notes, but it's not my top priority ATM. >> Once this is established, a possible representation becomes almost >> obvious: statements (in trees) or instructions (in rtl) that assert, >> to the variable tracker, that a user variable or member is represented >> by a given expression: >> >> # DEBUG var expr >> >> By var, we mean a tree expression that denotes a user variable, for >> now. We envision trivially extending it to support components of >> variables in the future. > While you say that this is almost obvious, it still isn't obvious at > all to me. You consider trees and RTL together, but I don't see why > that is appropriate. You snipped (skipped?) one aspect of the reasoning on why it is appropriate. Of course this doesn't prove it's the best possibility, but I haven't seen evidence of why it isn't. > My biggest concern at the tree level is the significantly increased > memory usage One of the first measurements we had from my code was from Richi, who said it didn't increase it too much. > and the introduction of a sort of a weak pointer to > values. Since DEBUG statements shouldn't interfere with > optimizations, we need to explicitly ignore them in things like > has_single_use. That's probably the easiest part, and it's already done. > But since our data structures need to be coherent, we can not ignore > them when we actually eliminate SSA names. That seems sort of > complicated. It's not. The code to do this is ready. After I got bootstrap-debug to pass on x86_64-linux-gnu, I don't recall needing any further changes in the tree passes for i386-linux-gnu, and none of the ia64-linux-gnu or ppc64-linux-gnu fixes I've made so far (most to their machine-dependent schedulers) required changes in the tree passes either. So, we can safely count that as easy and maintainable. Looking at the patches in the vta branch for the tree infrastructure will give you a very good idea of the involved effort. > In SSA form it seems very natural to provide a set of associations > with user variables for each GIMPLE variable. Yes. This provides for a simple AND WRONG representation (but not hopeless, see below, after the sample code). We went through some of this already. You can't recover the information with something that throws away information about the point of assignment. Even the basic block of assignment is lost. You can't generate correct debug information with this. The limitation of approaches like this is addressed in passing in the examples, but I didn't want to carry discussions about broken designs that I thought we'd already left behind into the concise design document. > Since the GIMPLE variables never change, these associations never > change. We have to get them right when we create a new GIMPLE > variable and when we eliminate a GIMPLE variable. Maybe you can show us how to represent the annotations for the two trivial examples I've chosen in the paper, to show that the compiler can stand a chance of generating correct debug information. > Of course this means that we are keeping the debug information in a > reversed form. This is not such a big deal; it would just lose some in completeness, and it would probably carry around lots of useless notes. The real problem is that it loses essential information for correct debug information generation. > Instead of saying that a user variable is associated with an > expression in terms of GIMPLE variables, we will say that a GIMPLE > variable is associated with an expression in terms of user > variables. Let me see if I understand what you have in mind. Given: int f(int x, int y) { int i, j; probe1(); i = x; j = y; probe2(); if (x < y) i += y; else j -= x; probe3(); return g (i ,j); } we'd SSAify it into something like: int f(int x, int y) { int i; int j; int T; probe1(); i_0 = x_1(D); /* i */ j_2 = y_3(D); /* j */ probe2(); if (x_1(D) < y_3(D)) i_4 = i_0 + y_3(D); /* i */ else j_5 = j_2 - x_1(D); /* j */ i_6 = PHI /* i */ j_7 = PHI /* j */ probe3(); T_8 = g (i_6, j_7); return T_8; } And I can see that setting breakpoints at the probe points would get you correct values for i and j. In fact, these annotations, so far, are no different from what we already have today. But then, if we optimize this just a little bit, I can't quite tell what we'd get to enable correct debug information: int f(int x, int y) { int i; int j; int T; probe1(); /* p1: ??? i, j */ probe2(); if (x_1(D) < y_3(D)) i_4 = x_1(D) + y_3(D); /* i */ else j_5 = y_3(D) - x_1(D); /* j */ i_6 = PHI /* i */ j_7 = PHI /* j */ probe3(); T_8 = g (i_6, j_7); return T_7; } Now, if you tell me that information about i_0 and j_2 is backward-propagated to the top of the function, where x and y are set up, I introduce say zero-initialization for i and j before probe1() (an actual function call, mind you), and then this representation is provably broken. And, if you tell me that you just discard that information, then at probe2() the variables will appear to be uninitialized (or zero-initialized after the change), and again the representation is wrong. If you tell me that you keep notes at those points to tell debug information that at probe2() both variables have unknown values, then you may get correct debug information, but you're willfully making it incomplete for an extremely common scenario (this example is intentionally made similar to a scenario after one pass of inlining into f, where i and j were former arguments to the inlined function). If you tell me that you keep notes at that point that indicate the expected values of i and j, then you've reached the representation I propose. If you tell me you keep different notes between probe1() and probe2(), that just tell the point at which i and j receive the values of x and y, but the annotations are still attached to the SSA assignment, then this stands a chance of generating correct debug information. Something like: x_1(D) /* x starting at entry point, and also i starting at p1 */ y_3(D) /* y starting at entry point, and also j starting at p1 */ Maybe these annotations interspersed in the code might be easier to handle. I hadn't considered this before. It's worth investigating. But I still haven't got your proposal entirely clear. I don't quite see how this would handle transformations other than trivial substitutions. Can you perhaps give examples of how you'd get from trivial annotations to more complex, potentially ambiguous expressions, as optimization passes make complex transformations? Maybe what you have in mind is something along the lines of induction variables, that loop optimizers would have to annotate explicitly, is that so? > It is of course true that optimized code will move around > unpredictably, and your proposal doesn't handle that. It handles that in that a variable will be regarded as being assigned to a value when execution crosses the debug stmt/insn originally inserted right after the assignment. This is by design, but I realize now I forgot to mention this in the design document. The idea is that, debug insns get high priority in scheduling. However, since they mention the assignment just before them, if the assignment is just moved earlier, without an intervening scheduling barrier, then the debug instruction will follow it. If the assignment is removed, then the debug insn can be legitimately be move up to the point where the assignment, if remaining, might have been moved up to. However, if the assignment is moved to a separate basic block, say out of a loop or a conditional, then we don't want the debug insn to move with it: such that hoisting and commonizing are regarded as setting temporaries, and the value is only "committed" to the variable if we get to the point where the assignment would take place. Neat, eh? I'll add something to this effect to the design document. > I don't see it as a flaw that it will be possible to view user > variables outside of their source code range. Agreed. Extending the range of a (variable value) binding to a point in which the variable wouldn't exist (yet or any more) without optimization is fine, but extending the range of such a binding across an assignment, even an optimized-away one, isn't. > It's not obvious to me why a DEBUG insn is superior to a REG_NOTE > attacked to an insn. Mainly because we won't want to always move the note along with the insn. A REG_NOTE isn't unambiguous for parallel sets, but there are ways around that. As written in the document, combining the debug annotation with an assignment is doable and not discarded from the plan, but at some point the note may need to be detached, and then it's not clear to me that the potential memory savings of this combination are worth the additional maintenance burden of splitting them out on demand, which is my greatest concern. On top of that, after splitting, all the maintenance burden (no matter how small) of dealing with stand-alone debug annotations would have to be undertaken anyway, so it appears to me that the combination would just add complexity. But then again, I'm not sure about it, so I haven't ruled it out; the design is open to it. > The problem with DEBUG insns is of course that the RTL code > is very sensitive to new insns, and also the additional memory usage. > You discuss those, but it's not obvious to me why your proposed > solution is the best one. I can't assert it's the best, no matter how hard I've worked on this design. I've presented my thoughts (or at least as many of them as I could remember; I may have forgotten some along the way ;-), and I've shown why other designs presented before didn't solve the problem I had to solve, as far as I could tell. Your annotations along with the point-marking notes are an approach I hadn't considered before, and I'm pretty sure I don't quite follow how this would work to the fullest extect, but on first sight it appears to me that it might work. So let's look further into it. >> Testing for accuracy and completeness of debug information can be best >> accomplished using a debugging environment. > Of course this is very unsatisfactory without an automated testsuite. Err... I didn't say the testing through a debugging environment wouldn't be automated. My plan is to use something along the lines of the GDB testsuite scripts, but whether to use GDB or some other debugging or monitoring infrastructure is a tiny implementation detail that I haven't worried about at all. The basic idea is to script the inspection of variables and verify that the obtained values are the expected ones, or that variables are defensibly unavailable at the inspection points. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Wed Dec 19 04:35:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Wed, 19 Dec 2007 04:35:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712181519rb637c16oea78bbcc18abe097@mail.gmail.com> (Daniel Berlin's message of "Tue\, 18 Dec 2007 18\:19\:31 -0500") References: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181519rb637c16oea78bbcc18abe097@mail.gmail.com> Message-ID: On Dec 18, 2007, "Daniel Berlin" wrote: >> int c = z; >> whatever0(c); >> c = x; > Because you have added information you have no way of knowing. > How exactly did you compute that the call *definitely sets c to the > value of z_0*, and definitely sets the value of c to x_2. Err... I guess you're thinking memory, global variables, alias analysis and that sort of stuff. None of this applies to gimple registers, which is all the annotations are about. Yes, aliasing, memory references and must- and may-alias do play a role at the time of turning the annotations into equivalence classes, when memory locations that are not stack slots allocated to gimple regs that couldn't get hardware registers show up in the equivalence classes. These don't seem too hard to handle conservatively (removing even may-alias assignment destinations from equivalence classes, as well as non-local memory references at function calls and volatile asms), at the expense of incompleteness in debug information, or in a more lax way, at the potential expense of correctness. I still don't know exactly where to draw the line here, this note-propagation algorithm is one that I haven't completely figured out yet. > However, value equivalene does not imply location equivalence, and all > of our debug formats deal with locations of variables, except for > constants. Dwarf enables arbitrary value expressions too. There's some discussion about lvalue vs rvalue in the document, and this is also something that will take some experimenting. I'm not entirely sure where to draw the line, and I'm not entirely sure there is a perfect answer. For example, consider that a variable's home is a stack slot, but for a loop in which it's not modified, it's held in a register. Clearly in this case the correct representation is for the variable to be in both locations, both as lvalues. But if the variable is further copied to other variables or locations, these additoinal locations probably shouldn't be regarded as the same variable any more; at most, as rvalues, but maybe not even that. And then, if for some particular instruction, the variable in the register needs to be copied to a different register class, then it is correct to state that, between the copy and the use, the variable is held in all three locations. I'm still trying to figure out how to deal with overlaps between variables, deciding whether locations are to be handled as lvalues or rvalues, this sort of stuff. It is indeed a difficult problem. > IE If you translate this directly into DWARF3, as written, you will > claim that c and x_4 has the same location (since dwarf does not let > you say "it has the same value as x, but not the same location), Yeah. The $1M question is, when two variables are coalesced into one, does this mean we now have two variables sharing the same location, or do we just use the rvalue of one (which?) for the other? Isn't this like talking about body and spirit of variables? After optimization, I'm not even sure that talking about location (body) of variables make much sense. An important part of the design process was to distinguish between source-level variables and implementation-level variables. Our naming of stack slots or pseudos as variables is just a mnemonic artifact for us compiler engineers, to simplify debugging. Which variables they actually represent depends a lot on optimization decisions, perhaps even more than on the original code. So I talk about binding a source-level variable to a value, rather than to a location. Then, we figure out the locations that hold the value, what other variables do, how they overlap, maybe how they're used, and then figure out which locations should be assigned to each source variable. Tricky. The only certainty I have right now is that the annotations I've proposed enable us to keep track of values. Distributing locations in equivalence classes to different user variables is an open problem, and there are various possible solutions that could make sense, and that would be arguably correct. > if all you want is the values you compute above, on SSA, you can > easily use a lattice to compute the same values you are going to > compute as you update the annotations on the fly. This sounds interesting, but I don't quite follow what you mean. Can you elaborate, maybe give some examples? > Tracking which values *definitely represent user values* is actually > quite easy at the tree level, and doesn't require any IR modification. But is the binding of user variables to user values for specified ranges part of this representation too? I don't see that it is, and this is the gap I'm trying to fill with the debug annotations. > It may be worth doing at the RTL level, however, where the solution > requires making up program points at each definition site and > computing the dataflow problem in terms of them. /me mumbles something about RTL-SSA, that Jeff Law started working on before we took this turn into Tree-SSA. I'm sort of having to introduce some limited form of SSA in RTL to infer global equivalence classes out of the annotations, in the RTL var-tracking pass. Fun... If only we had sticked to a single IR... (No personal preference, I like both, but I'd rather not have to duplicate work so as to deal with both) -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From eweddington@cso.atmel.com Wed Dec 19 05:15:00 2007 From: eweddington@cso.atmel.com (Weddington, Eric) Date: Wed, 19 Dec 2007 05:15:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> Message-ID: <258DDD1F44B6ED4AAFD4370847CF58D55034F5@csomb01.corp.atmel.com> > -----Original Message----- > From: Steven Bosscher [mailto:stevenb.gcc@gmail.com] > Sent: Tuesday, December 18, 2007 6:00 PM > To: GCC > Cc: hp@gcc.gnu.org > Subject: Regression count, and how to keep bugs around forever > > Maybe it is just me, but I find it very annoying to have to wade > through long bug lists, so I just don't do this. Instead I just don't > look at P4/P5 regressions anymore at all. It's just too much trouble > to find a bug report where the reporter or the target maintainer cares > as much as you do about resolving the bug. I understand your frustration. I know that some AVR target bugs might fit in your category that you describe. However a few of us are working to resolve the situation at least for the AVR target. If you any problems, issues, complaints, or patches ;-) about/for the AVR target, please let me know so they can be addressed. Thanks. Eric Weddington Product Manager Atmel From ddaney@avtrex.com Wed Dec 19 05:16:00 2007 From: ddaney@avtrex.com (David Daney) Date: Wed, 19 Dec 2007 05:16:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <20071219021753.GP2908@synopsys.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <200712190111.11959.paul@codesourcery.com> <20071219012050.GM2908@synopsys.com> <200712190125.21231.paul@codesourcery.com> <20071219013529.GN2908@synopsys.com> <20071219021753.GP2908@synopsys.com> Message-ID: <4768A8FB.2020904@avtrex.com> Joe Buck wrote: > On Wed, Dec 19, 2007 at 01:25:19AM +0000, Paul Brook wrote: > >>> Ok. I did check the GCC bugzilla help pages, and they don't mention SUSPENDED >>> at all :-) >>> > > I wrote: > >> Patches welcome, as they say. >> > > Never mind; see > http://gcc.gnu.org/bugs/management.html > > for when to use SUSPENDED. > > Perhaps we need a new state like: DORMANT -- Like new, but nobody cares enough to do anything about it. Really this is like NEW P5, so perhaps no new state is needed. David Daney From aoliva@redhat.com Wed Dec 19 06:07:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Wed, 19 Dec 2007 06:07:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> (Daniel Berlin's message of "Tue\, 18 Dec 2007 17\:15\:48 -0500") References: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> Message-ID: On Dec 18, 2007, "Daniel Berlin" wrote: > Consider PRE alone, > If your debug statement strategy is "move debug statements when we > insert code that is equivalent" Move? Debug statements don't move, in general. I'm not sure what you have in mind, but I sense some disconnect here. > because our equivalence is based on value equivalence, not location > equivalence. We only guarantee it has the same value as the > whatever it is a copy of at that point, not that it has the same > location. This sounds perfect to me. I'm concerned about values. Locations are an implementation detail. The thing to keep in mind is that what was originally a single user variable may end up mangloptimized into multiple stack slots, registers, with multiple simultaneously-live versions. Trying to pretend that any of these represent the user variable sounds like a recipe for madness to me. So I focus on values instead, and then on trying to recover locations based on binding and sharing of values. > How do i say debug info for some variable is now dead, we have no idea > what it is right now? For annotations, look for VAR_DEBUG_VALUE_NOVALUE in tree.h and VAR_LOC_UNKNOWN_P in rtl.h, in the VTA branch. For dwarf location lists, you just refrain from emitting locations for a given range. > How do I figure out which debug statements need to be modified when > you introduce new memory operations? None. By definition, debug annotations are only about variables that are not addressable. Those that are are fixed at a single location, so there's no reason to track them in a fancy way. > If i insert a new call > DEBUG(x, x_3): 1 > x_3 = x > foo() // May modify x and *&x) > y = x_3 > Now you have two problems. You're talking about a real problem, but your example is misguided. Let me give you a real problem scenario. (set (reg ) ()) (var_location x (reg )) (set (mem ) (reg )) (set (reg ) ()) (call (mem (symbol_ref foo))) So, at the var_location debug_insn, we know that x is in reg . That's stored at *addr, so now we might be able to use it as an additional location for x. And then, when reg is modified, we remove T from the equivalence class, and then only location holding the value of x is *addr. Then, a function call, that might modify *addr. So, do we decide that x is no longer available after the call, or do we hope *addr still represents it? The thing to remember is that the annotations are only about gimple regs. This means calls don't modify them, ever. But we still have to decide whether *addr represents x or not. My thoughts are leaning towards looking at the memory address or other memory attributes to tell whether it's an addressable stack slot or not. If it's addressable, remove it from the equivalence class at the call, so the equivalence class becomes empty, and the variable is regarded as dead. If it's not addressable (a pseudo assigned to memory), then we can keep it, even if x is actually dead past the call. What we'll see is that, if x is not dead after the call, the compiler will arrange to preserve its value in one such local non-addressable stack slot, and it will probably extend the equivalence class again after the call, as the pseudo is restored. Or the pseudo will be temporarily assigned to a call-saved register, which, for being call-saved, won't be removed from equivalence classes at call instructions. Whereas, if x is dead and its value was just copied to some random memory location, then we may as well flag it as dead at the call site, where the memory location may be modified. So, it all works out nicely, because we know we're only dealing with gimple regs. volatile asms make this slightly trickier, because they're totally unpredictable. I'm thinking it's safe to simply remove addressable memory locations from equivalence classes at them, just for safety, but I don't have it completely figured out. > #3 is a dataflow problem, and not something you want to do every time > you insert a call. I'm not sure what you mean by "inserting calls". We don't do that. Calls are present in the source code (even when implied by stuff like TLS, OpenMP or builtins such as memcpy), and they're either kept around, eliminated or inlined. (disgression intended to be funny: this "inserting a call" discussion reminds me of those impossible initial conditions in electromagnetism textbook exercises, such as uniform magnetic fields in which charged particle suddenly appear ;-) > If your answer is #1 or #2, then what you are really doing is > computing roughly the same dataflow problem var-location does, except > on trees and with a different meet-operation. I am actually computing the same dataflow problem of var-tracking. That's the whole point. But I'm giving it more information, to enable it to track more variables. And it needs to deal with multiple concurrent locations for the same variable, and multiple variables in the same locations, which are "slight" complications. But you're right, in the end it's the same problem. But I'm not computing that in trees. I'm just collecting and maintaining data points for var-tracking, all the way from the tree level. > var-location generates incorrect info not because it represents > something fundamentally different than you are (it doesn't), it falls > down because it uses union as the meet operation. > It says "oh, i don't know which of these locations is right, it must > be both of them". However, it can't deal with parallel locations, so this is at odds with your statement. I haven't got 'round to studying the exact dataflow algorithm var-tracking uses, I just figured I needed to do something along these lines. Maybe it does need tweaking, if I end up using it. I'm not sure yet it's going to make sense to use it for the more detailed tracking of copying that I'm going to have to do. > If you changed the meet operation to "oh, i don't know which of these > locations is right, it must be none of them", and did a little more > work you would inference the same info as yours *at the tree level* Intersection sounds like the right approach to me. I assumed var-tracking did this, except for unknowns. It's a bit trickier than this because var-tracking has to deal with a lot of incomplete information. But at least for vta values, we are going to have a complete picture, so we can be stricter when it comes to gimple reg variables. Now, whether the fact that we could infer the very same values at the tree level is relevant, I don't know. The tree level is neither source level nor the final executable code, so unless we can establish useful mappings from the tree level to both source level and final executable code, this information is of little use, no matter how true it is. > Nothing you have proposed is fundamentally going to give you better info. Except for what tree transformations currently discard, such as the points of the program in which variables are bound to values. This is indeed the one of the elements that the annotations are trying to preserve, that the compiler has not cared about preserving. (The other being expressions that end up not computed at run time, but that could still be computed by a debugger based on state available elsewhere) > All you have done is annotated the IR in some places to make explicit > some bits in the dataflow problem that you could inference anyway. Now, this is not true. I could infer values, yes, but I couldn't infer the variables they relate to, nor the point of binding. And debug information is not just about the values, it's about mapping variables to values and locations. So, we can't infer all the information we need. > There is absolutely no reason what you are trying to do needs to > modify the tree IR at all to achieve exactly the same accuracy of > debug info as your design proposes at the tree level. So far these claims have been unconvincing. I still get the feeling that you're missing some aspects of the problem, but I invite you to show me how the information available in the current IR could be used to generate accurate debug information for the two examples in the design document. Even if we leave the RTL aspect of it aside for a moment. I certainly wouldn't mind having to generate annotations only when we move from Trees to RTL, but I can't imagine how we'd reintroduce bindings at points that are not marked in the tree level, for variables that are (partially or entirely) gone from the tree IR. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dberlin@dberlin.org Wed Dec 19 06:18:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 19 Dec 2007 06:18:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> Message-ID: <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> On 12/19/07, Alexandre Oliva wrote: > On Dec 18, 2007, "Daniel Berlin" wrote: > > > Consider PRE alone, > > > If your debug statement strategy is "move debug statements when we > > insert code that is equivalent" > > Move? Debug statements don't move, in general. I'm not sure what you > have in mind, but I sense some disconnect here. OKay, so if you aren't going to move them, you have to erase them when you move statements around. > > > because our equivalence is based on value equivalence, not location > > equivalence. We only guarantee it has the same value as the > > whatever it is a copy of at that point, not that it has the same > > location. This is just a problem with an initial state and some propagation at each statement. How were you going to generate the initial set of debug annotations? This is how you get your initial state for your dataflow problem How were you going to update it if you saw a statement was updated to say x_5 = x_4 instead of x_5 = x_3 + x_2. The same operation you perform to update your annotations when you see x_5 = x_4 works whether you started with x_5 = x_3 + x_2 or not (it better, or else your updating will give different results for the same IR depending on how you got there, which is *incredibly* bad). So then how will using your debug annotations and updating them come out any different than say performing a value numbering pass where you also associate user variables with the ssa names (IE alongside our value numbers), and propagate them around as well? If you want to associate multiple user variables with a single SSA definition point, you can do that as well (use union instead of copy). You can do whatever you think is best at phi nodes (empty set if user var sets are not equal, or union them or intersect them). At the end, you could emit DEBUG(user var, ssa name) right after each SSA_NAME_DEF_STMT for all user vars in the user var set for ssa name. The right DEBUG statements would then appear at the points you can guarantee the user variable has the same *value* as the gimple register you've said it does. >From there, it is up to you to do what you like with the result. (it's late, so i may have described/ calculated the dataflow problem backwards, but you get the idea) This is, after all, more or less what PRE does for it's value numbering. It computes which things have the same value at what points in the program, then uses this after computing some more dataflow problems that say where this implies reuse. I don't see why you believe user variables/bindings are special and can't be propagated in this manner, given that you can't depend on the type of statement change that has occurred, only what the IR looks like after the statement change. Otherwise, again, the same IR and source may have different debug annotations depending on the set of changes you applied to get that IR from the initial IR, which is not good the standard reasons [maintainability, determinism, reproducibility, etc]. > > > #3 is a dataflow problem, and not something you want to do every time > > you insert a call. > > I'm not sure what you mean by "inserting calls". We don't do that. Sure we do. We will definitely insert new calls when we PRE const/pure calls, or calls we determine to be movable to the point we want to move them (using call clobbered results, etc). This will insert calls in latch blocks, above loops, in branch conditions This is not just movement. It is insertion of calls that did not exist in the source code at a given point, but are allowed to be executed at that point in the source code anyway. > Calls are present in the source code (even when implied by stuff like > TLS, OpenMP or builtins such as memcpy), and they're either kept > around, eliminated or inlined. No, we can and will insert new calls. Not just for PRE, but for profiling, devirtualization, struct reorg, SRA, etc struct reorg inserts new mallocs and frees profiling inserts profiling calls devirt will insert branches and new calls to replace virtual function calls SRA will insert memcpys to and from structures that were not there in user source before. i could go on if you like. I'm not sure why you believe all the calls that we end up with in the IR are actually in the source (or even implied by it). > > But I'm not computing that in trees. I'm just collecting and > maintaining data points for var-tracking, all the way from the tree > level. Okay, then for trees, why bother tracking it when you can compute it right before translation with the same accuracy you can if you update it every time you make statement changes? > > > All you have done is annotated the IR in some places to make explicit > > some bits in the dataflow problem that you could inference anyway. > > Now, this is not true. I could infer values, yes, but I couldn't > infer the variables they relate to, nor the point of binding See above. > And > debug information is not just about the values, it's about mapping > variables to values and locations. You have no locations at the tree level, and i've explicitly said what i said applies to the tree level :) > So, we can't infer all the > information we need. Again, i believe we can at the tree level. From rridge@csclub.uwaterloo.ca Wed Dec 19 09:13:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Wed, 19 Dec 2007 09:13:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071219091255.400BF73CC4@caffeine.csclub.uwaterloo.ca> Ross Ridge writes: > As I mentioned later in my message STACK_BOUNDARY shouldn't be defined in > terms of hardware, but in terms of the ABI. While the i386 allows the > stack pointer to bet set to any value, by convention the stack pointer > is always kept 4-byte aligned at all times. GCC should never generate > code that that would violate this requirement, even in leaf-functions > or transitorily during the prologue/epilogue. H.J. Lu writes: > From gcc internal manual I'm suggesting a different defintion of STACK_BOUNDARY which wouldn't, if strictly followed, result STACK_BOUNDARY being defined as 8 on the i386. The i386 hardware doesn't enforce a minimum alignment on the stack pointer. > Since x86 always push/pop stack by decrementing/incrementing address > size, it makes senses to define STACK_BOUNDARY as address size. The i386 PUSH and POP instructions adjust stack pointer the by the operand size of the instruction. The address size of the instruction has no effect. For example, GCC should never generate code like this: pushw $0 pushw %ax because the stack is temporarily misaligned. This could result in a signal, trap, interrupt or other asynchronous handler using a misaligned stack. In context of your proposal, defining STACK_BOUNDARY this way, as a requirement imposed on GCC by an ABI (or at least by convention), not the hardware, is important. Without an ABI requirement, there's nothing that would prohibit an i386 leaf function from adjusting the stack in a way that leaves the stack 1- or 2-byte aligned. Ross Ridge From ramana.r@gmail.com Wed Dec 19 09:13:00 2007 From: ramana.r@gmail.com (Ramana Radhakrishnan) Date: Wed, 19 Dec 2007 09:13:00 -0000 Subject: porting gcc to tic54x In-Reply-To: <20071216093432.R82190@dair.pair.com> References: <984587.48623.qm@web73407.mail.tp2.yahoo.com> <20071216093432.R82190@dair.pair.com> Message-ID: <67ea2eb0712182218g297c9d3bi6f2ad6770e9f63da@mail.gmail.com> > > I have been porting tic54x to gcc. I use gcc-4.2.2 version. I write some simplest c54x.h and c54x.c and a empty md, and I > > I think the answer is right there ^^^^^^^^^^ IIRC what you need as a bare minimum as a whole bunch of macros in the header file and a jump insn. Take a look at some of the tutorials on writing machine descriptions off the gcc wiki. There was a workshop about writing incremental machine descriptions held sometime back that can help you write one from scratch. http://gcc.gnu.org/wiki/GettingStarted http://www.cse.iitb.ac.in/~uday/gcc-workshop/?file=downloads HTH cheers Ramana On Dec 16, 2007 8:24 PM, Hans-Peter Nilsson wrote: > On Wed, 12 Dec 2007, a2220333 wrote: > > > hi, > > I have been porting tic54x to gcc. I use gcc-4.2.2 version. I write some simplest c54x.h and c54x.c and a empty md, and I > > I think the answer is right there ^^^^^^^^^^ > > > compile it to generate the tic54x-gcc compiler. > > > > But when I execute the compiler I generate I got a segmentation fault error. Is there anything must be define in c54x.c or > > c54x.h that could make the simplest compiler with no correct output and no errors? Because I want to add functions from this > > basic port. > > If that wasn't the bug, I suggest you start up gdb and step > through cc1, but I'd be surprised if you get anywhere without > the prerequisite move, add, and control flow insns in the .md. > > brgds, H-P > -- Ramana Radhakrishnan From rridge@csclub.uwaterloo.ca Wed Dec 19 10:06:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Wed, 19 Dec 2007 10:06:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071219091259.B7FDD73D10@caffeine.csclub.uwaterloo.ca> H.J. Lu writes: > What value did you use for -mpreferred-stack-boundary? The x86 backend > defaults to 16byte. On Windows the 16-byte default pretty much just wastes space, so I use -mpreferred-stack-boundary=2 where it might make a difference. In the case where I wanted to use SSE vector instructions, I explicitly used -mpreferred-stack-boundary=4 (16-byte alignment). >STACK_BOUNDARY is the minimum stack boundary. MAX(STACK_BOUNDARY, >PREFERRED_STACK_BOUNDARY) == PREFERRED_STACK_BOUNDARY. So the question is >if we should assume INCOMING == PREFERRED_STACK_BOUNDARY in all cases: Doing this would also remove need for ABI_STACK_BOUNDARY in your proposal. >Pros: > 1. Keep the current behaviour of -mpreferred-stack-boundary. > >Cons: > 1. The generated code is incompatible with the other object files. Well, your proposal wouldn't completely solve that problem, either. You can't guarantee compatiblity with object files compiled with different values -mpreferred-stack-boundary, including those compiled with current implementation, unless you assume the incomming stack is aligned to the lowest value the flag can have and align the outgoing stack to the highest value that the flag can have. Ross Ridge From pinskia@gmail.com Wed Dec 19 10:33:00 2007 From: pinskia@gmail.com (Andrew Pinski) Date: Wed, 19 Dec 2007 10:33:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071219031617.02C7073CC4@caffeine.csclub.uwaterloo.ca> References: <20071219031617.02C7073CC4@caffeine.csclub.uwaterloo.ca> Message-ID: On 12/18/07, Ross Ridge wrote: > Look at it another way. Lets say you were compiling x86_64 code with > -fpreferred-stack-boundary=3, an 8-byte PREFERRED alignment. Can we stop talking about x86/x86_64 specifics issues here? I have an use case for the PowerPC side of the Cell BE for variables greater than the normal stack boundary alignment of 16bytes. They need to be 128byte aligned for DMA transfering to the SPUs. I already proposed a patch [1] to fix this use case but I have not seen many replies yet. Thanks, Andrew Pinski [1] http://gcc.gnu.org/ml/gcc-patches/2007-05/msg01167.html From rridge@csclub.uwaterloo.ca Wed Dec 19 11:52:00 2007 From: rridge@csclub.uwaterloo.ca (Ross Ridge) Date: Wed, 19 Dec 2007 11:52:00 -0000 Subject: A proposal to align GCC stack Message-ID: <20071219103318.85B9273CC4@caffeine.csclub.uwaterloo.ca> Andrew Pinski writes: > Can we stop talking about x86/x86_64 specifics issues here? No. >I have an use case for the PowerPC side of the Cell BE for variables >greater than the normal stack boundary alignment of 16bytes. They need >to be 128byte aligned for DMA transfering to the SPUs. > >I already proposed a patch [1] to fix this use case but I have not >seen many replies yet. Complaining about someone talking about x86/x86_64 specific issues and then bringing up a PowerPC/Cell specific issue is probably not the best way to go about getting your patch approved. Ross Ridge From hariharans@picochip.com Wed Dec 19 13:16:00 2007 From: hariharans@picochip.com (Hariharan Sandanagobalane) Date: Wed, 19 Dec 2007 13:16:00 -0000 Subject: vliw scheduling - TImode bug? Message-ID: <47690607.7010304@picochip.com> Hello, I see quite a few instances when i get the following RTL. A conditional branch, followed by a BASIC_BLOCK note, followed by a non-TImode instruction. Theoretically, i should be allowed to package the non-TI instruction along with the conditional branch, but doing so seems to be produce incorrect results. Am i supposed to consider the NOTE_INSN_BASIC_BLOCK as a cycle-breaker? Or, is it a genuine bug in the way TImodes are set on instructions? (jump_insn:TI 144 225 17 2 /home/hariharans5/gcc-4.2.2/gcc/testsuite/gcc.c-torture/execute/931004-8.c:15 (parallel [ (set (pc) (if_then_else (le:HI (reg:CC 17 pseudoCC) (const_int 0 [0x0])) (label_ref 109) (pc))) (use (const_int 77 [0x4d])) ]) 10 {*branch} (nil) (expr_list:REG_DEAD (reg:CC 17 pseudoCC) (expr_list:REG_BR_PROB (const_int 500 [0x1f4]) (nil)))) (note 17 144 124 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (note 124 17 21 3 ("/home/hariharans5/gcc-4.2.2/gcc/testsuite/gcc.c-torture/execute/931004-8.c") 17) (insn 21 124 196 3 /home/hariharans5/gcc-4.2.2/gcc/testsuite/gcc.c-torture/execute/931004-8.c:15 (set (reg:HI 3 R3) (plus:HI (reg/f:HI 13 FP) (const_int 12 [0xc]))) 31 {*lea_move} (nil) (nil)) Thanks and regards Hari From joey.ye@intel.com Wed Dec 19 13:23:00 2007 From: joey.ye@intel.com (Ye, Joey) Date: Wed, 19 Dec 2007 13:23:00 -0000 Subject: A proposal to align GCC stack - update Message-ID: Thanks for Ross and HJ's comments. Here is updated proposal: Changes: - value of REQUIRED_STACK_BOUNDARY of leaf function - value of INCOMING_STACK_BOUNDARY -- 0. MOTIVATION -- Some local variables (such as of __m128 type or marked with alignment attribute) require stack aligned at a boundary larger than the default stack boundary. Current GCC partially supports this with limitations. We are proposing a new design to fully solve the problem. -- 1. CURRENT IMPLEMENTATION -- There are two ways current GCC supports bigger than default stack alignment. One is to make sure that stack is aligned at program entry point, and then ensure that for each non-leaf function, its frame size is aligned. This approach doesn't work when linking with libs or objects compiled by other psABI confirming compilers. Some problems are logged as PR 33721. Another is to adjust stack alignment at the entry point of a function if it is marked with __attribute__ ((force_align_arg_pointer)) or -mstackrealign option is provided. This method guarantees the alignment in most of the cases but with following problems and limitations: * Only 16 bytes alignment is supported * Adjusting stack alignment at each function prologue hurts performance unnecessarily, because not all functions need bigger alignment. In fact, commonly only those functions which have SSE variables defined locally (either declared by the user or compiler generated internal temporary variables) need corresponding alignment. * Doesn't support x86_64 for the cases when required stack alignment is > 16 bytes * Emits inefficient and complicated prologue/epilogue code to adjust stack alignment * Doesn't work with nested functions * Has a bug handling register parameters, which resulted in a cpu2006 failure. A patch is available as a workaround. -- 2. NEW PROPOSAL: DESIGN -- Here, we propose a new design to fully support stack alignment while overcoming above problems. The new design will * Support arbitrary alignment value, including 4,8,16,32... * Adjust function stack alignment only when necessary * Initial development will be on i386 and x86_64, but can be extended to other platforms * Emit more efficient prologue/epilogue code * Coexist with special features like dynamic stack allocation (alloca), nested functions, register parameter passing, PIC code and tail call optimization * Be able to debug and unwind stack 2.1 Support arbitrary alignment value Different source code and optimizations requires different stack alignment, as in following table: Feature Alignment (bytes) i386_ABI 4 x86_64_ABI 16 char 1 short 2 int 4 long 4/8* long long 8 __m64 8 __m128 16 float 4 double 8 long double 16 user specified any power of 2 *Note: 4 for i386, 8 for x86_64 The new design will support any alignment value in this table. 2.2 Adjust function stack alignment only when necessary Current GCC defines following macros related to stack alignment: i. STACK_BOUNDARY in bits, which is preferred by hardware, 32 for i386 and 64 for x86_64. It is the minimum stack boundary. It is fixed. ii. PREFERRED_STACK_BOUNDARY. It sets the stack alignment when calling a function. It may be set at command line and has no impact on stack alignment at function entry. This proposal requires PREFERRED >= STACK, and by default set to ABI_STACK_BOUNDARY This design will define a few more macros, or concepts not explicitly defined in code: iii. ABI_STACK_BOUNDARY in bits, which is the stack boundary specified by psABI, 32 for i386 and 128 for x86_64. ABI_STACK_BOUNDARY >= STACK_BOUNDARY. It is fixed for a given psABI. iv. LOCAL_STACK_BOUNDARY in bits. Each function stack has its own stack alignment requirement, which depends the alignment of its stack variables, LOCAL_STACK_BOUNDARY = MAX (alignment of each effective stack variable). v. INCOMING_STACK_BOUNDARY in bits, which is the stack boundary at function entry. If a function is marked with __attribute__ ((force_align_arg_pointer)) or -mstackrealign option is provided, INCOMING = STACK_BOUNDARY. Otherwise, INCOMING == PREFERRED_STACK_BOUNDARY. For those function whose PREFERRED is larger than ABI, it is the caller's responsibility to invoke them with appropriate PREFERRED. vi. REQUIRED_STACK_ALIGNMENT in bits, which is stack alignment required by local variables and calling other function. REQUIRED_STACK_ALIGNMENT == MAX(LOCAL_STACK_BOUNDARY,PREFERRED_STACK_BOUNDARY) in case of a non-leaf function. For a leaf function, REQUIRED_STACK_ALIGNMENT == MAX(LOCAL_STACK_BOUNDARY,STACK_BOUNDARY). This proposal won't adjust stack when INCOMING_STACK_BOUNDARY >= REQUIRED_STACK_ALIGNMENT. Only when INCOMING_STACK_BOUNDARY < REQUIRED_STACK_ALIGNMENT, it will adjust stack to REQUIRED_STACK_ALIGNMENT at prologue. 2.3 Initial development on i386 and x86_64 We initially support i386 and x86_64. In this document we focus more on i386 because it is hard to implement because of the restriction of having a small register file. But all that we discuss can be easily applied to x86_64. 2.4 Emit more efficient prologue/epilogue When a function needs to adjust stack alignment and has no dynamic stack allocation, this design will generate following example prologue/epilogue code: IA32 example Prologue: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $4, %esp ; is $-4 the local stack size? Epilogue: movl %ebp, %esp popl %ebp ret Locals will be addressed as esp + offset and parameters as ebp + offset. Add x86_64 example here. Thus BP points to parameter frame and SP points to local frame. 2.5 Coexist with special features Stack alignment adjustment will coexist with varying GCC features that have special calling conventions and frame layout, such as dynamic stack allocation (alloca), nested functions and parameter passing via registers to local functions. I386 hard register usage is the major problem to make the proposal friendly to various GCC features. This design requires an additional hard register in prologue/epilogue in case of dynamic stack allocation. Because I386 PIC requires BX as GOT pointer and I386 may use AX, DX and CX as parameter passing registers, there are limited candidates for this proposal to choose. Current proposal suggests EDI, because it won't conflict with i386 PIC or regparm. X86_64 is much easier. This proposal just chooses RBX. 2.5.1 When stack alignment adjustment comes together with alloca, following example prologue/epilogue will be emitted: Prologue: pushl %edi // Save callee save reg edi leal 8(%esp), %edi // Save address of parameter frame andl $-16, %esp // Align local stack // Reserve two stack slots and save return address // and previous frame pointer into them. By // pointing new ebp to them, we build a pseudo // stack for unwinding. pushl $4(%edi) // save return address pushl %ebp // save old ebp movl %esp, %ebp // point ebp to pseudo frame start subl $24, %esp // adjust local frame size movl %edi, vreg1 epilogue: movl vreg1, %edi movl %ebp, %esp // Restore esp to pseudo frame start popl %ebp leal -8(%edi), %esp // restore esp to real frame start popl %edi // Restore edi ret Locals will be addressed as ebp - offset, parameters as vreg1 + offset Where BX is used to set up virtual parameter frame pointer, BP points to local frame and SP points to dynamic allocation frame. 2.5.2 Nested functions will automatically work because it uses CX as static pointer, which won't conflict with any registers used by stack alignment adjustment, even when nested functions are called via function pointer and a function stub on stack. 2.5.3 GCC may optimize to use registers to pass parameters . At most AX, DX and CX will be used. Such optimization won't conflict with stack alignment adjustment thus it should automatically work. 2.5.4 I386 PIC uses EBX as GOT pointer. This design work well under i386 PIC: For example: i686 Prologue: pushl %edi leal 8(%esp), %edi andl $-16, %esp pushl $4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp call .L1 .L1: popl %ebx movl %edi, vreg1 Body: // code for alloca movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax i686 Epilogue: movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret Locals will be addressed as ebp - offset, parameters as vreg1 + offset, ebx has the GOT pointer. 2.6 Debug and unwind will work since DWARF2 has the flexibility to define different frame pointers. 2.7 Some intrinsics rely on stack layout. Need to handle them accordingly. They are __builtin_return_address, __builtin_frame_address. This proposal will setup pseudo frame slot to help unwinder find return address and parent frame address by emit following prologue code after adjusting alignment: pushl $4(%edi) pushl %ebp -- 3. NEW PROPOSAL: IMPLEMENTATION -- The proposed implementation can be partitioned into following subtasks. * Alignment requirement collection * Frames addressing * Alignment code generation * Debug and unwind information 3.1 Collect alignment requirement Collecting each function's alignment requirement from frontend or from optimization passes like vectorizer, and informing backend. Current GCC uses cfun->stack_alignment_needed to store MIN(largest stack variable alignment, PREFERRED_STACK_BOUNDARY). We will reuse this field and define its value only as "largest stack variable alignment" 3.2 Frames addressing Adding parameter frame, local frame, static frame and dynamic frame with appropriate pointers, either hard registers or virtual registers. Backend will customize CAN_ELIMINATE hook to assign hard registers to corresponding virtual registers. 3.3 Alignment code generation Emit prologue/epilogue code to guarantee correct stack alignment based on each function's alignment requirement collected previously. Modification should happen in ix86_expand_prologue and ix86_expand_epilogue. Code to be emitted can follow above design in a straight forward manner. 3.4 Debug information Emit debug and unwind information for aligned stacks. It also happens in ix86_expand_prologue and ix86_expand_epilogue corresponding the prologue/epilogue code emitted. 4. Code Example Simply function: void foo() { volatile int local; ... } i686 Prologue: pushl %ebp movl %esp, %ebp subl $4, %esp // Adjust local frame size by 4 i686 Epilogue: movl %ebp, %esp popl %ebp ret x86_64 Prologue: pushq %rbp movq %rsp, %rbp subq $16, %rsp x86_64 Epilogue: movl %rbp, %rsp popl %rbp ret Pure 16 bytes align: void foo() { volatile __m128 m = _mm_set_ps1(0.f); } i686 Prologue: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $16, %esp // this is space for m, 16 byte aligned i686 Epilogue: movl %ebp, %esp popl %ebp ret x86_64 Prologue: pushq %rbp movq %rsp, %rbp andq $-16, %rsp subq $16, %rsp x86_64 Epilogue: movl %rbp, %rsp popl %rbp ret 16 bytes align with alloca: void foo(int size) { char * ptr=alloca(size); volatile int __attribute((aligned(32)) m = 0; ... } i686 Prologue: pushl %edi leal 8(%esp), %edi andl $-32, %esp pushl $4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp Body: // code for alloca movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax i686 Epilogue: movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret void foo(int dummy1, int dummy2, int dummy3, int dummy4, int dummy5, int dummy6, int size) { char * ptr=alloca(size); volatile int __attribute((aligned(32)) m = 0; ... } x86_64 Prologue: pushq %rbx leaq $16(%rsp), %rbx andq $-32, %rsp pushq 8(%rbx) pushq %rbp movq %rsp, %rbp subq $24, %rsp Body: movq %rbx, vreg1 movl (vreg1), %eax subq %rax, %rsp andq $-16, %rsp movq %rsp, %rax x86_64 Epilogue: movl %rbp, %rsp popl %rbp movl %rbx, %rsp popl %rbx ret m128 and PIC int g_i; void foo() { volatile __m128 m = _mm_set_ps1(0.f); g_i = 123; ... } i686 Prologue: pushl %ebp movl %esp, %ebp andl $-16, %esp pushl %ebx subl $16, %esp call .L1 .L1: popl %ebx ... i686 Epilogue: addl $16, %esp popl %ebx movl %ebp, %esp popl %ebp ret m128 + alloca + PIC void foo(int size) { char * ptr=alloca(size); volatile __m128 m = _mm_set_ps1(0.f); ... } i686 Prologue: pushl %edi leall 8(%esp), %edi andl $-16, %esp pushl 4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp call .L1 .L1: popl %ebx Body: movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax i686 Epilogue: movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret m128 + alloca + PIC + library call void foo(int size) { char * ptr=alloca(size); volatile __m128 m = _mm_set_ps1(0.f); printf("Hello\n"); ... } i686 Prologue: pushl %edi leal 8(%esp), %edi andl $-16, %esp pushl 4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp call .L1 .L1: popl %ebx i686 Body: movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax Body: call printf@PLT i686 Epilogue: movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret m128 and nested function and PIC void foo() { void bar(int arg1, int arg 2) { volatile __m128 m = _mm_set_ps1(0.f); ... } bar(1,2); } i686: foo: ... movl %ebp, %ecx call bar@PLT ... bar: pushl %edi leal 8(%esp), %edi andl $-16, %esp pushl 4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp call .L1 .L1: popl %ebx movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax ... movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret m128, dynamic stack alloc and register parameter function call static void bar(int arg1, int arg 2, int arg3) { char * ptr=alloca(size); volatile __m128 m = _mm_set_ps1(0.f); ... } void foo() { bar(1,2,3); } i686 foo: movl $1, %eax movl $2, %edx movl $3, %ecx call bar ... bar: pushl %edi leal 8(%esp), %edi andl $-16, %esp pushl $4(%edi) pushl %ebp movl %esp, %ebp subl $24, %esp movl %edi, vreg1 movl (vreg1), %eax subl %eax, %esp andl $-16, %esp movl %esp, %eax ... movl %ebp, %esp popl %ebp leal -8(%edi), %esp popl %edi ret Thanks - Joey From lopezibanez@gmail.com Wed Dec 19 14:22:00 2007 From: lopezibanez@gmail.com (=?ISO-8859-1?Q?Manuel_L=F3pez-Ib=E1=F1ez?=) Date: Wed, 19 Dec 2007 14:22:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> Message-ID: <6c33472e0712190521w5d995eeer8aa1947582992fe2@mail.gmail.com> On 19/12/2007, Steven Bosscher wrote: > > The current list of "All regressions" should be a list of bugs that > people are actively trying to resolve, preferably before the release > of GCC 4.3. Instead, it is a mix of high-activity bug reports and bug > reports where even the target maintainer has been unwilling for 3.5 > years to spend some time on resolving the bug report. So to pick a bug > report to work on, I need to go through the but report summaries of a > long list, trying to pick out new regressions between the old > no-one-cares P4 and P5 regressions. > I am sorry but I don't understand how this can be possible. Old no-one-cares have a lower ID than new ones. So if you start with the list backwards you should always get the newer ones. Also, PRs that are regressions for 4.3 only cannot be that old (but perhaps they are no-one-cares). On the other hand, there are around 1003 PRs UNCONFIRMED. Those are annoying. > Maybe it is just me, but I find it very annoying to have to wade > through long bug lists, so I just don't do this. Instead I just don't > look at P4/P5 regressions anymore at all. It's just too much trouble > to find a bug report where the reporter or the target maintainer cares > as much as you do about resolving the bug. Well, perhaps instead of 2 lists: Serious regressions and All regressions. We should have 3 lists: High priority, Medium Priority, Low priority. High priority is the same as Serious regressions, Medium are P4 and P5 and Low priority are those that you just described (P6?). Anyway, I don't typically look at those lists. I create my own customized searches and save them. Cheers, Manuel. From hjl@lucon.org Wed Dec 19 14:30:00 2007 From: hjl@lucon.org (H.J. Lu) Date: Wed, 19 Dec 2007 14:30:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071219091255.400BF73CC4@caffeine.csclub.uwaterloo.ca> References: <20071219091255.400BF73CC4@caffeine.csclub.uwaterloo.ca> Message-ID: <20071219142214.GA19661@lucon.org> On Wed, Dec 19, 2007 at 04:12:55AM -0500, Ross Ridge wrote: > > I'm suggesting a different defintion of STACK_BOUNDARY which wouldn't, > if strictly followed, result STACK_BOUNDARY being defined as 8 on > the i386. The i386 hardware doesn't enforce a minimum alignment on the > stack pointer. On i386, you can only push/pop 2 or 4 bytes. On x86-64, you can only push/pop 2 or 8 bytes. > stack. In context of your proposal, defining STACK_BOUNDARY this way, > as a requirement imposed on GCC by an ABI (or at least by convention), > not the hardware, is important. Without an ABI requirement, there's > nothing that would prohibit an i386 leaf function from adjusting the > stack in a way that leaves the stack 1- or 2-byte aligned. > I don't mind changing the definition of STACK_BOUNDARY. It won't affect our proposal. However, please don't use ABI when defining STACK_BOUNDARY since a given hardware can have more than one ABIs and only one STACK_BOUNDARY. H.J. From hjl@lucon.org Wed Dec 19 15:32:00 2007 From: hjl@lucon.org (H.J. Lu) Date: Wed, 19 Dec 2007 15:32:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071219091259.B7FDD73D10@caffeine.csclub.uwaterloo.ca> References: <20071219091259.B7FDD73D10@caffeine.csclub.uwaterloo.ca> Message-ID: <20071219142955.GB19661@lucon.org> On Wed, Dec 19, 2007 at 04:12:59AM -0500, Ross Ridge wrote: > > >STACK_BOUNDARY is the minimum stack boundary. MAX(STACK_BOUNDARY, > >PREFERRED_STACK_BOUNDARY) == PREFERRED_STACK_BOUNDARY. So the question is > >if we should assume INCOMING == PREFERRED_STACK_BOUNDARY in all cases: > > Doing this would also remove need for ABI_STACK_BOUNDARY in your proposal. In our proposal, ABI_STACK_BOUNDARY provides the default value for PREFERRED_STACK_BOUNDARY. It can be different for different OSes. For a given OS, you can change PREFERRED_STACK_BOUNDARY. But you can't change ABI_STACK_BOUNDARY. You can think it as software STACK_BOUNDARY. > > >Pros: > > 1. Keep the current behaviour of -mpreferred-stack-boundary. > > > >Cons: > > 1. The generated code is incompatible with the other object files. > > Well, your proposal wouldn't completely solve that problem, either. > You can't guarantee compatiblity with object files compiled with different > values -mpreferred-stack-boundary, including those compiled with current > implementation, unless you assume the incomming stack is aligned to > the lowest value the flag can have and align the outgoing stack to the > highest value that the flag can have. We can align the outgoing stack to PREFERRED_STACK_BOUNDARY and assume INCOMING = MIN (ABI_STACK_BOUNDARY, PREFERRED_STACK_BOUNDARY), which is our original proposal. H.J. From rask@sygehus.dk Wed Dec 19 15:59:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Wed, 19 Dec 2007 15:59:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> Message-ID: <20071219153201.GH17368@sygehus.dk> On Wed, Dec 19, 2007 at 01:59:51AM +0100, Steven Bosscher wrote: > The current list of "All regressions" should be a list of bugs that > people are actively trying to resolve, preferably before the release > of GCC 4.3. No, it should be exactly what it says it is. If you want an additional list of bugs that are being actively worked on (and labelled as such), that's fine (although I have no idea how that list would be useful). > Instead, it is a mix of high-activity bug reports and bug > reports where even the target maintainer has been unwilling for 3.5 > years to spend some time on resolving the bug report. That may be an indication that maintainership should be passed on to someone else. I don't see how it can be an indication that the bug should not be fixed. > So to pick a bug > report to work on, I need to go through the but report summaries of a > long list, trying to pick out new regressions between the old > no-one-cares P4 and P5 regressions. PR numbers are assigned in ascending order. The newest regressions have the highest numbers. What exactly is the problem you're facing when starting with the highest-numbered PRs? > To me, the situation is quite clear: If a bug report is open for so > long, and even the reporter and the responsible maintainer show no > sign of motivation to work on resolving the bug, I think this tells us > something about how important this bug is: Not important enough to > fix. IMOH we should close such reports as WONTFIX or SUSPENDED to > make them less visible, so that other bug reports don't fall through > the cracks. > > So I'm asking for a policy here that says when it is OK to resolve old > bug without progress as WONTFIX or SUSPENDED. Start shooting. Having assigned myself to and/or posted patches for some of the bugs you want to close as WONTFIX, including four which have four-digit PR numbers, my response is predictable: No way. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From dberlin@dberlin.org Wed Dec 19 16:01:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 19 Dec 2007 16:01:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> References: <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> Message-ID: <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> On 12/19/07, Daniel Berlin wrote: > On 12/19/07, Alexandre Oliva wrote: > > On Dec 18, 2007, "Daniel Berlin" wrote: > > > > > Consider PRE alone, > > > > > If your debug statement strategy is "move debug statements when we > > > insert code that is equivalent" > > > > Move? Debug statements don't move, in general. I'm not sure what you > > have in mind, but I sense some disconnect here. > > OKay, so if you aren't going to move them, you have to erase them when > you move statements around. > Besides this, how do you plan on handling the following situations (both of which reassoc performs *right now*). These are the relatively easy ones Here is the easy one: z_5 = a_3 + b_3 x_4 = z_5 + c_3 DEBUG(x, x_4) Reassoc may transform this into: z_5 = c_3 + b_3 x_4 = z_5 + a_3 DEBUG(x, x_4) Now x has the wrong value. At least in this case, you can tell which DEBUG statement to eliminate easily (it is an immediate use of x_4) It gets worse, however c_3 = a_1 + b_2 z_5 = c_3 + d_9 x_4 = z_5 + e_10 DEBUG(x, x_4) y_7 = x_4 + f_11 z_8 = y_7 + g_12 -> c_3 = a_1 + b_2 z_5 = c_3 + g_12 x_4 = z_5 + e_10 DEBUG(x, x_4) y_7 = x_4 + f_11 z_8 = y_7 + d_9 x_4 now no longer represents the value of x, but we haven't directly changed x_4, it's immediate users, or the statements that immediately make up it's defining values. How do you propose we figure out which DEBUG statements we may have affected without doing all kinds of walks? (This is of course, a more general problem of how do i find which debug statements are reached by my transformation without doing linear walks) From dberlin@dberlin.org Wed Dec 19 16:12:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 19 Dec 2007 16:12:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181519rb637c16oea78bbcc18abe097@mail.gmail.com> Message-ID: <4aca3dc20712190801o16d5ff60i72c9407fee026a09@mail.gmail.com> On 12/18/07, Alexandre Oliva wrote: > On Dec 18, 2007, "Daniel Berlin" wrote: > > >> int c = z; > >> whatever0(c); > >> c = x; > > > Because you have added information you have no way of knowing. > > How exactly did you compute that the call *definitely sets c to the > > value of z_0*, and definitely sets the value of c to x_2. > > Err... I guess you're thinking memory, global variables, alias > analysis and that sort of stuff. > Yes, i mixed your examples up, i apologize. > None of this applies to gimple registers, which is all the annotations > are about. > > > > However, value equivalene does not imply location equivalence, and all > > of our debug formats deal with locations of variables, except for > > constants. > > Dwarf enables arbitrary value expressions too. Well, uh, no. The only way to directly specify the value of a variable is for constants. DW_AT_const_value does not allow location descriptions. "An entry describing a variable or formal parameter whose value is constant and not represented by an object in the address space of the program, or an entry describing a named constant, does not have a location attribute. Such entries have a DW_AT_const_value attribute, whose value may be a string or any of the constant data or data block forms, as appropriate for the representation of the variable's value. The value of this attribute is the actual constant value of the variable, represented as it would be on the target architecture. " There are no other provisions in DWARF for describing the value of a variable, it is expected you describe their locations using DW_AT_location (which gives you the full power of location descriptions, but requires you to return a location, not a value) > There's some > discussion about lvalue vs rvalue in the document, and this is also > something that will take some experimenting. I'm not entirely sure > where to draw the line, and I'm not entirely sure there is a perfect > answer. I'm still curious where you think it describes value expressions for variables other than constants (which again, can't use the location description language) Again, i'd support such an extension, but it does not currently exist. Rest answers in other message. From amacleod@redhat.com Wed Dec 19 16:29:00 2007 From: amacleod@redhat.com (Andrew MacLeod) Date: Wed, 19 Dec 2007 16:29:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> References: <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> Message-ID: <47694256.8000303@redhat.com> Daniel Berlin wrote: > > Here is the easy one: > > z_5 = a_3 + b_3 > x_4 = z_5 + c_3 > > DEBUG(x, x_4) > > > Reassoc may transform this into: > > > z_5 = c_3 + b_3 > x_4 = z_5 + a_3 > > DEBUG(x, x_4) > > Now x has the wrong value. > ?? x_4 looks like it has the value 'a_3 + b_3 + c_3' in both examples to me, although computed in different orders... so isn't that still the right value? Andrew From dnovillo@google.com Wed Dec 19 17:22:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Wed, 19 Dec 2007 17:22:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <20071218132914.GB12527@atrey.karlin.mff.cuni.cz> References: <47603F3C.2090808@google.com> <20071218132914.GB12527@atrey.karlin.mff.cuni.cz> Message-ID: <476946B8.9030409@google.com> On 12/18/07 08:29, Jan Hubicka wrote: > Doing call graph changes should not be that hard (I was trying to keep > similar deisgn in mind when implementing it, even if we stepped away > from the plan in some cases, like reorganizing passes from vertical to > horisontal order). Nearest problem I see is merging different > declarations of units read back, I have prototype implementation of DECL > merging pass done from my trip this week and hope to have it working at > least for --combine and C during christmas. Cool. Yeah, that is going to be one of the main things we need to continue. For the next little while I will be working on finishing tuples, most of what remains are mechanical changes to get bootstraps going. I will then work on tuning RTL generation. Since we have these two ongoing branches (LTO and tuples) that will be used by whole program optimizer, I think we need to coordinate a little bit. I wrote up a wiki page to keep all these things linked from one place. http://gcc.gnu.org/wiki/whopr I started a very incomplete implementation plan that I would like folks to help fill in. Ken/Nathan, what are the major issues still missing in LTO? I wrote up a couple, but I'm sure you guys have a much more complete list. Jan, wrt the optimization plan coming out of the analysis phase, and the various pieces of header/summary information, what do you think are the major pieces we need? In terms of branch mechanics, I'm initially tempted to do this implementation on a branch separate from tuples and lto. This will allow us to merge both lto and tuples separately, as the rest of the optimizer is still a long ways away. What do folks think? Thanks. Diego. From aph@redhat.com Wed Dec 19 17:29:00 2007 From: aph@redhat.com (Andrew Haley) Date: Wed, 19 Dec 2007 17:29:00 -0000 Subject: Strange error message from gdb Message-ID: <18281.21294.238761.442229@zebedee.pink> Die: DW_TAG_interface_type (abbrev = 23, offset = 4181) has children: FALSE attributes: DW_AT_declaration (DW_FORM_flag) flag: TRUE Dwarf Error: Cannot find type of die [in module /home/aph/a.out] I suppose this means that gcj is generating bad debug info, but I don't know what it's complaining about exactly, so I don't know how to fix it. Here's the abbrev in question: <1><1055>: Abbrev Number: 23 (DW_TAG_interface_type) <1056> DW_AT_declaration : 1 Clues welcome... Thanks, Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From drow@false.org Wed Dec 19 17:54:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Wed, 19 Dec 2007 17:54:00 -0000 Subject: Strange error message from gdb In-Reply-To: <18281.21294.238761.442229@zebedee.pink> References: <18281.21294.238761.442229@zebedee.pink> Message-ID: <20071219172943.GA5939@caradoc.them.org> On Wed, Dec 19, 2007 at 05:21:50PM +0000, Andrew Haley wrote: > Die: DW_TAG_interface_type (abbrev = 23, offset = 4181) > has children: FALSE > attributes: > DW_AT_declaration (DW_FORM_flag) flag: TRUE > Dwarf Error: Cannot find type of die [in module /home/aph/a.out] > > I suppose this means that gcj is generating bad debug info, but I > don't know what it's complaining about exactly, so I don't know how to > fix it. > > Here's the abbrev in question: > > <1><1055>: Abbrev Number: 23 (DW_TAG_interface_type) > <1056> DW_AT_declaration : 1 That DIE doesn't have any content. It says "I am a declartion of an interface". But not which interface or what it's called or what the type is. I'd need a backtrace to be more specific, but in addition to bad debug info this may be a limitation in GDB; it does not know anything about DW_TAG_interface_type. -- Daniel Jacobowitz CodeSourcery From aph@redhat.com Wed Dec 19 18:01:00 2007 From: aph@redhat.com (Andrew Haley) Date: Wed, 19 Dec 2007 18:01:00 -0000 Subject: Strange error message from gdb In-Reply-To: <20071219172943.GA5939@caradoc.them.org> References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> Message-ID: <18281.23234.151870.816362@zebedee.pink> Daniel Jacobowitz writes: > On Wed, Dec 19, 2007 at 05:21:50PM +0000, Andrew Haley wrote: > > Die: DW_TAG_interface_type (abbrev = 23, offset = 4181) > > has children: FALSE > > attributes: > > DW_AT_declaration (DW_FORM_flag) flag: TRUE > > Dwarf Error: Cannot find type of die [in module /home/aph/a.out] > > > > I suppose this means that gcj is generating bad debug info, but I > > don't know what it's complaining about exactly, so I don't know how to > > fix it. > > > > Here's the abbrev in question: > > > > <1><1055>: Abbrev Number: 23 (DW_TAG_interface_type) > > <1056> DW_AT_declaration : 1 > > That DIE doesn't have any content. It says "I am a declartion of an > interface". But not which interface or what it's called or what the > type is. Well, the type is the interface: there's nothing else it might be. > I'd need a backtrace to be more specific, but in addition to bad > debug info this may be a limitation in GDB; it does not know > anything about DW_TAG_interface_type. OK, thanks. Anyway, on inspection it seems like read_type_die() in dwarf2read.c doesn't know how to handle DW_TAG_interface_type. This is rather odd, given that dwarf_tag_name() does know about interface types. Maybe I should just fix gcj not to use DW_TAG_interface_type. Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From iant@google.com Wed Dec 19 18:41:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Wed, 19 Dec 2007 18:41:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: Alexandre Oliva writes: > You snipped (skipped?) one aspect of the reasoning on why it is > appropriate. Of course this doesn't prove it's the best possibility, > but I haven't seen evidence of why it isn't. You will find it easier to demonstrate the worth of your proposal if you act publically as though your interlocutors are people of good will, even when it doesn't seem that way to you, and omit interjections like "(skipped?)". Assuming the goal is to get this into mainline gcc, you have to convince us, not the other way around. The first step in convincing people in this forum is not to irritate them. > Now, if you tell me that information about i_0 and j_2 is > backward-propagated to the top of the function, where x and y are set > up, I introduce say zero-initialization for i and j before probe1() > (an actual function call, mind you), and then this representation is > provably broken. To be sure we are on the same page, I think your argument here is that with this code: int f(int x, int y) { int i = 0, j = 0; probe1(); i = x; j = y; probe2(); if (x < y) i += y; else j -= x; probe3(); return g (i ,j); } if I set a breakpoint just before the call to probe2(), and I print the values of 'i' and 'j', I should get the values of 'x' and 'y'. That is, you want to emit a DWARF variable note at that point that the value of 'i' can be found in the location corresponding to 'x'. Of course there are no actual instructions between the calls to probe1() and probe2(). If I use gdb's "finish" command out of probe1(), what values should I see for 'i' and 'j' at that point? Arguably I am now before the assignment statements, and should see '0' and '0', the values that 'i' and 'j' have before they are changed. Of course, this is the same location as the breakpoint before probe2(), and we can't see both '0'/'0' and 'x'/'y'. So it seems to me that this situation is actually somewhat ambiguous. I don't see an obviously correct answer. Setting that aside, seeing the values 'x' and 'y' would probably be more useful in practice, even if the other possibility is not wrong. I think the general issue you are describing is how to handle an assignment which appears in user code but which has been eliminated during optimization. You are certainly correct: the scheme I was outlining does not address deleted assignments. It seems to me that such eliminated assignments are inherently ambiguous. If the assignment is gone, then there is a point in the generated code where the variable logically has both the old and the new values. I assume that the debugger can only display one value. Which one should it be? Your representation clearly makes a choice. What makes it a principled choice? Consider a series of assignments to a local variable, and suppose that all the assignments are deleted becaues they are unused. Are there dependencies between the DEBUG notes which keep them in the right order? One way to make a principled choice is to consider the line notes we are going to emit with the debugging information. Presumably we do not have the goal of emitting correct debug information in between line notes--e.g., when using the "stepi" command in gdb. Our goal is to emit correct debug information at the points where a debugger would naturally stop--the notes for where a line starts. I wonder whether it would be feasible for the debug info generation to work from the assignments in the source code as generated by the frontend. For each assignment, we would find the corresponding line note. Then we would look at the right hand side, and try to identify where that value could be found at that point in the program. This would be a variant of our current variable tracking pass. I haven't thought about this enough to know whether it would really work. > > It is of course true that optimized code will move around > > unpredictably, and your proposal doesn't handle that. > > It handles that in that a variable will be regarded as being assigned > to a value when execution crosses the debug stmt/insn originally > inserted right after the assignment. This is by design, but I realize > now I forgot to mention this in the design document. > > The idea is that, debug insns get high priority in scheduling. > However, since they mention the assignment just before them, if the > assignment is just moved earlier, without an intervening scheduling > barrier, then the debug instruction will follow it. If the assignment > is removed, then the debug insn can be legitimately be move up to the > point where the assignment, if remaining, might have been moved up to. > However, if the assignment is moved to a separate basic block, say out > of a loop or a conditional, then we don't want the debug insn to move > with it: such that hoisting and commonizing are regarded as setting > temporaries, and the value is only "committed" to the variable if we > get to the point where the assignment would take place. That will only work correctly if sched-deps.c introduces dependencies between debug insns and real insns. Otherwise, debug insns will move ahead of real insns which change their values. If you introduce those dependencies, I don't understand how you will avoid changing the schedulers behaviour in the presence of debug insns. How did you work around that problem? > >> Testing for accuracy and completeness of debug information can be best > >> accomplished using a debugging environment. > > > Of course this is very unsatisfactory without an automated testsuite. > > Err... I didn't say the testing through a debugging environment > wouldn't be automated. My plan is to use something along the lines of > the GDB testsuite scripts, but whether to use GDB or some other > debugging or monitoring infrastructure is a tiny implementation detail > that I haven't worried about at all. The basic idea is to script the > inspection of variables and verify that the obtained values are the > expected ones, or that variables are defensibly unavailable at the > inspection points. Personally, I would like to see that testsuite first. That will give us an operational definition to aim for, rather than a theoretical discussion which I find to be ambiguous. And it will avoid the problem of turning the testsuite into a regression testsuite rather than an accuracy testsuite. But of course I'm not doing the work. Ian From zadeck@naturalbridge.com Wed Dec 19 18:55:00 2007 From: zadeck@naturalbridge.com (Kenneth Zadeck) Date: Wed, 19 Dec 2007 18:55:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <476946B8.9030409@google.com> References: <47603F3C.2090808@google.com> <20071218132914.GB12527@atrey.karlin.mff.cuni.cz> <476946B8.9030409@google.com> Message-ID: <476965CC.5050301@naturalbridge.com> Diego Novillo wrote > On 12/18/07 08:29, Jan Hubicka wrote: > >> Doing call graph changes should not be that hard (I was trying to keep >> similar deisgn in mind when implementing it, even if we stepped away >> from the plan in some cases, like reorganizing passes from vertical to >> horisontal order). Nearest problem I see is merging different >> declarations of units read back, I have prototype implementation of DECL >> merging pass done from my trip this week and hope to have it working at >> least for --combine and C during christmas. > > Cool. Yeah, that is going to be one of the main things we need to > continue. For the next little while I will be working on finishing > tuples, most of what remains are mechanical changes to get bootstraps > going. I will then work on tuning RTL generation. > > Since we have these two ongoing branches (LTO and tuples) that will be > used by whole program optimizer, I think we need to coordinate a > little bit. I wrote up a wiki page to keep all these things linked > from one place. > > http://gcc.gnu.org/wiki/whopr > > I started a very incomplete implementation plan that I would like > folks to help fill in. > > Ken/Nathan, what are the major issues still missing in LTO? I wrote > up a couple, but I'm sure you guys have a much more complete list. > > Jan, wrt the optimization plan coming out of the analysis phase, and > the various pieces of header/summary information, what do you think > are the major pieces we need? > > In terms of branch mechanics, I'm initially tempted to do this > implementation on a branch separate from tuples and lto. This will > allow us to merge both lto and tuples separately, as the rest of the > optimizer is still a long ways away. What do folks think? > > > Thanks. Diego. I am hoping that in the next couple of days, Nathan and I will be able to say that we have completed to work that Codesourcery/NaturalBridge contracted to do with IBM. Completion means that we are able to compile and run the C language spec 2000 benchmarks in LTO mode, as well as compile all of the gcc compiler itself (this does not include the runtime). There are still many open issues that we are hoping that the community would address (The next four items are considered general cleanups/improvements independent of LTO and would be welcomed as changes to the truck when stage I opens. However a complete LTO implementation depends on them being completed): 1) Removal of the rest of the lang hooks. 2) Removal of support for not file at time mode (I believe that IanT has a patch for this.) 3) Removal of any remaining places where the front ends directly generate rtl. 4) Gimplifying static initializers at the same time as everything else. When these 4 items are done, it will be possible to consider the making lto work with other front ends. There are still LTO items that do not work with the C front ends. Most of these support extensions to C. 1) We do not handle types that reference local variables. Such as arrays that are sized by the parameter to a function. 2) Nested functions. 3) Attributes associated with types, like packed. (1) may be hard. The rest of a simple matter of programming. There is still a matter that it is difficult to separate the LTO type information from the debugging information. There are a large number of things that need to change to make lto/whopr a reality. Many of them are addressed in the google document. I personally was planning to start restructuring the ipa passes and serializing the cgraph. I was waiting for Honza to get back to being regularly available so that we could work on that together. The current code does not need serialize the cgraph since it loads all functions into memory, the call graph is just rebuilt as each function is loaded. This obviously needs to be changed before we can at all talk about distributing the compilation. I personally think that the most pressing problems are 1) making lto/whopr work in the presence of modules that do not fit perfectly together, because of type or function argument mismatches. I think that this will be a challenging problem that will require a lot of thought and code. The easy case of just dying when things do not match up is ok, but it is unlikely that lto/whopr will be a generally useful tool without at least being able to swallow any existing program and at least do try to do something good. 2) making the front end specific aliasing information available in some language independent manner to the back ends. Gcc is basically a C compiler with a bunch of other front ends graphed onto it. While it makes many accommodations to the requirements of other languages, it rarely does things to take advantage of the "higher level" information that is available in non C languages. Toon's paper at last year's summit is a good example of exactly how badly we do, and the problem is likely to only get worse with LTO/whopr as the lang hooks go away. While the last section of the whopr pays some lip service to this, we as a community have never really addressed the issues of how we could expand/change our internal representation to accomodate the high level features supported by the non c frontends. I have not looked at the code for Diego's tuplization. The wiki does not indicate that there is any semantic difference between gimple trees and gimple tuples, it just appears to be a well designed data structure cleanup. It would be nice if this next round of intermediate code was able to represent some of the information that we will be throwing away the lang hooks. Both of these are very hard problems and they are likely to require the same level of commitment that will be required to make Whopr work. It is not that i think that making lto/whopr work in a distributed environment is not an important problem, it is just that i think that we need to make LTO produce good code on real programs first. Kenny From drow@false.org Wed Dec 19 19:00:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Wed, 19 Dec 2007 19:00:00 -0000 Subject: Strange error message from gdb In-Reply-To: <18281.23234.151870.816362@zebedee.pink> References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> <18281.23234.151870.816362@zebedee.pink> Message-ID: <20071219185517.GA10986@caradoc.them.org> On Wed, Dec 19, 2007 at 05:54:10PM +0000, Andrew Haley wrote: > > That DIE doesn't have any content. It says "I am a declartion of an > > interface". But not which interface or what it's called or what the > > type is. > > Well, the type is the interface: there's nothing else it might be. >From the DWARF standard: Interface types are represented by debugging information entries with the tag DW_TAG_interface_type. An interface type entry has a DW_AT_name attribute, whose value is a null-terminated string containing the type name as it appears in the source program. The members of an interface are represented by debugging information entries that are owned by the interface type entry and that appear in the same order as the corresponding declarations in the source program. So this is a declaration of an interface, but without a name. GDB is doing the wrong thing with it, but it still seems wrong to me. Or do Java interfaces have no name? > Anyway, on inspection it seems like read_type_die() in dwarf2read.c > doesn't know how to handle DW_TAG_interface_type. This is rather odd, > given that dwarf_tag_name() does know about interface types. That's just a complete transcription of the DWARF tags (at some point in history). -- Daniel Jacobowitz CodeSourcery From drow@false.org Wed Dec 19 19:00:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Wed, 19 Dec 2007 19:00:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: <20071219190023.GB10986@caradoc.them.org> On Wed, Dec 19, 2007 at 10:00:38AM -0800, Ian Lance Taylor wrote: > int f(int x, int y) { > int i = 0, j = 0; > > probe1(); > i = x; > j = y; > probe2(); > Of course there are no actual instructions between the calls to > probe1() and probe2(). If I use gdb's "finish" command out of > probe1(), what values should I see for 'i' and 'j' at that point? > Arguably I am now before the assignment statements, and should see '0' > and '0', the values that 'i' and 'j' have before they are changed. Of > course, this is the same location as the breakpoint before probe2(), > and we can't see both '0'/'0' and 'x'/'y'. So it seems to me that > this situation is actually somewhat ambiguous. I don't see an > obviously correct answer. For once, I do. As far as a debugger dares to distinguish, any location is always the beginning of the next instruction, not the end of the preceeding instruction. If you want to see the zeroes, stop in probe1 and say "up" instead of "finish". A hypothetical -Og which placed observation points between statements would probably need a minimum of one nop per source line. Similarly for observation points at sequence points. -- Daniel Jacobowitz CodeSourcery From aph@redhat.com Wed Dec 19 19:05:00 2007 From: aph@redhat.com (Andrew Haley) Date: Wed, 19 Dec 2007 19:05:00 -0000 Subject: Strange error message from gdb In-Reply-To: <20071219185517.GA10986@caradoc.them.org> References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> <18281.23234.151870.816362@zebedee.pink> <20071219185517.GA10986@caradoc.them.org> Message-ID: <18281.27225.249982.220171@zebedee.pink> Daniel Jacobowitz writes: > On Wed, Dec 19, 2007 at 05:54:10PM +0000, Andrew Haley wrote: > > > That DIE doesn't have any content. It says "I am a declartion of an > > > interface". But not which interface or what it's called or what the > > > type is. > > > > Well, the type is the interface: there's nothing else it might be. > > >From the DWARF standard: > > Interface types are represented by debugging information entries with > the tag DW_TAG_interface_type. > > An interface type entry has a DW_AT_name attribute, whose value is a > null-terminated string containing the type name as it appears in the > source program. > > The members of an interface are represented by debugging information > entries that are owned by the interface type entry and that appear in > the same order as the corresponding declarations in the source > program. OK, so the name is missing, and that's wrong. I should find out why. > So this is a declaration of an interface, but without a name. GDB is > doing the wrong thing with it, but it still seems wrong to me. Or do > Java interfaces have no name? > > > Anyway, on inspection it seems like read_type_die() in dwarf2read.c > > doesn't know how to handle DW_TAG_interface_type. This is rather odd, > > given that dwarf_tag_name() does know about interface types. > > That's just a complete transcription of the DWARF tags (at some point > in history). Right, so read_type_die() doesn't know how to handle DW_TAG_interface_type. The weird thing is that I have never seen this error mesage before today, and AFAIAA gcj has been generating these interface types for a long while. It seems to me that even if gcj did generate the name for the interface, gdb would still die because it doesn't have any handlers for DW_TAG_interface_type in dwarf2read.c Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From drow@false.org Wed Dec 19 19:05:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Wed, 19 Dec 2007 19:05:00 -0000 Subject: Strange error message from gdb In-Reply-To: <18281.27225.249982.220171@zebedee.pink> References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> <18281.23234.151870.816362@zebedee.pink> <20071219185517.GA10986@caradoc.them.org> <18281.27225.249982.220171@zebedee.pink> Message-ID: <20071219190450.GA11655@caradoc.them.org> On Wed, Dec 19, 2007 at 07:00:41PM +0000, Andrew Haley wrote: > It seems to me that even if gcj did generate the name for the > interface, gdb would still die because it doesn't have any handlers > for DW_TAG_interface_type in dwarf2read.c Yes, you're probably right. It thinks it's some kind of symbol, probably. There's a default: in the DIE processing that, strictly speaking, ought not to be there. -- Daniel Jacobowitz CodeSourcery From aoliva@redhat.com Wed Dec 19 19:13:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Wed, 19 Dec 2007 19:13:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712190801o16d5ff60i72c9407fee026a09@mail.gmail.com> (Daniel Berlin's message of "Wed\, 19 Dec 2007 11\:01\:06 -0500") References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181519rb637c16oea78bbcc18abe097@mail.gmail.com> <4aca3dc20712190801o16d5ff60i72c9407fee026a09@mail.gmail.com> Message-ID: On Dec 19, 2007, "Daniel Berlin" wrote: > On 12/18/07, Alexandre Oliva wrote: >> Dwarf enables arbitrary value expressions too. > Well, uh, no. > The only way to directly specify the value of a variable is for > constants. DW_AT_const_value does not allow location descriptions. DW_AT_const_value is irrelevant for location lists. It's DW_OP_* that I'm talking about. That said... I can't find any more the equivalent of DW_CFA_val_expression in DW_OP_*s that could be used in location expressions. I just *knew* it was there, but I guess I just imagined it. This is embarrassing. At this point, there are three options available: - go back to the drawing board - discard altogether expressions that don't represent lvalues (maybe don't even keep track of them) - introduce a DWARF extension that enables value expressions to be used in location lists (say DW_OP_value, DW_OP_temp_location, or even DW_OP_self_location (*)) (*) maps value to a virtual location that, if dereferenced, evaluates to the value. Could be "easily" implemented through a virtual out-of-range base address, plus the offset that represents the value on dereference, but there are many other ways to implement this in debug information consumers. > I'm still curious where you think it describes value expressions for > variables other than constants Me too :-) :-( Thanks for drawing my attention to this incorrect assumption I made about DWARF location lists. > i'd support such an extension Cool. Do you happen to know the procedure to propose DWARF standard extensions? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dberlin@dberlin.org Wed Dec 19 19:25:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 19 Dec 2007 19:25:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <47694256.8000303@redhat.com> References: <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> <47694256.8000303@redhat.com> Message-ID: <4aca3dc20712191112q31fee70atf1c92567787f1b82@mail.gmail.com> On 12/19/07, Andrew MacLeod wrote: > Daniel Berlin wrote: > > > > Here is the easy one: > > > > z_5 = a_3 + b_3 > > x_4 = z_5 + c_3 > > > > DEBUG(x, x_4) > > > > > > Reassoc may transform this into: > > > > > > z_5 = c_3 + b_3 > > x_4 = z_5 + a_3 > > > > DEBUG(x, x_4) > > > > Now x has the wrong value. > > > ?? > > x_4 looks like it has the value 'a_3 + b_3 + c_3' in both examples to > me, although computed in different orders... > > so isn't that still the right value? Yes, sorry, you have to add one more set of adds below and move one so you can make it have a different value You get the general idea though :) Reassoc knows they are all only used in each other, and that it is okay to change their intermediate value as long as the last thing int he chain retains its value (which it does since they are all commutative operations) > > Andrew > From janis187@us.ibm.com Wed Dec 19 19:53:00 2007 From: janis187@us.ibm.com (Janis Johnson) Date: Wed, 19 Dec 2007 19:53:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: <1198092296.6413.5.camel@janis-laptop> On Wed, 2007-12-19 at 10:00 -0800, Ian Lance Taylor wrote: > One way to make a principled choice is to consider the line notes we > are going to emit with the debugging information. Presumably we do > not have the goal of emitting correct debug information in between > line notes--e.g., when using the "stepi" command in gdb. Our goal is > to emit correct debug information at the points where a debugger would > naturally stop--the notes for where a line starts. Debugging in between line notes is important for core files and when moving up and down the call stack, so at such locations the debugger needs to at least know whether debug information is reliable or not. Janis From aoliva@redhat.com Wed Dec 19 20:00:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Wed, 19 Dec 2007 20:00:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> (Daniel Berlin's message of "Wed\, 19 Dec 2007 10\:59\:41 -0500") References: <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> Message-ID: On Dec 19, 2007, "Daniel Berlin" wrote: > Here is the easy one: > z_5 = a_3 + b_3 > x_4 = z_5 + c_3 > DEBUG(x, x_4) > Reassoc may transform this into: > z_5 = c_3 + b_3 > x_4 = z_5 + a_3 > DEBUG(x, x_4) > Now x has the wrong value. As Andrew said, no, it doesn't. Now, if z_5 were present in a debug expression, then it would need adjusting. No different from the adjusting need for any other instruction in which z_5 was present, though. That's what I mean when I talk about letting the optimizers do their job on debug instructions too. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From amacleod@redhat.com Wed Dec 19 20:00:00 2007 From: amacleod@redhat.com (Andrew MacLeod) Date: Wed, 19 Dec 2007 20:00:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> References: <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> Message-ID: <47697618.3090807@redhat.com> > It gets worse, however > > c_3 = a_1 + b_2 > z_5 = c_3 + d_9 > x_4 = z_5 + e_10 > DEBUG(x, x_4) > y_7 = x_4 + f_11 > z_8 = y_7 + g_12 > -> > > c_3 = a_1 + b_2 > z_5 = c_3 + g_12 > x_4 = z_5 + e_10 > DEBUG(x, x_4) > y_7 = x_4 + f_11 > z_8 = y_7 + d_9 > > > x_4 now no longer represents the value of x, but we haven't directly > changed x_4, it's immediate users, or the statements that immediately > make up it's defining values. > > This does seem more troublesome. Reassociation shuffles things around without changing the LHS presumably because it has looked at the uses and knows there are no uses outside the expression, so it can manipulate them however it wants. It elects not to create new temps since it knows the old ones aren't being used elsewhere, so why wast new entries. So if it was aware of the debug stmt, there would be a use of x_4 outside the expression, and it would no longer do the same reassociation. Is that the jist of it? Andrew From aoliva@redhat.com Wed Dec 19 20:03:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Wed, 19 Dec 2007 20:03:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> (Daniel Berlin's message of "Wed\, 19 Dec 2007 01\:07\:29 -0500") References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> Message-ID: On Dec 19, 2007, "Daniel Berlin" wrote: > On 12/19/07, Alexandre Oliva wrote: >> On Dec 18, 2007, "Daniel Berlin" wrote: >> >> > Consider PRE alone, >> >> > If your debug statement strategy is "move debug statements when we >> > insert code that is equivalent" >> >> Move? Debug statements don't move, in general. I'm not sure what you >> have in mind, but I sense some disconnect here. > OKay, so if you aren't going to move them, you have to erase them when > you move statements around. Why? They still represent the point of binding between user variable and value. > How were you going to generate the initial set of debug annotations? It's in the document: after each assignment to user variable, and at PHI nodes for user variables. The debug statement means the variable holds that value from that point on until conflicting information arises (i.e., another debug statement for the same variable, or a control flow merge with different values for the same variable) > How were you going to update it if you saw a statement was updated to > say x_5 = x_4 instead of x_5 = x_3 + x_2. No update needed, if x_5 is the value of interest. I'm not sure that's what you're asking, though. > So then how will using your debug annotations and updating them come > out any different than say performing a value numbering pass where you > also associate user variables with the ssa names (IE alongside our > value numbers), and propagate them around as well? First, debug annotations may be at different points than the corresponding SSA definitions, because the same SSA definition may be bound to different variables at different ranges. Second, debug annotations may contain more complex expressions than a single SSA name, and there may not be any SSA name that represents the value of these expressions left. For example, given: x_3 = a_1 + b_2; # DEBUG x => x_3 foo(); if we find that x_3 is unused elsewhere, we can drop it without discarding debug information about the value of x at that point # DEBUG x => a_1 + b_2 foo(); such that, if we stop at the call and print x, we get the expected value, even though the actual variable was optimized away. > At the end, you could emit DEBUG(user var, ssa name) right after each > SSA_NAME_DEF_STMT for all user vars in the user var set for ssa name. This doesn't work. Consider: a_2 = whatever1; b_4 = whatever2; x_1 = a_2; probe(); if (condition) { probe(); x_3 = b_4; probe(); } x_5 = PHI ; probe(); Now, if you optimize it and apply the debug stmt generation technique you suggested, this is what you get: T_2 = whatever1; # DEBUG a => T_2 # DEBUG x => T_2 T_4 = whatever2; # DEBUG b => T_4 # DEBUG x => T_4 probe(); if (condition) { probe(); probe(); } T_5 = PHI # DEBUG x => T_5 probe(); What do you get if you print x at each of the probe points? > I don't see why you believe user variables/bindings are special and > can't be propagated in this manner, It's not that I don't believe it, it's just that just being able to propagate them is not enough. We must also take the binding point into account. Now, as I wrote to Ian last night, if we just add a binding point annotation to this mix, then we have sufficient information: T_2 = whatever1; # DEBUG a => T_2 here # DEBUG x => T_2 at P1 T_4 = whatever2; # DEBUG b => T_4 here # DEBUG x => T_4 at P2 probe(); # DEBUG point P1 if (condition) { probe(); # DEBUG point P2 probe(); } T_5 = PHI # DEBUG x => T_5 probe(); I still don't see how, in this notation, we'd represent something like "at this point, the value of this user variable is unknown". Any ideas? Also, this strategy works for the nice and well-behaved Tree SSA optimization passes. For RTL, that is far less abstract, especially after register allocation, I don't see that we can rely on such a simple strategy. But, in a way, I hope I'm wrong ;-) >> > #3 is a dataflow problem, and not something you want to do every time >> > you insert a call. >> I'm not sure what you mean by "inserting calls". We don't do that. > Sure we do. > We will definitely insert new calls when we PRE const/pure calls, or > calls we determine to be movable to the point we want to move them I think of that as moving, rather than inserting. That said, I still don't quite see what you're getting at. Calls don't mess with gimple registers of their callers, ever, so it appears to me that inserting a call in the tree level is a NOP in terms of debug information annotations. > I'm not sure why you believe all the calls that we end up with in the > IR are actually in the source (or even implied by it). Conceptually, they are, kind-a sort of :-) Except perhaps for profiling calls, that are meant to be fully transparent anyway. Others are more akin to inlining, or using a call for convenience rather than expanding a copy or something to that effect. >> But I'm not computing that in trees. I'm just collecting and >> maintaining data points for var-tracking, all the way from the tree >> level. > Okay, then for trees, why bother tracking it when you can compute it > right before translation with the same accuracy you can if you update > it every time you make statement changes? Just because we still haven't found a reliable way to do so that doesn't drop essential information for correct debug info. If we do, I'll be delighted to immediately drop the proposed debug annotations in the tree level. And in the RTL level as well. >> And debug information is not just about the values, it's about >> mapping variables to values and locations. > You have no locations at the tree level, ?!? Locations as in point of execution, rather than DWARF locations, is waht I mean. > and i've explicitly said what > i said applies to the tree level :) Indeed ;-) >> So, we can't infer all the >> information we need. > Again, i believe we can at the tree level. Good, let's keep on it. How about you use something like the example above to explain how to accomplish it? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From jklowden@freetds.org Wed Dec 19 20:07:00 2007 From: jklowden@freetds.org (jklowden@freetds.org) Date: Wed, 19 Dec 2007 20:07:00 -0000 Subject: -Wparentheses lumps too much together Message-ID: <20071219200235.GA21525@oak.schemamania.org> Hello Gentlemen, Much as I'm a fan of the GCC and rely on -Wall, I would like to suggest to you that -Wparentheses should be split up, and things it checks/suggests be moved out of -Wall. If this is not the right forum or if you'd rather see this as a bug report, I'm happy to go where I'm pointed. The basic problem with this warning is that it includes some helpful advice about some subtle bugs with some "dear beginner" advice about "operators ... whose precedence people often get confused about." My assertion is that operator precedence is fundamental to C. No one can read C without knowing. If you're unsure about precedence, you're only guessing. Moreover, to interpret operator precedence, it's not necessary to, say, rememember the variables' declarations 100 lines earlier and it's impossible to be misled by non-syntactic indentation. Everything you need to know to understand the statement is contained in the statement itself. My specific candidate for exclusion from -Wall is this one: if (a && b || c && d) which yields (as you know) advice to parenthesize the two && pairs. I very much think this is unhelpful, counterproductive advice. Yes, I know beginners get confused by and/or precedence. But *every* language that I know of that has operator precedence places 'and' before 'or'. More important, a C programmer will encounter many thousands such expressions in his dealings with the language. To "help" him is merely to retard his education. But -Wall isn't really concerned with helping beginners, right? It's really about avoiding errors and pitfalls. That's why I suggest this and other precedence warnings be moved to -Woperator (notional). If you need help learning C, GCC will help you. But if you're maintaining 100,000 lines of portable, standards-compliant C (as I do), you don't want warnings to parenthesize things that don't need it. At the risk of overstaying my welcome, let me answer the commonest reply, "what's a few parentheses among friends?" The problem is that, for the experienced programmer, "extra" parenthesis are a signal: something unusual is going on here. By forcing your most careful users -- those who bother with -Wall -- to use parenthesis to enforce what the compiler would do anyway where the meaning was clear(er) without them, you're removing a signalling tool by "insisting" on white noise. I don't think that's a good outcome and I doubt it was the intention, but there it is. By the way, I distinguish the 'and/or' advice from the more helpful concern: while ((erc=foo()) != 0) because there the programmer's intent is likely impossible to express without parentheses; the idiom requires the override. One last point. In looking for the rationale behind this warning, I searched for examples of it. I didn't find any discussion on this list. What I did find were many examples of people rototilling perfectly fine code, "improving" it by adding unneeded parenthesis specifically so that it would compile cleanly with -Wall. I think that's a shame: a waste of effort at best. I ask you, please, to consider splitting advice about operator precedence from advice about mismatched if/else branches, and to exclude advice about making sure && is parenthesized ahead of || from -Wall. -Wall is the standard for "good, clean code" in many projects. This warning doesn't belong there. Thank you for your time and consideration. Regards, --jkl From drow@false.org Wed Dec 19 20:11:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Wed, 19 Dec 2007 20:11:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <4aca3dc20712181519rb637c16oea78bbcc18abe097@mail.gmail.com> <4aca3dc20712190801o16d5ff60i72c9407fee026a09@mail.gmail.com> Message-ID: <20071219200720.GA14956@caradoc.them.org> On Wed, Dec 19, 2007 at 05:02:52PM -0200, Alexandre Oliva wrote: > That said... I can't find any more the equivalent of > DW_CFA_val_expression in DW_OP_*s that could be used in location > expressions. I just *knew* it was there, but I guess I just imagined > it. This is embarrassing. I am pretty sure such an extension has already been proposed. Might want to check with the committee (see dwarf.org). -- Daniel Jacobowitz CodeSourcery From doug.gregor@gmail.com Wed Dec 19 20:14:00 2007 From: doug.gregor@gmail.com (Doug Gregor) Date: Wed, 19 Dec 2007 20:14:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: <20071219200235.GA21525@oak.schemamania.org> References: <20071219200235.GA21525@oak.schemamania.org> Message-ID: <24b520d20712191211w70c6544fxfc3954a4fffa023@mail.gmail.com> On Dec 19, 2007 3:02 PM, wrote: > One last point. In looking for the rationale behind this warning, I searched > for examples of it. I didn't find any discussion on this list. What I did > find were many examples of people rototilling perfectly fine code, "improving" > it by adding unneeded parenthesis specifically so that it would compile > cleanly with -Wall. I think that's a shame: a waste of effort at best. > > I ask you, please, to consider splitting advice about operator precedence from > advice about mismatched if/else branches, and to exclude advice about > making sure && is parenthesized ahead of || from -Wall. -Wall is the > standard for "good, clean code" in many projects. This warning doesn't > belong there. For what it is worth, I completely agree with everything you have said here. This warning oversteps the bounds of what -Wall should do, and forces people to change perfectly good, clean code. Operator precedence is an important concept that any C or C++ programmer should know, and we're not helping anyone by pretending that programmer's won't understand this concept. We should certainly remove the warning from -Wall, and perhaps remove it entirely. - Doug From drow@false.org Wed Dec 19 20:15:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Wed, 19 Dec 2007 20:15:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: <20071219200235.GA21525@oak.schemamania.org> References: <20071219200235.GA21525@oak.schemamania.org> Message-ID: <20071219201426.GA15270@caradoc.them.org> On Wed, Dec 19, 2007 at 03:02:35PM -0500, jklowden@freetds.org wrote: > My specific candidate for exclusion from -Wall is this one: > > if (a && b || c && d) > > which yields (as you know) advice to parenthesize the two && pairs. > > I very much think this is unhelpful, counterproductive advice. > Yes, I know beginners get confused by and/or precedence. But > *every* language that I know of that has operator precedence places > 'and' before 'or'. More important, a C programmer will encounter > many thousands such expressions in his dealings with the language. > To "help" him is merely to retard his education. I am happy to stand as a counterexample; I am an experienced C programmer and I greatly appreciate this warning. And I loathe reading code which cavalierly assumes you remember the precedence. + and *, sure, you learn that in grade school. && and || is trickier because there are sensible arguments for both directions; it is harder to derive from first principles. If you are more bothered by any clarifying parentheses than I am, use -Wno-parentheses. -- Daniel Jacobowitz CodeSourcery From ismail@pardus.org.tr Wed Dec 19 20:39:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Wed, 19 Dec 2007 20:39:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: <24b520d20712191211w70c6544fxfc3954a4fffa023@mail.gmail.com> References: <20071219200235.GA21525@oak.schemamania.org> <24b520d20712191211w70c6544fxfc3954a4fffa023@mail.gmail.com> Message-ID: <200712192214.56822.ismail@pardus.org.tr> Wednesday 19 December 2007 22:11:22 tarihinde Doug Gregor ?unlar? yazm??t?: > On Dec 19, 2007 3:02 PM, wrote: > > One last point. In looking for the rationale behind this warning, I > > searched for examples of it. I didn't find any discussion on this list. > > What I did find were many examples of people rototilling perfectly fine > > code, "improving" it by adding unneeded parenthesis specifically so that > > it would compile cleanly with -Wall. I think that's a shame: a waste of > > effort at best. > > > > I ask you, please, to consider splitting advice about operator precedence > > from advice about mismatched if/else branches, and to exclude advice > > about making sure && is parenthesized ahead of || from -Wall. -Wall is > > the standard for "good, clean code" in many projects. This warning > > doesn't belong there. > > For what it is worth, I completely agree with everything you have said > here. This warning oversteps the bounds of what -Wall should do, and > forces people to change perfectly good, clean code. Operator > precedence is an important concept that any C or C++ programmer should > know, and we're not helping anyone by pretending that programmer's > won't understand this concept. > > We should certainly remove the warning from -Wall, and perhaps remove > it entirely. Agreed, for example it has lots of useless warnings when compiling ffmpeg. Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From dberlin@dberlin.org Wed Dec 19 20:40:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 19 Dec 2007 20:40:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <47697618.3090807@redhat.com> References: <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> <47697618.3090807@redhat.com> Message-ID: <4aca3dc20712191238t490cda9di9eac38cfc1d1eeb8@mail.gmail.com> On 12/19/07, Andrew MacLeod wrote: > > > It gets worse, however > > > > c_3 = a_1 + b_2 > > z_5 = c_3 + d_9 > > x_4 = z_5 + e_10 > > DEBUG(x, x_4) > > y_7 = x_4 + f_11 > > z_8 = y_7 + g_12 > > -> > > > > c_3 = a_1 + b_2 > > z_5 = c_3 + g_12 > > x_4 = z_5 + e_10 > > DEBUG(x, x_4) > > y_7 = x_4 + f_11 > > z_8 = y_7 + d_9 > > > > > > x_4 now no longer represents the value of x, but we haven't directly > > changed x_4, it's immediate users, or the statements that immediately > > make up it's defining values. > > > > > > This does seem more troublesome. Reassociation shuffles things around > without changing the LHS presumably because it has looked at the uses > and knows there are no uses outside the expression, so it can manipulate > them however it wants. It elects not to create new temps since it knows > the old ones aren't being used elsewhere, so why wast new entries. Yes. > > So if it was aware of the debug stmt, there would be a use of x_4 > outside the expression, and it would no longer do the same reassociation. Either that, or you would have to hunt all the uses of every single thing in the chain to see if any were debug expressions, and if the value is going to change. > > Is that the jist of it? Yes > > Andrew > From dberlin@dberlin.org Wed Dec 19 21:11:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 19 Dec 2007 21:11:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> Message-ID: <4aca3dc20712191240r472ee46ekacf32886a1abd63@mail.gmail.com> On 12/19/07, Alexandre Oliva wrote: > On Dec 19, 2007, "Daniel Berlin" wrote: > > > Here is the easy one: > > > z_5 = a_3 + b_3 > > x_4 = z_5 + c_3 > > > DEBUG(x, x_4) > > > > Reassoc may transform this into: > > > > z_5 = c_3 + b_3 > > x_4 = z_5 + a_3 > > > DEBUG(x, x_4) > > > Now x has the wrong value. > > As Andrew said, no, it doesn't. > Yes, I corrected it later. You didn't address the other one, which is much harder and does require addressing by you. > Now, if z_5 were present in a debug expression, then it would need > adjusting. No different from the adjusting need for any other > instruction in which z_5 was present, though. uh, but if you don't adjust in the fixed examples, DEBUG(x, x_4) will give an invalid value. You can cause this to value to change without ever changing x_4, and do so legally. How do i know i need to change this DEBUG expression. From iant@google.com Wed Dec 19 21:17:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Wed, 19 Dec 2007 21:17:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <1198092296.6413.5.camel@janis-laptop> References: <4737BF2C.70408@codesourcery.com> <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1198092296.6413.5.camel@janis-laptop> Message-ID: Janis Johnson writes: > On Wed, 2007-12-19 at 10:00 -0800, Ian Lance Taylor wrote: > > One way to make a principled choice is to consider the line notes we > > are going to emit with the debugging information. Presumably we do > > not have the goal of emitting correct debug information in between > > line notes--e.g., when using the "stepi" command in gdb. Our goal is > > to emit correct debug information at the points where a debugger would > > naturally stop--the notes for where a line starts. > > Debugging in between line notes is important for core files and > when moving up and down the call stack, so at such locations the > debugger needs to at least know whether debug information is > reliable or not. For some things, sure, but we are just talking about the values in user visible variables stored in registers. There is no way we can make that information be correct between line notes. Ian From stevenb.gcc@gmail.com Wed Dec 19 22:19:00 2007 From: stevenb.gcc@gmail.com (Steven Bosscher) Date: Wed, 19 Dec 2007 22:19:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <20071219153201.GH17368@sygehus.dk> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <20071219153201.GH17368@sygehus.dk> Message-ID: <571f6b510712191317y58d1131dmaa7a00d32c753c56@mail.gmail.com> On Dec 19, 2007 4:32 PM, Rask Ingemann Lambertsen wrote: > On Wed, Dec 19, 2007 at 01:59:51AM +0100, Steven Bosscher wrote: > > The current list of "All regressions" should be a list of bugs that > > people are actively trying to resolve, preferably before the release > > of GCC 4.3. > > No, it should be exactly what it says it is. I don't mind renaming it ;-) > If you want an additional > list of bugs that are being actively worked on (and labelled as such), > that's fine (although I have no idea how that list would be useful). Let's take a bug as an example case: http://gcc.gnu.org/23835 Here, there is a bug report about a huge compile time increase. The release manager decided that this was not a release blocker for GCC 4.2. So it was marked P4, and it disappeared from the radar for GCC 4.3 for everyone who only looks at the "Serious regressions". Gr. Steven From tejgcc@westnet.com.au Wed Dec 19 22:29:00 2007 From: tejgcc@westnet.com.au (Tim Josling) Date: Wed, 19 Dec 2007 22:29:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <4761334A.2070709@google.com> References: <47603F3C.2090808@google.com> <1197502096.6006.14.camel@tim-gcc> <65dd6fd50712122339h2c856fdbpe127417e6726bd6@mail.gmail.com> <4761334A.2070709@google.com> Message-ID: <1198102778.11411.8.camel@tim-gcc> On Thu, 2007-12-13 at 08:27 -0500, Diego Novillo wrote: > On 12/13/07 2:39 AM, Ollie Wild wrote: > > > The lto branch is already doing this, so presumably that discussion > > was resolved (Maybe someone in the know should pipe up.). > > Yes, streaming the IL to/from disk is a resolved issue. > ... > > Diego. I found this thread http://gcc.gnu.org/ml/gcc/2005-11/msg00735.html >> From: Mark Mitchell >> To: gcc mailing list >> Date: Wed, 16 Nov 2005 14:26:28 -0800 >> Subject: Link-time optimzation ________________________________________________________________________ >> The GCC community has talked about link-time optimization for some time. >> ... >> We would prefer not to have this thread devolve into a discussion about >> legal and "political" issues relating to reading and writing GCC's >> internal representation. I've said publicly for a couple of years that >> GCC would need to have this ability, and, more constructively, David >> Edelsohn has talked with the FSF (both RMS and Eben Moglen) about it. >> The FSF has indicated that GCC now can explore adding this feature, >> although there are still some legal details to resolve. >> ... >> http://gcc.gnu.org/projects/lto/lto.pdf >> ... Was there any more about this? I have restarted work on my COBOL front end. Based on my previous experiences writing a GCC front end I want to have as little code as possible in the same process as the GCC back end. This means passing over a file. So I would like to understand how to avoid getting into political/legal trouble when doing this. Thanks, Tim Josling From dnovillo@google.com Wed Dec 19 22:29:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Wed, 19 Dec 2007 22:29:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <1198102778.11411.8.camel@tim-gcc> References: <47603F3C.2090808@google.com> <1197502096.6006.14.camel@tim-gcc> <65dd6fd50712122339h2c856fdbpe127417e6726bd6@mail.gmail.com> <4761334A.2070709@google.com> <1198102778.11411.8.camel@tim-gcc> Message-ID: On Dec 19, 2007 5:19 PM, Tim Josling wrote: > This means passing over a file. So I would like to understand how to > avoid getting into political/legal trouble when doing this. Passing over a file in what format? If you are writing a COBOL to C translator, that will certainly be fine. If you are emitting GENERIC or GIMPLE, you are much better off implementing your FE like any other GCC FE. Fortran is a good example of an FE that is totally independent from the rest of GCC. It hands out a GENERIC representation built out of its internal data structures. Otherwise, you will need to translate to GIMPLE, stream it out from the COBOL FE and feed it into the LTO FE. Diego. From clattner@apple.com Wed Dec 19 22:34:00 2007 From: clattner@apple.com (Chris Lattner) Date: Wed, 19 Dec 2007 22:34:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <1198102778.11411.8.camel@tim-gcc> References: <47603F3C.2090808@google.com> <1197502096.6006.14.camel@tim-gcc> <65dd6fd50712122339h2c856fdbpe127417e6726bd6@mail.gmail.com> <4761334A.2070709@google.com> <1198102778.11411.8.camel@tim-gcc> Message-ID: On Dec 19, 2007, at 2:19 PM, Tim Josling wrote: > >>> ... >>> http://gcc.gnu.org/projects/lto/lto.pdf >>> ... > > Was there any more about this? > > I have restarted work on my COBOL front end. Based on my previous > experiences writing a GCC front end I want to have as little code as > possible in the same process as the GCC back end. > > This means passing over a file. So I would like to understand how to > avoid getting into political/legal trouble when doing this. While it is possible to make this work once LTO is finished, it seems unlikely that it will be pleasant. Doing so will basically mean reimplementing a 'writer' for the LTO format to interoperate with the GCC code. This seems to be a hard task, as there is no document on the structure and contents of the LTO file. OTOH, you can get this by reverse engineering the code to find out what it does. Further, it has been publicly stated that the format will evolve and is not going to be stable (though I don't recall where). Unless you want to fight to keep up with the format, this sounds like a major pain. You might be interested in checking out LLVM: http://llvm.org/ which has a well defined and well specified file formats (one text and one binary), and preserves backwards compatibility with them across major release (i.e. 1.0 -> 1.9 and 2.0 -> 2.x). http://llvm.org/docs/ LangRef.html I'd be interested to hear if keeping the LTO format stable is something the GCC community plans to do, -Chris From dnovillo@google.com Wed Dec 19 22:49:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Wed, 19 Dec 2007 22:49:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: References: <47603F3C.2090808@google.com> <1197502096.6006.14.camel@tim-gcc> <65dd6fd50712122339h2c856fdbpe127417e6726bd6@mail.gmail.com> <4761334A.2070709@google.com> <1198102778.11411.8.camel@tim-gcc> Message-ID: On Dec 19, 2007 5:29 PM, Chris Lattner wrote: > I'd be interested to hear if keeping the LTO format stable is > something the GCC community plans to do, I doubt it. We may end up doing it for practical reasons within a release, but I'm not sure if it's high on anyone's priority list. Diego. From gccadmin@gcc.gnu.org Wed Dec 19 22:59:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Wed, 19 Dec 2007 22:59:00 -0000 Subject: gcc-4.2-20071219 is now available Message-ID: <20071219224932.21826.qmail@sourceware.org> Snapshot gcc-4.2-20071219 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20071219/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.2 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch revision 131091 You'll find: gcc-4.2-20071219.tar.bz2 Complete GCC (includes all of below) gcc-core-4.2-20071219.tar.bz2 C front end and core compiler gcc-ada-4.2-20071219.tar.bz2 Ada front end and runtime gcc-fortran-4.2-20071219.tar.bz2 Fortran front end and runtime gcc-g++-4.2-20071219.tar.bz2 C++ front end and runtime gcc-java-4.2-20071219.tar.bz2 Java front end and runtime gcc-objc-4.2-20071219.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.2-20071219.tar.bz2 The GCC testsuite Diffs from 4.2-20071212 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.2 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From iant@google.com Wed Dec 19 23:11:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Wed, 19 Dec 2007 23:11:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: <20071219200235.GA21525@oak.schemamania.org> References: <20071219200235.GA21525@oak.schemamania.org> Message-ID: jklowden@freetds.org writes: > Much as I'm a fan of the GCC and rely on -Wall, I would like to suggest > to you that -Wparentheses should be split up, and things it checks/suggests > be moved out of -Wall. If this is not the right forum or if you'd rather > see this as a bug report, I'm happy to go where I'm pointed. I have no objection to splitting -Wparentheses into separate warnings controlled by separate options. > My specific candidate for exclusion from -Wall is this one: > > if (a && b || c && d) > > which yields (as you know) advice to parenthesize the two && pairs. That particular warning happened to find dozens of real errors when I ran it over a large code base. It may be noise for you, but I know from personal experience that it is very useful. Ian From rask@sygehus.dk Thu Dec 20 00:10:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Thu, 20 Dec 2007 00:10:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <571f6b510712191317y58d1131dmaa7a00d32c753c56@mail.gmail.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <20071219153201.GH17368@sygehus.dk> <571f6b510712191317y58d1131dmaa7a00d32c753c56@mail.gmail.com> Message-ID: <20071219231135.GJ17368@sygehus.dk> On Wed, Dec 19, 2007 at 10:17:00PM +0100, Steven Bosscher wrote: > On Dec 19, 2007 4:32 PM, Rask Ingemann Lambertsen wrote: > > > If you want an additional > > list of bugs that are being actively worked on (and labelled as such), > > that's fine (although I have no idea how that list would be useful). > > Let's take a bug as an example case: http://gcc.gnu.org/23835 > > Here, there is a bug report about a huge compile time increase. The > release manager decided that this was not a release blocker for GCC > 4.2. So it was marked P4, and it disappeared from the radar for GCC > 4.3 for everyone who only looks at the "Serious regressions". Right. I tend to look at the list of "All regressions" for that reason. I have also bookmarked about 20 PRs that I'm likely to work on. It does not come as a surprice to me that one size doesn't fit all. The list of "Serious regressions" just gives you a peek over the release manager's shoulders. I use it mainly as an indication of how far away regressions only mode is. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From nightstrike@gmail.com Thu Dec 20 01:25:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Thu, 20 Dec 2007 01:25:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: <571f6b510712191317y58d1131dmaa7a00d32c753c56@mail.gmail.com> References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <20071219153201.GH17368@sygehus.dk> <571f6b510712191317y58d1131dmaa7a00d32c753c56@mail.gmail.com> Message-ID: On 12/19/07, Steven Bosscher wrote: > Let's take a bug as an example case: http://gcc.gnu.org/23835 > > Here, there is a bug report about a huge compile time increase. The > release manager decided that this was not a release blocker for GCC > 4.2. So it was marked P4, and it disappeared from the radar for GCC > 4.3 for everyone who only looks at the "Serious regressions". Under this system, do P4's and P5's ever get fixed? From sjackman@gmail.com Thu Dec 20 01:29:00 2007 From: sjackman@gmail.com (Shaun Jackman) Date: Thu, 20 Dec 2007 01:29:00 -0000 Subject: Disabling the heuristic inliner Message-ID: <7f45d9390712191725k74bf0fc0w77bb3eae735e6b10@mail.gmail.com> Is it possible to disable the heuristic inline function logic? I would like to select the following behaviour: * all static inline functions are always inlined * all static functions that are called once are inlined (-finline-functions-called-once) * no other functions are inlined I'm using -Os and I'm trying to find the right combination of -f switches to enable this behaviour. I tried -finline-limit=1 -finline-functions-called-once but it caused static inline functions to not be inlined. Thanks, Shaun From sjackman@gmail.com Thu Dec 20 03:09:00 2007 From: sjackman@gmail.com (Shaun Jackman) Date: Thu, 20 Dec 2007 03:09:00 -0000 Subject: Disabling the heuristic inliner In-Reply-To: <7f45d9390712191725k74bf0fc0w77bb3eae735e6b10@mail.gmail.com> References: <7f45d9390712191725k74bf0fc0w77bb3eae735e6b10@mail.gmail.com> Message-ID: <7f45d9390712191729w12b9d119q6c8326c806f7ee01@mail.gmail.com> Is there a switch to never inline a non-static function? Thanks, Shaun On Dec 19, 2007 6:25 PM, Shaun Jackman wrote: > Is it possible to disable the heuristic inline function logic? I would > like to select the following behaviour: > > * all static inline functions are always inlined > * all static functions that are called once are inlined > (-finline-functions-called-once) > * no other functions are inlined > > I'm using -Os and I'm trying to find the right combination of -f > switches to enable this behaviour. I tried > -finline-limit=1 -finline-functions-called-once > but it caused static inline functions to not be inlined. > > Thanks, > Shaun From dnovillo@google.com Thu Dec 20 03:48:00 2007 From: dnovillo@google.com (Diego Novillo) Date: Thu, 20 Dec 2007 03:48:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <476965CC.5050301@naturalbridge.com> References: <47603F3C.2090808@google.com> <20071218132914.GB12527@atrey.karlin.mff.cuni.cz> <476946B8.9030409@google.com> <476965CC.5050301@naturalbridge.com> Message-ID: On Dec 19, 2007 1:41 PM, Kenneth Zadeck wrote: > I am hoping that in the next couple of days, Nathan and I will be able > to say that we have completed to work that Codesourcery/NaturalBridge > contracted to do with IBM. Completion means that we are able to compile > and run the C language spec 2000 benchmarks in LTO mode, as well as > compile all of the gcc compiler itself (this does not include the runtime). Sounds good. Do you folks have some criteria for merging into mainline? I ran the C testsuite today by forcing every single test to use -flto, there were about <8,000 testsuite failures (FAIL + UNRESOLVED), which is about a 16% failure rate. We (Google) plan to keep working on those failures and getting the C++ front end in shape. We (GCC) should probably figure out a set of criteria to consider merging the branch into mainline. Should we shoot for being able to bootstrap with -flto enabled? I would at least be able to pass all the testsuites with -flto enabled. > There are still many open issues that we are hoping that the community > would address Thanks. I've added some of these items to the implementation plan on the wiki page. The rest were already there, please take a look and add/modify to the list. > I personally was planning to start restructuring the ipa passes and > serializing the cgraph. Great. Those are items under the WPA phase. If you have a list of things to be done besides the ones that are already there, could you add them? The more specific we are in this list, the easier it will be for folks to pick up stuff to do. > I personally think that the most pressing problems are > > 1) making lto/whopr work in the presence of modules that do not fit > perfectly together, because of type or function argument mismatches. Agreed. > that is available in non C languages. Toon's paper at last year's > summit is a good example of exactly how badly we do, and the problem is > likely to only get worse with LTO/whopr as the lang hooks go away. Are you talking about aliasing or things like high-level array operations? > While the last section of the whopr pays some lip service to this Well, no. That only addresses some of the aliasing problems. Representing high-level concepts like array/vector arithmetic or class hierarchies is not something we have done well in GIMPLE. In terms of whole program optimization, we will be interested in addressing class hierarchy optimizations. > a community have never really addressed the issues of how we could > expand/change our internal representation to accomodate the high level > features supported by the non c frontends. We have for concurrency with the extensions to support OpenMP which are useful in contexts like auto-parallelism. But in general, we don't transfer some things like array syntax or class hierarchies very well. Now, adding high-level concepts to an IL is usually expensive in several ways. Beyond arrays and class hierarchies, do you see any other high-level concept worth transferring into GIMPLE? I wouldn't want to represent very many high-level concepts in GIMPLE. > The wiki does not indicate that there is any semantic difference between gimple trees > and gimple tuples Right, there isn't. The work on tuples is orthogonal and can go in/out at any time. It's just mechanically big, as it changes data structures used by most of the compiler. All this work can proceed in parallel. > Both of these are very hard problems and they are likely to require the > same level of commitment that will be required to make Whopr work. It > is not that i think that making lto/whopr work in a distributed > environment is not an important problem, it is just that i think that we > need to make LTO produce good code on real programs first. Oh, absolutely. The design simply allows the first (LGEN) and last stage (LTRANS) to operate in a distributed environment. The initial implementation can simply assume a shared file system. Distribution can be added later. The only important parameter is to avoid implementation decisions that will prevent processing massively large applications. Thanks. Diego. From zadeck@naturalbridge.com Thu Dec 20 04:59:00 2007 From: zadeck@naturalbridge.com (Kenneth Zadeck) Date: Thu, 20 Dec 2007 04:59:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: References: <47603F3C.2090808@google.com> <20071218132914.GB12527@atrey.karlin.mff.cuni.cz> <476946B8.9030409@google.com> <476965CC.5050301@naturalbridge.com> Message-ID: <4769E5F8.50905@naturalbridge.com> Diego Novillo wrote: > On Dec 19, 2007 1:41 PM, Kenneth Zadeck wrote: > > >> I am hoping that in the next couple of days, Nathan and I will be able >> to say that we have completed to work that Codesourcery/NaturalBridge >> contracted to do with IBM. Completion means that we are able to compile >> and run the C language spec 2000 benchmarks in LTO mode, as well as >> compile all of the gcc compiler itself (this does not include the runtime). >> > > Sounds good. Do you folks have some criteria for merging into > mainline? I ran the C testsuite today by forcing every single test to > use -flto, there were about <8,000 testsuite failures (FAIL + > UNRESOLVED), which is about a 16% failure rate. > I never tried such a test. My listing of things to do was based on the execute tests. It is hard to say what that 16% really means without going thur case by case. I did this for the execute tests. That was what I based my list on. It would not surprise me if there are other issues, but it could also be the case that the listed failures are just over expressed in the test suite. The "last" bug that nathan and I are working on is that local statics are not done correctly. The hope is that will be fixed tomorrow. An idea that has been kicked around is that when lto is good enough to replace the old --combine, then we should remove --combine and replace it with lto. I have not really thought thru the details of this, but given that --combine is (i believe) a c only thing, having lto be c only is not that big a deal. Certainly no extra regressions in the c testsuite is required. > We (Google) plan to keep working on those failures and getting the C++ > front end in shape. We (GCC) should probably figure out a set of > criteria to consider merging the branch into mainline. Should we > shoot for being able to bootstrap with -flto enabled? I would at > least be able to pass all the testsuites with -flto enabled. > > My guess is that you are not going to get C++ working until all of the lang hooks are properly resolved. Some of the ways that some of these langhooks were resolved in the lto branch was to assume c. >> There are still many open issues that we are hoping that the community >> would address >> > > Thanks. I've added some of these items to the implementation plan on > the wiki page. The rest were already there, please take a look and > add/modify to the list. > > >> I personally was planning to start restructuring the ipa passes and >> serializing the cgraph. >> > > Great. Those are items under the WPA phase. If you have a list of > things to be done besides the ones that are already there, could you > add them? The more specific we are in this list, the easier it will > be for folks to pick up stuff to do. > > sure >> I personally think that the most pressing problems are >> >> 1) making lto/whopr work in the presence of modules that do not fit >> perfectly together, because of type or function argument mismatches. >> > > Agreed. > > >> that is available in non C languages. Toon's paper at last year's >> summit is a good example of exactly how badly we do, and the problem is >> likely to only get worse with LTO/whopr as the lang hooks go away. >> > > Are you talking about aliasing or things like high-level array operations? > > Arrays and type heirarchy games are certainly the two that come to mind. Strings are also a possibility. Many languages do magical things with strings that go way beyond what one can do with arrays. >> While the last section of the whopr pays some lip service to this >> > > Well, no. That only addresses some of the aliasing problems. > Representing high-level concepts like array/vector arithmetic or class > hierarchies is not something we have done well in GIMPLE. In terms of > whole program optimization, we will be interested in addressing class > hierarchy optimizations. > > >> a community have never really addressed the issues of how we could >> expand/change our internal representation to accomodate the high level >> features supported by the non c frontends. >> > > We have for concurrency with the extensions to support OpenMP which > are useful in contexts like auto-parallelism. But in general, we > don't transfer some things like array syntax or class hierarchies very > well. > > Now, adding high-level concepts to an IL is usually expensive in > several ways. Beyond arrays and class hierarchies, do you see any > other high-level concept worth transferring into GIMPLE? I wouldn't > want to represent very many high-level concepts in GIMPLE. > > >> The wiki does not indicate that there is any semantic difference between gimple trees >> and gimple tuples >> > > Right, there isn't. The work on tuples is orthogonal and can go > in/out at any time. It's just mechanically big, as it changes data > structures used by most of the compiler. All this work can proceed in > parallel. > > >> Both of these are very hard problems and they are likely to require the >> same level of commitment that will be required to make Whopr work. It >> is not that i think that making lto/whopr work in a distributed >> environment is not an important problem, it is just that i think that we >> need to make LTO produce good code on real programs first. >> > > Oh, absolutely. The design simply allows the first (LGEN) and last > stage (LTRANS) to operate in a distributed environment. The initial > implementation can simply assume a shared file system. Distribution > can be added later. The only important parameter is to avoid > implementation decisions that will prevent processing massively large > applications. > > > Thanks. Diego. > kenny From ddaney@avtrex.com Thu Dec 20 05:05:00 2007 From: ddaney@avtrex.com (David Daney) Date: Thu, 20 Dec 2007 05:05:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> <20071219153201.GH17368@sygehus.dk> <571f6b510712191317y58d1131dmaa7a00d32c753c56@mail.gmail.com> Message-ID: <4769F67F.8010609@avtrex.com> NightStrike wrote: > On 12/19/07, Steven Bosscher wrote: > >> Let's take a bug as an example case: http://gcc.gnu.org/23835 >> >> Here, there is a bug report about a huge compile time increase. The >> release manager decided that this was not a release blocker for GCC >> 4.2. So it was marked P4, and it disappeared from the radar for GCC >> 4.3 for everyone who only looks at the "Serious regressions". >> > > Under this system, do P4's and P5's ever get fixed? > Under the existing system *no* bugs get fixed unless someone wants to fix them. And to answer your question: http://gcc.gnu.org/bugzilla/buglist.cgi?query_format=advanced&short_desc_type=allwordssubstr&short_desc=&known_to_fail_type=allwordssubstr&known_to_work_type=allwordssubstr&long_desc_type=substring&long_desc=&bug_file_loc_type=allwordssubstr&bug_file_loc=&gcchost_type=allwordssubstr&gcchost=&gcctarget_type=allwordssubstr&gcctarget=&gccbuild_type=allwordssubstr&gccbuild=&keywords_type=allwords&keywords=&bug_status=RESOLVED&bug_status=VERIFIED&bug_status=CLOSED&priority=P4&priority=P5 From aoliva@redhat.com Thu Dec 20 05:16:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 05:16:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <4aca3dc20712191240r472ee46ekacf32886a1abd63@mail.gmail.com> (Daniel Berlin's message of "Wed\, 19 Dec 2007 15\:40\:11 -0500") References: <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> <4aca3dc20712191240r472ee46ekacf32886a1abd63@mail.gmail.com> Message-ID: On Dec 19, 2007, "Daniel Berlin" wrote: >> Now, if z_5 were present in a debug expression, then it would need >> adjusting. No different from the adjusting need for any other >> instruction in which z_5 was present, though. > uh, but if you don't adjust in the fixed examples, DEBUG(x, x_4) will > give an invalid value. My point was that optimizers already had to know how to adjust things such that it doesn't break code. Now, in this optimization, it takes additional liberties with existing variables because it sees they're only used within the sequence. IMHO, it would be more appropriate to introduce alternate temporaries, rather than reusing SSA names for different purposes, in this case. If this approach was taken, the debug annotations referring to a no-longer-defined SSA name would be recognized as invalid, and the variable binding would be removed (i.e., turned into a "value unknown" annotation). Or, if we left the definitions in place, even though they're dead, the same code that cleans up undefined SSA names could recognize these SSA names as unused except in debug information and substitute them for their values, maintaining accurate and complete debug information. But can we do better without introducing more SSA names and keeping assignments around that are known to be dead? Yes, with some additional effort, see below. > How do i know i need to change this DEBUG expression. As reassoc looks for sets of variables it can freely mess with, it should take note of variables that are used in debug annotations in addition to the kind of single (?) non-debug uses it's interested in, such that, when it modifies these variables, the annotations can be compensated for. OTOH, if the compiler performs reassoc on user variables today, it means we do get mangled debug information for such variables already, and they get incorrect values. So, even if we didn't address this problem right away, it wouldn't be much of a regression. But, of course, not dealing with it breaks the goal of having correct debug information, so it ought to be dealt with properly. Do you happen to have a yummy testcase handy that I could use to trigger this kind of transformation in ways that affect the value of user variables? Thanks in advance, -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Thu Dec 20 05:51:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 05:51:00 -0000 Subject: Strange error message from gdb In-Reply-To: <18281.27225.249982.220171@zebedee.pink> (Andrew Haley's message of "Wed\, 19 Dec 2007 19\:00\:41 +0000") References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> <18281.23234.151870.816362@zebedee.pink> <20071219185517.GA10986@caradoc.them.org> <18281.27225.249982.220171@zebedee.pink> Message-ID: On Dec 19, 2007, Andrew Haley wrote: > Right, so read_type_die() doesn't know how to handle > DW_TAG_interface_type. The weird thing is that I have never seen this > error mesage before today, and AFAIAA gcj has been generating these > interface types for a long while. For very small values of "long while" :-) This was added by: 2007-12-15 Alexandre Oliva PR debug/7081 * lang.c (java_classify_record): New. (LANG_HOOKS_CLASSIFY_RECORD): Override. Sorry, I didn't check whether GDB or other debug information consumers supported this tag. I just ASSumed they did, given how long they've been specified (today Dwarf 3 turns 2 :-) and how noisy a failure would be should one run into such a tag without supporting it. What now, revert until GDB et al are fixed, or leave it in, for it's the right thing to do, and it serves as an additional incentive for debug information consumers to support new Dwarf 3 features? Or introduce the -gdwarf-3 and get -gdwarf-2 (and -g?) to avoid using Dwarf 3 features that are not backward compatible? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From jklowden@freetds.org Thu Dec 20 06:08:00 2007 From: jklowden@freetds.org (James K. Lowden) Date: Thu, 20 Dec 2007 06:08:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: References: <20071219200235.GA21525@oak.schemamania.org> Message-ID: <20071220005030.4971a442.jklowden@freetds.org> Ian Lance Taylor wrote: > I have no objection to splitting -Wparentheses into separate warnings > controlled by separate options. Thank you, Ian. > > which yields (as you know) advice to parenthesize the two && pairs. > > That particular warning happened to find dozens of real errors when I > ran it over a large code base. It may be noise for you, but I know > from personal experience that it is very useful. I would like to hear more about that, if you wouldn't mind. I'm really quite surprised. Honestly. I don't claim to be the last arbiter in good taste. It sounds like you're saying that this warning, when applied to code of uncertain quality, turned up errors -- cases when the code didn't reflect (what must have been) the programmer's intentions. My untested (and consequently firmly held) hypothesis is that 1) most combinations of && and || don't need parentheses because (a && b) || (c && d) is by far more common than a && (b || c) && d and, moreover, broken code fails at runtime, and 2) Most programmers know (because they need to know) that && comes before ||. I'm sure a few years spent working with the GCC and fielding questions about it would lower my opinion of the average programmer, so I won't try to convince you. But I would like to know more about what you found, because that's at least objective evidence. I was unable to find any metrics supporting the inclusion of this particular warning in -Wall. I would hold to this, though: the warnings about precedence are of a different order than warnings about nesting. I suggest that well vetted code doesn't benefit from the kind of false positives that -Wparentheses can generate. I very much appreciate your time and effort. Kind regards, --jkl From aoliva@redhat.com Thu Dec 20 06:10:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 06:10:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "19 Dec 2007 13\:10\:49 -0800") References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: On Dec 19, 2007, Ian Lance Taylor wrote: > For some things, sure, but we are just talking about the values in > user visible variables stored in registers. There is no way we can > make that information be correct between line notes. Err... I think there is, and one way to do it is with the design I've proposed. Do you have anything to back up your implied assertion that the design can't accomplish this? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Thu Dec 20 08:00:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 08:00:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "19 Dec 2007 10\:00\:38 -0800") References: <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: On Dec 19, 2007, Ian Lance Taylor wrote: > Alexandre Oliva writes: >> You snipped (skipped?) one aspect of the reasoning on why it is >> appropriate. Of course this doesn't prove it's the best possibility, >> but I haven't seen evidence of why it isn't. > You will find it easier to demonstrate the worth of your proposal if > you act publically as though your interlocutors are people of good > will, even when it doesn't seem that way to you, and omit > interjections like "(skipped?)". Sorry, I didn't mean it in a demeaning tone. I realize I should have been more careful, given the heat of the debate, for which I apologize. It just so happens that I'm just used to having texts I write skimmed through rather than read in detail, so, when someone makes a point that appears to disregard something that I write about, I tend to assume that the person missed the portion in which I discussed it. That was what the 'skipped?' was about. I know I tend to pack too much information in small spaces when I write (and I'm not proud of it, mind you :-), so having readers miss points I did try to address is unfortunately quite common. Again, I apologize for not realizing this could be interpreted in a different way than the one I meant. It was indeed inappropriate. > To be sure we are on the same page, I think your argument here is that > with this code: > int f(int x, int y) { > int i = 0, j = 0; > probe1(); > i = x; > j = y; > probe2(); > if (x < y) > i += y; > else > j -= x; > probe3(); > return g (i ,j); > } > if I set a breakpoint just before the call to probe2(), and I print > the values of 'i' and 'j', I should get the values of 'x' and 'y'. > That is, you want to emit a DWARF variable note at that point that the > value of 'i' can be found in the location corresponding to 'x'. Yep. That would be correct and complete. It would also be acceptable, but undesirable, to emit information to the effect that the locations of 'i' and 'j' are unknown at those points; for this would be correct, even if incomplete. > Of course there are no actual instructions between the calls to > probe1() and probe2(). If I use gdb's "finish" command out of > probe1(), what values should I see for 'i' and 'j' at that point? > Arguably I am now before the assignment statements, and should see '0' > and '0', the values that 'i' and 'j' have before they are changed. Of > course, this is the same location as the breakpoint before probe2(), > and we can't see both '0'/'0' and 'x'/'y'. So it seems to me that > this situation is actually somewhat ambiguous. I don't see an > obviously correct answer. Dan has dealt with this point, but, if it floats your boat, you can disregard any hope of getting it right between probe1() and probe2(), since there aren't instructions in between them, and focus on getting it right at probe2() or while probe2() is active in a lower stack frame. > I think the general issue you are describing is how to handle an > assignment which appears in user code but which has been eliminated > during optimization. Yes, this is a way to describe it. I'm addressing this in a bit more detail in a revised version of the spec, that I intend to publish in the GCC wiki RSN. > It seems to me that such eliminated assignments are inherently > ambiguous. If the assignment is gone, then there is a point in the > generated code where the variable logically has both the old and the > new values. I assume that the debugger can only display one value. > Which one should it be? I don't think this characterization is correct. There are points that are logically before the removed assignment, and there are points that are logically after it. If we actually emitted a nop for the removed assignment, then we could single-step through it and observe the change in the logical variable even though no observable change occurred in the program state (other than the advance of the PC past this nop). Except that, in the implementation plan I have in mind, the observable change would quite often be from "unknown value" to "assigned value", because the location holding the previous value will likely have already been overwritten when we reach the debug insn. > Consider a series of assignments to a local variable, and suppose > that all the assignments are deleted becaues they are unused. Are > there dependencies between the DEBUG notes which keep them in the > right order? There ought to be, for sure, such that the last one prevails. > Presumably we do not have the goal of emitting correct debug > information in between line notes I do. Stack traces, for one, are seldom taken at line note boundaries, for stack frames other than the top active one. If we didn't have correct debug information at those points, monitors wouldn't be able to do a correct job. Going from that to backtraces that cross signal handling frames makes it only slightly more complex, from a theoretical standpoint. I.e., I don't see that solving the problem such that it addresses the apparently-simpler requirement would take significantly less implementation effort than solving the apparently-more-complex requirement. > I wonder whether it would be feasible for the debug info generation to > work from the assignments in the source code as generated by the > frontend. For each assignment, we would find the corresponding line > note. Then we would look at the right hand side, and try to identify > where that value could be found at that point in the program. This > would be a variant of our current variable tracking pass. I haven't > thought about this enough to know whether it would really work. I've been giving something along these lines some thought, but it's a bit more elaborate, and I'm not ready to present even a draft of my thoughts on this topic. And I unfortunately may have to discuss it with lawyers before I can do anything concrete about it. > That will only work correctly if sched-deps.c introduces dependencies > between debug insns and real insns. Yep, it does, have a look at the vta branch. In fact, sched is the pass that has given me the most headaches to get bootstrap-debug to pass. > If you introduce those dependencies, I don't understand how you will > avoid changing the schedulers behaviour in the presence of debug > insns. How did you work around that problem? Debug insns don't use any actual machine resources, and they sort of always fit, so the scheduler can accept them as soon as they become ready, without changing any other internal state. I haven't introduced explicit deps among debug insns, because I get the impression that they're implied by the original instruction order and the fact that, if two debug insns become simultaneously ready, there's nothing that would reorder them (sorting is stable). That said, I'm pretty sure I still have some scheduling issues to sort out. Trying to get bootstrap-debug to pass on ppc64 and ia64 has exposed a number of scheduling issues, but IIRC almost all of them were in the machine-specific scheduling code, that needed adjusting to tolerate debug insns without internal state changes. But I may still be missing additional tweaks to the machine-independent scheduling code. > Personally, I would like to see that testsuite first. That will give > us an operational definition to aim for, rather than a theoretical > discussion which I find to be ambiguous. The two examples at the end of the design document are sort of meant as a starting point for the testsuite. As we discuss further interesting examples, I'll probably add them, if not to the document, to some collection of interesting debug info testcases. I'm not ready to spend time figuring out the precise incantations to automate these tests yet, but contributions along these lines would obviously be welcome. As for myself, I need to complete the design of the GVN-like algorithm to turn RTL debug insns into var tracking notes, that's currently underspecified. Once that's done, we'll be able to start testing things more seriously, and polishing the heuristics that are going to be needed to decide between lvalue location or rvalue for variables, partitioning lvalues that happen to be in the same value equivalence classes into different user variables, this sort of stuff. I think this will take some experimentation to get a reasonable idea of what is right, or at least reasonable. > And it will avoid the problem of turning the testsuite into a > regression testsuite rather than an accuracy testsuite. Sorry, I don't understand what you mean here. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Thu Dec 20 08:01:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 08:01:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Alexandre Oliva's message of "Thu\, 20 Dec 2007 04\:05\:18 -0200") References: <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: On Dec 20, 2007, Alexandre Oliva wrote: > I'm addressing this in a bit more detail in a revised version of the > spec, that I intend to publish in the GCC wiki RSN. http://gcc.gnu.org/wiki/Var_Tracking_Assignments Here's a diff between the version I posted a couple of days ago and the one before I started adjusting formatting for better rendering in the wiki. Index: debug-var-loc.txt =================================================================== RCS file: /home/aoliva/.cvs/txt/free-software/gcc/debug-var-loc.txt,v retrieving revision 1.2 retrieving revision 1.4 diff -u -p -d -u -p -r1.2 -r1.4 --- debug-var-loc.txt 18 Dec 2007 08:03:42 -0000 1.2 +++ debug-var-loc.txt 20 Dec 2007 07:32:56 -0000 1.4 @@ -34,54 +34,66 @@ optimization passes discard information emit correct and complete variable location lists. Coalescing, scalarizing, substituting, propagating, and many other -transformations prevent the late-running variable tracker from doing -an accurate job. By the time it runs, many variables no longer show -up in the retained annotations, although they're still conceptually -available. +transformations prevent the late-running variable tracker from doing a +complete or even accurate job. By the time it runs, many variables no +longer show up in the retained annotations, although they're still +conceptually available. -The variable tracker can't tell when a user variable overlaps with -another, and it can't tell when a variable is overwritten, if the -assignment is optimized away. These limitations are inherent to a -model based on inspecting actual code and trying to make inferences -from that. In order to be able to represent not only what remained in -the code, but also what was optimized, combined or otherwise -apparently-removed, additional information needs to be kept around. +The variable tracker can't handle sharing of a location by multiple +user variables, multiple active locations for the same variable, and +it can't tell when a variable is overwritten, if the assignment is +optimized away. This last limitations is inherent to a model based on +inspecting only actual code, and trying to make inferences from that. +In order to be able to represent not only what remained in the code, +but also what was optimized, combined or otherwise apparently-removed, +additional information needs to be kept around. This paper describes an approach to maintain this information. == Goals -* Ensure that, for every user variable for which we emit debug -information, the information is correct, i.e., if it says the value of -a variable at a certain instruction is at certain locations, or is a -known constant, then the variable must not be at any other location at -that point, and the locations or values must match reasonable -expectations based on source code inspection. +=== Correctness -* Defining "reasonable expectations" is tricky, for code reordering -typical of optimization can make room for numerous surprises. I don't -have a precise definition for this yet, but very clearly to me saying -that a variable holds a value that it couldn't possibly hold (e.g., -because it is only assigned that value in a code path that is -knowingly not taken) is a very clear indication that something is -amiss. The general guiding rule is, if we aren't sure the information -is correct (or we're sure it isn't), we shouldn't pretend that it is. +Ensure that, for every user variable for which we emit debug +information, the information is correct, i.e., if it provides +locations or value expressions for a variable in a certain range of +instructions, then, for all instructions in that range, the values +specified in the debug information must match the value the user +variable is bound to. -* Try to ensure that, if the value of a variable is a known constant +We say a variable is bound to a value when control flow crosses a +theoretical instruction placed at the point of the program in which +the user variable is, or should be have been, assigned that value. +This theoretical instruction is maintained roughly in place regardless +of optimizations that move, remove or otherwise optimize any code +generated to implement the source-level variable modification. More +details below, in the "scheduling and reordering" section. + + +=== Completeness + +Try to ensure that, if the value of a variable is a known constant at +a certain point in the program, this information is present in debug +information. + +Try to ensure that, if the value of a variable is available at any +location, or computable from values available at any other locations at a certain point in the program, this information is present in debug information. -* Try to ensure that, if the value of a variable is available or -computable at any location at a certain point in the program, this -information is present in debug information. -* Stop missing optimizations for the sake of preserving debug -information. +=== Run-time efficienty -* Avoid using additional memory and CPU cycles that would be needed -only for debug information when compiling without generating debug -information +Stop missing optimizations for the sake of preserving variable +location debug information. + + +=== Compile-time efficienty + +Avoid using additional memory and CPU cycles that would be needed only +to generate debug information when compiling without generating debug +information. == Internal Representation @@ -118,9 +130,9 @@ most optimization passes, be handled jus Once this is established, a possible representation becomes almost obvious: statements (in trees) or instructions (in rtl) that assert, to the variable tracker, that a user variable or member is represented -by a given expression: +by a given expression, or that bind a user variable to a value: - # DEBUG var expr + # DEBUG var => expr By var, we mean a tree expression that denotes a user variable, for now. We envision trivially extending it to support components of @@ -128,103 +140,204 @@ variables in the future. By expr, we mean a tree or rtl expression that computes the value of the variable at the point in which the statement or instruction -appears in the program. A special value needs to be specified for -each representation that denotes a location or value that cannot be -determined or represented in debug information, for example, the -location of a variable that was completely optimized away. It might -be useful to represent the expression as a list of expressions, and to -distinguish lvalues from rvalues, but for now let's keep this simple. +appears in the program, and that the variable is expected to hold +until (i) execution crosses another such annotation for that variable, +or (ii) the value becomes no longer computable, because all locations +containing it or usable to compute it are no longer provably usable to +compute it. For example, if the variable is bound to the value of a +certain hardware register, and the register is subsequently modified, +but the bound value is not known to be available elsewhere, then the +variable is regarded as unavailable at that point. + +A special value needs to be specified for each debug annotation +representation that denotes an unavailable variable. Although in some +cases this condition can be detected implicitly, as described above, +in others we must be able to describe that, at the point of the +binding, the value that should be bound to the variable is not +available, for example, because it was completely optimized away and +it's not even computable any more, or because the compiler has been +unable to represent or to keep track of the expected value of the +variable at that point. + +Also, it might be useful to represent the expression as a list of +expressions, to establish larger equivalence classes to begin with and +to get better resistance against complete loss of values. It may also +be useful distinguish lvalues from rvalues in the representation, but +for now we're keeping it simpler, to see if we can make do without the +additional complexity. == Generating debug information Generating initial annotations when entering SSA is early enough in the translation that the program will still reflect very reliably the -original source code. Annotations are only generated for user -variables that are GIMPLE registers, i.e., variables that represent -scalar values and that never have their address taken. Other kinds of -variables don't have varying locations, so we don't need to worry -about them. +original source code. We will only emit such annotations for user +variables that are GIMPLE registers, i.e., variables that present in +the source code, that are not addressable and that hold scalar values. +Addressable or non-scalar user variables don't have varying locations, +so we don't need these annotations to generate correct debug +information for them. -After every assignment to such a variable, we emit a DEBUG statement -that will preserve, throughout compilation, the information that, at -that point, the assigned variable was represented by that expression. -So, after turning an assignment such as the following into SSA form, -we emit the debug statement below right after it: +As optimizations transform the code, the initially-trivial mapping +between such user variables and implementation locations gets more and +more fuzzy. Even when the compiler retains mnemonic names that +resemble user variable names for such implementation locations (GIMPLE +registers, RTL pseudos, hardware registers and stack slots), it is +important to keep in mind that source- and implementation concepts are +in different name spaces, and that the implementation locations cannot +be assumed to remain associated with the user variables they were +initially named after. + +The purpose of the annotations is precisely to establish a mapping +from user variables to implementation concepts without preventing +optimizations. The choice of focusing not so much on locations, but +rather on values, is intended to minimize the impact of optimizations +on the ability to represent the value a variable holds, which is what +debug information consumers are most often interested in. Actual +locations are a slightly secondary issue, that we expect to be able to +infer from the value binding annotations, but that may require more +explicit annotations, as mentioned above. + +After every assignment to user variables that are GIMPLE registers, we +emit a DEBUG statement that will preserve, throughout compilation, the +information that, at that point, the user variable was bound to the +value of that expression. So, after putting an assignment such as the +following in SSA form, we emit the debug statement below right after +it: x_1 = whatever; - # DEBUG x x_1 + # DEBUG x => x_1 -Likewise, at control flow merge points, for each PHI node we introduce -in the SSA representation, we emit an annotation: +Likewise, at control flow merge points, for each PHI node associated +with a user variable we introduce in the SSA representation, we emit +an annotation: # x_4 = PHI ; - # DEBUG x x_4 + # DEBUG x => x_4 Then, we let tree optimizers do their jobs. Whenever they rename, renumber, coalesce, combine or otherwise optimize a variable, they -will automatically update debug statements that mention them as well. +will most likely automatically update debug statements that mention +them as well. In the rare cases in which the presence of such a statement might prevent an optimization, we need to adjust the optimizer code such that the optimization is not prevented. This most often amounts to -skipping or otherwise ignoring debug statements. In a few very rare -cases, special code might be needed to adjust debug statements -manually. +skipping or otherwise ignoring debug statements. In a few rare cases, +additional code might be needed specifically to adjust debug +statements. -After transformation to RTL, the representation needs translation, but -conceptually it's still the same: a mapping from variable to -expression. Again, optimizers will most often adjust debug -instructions automatically. +During conversion to RTL, the debug statements also decay to debug +instructions, and the tree value expressions are trivially converted +to RTL. Conceptually, however, it's still the same representation: a +binding from user variable to expression. RTL optimizers will most +often adjust debug instructions automatically. -The exceptions can be handled at no cost: the test for whether an -element of the instruction stream is an instruction or some kind of -note, that never needs updating, is a range test, in its optimized +The exceptions can be handled often at no cost: the test for whether +an element of the instruction stream is an instruction or some kind of +note (that never needs updating) is a range test, in its optimized form. By placing the identifier for a debug instruction at one of the -limits of this range, testing for both ranges requires identical code, -except for the constants. +limits of this range, testing for ranges that include or exclude debug +instructions requires identical code, except for the constants. Since most code that tests for INSN_P and handles instructions can and should match debug instructions as well, in order to keep them up to date, we extend INSN_P so as to match debug instructions, and modify -the exceptions, that need to skip debug instructions, by using an -alternate test, with the same meaning as the original definition of -INSN_P. These simple and non-intrusive changes are relatively common, -but still, by far, the exception rather than the rule. +the code in the exceptions, that need to skip debug instructions, by +using an alternate test, with the same meaning as the original +definition of INSN_P. These simple and non-intrusive changes are +relatively common, but still, by far, the exception rather than the +rule. As in tree level, there are transformations that require +special handling of debug annotations, but these are even rarer. When optimizations are completed, including register allocation and -scheduling, it is time to pick up the debug instructions and emit -debug information out of them. Conceptually, the debug instructions -represent points of assignment, at which a user variable ought to -evaluate to the annotated expression, maintained throughout -compilation. However, when the value of a variable is live at more -than one location, it is important to note it, such that, if a -debugging session attempts to modify the variable, all copies are -modified. +scheduling, it is time to take the data collected in debug +instructions and emit debug information out of them. Conceptually, +the debug instructions represent points of assignment, at which a user +variable ought to evaluate to the annotated expression, maintained +throughout compilation. However, when the value of a user variable is +available at more than one location (think, for example, stack +variable temporarily held also in a register), it is important to note +it, such that, if a debugging session attempts to modify the variable, +all copies are modified. The idea is to use some mechanism to determine equivalent expressions throughout a function (say some variant of Global Value Numbering). At debug instructions, we assert that the value of the named variable -is in the equivalence class represented by the expression. As we scan +is in the equivalence class the expression belongs to. As we scan basic blocks forward and find that expressions in an equivalence class are modified, we remove them from the equivalence class, and thus from -the list of available locations for the variable. When such -expressions are further copied, we add them to equivalence classes. -At function calls and volatile asm statements, we remove -non-function-private memory slots from equivalence classes. At -function calls, we also remove call-clobbered registers from -equivalence classes. When no live expression remains in the -equivalence class that represents a variable, it is understood that -its value is no longer available. At basic block confluences, we -combine information from the end states of the incoming blocks and the -debug statements added as a side effect of PHI nodes. +the list of available locations for the variables that hold that +value. When members of an equivalence class are copied, we add the +copies to equivalence class. At function calls and volatile asm +statements, we remove non-function-private memory slots from +equivalence classes. At function calls, we also remove call-clobbered +registers from all equivalence classes. When no live expression +remains in the equivalence class that represents a variable, it is +understood that its value is no longer available. At basic block +confluences, we combine information from the end states of the +incoming blocks and the block-entry debug statements that had been +added after PHI nodes earlier. -The end result is accurate debug information. Also, except for -transformations that require special handling to update debug -annotations properly, debug information should come out as complete as +When multiple variables are held in the same equivalence class, some +care must be taken to determine which locations can be used as +modifiable copies of a variable and which hold incidental copies. +More investigation is needed to design strategies to make this +partitioning, such that the end result is accurate debug information. + +Also, except for transformations that require special handling to +update debug annotations properly but that haven't been improved +accordingly, debug information should come out as complete as possible. +== Scheduling and reordering + +Optimizing code involves a lot of moving code around. Basic block +reordering, loop unrolling, and other forms of code duplication, +movement or removal that affect placement of sequences of +instructions, but not so much the instructions to be executed in a +given execution path, have no effect on the debug information +annotations presented in this article. When moving, duplicating or +removing code along these lines, debug annotations can be regarded +just like regular instructions. + +Other than that, debug annotations should generally remain in place, +serving as guides for what would amount to the natural execution order +of the program, regardless of optimizations that reorder instructions, +move instructions out of loops or conditionals. + +For example, if we move to an unconditional block a computation that +was only to be performed inside a conditional, the debug annotation +that binds the variable to the conditionally-computed value should +remain in the conditional block, unless it is completely eliminated. +Likewise, if some computation is hoisted out of a loop, the debug +annotation should remain in the loop, where the user expects the +assignment to take place. + +Moving a computation to an earlier point shouldn't require +modification in subsequent debug annotations, but moving it to a later +point may, especially when the move crosses the annotation. For +example, if an assignment instruction, say x = y, is moved past the +end of a loop, debug annotations that refer to x in their expressions +probably need to have it replaced with y, such that the binding +remains with the same value in spite of the assignment move. + +Transformations that reorder instructions within a single block, such +as instruction scheduling, don't require modification of annotations. +Debug annotations should be maintained after the assignments they +refer to, if the assignments are still nearby, and this is trivially +accomplished through scheduling dependencies. Other than that, debug +annotations should generally have high scheduling priority, such that +they are kept right after the corresponding assignment, or moved early +when an assignment was hoisted out of a loop. That said, reordering +debug annotations may be undesirable and surprising at times. Also, +care must be taken to not schedule too early bindings for values that +are completely optimized away: because these have no dependencies, +they might be moved too early, to the point of making the range of the +previous binding an empty range. + + == Testability Since debug annotations are added early, and, in most cases, @@ -240,9 +353,9 @@ maintaining debug annotations throughout them away at the end. This is undesirable, for it would slow down compilation without debug information and waste memory while at that. -Therefore, we've built testing mechanisms into the compiler to detect -cases in which the presence of debug annotations would cause code -changes. +Therefore, we've built testing mechanisms into the compiler build +machinery to detect cases in which the presence of debug annotations +would cause code changes. The bootstrap-debug Makefile target, by default, compiles the second bootstrap stage without debug information, and the third bootstrap @@ -285,11 +398,13 @@ or whether the value is available or com missing, is a harder problem, but it's not part of the accuracy test, but rather of the completeness test. -The completeness score for an unoptimized program might very often be +A completeness score for an unoptimized program might very often be unachievable for optimized programs, not because the compiler is doing a poor job at maintaining debug information, but rather because the -compiler is doing a good job at optimizing it, to the point that it is -no longer possible to determine the value of the inspected variable. +compiler is doing a good job at optimizing it, to the point that no +possibility remains of computing the value of certain variables at +certain points in the program. This should be taken into account when +desigining completeness tests. == Concerns @@ -303,14 +418,16 @@ bit. In order to generate correct debug information, more information needs to be retained throughout compilation. The only way to arrange for debug information to not require any additional memory is to waste -memory when not generating debug information. But this is -undesirable. +memory when not generating debug information. But this is probably +undesirable, even if it would minimize the risks of debug annotations +affecting optimizations and modifying the generated code. Therefore, the better debug information we want, the more memory overhead we're going to have to tolerate. Of course at times we can trade memory for efficiency, using more -computationally expensive representations that are more compact. +computationally expensive representations that are more compact, when +we can't have both compactness and efficiency. At other times, we may trade memory for maintainability. For example, instead of emitting annotations as soon as we enter SSA mode, we could @@ -319,29 +436,31 @@ modified an SSA assignment for which we annotation. Additional memory would be needed to mark assignments that should have gained annotations but haven't, and care must be taken to make sure that transformations aren't made without leaving a -correct debug statement in place. It is not clear that this would -save significant memory, for a large fraction of relevant assignments -are modified or moved anyway, so it might very well be a -maintainability loss and a performance penalty for no measurable -memory gains. +correct (even if still implied) debug annotation in place. It is not +clear that this would save significant memory, for a large fraction of +relevant assignments are probably modified or moved anyway, so it +might turn out to be a maintainability and performance loss for small +memory gains. More investigation is required to determine whether +this is indeed the case. -Worst case, we may trade memory for debug information quality: if -memory use of this scheme is too high for some scenario, one can -disable debug information annotations through a command line option, -or disable debug information altogether. +Worst case, a user may trade memory for debug information quality: if +the memory use of this scheme turns out to be too high for some +scenario, the user can disable debug information annotations through a +command line option, or disable debug information altogether. === Intrusiveness Given that nearly all compiler transformations would require reflection in debug information, any solution that doesn't take -advantage of this fact is bound to require changes all over the place. +advantage of this fact is bound to require changes all over the +compiler. Perhaps not so much for Tree-SSA passes, that are relatively well-behaved and use a narrow API to make transformations, but very clearly so for RTL passes, that very often modify instructions in place, and at times even reuse locations assigned to user variables as -temporaries. +temporaries (the same is true of tree-ssa-reassoc, FWIW). Even when we do use the strength of optimizers to maintain debug information up to date, there are exceptions in which detailed @@ -378,7 +497,8 @@ below. Worrying about the representation of debug annotations as statements or instructions, rather than notes, is missing the fact that, most of the time, we do want them to be updated just like statements and -instructions. +instructions, rather than handled like notes, that never need +updating. Worrying about the representation of debug annotations in-line, rather than an on-the-side representation, is a valid concern, but it's @@ -400,19 +520,21 @@ generates actually matches the executabl complete as viable. The goal is not to disable optimizations so as to preserve variables -or code, such that it can be represented in debug information and +or code, such that they could be represented in debug information and provide for a debugging experience more like that of code that is not -optimized. - -If debug information disables any optimization, that's a bug that -needs fixing. +optimized. If debug information disables any optimization, that's a +bug that needs fixing. Preventing optimizations that lower the +quality of debug information is a separate feature, and one that will +benefit from this work, but that won't be accomplished through this +work. -Now, while testing this design, a number of opportunities for -optimization that GCC missed were detected and fixed, others were -merely detected, and at least one optimization shortcoming kept in -place in order to get better debug information could be removed, for -the new debug information infrastructure enables the optimization to -be applied in its fullest extent. +It is worth mentioning that, while testing the implementation of this +design, a number of opportunities for optimization that GCC missed +were detected and fixed, others were merely detected sof ar, and at +least one optimization shortcoming kept in place in order to get +better debug information could be removed, for the new debug +information infrastructure enables the optimization to be applied in +its fullest extent. == Examples @@ -449,7 +571,9 @@ print the correct values for i if we kee In this case, before the call to h, not only the assignment to i was dead, but also the value of the incoming argument x had already been clobbered. If i had been assigned to another constant instead, debug -information could easily represent this. +information could easily represent this, through an extension to DWARF +version 3 that enable location lists to contain value expressions, in +addition to location expressions. Another example that covers PHI nodes and conditionals: @@ -491,7 +615,8 @@ x2 (int x, int y, int z) Note how, without debug annotations, c is only initialized just before the call to whatever4. At all other points, the value of c would be -unavailable to the debugger, possibly even wrong. +unavailable to the debugger, possibly even wrong, if prior assignments +to c had survived optimization. If we were to annotate the SSA definitions forward-propagated into c versions as applying to c, we'd end up with all of x_2, y_3 and z_0 @@ -506,23 +631,23 @@ x2 (int x, int y, int z) int c; # bb 1 c_4 = z_0(D); - # DEBUG c c_4 + # DEBUG c => c_4 whatever0(c_4); c_5 = x_2(D); - # DEBUG c c_5 + # DEBUG c => c_5 whatever1(); if (some_condition) { # bb 2 whatever2(); c_6 = y_3(D); - # DEBUG c c_6 + # DEBUG c => c_6 whatever3(); } # bb 3 # c_1 = PHI - # DEBUG c c_1 + # DEBUG c => c_1 whatever4(c_1); } @@ -533,20 +658,20 @@ x2 (int x, int y, int z) { int c; # bb 1 - # DEBUG c z_0(D) + # DEBUG c => z_0(D) whatever0(z_0(D)); - # DEBUG c x_2(D) + # DEBUG c => x_2(D) whatever1(); if (some_condition) { # bb 2 whatever2(); - # DEBUG y_3(D) + # DEBUG c => y_3(D) whatever3(); } # bb 3 # c_1 = PHI ; - # DEBUG c c_1 + # DEBUG c => c_1 whatever4(c_1); } -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Thu Dec 20 08:04:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 08:04:00 -0000 Subject: DEBUG_INSN that is not an insn In-Reply-To: <571f6b510711050129u349da3c8tbfa5ce9050a7e13b@mail.gmail.com> (Steven Bosscher's message of "Mon\, 5 Nov 2007 10\:29\:22 +0100") References: <571f6b510711050129u349da3c8tbfa5ce9050a7e13b@mail.gmail.com> Message-ID: On Nov 5, 2007, "Steven Bosscher" wrote: > Alex, maybe you could add a Wiki page about this project Done, at last! :-) http://gcc.gnu.org/wiki/Var_Tracking_Assignments -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Thu Dec 20 08:18:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 08:18:00 -0000 Subject: [www] get VTA branch entry to refer to design document Message-ID: I'm checking this in... --- svn.html 16 Nov 2007 11:11:59 -0200 1.69 +++ svn.html 20 Dec 2007 06:02:39 -0200 @@ -156,7 +156,8 @@ list therefore provides only some repres
-
var-tracking-assignments-branch
+
+ var-tracking-assignments-branch
This branch aims at improving debug information by annotating assignments early in the compilation, and carrying over such annotations throughout optimization passes and representations. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From stevenb.gcc@gmail.com Thu Dec 20 08:25:00 2007 From: stevenb.gcc@gmail.com (Steven Bosscher) Date: Thu, 20 Dec 2007 08:25:00 -0000 Subject: DEBUG_INSN that is not an insn In-Reply-To: References: <571f6b510711050129u349da3c8tbfa5ce9050a7e13b@mail.gmail.com> Message-ID: <571f6b510712200017h14384d21x555bc01e46e8156a@mail.gmail.com> On Dec 20, 2007 9:00 AM, Alexandre Oliva wrote: > On Nov 5, 2007, "Steven Bosscher" wrote: > > > Alex, maybe you could add a Wiki page about this project > > Done, at last! :-) > > http://gcc.gnu.org/wiki/Var_Tracking_Assignments Very nice. Thanks! Gr. Steven From jh@suse.cz Thu Dec 20 08:53:00 2007 From: jh@suse.cz (Jan Hubicka) Date: Thu, 20 Dec 2007 08:53:00 -0000 Subject: [RFC] WHOPR - A whole program optimizer framework for GCC In-Reply-To: <476946B8.9030409@google.com> References: <47603F3C.2090808@google.com> <20071218132914.GB12527@atrey.karlin.mff.cuni.cz> <476946B8.9030409@google.com> Message-ID: <20071220082508.GA25454@kam.mff.cuni.cz> Hi, > > Jan, wrt the optimization plan coming out of the analysis phase, and the > various pieces of header/summary information, what do you think are the > major pieces we need? The cgraph used to be organized on a separate analysis, propagation and modify stages and the passes in implemented fall quite naturally into those pieces. Naturally it is limiting ellement, but not that bad and for some fancier passes I am sure we can arrange interesting things to be compiled at once or something like that. As discussed with Kenny in the past, there are ordering issues - ie one pass might invalidate info of other so there is a lot of interference that makes it more challenging to order the IPA pass queue than for all-in-memory IPA consisting of couple of completely independent passes. We will need better memory management for Gimple, have effective way to load function body into memory and actually cheaply release it when it is no longer needed that is quite challenging with GGC. I would hope that in longer term we can move tupleized gimple into pools. I also have sort of implementation plan TODO, I never got around putting it into wiki well, but I will try to check how much ours differ and update it. One area that is quite nasty WRT LTO is debug info. Honza > > In terms of branch mechanics, I'm initially tempted to do this > implementation on a branch separate from tuples and lto. This will > allow us to merge both lto and tuples separately, as the rest of the > optimizer is still a long ways away. What do folks think? > > > Thanks. Diego. From joey.ye@intel.com Thu Dec 20 09:11:00 2007 From: joey.ye@intel.com (Ye, Joey) Date: Thu, 20 Dec 2007 09:11:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: <20071219031617.02C7073CC4@caffeine.csclub.uwaterloo.ca> References: <20071219031617.02C7073CC4@caffeine.csclub.uwaterloo.ca> Message-ID: Ye, Joey writes: >> This proposal values correctness at first place. So when compile can't >> make sure a function is only called from functions with the same or bigger >> preferred-stack-boundary, it will conservatively align the stack. One >> optimization is to set INCOMING = PREFERRED for local functions. Do you >> think it more acceptable? Ross Ridge wrote: > Not really. It might reduce the amount of unnecessary stack adjustment, > but the performance regression would remain. Changing the behaviour of > -fpreferred-stack-boundary doesn't make it more correct. It supposed > to change the ABI, it works as documented and, yes, if it's misused it > will cause problems. So will any number of GCC's ABI changing options. > Look at it another way. Lets say you were compiling x86_64 code with > -fpreferred-stack-boundary=3, an 8-byte PREFERRED alignment. As you > know, this is different from the standard x86_64 ABI which requires a > 16-byte alignment. Now with your proposal, GCC's behaviour of won't > change, because it's safe to assume that incoming stack is at least > 8-byte aligned. There should be no change in the code GCC generates, > with or without your proposal. However, the outgoing stack won't be > 16-byte aligned as the x86_64 ABI requires. In this case, what also > doesn't change is the fact that mixing code compiled with different > -fpreferred-stack-boundary values doesn't work. It's just as problematic > and unsafe as it was before. > So when you said "this proposal values correctness at first place", > that really isn't true. The proposal only addresses safety when > preferred alignment is raised from the standard ABI's alignment. You're > conservatively aligning the incoming stack, but not the outgoing stack. > You don't seem to be concerned about the problems that can arise when > the preferred is raised above the ABI's. Why? My guess is that because > "correctness" in this case would cause unacceptable regressions when > compiling the x86_64 Linux kernel. You are right. My proposal doesn't guarantee 100% correctness. In case of PREFERRED < ABI, we hope the author knows what will happen. > If you can understand why it would be unacceptable to change how > -fpreferred-stack-boundary behaves when compiling the Linux kernel, > then maybe you can understand why I don't find it acceptable for it to > change when compiling my code. I think I understand now. My updated version proposal sets INCOMING == PREFERRED, and -fpreferred-stack-boundary works the same as before. Thanks - Joey From joey.ye@intel.com Thu Dec 20 09:32:00 2007 From: joey.ye@intel.com (Ye, Joey) Date: Thu, 20 Dec 2007 09:32:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: References: <20071219031617.02C7073CC4@caffeine.csclub.uwaterloo.ca> Message-ID: Andrew, My proposal is supposed not limited to i386/x86_64. Would do please spend some time review it and see if it can really solve problem in PowerPC? Your comments is welcome. Thanks - Joey -----Original Message----- From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf Of Andrew Pinski Sent: 2007?12?19? 18:07 To: Ross Ridge Cc: gcc@gcc.gnu.org Subject: Re: A proposal to align GCC stack On 12/18/07, Ross Ridge wrote: > Look at it another way. Lets say you were compiling x86_64 code with > -fpreferred-stack-boundary=3, an 8-byte PREFERRED alignment. Can we stop talking about x86/x86_64 specifics issues here? I have an use case for the PowerPC side of the Cell BE for variables greater than the normal stack boundary alignment of 16bytes. They need to be 128byte aligned for DMA transfering to the SPUs. I already proposed a patch [1] to fix this use case but I have not seen many replies yet. Thanks, Andrew Pinski [1] http://gcc.gnu.org/ml/gcc-patches/2007-05/msg01167.html From aph@redhat.com Thu Dec 20 14:24:00 2007 From: aph@redhat.com (Andrew Haley) Date: Thu, 20 Dec 2007 14:24:00 -0000 Subject: Strange error message from gdb In-Reply-To: References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> <18281.23234.151870.816362@zebedee.pink> <20071219185517.GA10986@caradoc.them.org> <18281.27225.249982.220171@zebedee.pink> Message-ID: <18282.13956.454306.676164@zebedee.pink> Alexandre Oliva writes: > On Dec 19, 2007, Andrew Haley wrote: > > > Right, so read_type_die() doesn't know how to handle > > DW_TAG_interface_type. The weird thing is that I have never seen > > this error mesage before today, and AFAIAA gcj has been > > generating these interface types for a long while. > > For very small values of "long while" :-) > > This was added by: > > 2007-12-15 Alexandre Oliva > > PR debug/7081 > * lang.c (java_classify_record): New. > (LANG_HOOKS_CLASSIFY_RECORD): Override. Yeah, I discovered this today. Because your patch hadn't been flagged as affecting Java and no Java maintainer approved it, I hadn't noticed. > Sorry, I didn't check whether GDB or other debug information > consumers supported this tag. I just ASSumed they did, given how > long they've been specified (today Dwarf 3 turns 2 :-) and how > noisy a failure would be should one run into such a tag without > supporting it. Well, that was a bad thing to do. > What now, revert until GDB et al are fixed, or leave it in, for > it's the right thing to do, and it serves as an additional > incentive for debug information consumers to support new Dwarf 3 > features? Please revert it, right now. It is impossible for anyone to debug gcj code at the moment. Once gdb support is in and widely distributed, then we can change gcc. Realistically, at least a year or two. Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From aoliva@redhat.com Thu Dec 20 14:33:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 14:33:00 -0000 Subject: Strange error message from gdb In-Reply-To: <18282.13956.454306.676164@zebedee.pink> (Andrew Haley's message of "Thu\, 20 Dec 2007 09\:31\:48 +0000") References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> <18281.23234.151870.816362@zebedee.pink> <20071219185517.GA10986@caradoc.them.org> <18281.27225.249982.220171@zebedee.pink> <18282.13956.454306.676164@zebedee.pink> Message-ID: On Dec 20, 2007, Andrew Haley wrote: > Alexandre Oliva writes: >> PR debug/7081 >> * lang.c (java_classify_record): New. >> (LANG_HOOKS_CLASSIFY_RECORD): Override. > Yeah, I discovered this today. Because your patch hadn't been flagged > as affecting Java and no Java maintainer approved it, I hadn't > noticed. Sorry. >> What now, revert until GDB et al are fixed, or leave it in, for >> it's the right thing to do, and it serves as an additional >> incentive for debug information consumers to support new Dwarf 3 >> features? > Please revert it, right now. How about this patch, instead? It will restore debuggability to Java while at the same time maintaining the progress of using the long-supported-by-GDB DW_TAG_class_type in both C++ and Java. -------------- next part -------------- A non-text attachment was scrubbed... Name: gcc-dwarf-record-types-java-revert.patch Type: text/x-patch Size: 735 bytes Desc: not available URL: -------------- next part -------------- -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aph@redhat.com Thu Dec 20 15:09:00 2007 From: aph@redhat.com (Andrew Haley) Date: Thu, 20 Dec 2007 15:09:00 -0000 Subject: Strange error message from gdb In-Reply-To: References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> <18281.23234.151870.816362@zebedee.pink> <20071219185517.GA10986@caradoc.them.org> <18281.27225.249982.220171@zebedee.pink> <18282.13956.454306.676164@zebedee.pink> Message-ID: <18282.32042.419094.175944@zebedee.pink> Alexandre Oliva writes: > > How about this patch, instead? It will restore debuggability to Java > while at the same time maintaining the progress of using the > long-supported-by-GDB DW_TAG_class_type in both C++ and Java. > > for gcc/java/ChangeLog > from Alexandre Oliva > > * lang.c (java_classify_record): Don't return > RECORD_IS_INTERFACE for now. > OK, thanks. Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From paul@codesourcery.com Thu Dec 20 16:41:00 2007 From: paul@codesourcery.com (Paul Brook) Date: Thu, 20 Dec 2007 16:41:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: <20071220005030.4971a442.jklowden@freetds.org> References: <20071219200235.GA21525@oak.schemamania.org> <20071220005030.4971a442.jklowden@freetds.org> Message-ID: <200712201509.20582.paul@codesourcery.com> > My untested (and consequently firmly > held) hypothesis is that > > 1) most combinations of && and || don't need parentheses because > > (a && b) || (c && d) > > is by far more common than > > a && (b || c) && d > > and, moreover, broken code fails at runtime, and I dispute these claims. The former may be statistically more common, but I'd be surprised if the difference is that big. I can think of several fairly common situations where both would be used. Any time you've got any sort of nontrivial condition, I always find it better to include the explicit parentheses. Especially if a, b, c, and d are relatively complex relational expressions rather than simple variables. Code failing at runtime is way too late. By that time it's already been burned onto the device, and may be half way to the moon :-) Coverage testing never tests everything, and there's a fair chance that your complex condition will only break in an exceptional case which is, by definition, hard to predict, test and reproduce. > 2) Most programmers know (because they need to know) that && comes > before ||. I don't really believe that either. Most *good* programmers know operator precedence rules (or will at least look it up). However there are a lot of distinctly average programmers, and even the good programmers get confused or have bad days. As someone else mentioned precedence of arithmetic operators is taught in school from a fairly early age. Precedence of logical operators is (to me at least) much less well conditioned. I have no objection to splitting -Wparentheses into finer grained options. I just think they should remain enabled by -Wall. Paul From iant@google.com Thu Dec 20 16:44:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Thu, 20 Dec 2007 16:44:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> <4aca3dc20712191240r472ee46ekacf32886a1abd63@mail.gmail.com> Message-ID: Alexandre Oliva writes: > > How do i know i need to change this DEBUG expression. > > As reassoc looks for sets of variables it can freely mess with, it > should take note of variables that are used in debug annotations in > addition to the kind of single (?) non-debug uses it's interested in, > such that, when it modifies these variables, the annotations can be > compensated for. The question is how it finds them efficiently, without doing a scan of all instructions. Ian From iant@google.com Thu Dec 20 16:52:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Thu, 20 Dec 2007 16:52:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: Alexandre Oliva writes: > On Dec 19, 2007, Ian Lance Taylor wrote: > > > For some things, sure, but we are just talking about the values in > > user visible variables stored in registers. There is no way we can > > make that information be correct between line notes. > > Err... I think there is, and one way to do it is with the design I've > proposed. Do you have anything to back up your implied assertion that > the design can't accomplish this? It is technically feasible but problematic for other reasons. i = i * m + ((i / j) + k) / n; On a two register machine like the x86 i will change several times during that calculation. You could issue debug notes making it correct at every machine instruction. But that would balloon the amount of debug info that we generate, for near-zero gain in real usability of the debugger. We already generate huge amounts of debug info--a typical C++ executable has more debug info than text and data combined. Increasing the amount of debug info significantly, for little gain, is contraindicated. Ian From iant@google.com Thu Dec 20 17:02:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Thu, 20 Dec 2007 17:02:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <47388599.2040701@codesourcery.com> <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: Alexandre Oliva writes: > > And it will avoid the problem of turning the testsuite into a > > regression testsuite rather than an accuracy testsuite. > > Sorry, I don't understand what you mean here. It's not a major point. When one adds a testsuite to working code, it is natural to write tests that expect to see what the code generates. The risk is that any change to the code causes the test to fail. This is the essence of a regression testsuite. For an example, see the linker testsuite in the binutils. Practically any change to the linker, correct or not, causes some tests to fail. An accuracy testsuite is one written independently of the code. It tests for the specific features that are desired, rather than testing for what the code currently does. Of course you can write an accuracy testsuite with working code. It's just much easier to write a regression testsuite, and it's easy to backslide into that. Ian From iant@google.com Thu Dec 20 18:01:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Thu, 20 Dec 2007 18:01:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: <20071220005030.4971a442.jklowden@freetds.org> References: <20071219200235.GA21525@oak.schemamania.org> <20071220005030.4971a442.jklowden@freetds.org> Message-ID: "James K. Lowden" writes: > > That particular warning happened to find dozens of real errors when I > > ran it over a large code base. It may be noise for you, but I know > > from personal experience that it is very useful. > > I would like to hear more about that, if you wouldn't mind. I'm really > quite surprised. Honestly. I'm not free to share actual details, and I don't have the real percentages anyhow. The warning triggered many false positives. But there were also a number of true positives, far more than I expected. A typical true positive looked more or less like if (a && b || c) often with a more complex condition. The indentation would express the intent, so it was easy to read the code and assume that it meant what it appeared to mean. But, of course, it didn't. Ian From ddaney@avtrex.com Thu Dec 20 20:16:00 2007 From: ddaney@avtrex.com (David Daney) Date: Thu, 20 Dec 2007 20:16:00 -0000 Subject: RFC: New mangling for java resources. Message-ID: <476AADF1.7040001@avtrex.com> In: http://gcc.gnu.org/ml/gcc-patches/2007-12/msg00979.html I propose a new mangling for embedded java resource files. Quoting from that message: > The mangling is as follows: > > The resource name is broken into path components by '/' characters. Each > component then has an '_' prepended and all '.' -> "$_" and '$' -> > "$$". The length of each component is then prepended to this and all > are concatenated together and preceeded by "_ZGr". "Gr" being an unused > special-name designator that could be thought of as representing > GNU-resource. For example: > > "java/util/iso4217.properties" mangles as: > "_ZGr5_java5_util20_iso4217$_properties" > > These symbols seem to pass through the demangler unaffected (GNU nm > 2.17.50.0.6-5.fc6 20061020 from FC6). > The java resource names differ from identifiers in languages like C,C++, and java in that there is no restriction on the position of digits in the name, really any character can appear anywhere in the name. One other thing I didn't mention in the original message, is that all characters that are not ISALNUM() are encoded with a __U_XXXX sequence. I am looking from feedback from mangling gurus on this. Does this seem acceptable? Are there changes that you would recommend? I will prepare a demangler patch to accompany the java patch when the mangling scheme is deemed to be good. Thanks, David Daney From aoliva@redhat.com Thu Dec 20 20:42:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 20:42:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "20 Dec 2007 08\:40\:00 -0800") References: <47671BF4.5050704@google.com> <4aca3dc20712181415y3d5c3717s6d73b1335b311313@mail.gmail.com> <4aca3dc20712182207y648f7bbhab9e0af8ad2ff832@mail.gmail.com> <4aca3dc20712190759g748d6e15pa0e5146c3f5ca0ba@mail.gmail.com> <4aca3dc20712191240r472ee46ekacf32886a1abd63@mail.gmail.com> Message-ID: On Dec 20, 2007, Ian Lance Taylor wrote: > Alexandre Oliva writes: >> > How do i know i need to change this DEBUG expression. >> >> As reassoc looks for sets of variables it can freely mess with, it >> should take note of variables that are used in debug annotations in >> addition to the kind of single (?) non-debug uses it's interested in, >> such that, when it modifies these variables, the annotations can be >> compensated for. > The question is how it finds them efficiently, without doing a scan of > all instructions. It must keep track of variables it can mess with, so it might as well take notes about those it has to be more careful about. *Or* it can just introduce new temporaries, rename the uses and leave the original sets behind for "garbage collection" AKA dead code elimination, like I said. One is more implementation work, the other is potentially more wasteful in terms of memory use. None look particularly hard to me. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Thu Dec 20 21:38:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Thu, 20 Dec 2007 21:38:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "20 Dec 2007 08\:44\:20 -0800") References: <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: On Dec 20, 2007, Ian Lance Taylor wrote: > It is technically feasible but problematic for other reasons. > i = i * m + ((i / j) + k) / n; > On a two register machine like the x86 i will change several times > during that calculation. No. The register used to hold its initial value will. Keep in mind the separation between user variables and implementation locations. The user variable 'i' is only supposed to change when assignment operation is performed, (even if only in a theoretical level), when the final value of the RHS is available and stored in the location then assigned to hold the value of variable 'i'. Now, it is possible that the previous value of 'i' becomes unavailable while the expression is evaluated. Then, in order to represent this correctly, we just have to note that 'i' is no longer available as soon as all locations holding its original value are clobbered, and that it's available again when its new location holds the assigned value. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From iant@google.com Fri Dec 21 01:54:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Fri, 21 Dec 2007 01:54:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: Alexandre Oliva writes: > On Dec 20, 2007, Ian Lance Taylor wrote: > > > It is technically feasible but problematic for other reasons. > > i = i * m + ((i / j) + k) / n; > > On a two register machine like the x86 i will change several times > > during that calculation. > > No. The register used to hold its initial value will. Keep in mind > the separation between user variables and implementation locations. > The user variable 'i' is only supposed to change when assignment > operation is performed, (even if only in a theoretical level), when > the final value of the RHS is available and stored in the location > then assigned to hold the value of variable 'i'. OK, fair enough. > Now, it is possible that the previous value of 'i' becomes unavailable > while the expression is evaluated. Then, in order to represent this > correctly, we just have to note that 'i' is no longer available as > soon as all locations holding its original value are clobbered, and > that it's available again when its new location holds the assigned > value. Right, which will significantly increase debugging size as you add two more notes around many lines. Ian From aoliva@redhat.com Fri Dec 21 02:11:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Fri, 21 Dec 2007 02:11:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "20 Dec 2007 13\:37\:26 -0800") References: <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: On Dec 20, 2007, Ian Lance Taylor wrote: > Right, which will significantly increase debugging size as you add two > more notes around many lines. If that's the price to avoid debug information consumers getting incorrect values... Would you argue for a position such as: we can't go on expanding C++ templates for every conceivable type users instatiate them, this would make applications too large. let's try to figure out some way to reuse template expansions, even if some programs break, because it's more important to keep programs small than to enable them to behave correctly ? Why would code, essential for debug information consumers that are part of larger systems to work correctly, deserve any less attention to correctness? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dewar@adacore.com Fri Dec 21 03:16:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Fri, 21 Dec 2007 03:16:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: <476B20B1.6050608@adacore.com> Alexandre Oliva wrote: > On Dec 20, 2007, Ian Lance Taylor wrote: > >> Right, which will significantly increase debugging size as you add two >> more notes around many lines. > > If that's the price to avoid debug information consumers getting > incorrect values... It may be an unacceptable price, the cost of an executable going from 50 megabytes to 80 megabytes can be the difference between handling the situation being practical and impractical. From paulur@gmail.com Fri Dec 21 04:37:00 2007 From: paulur@gmail.com (Paul Li) Date: Fri, 21 Dec 2007 04:37:00 -0000 Subject: compiling failed. Message-ID: <20d714960712201916g567ba67bpf62b3ec88d033569@mail.gmail.com> I tried to compile gcc4.2.x on Ubuntu, and there is an error message as the following. "make[3]: *** No rule to make target `../../../srcdir/fixincludes/../gcc/BASE-VER', needed by `mkheaders'. Stop. make[3]: Leaving directory `/home/paul/gcc/gcc-4.2.0/objdir/build-i686-pc-linux-gnu/fixincludes' make[2]: *** [all-build-fixincludes] Error 2 make[2]: Leaving directory `/home/paul/gcc/gcc-4.2.0/objdir' make[1]: *** [stage1-bubble] Error 2 make[1]: Leaving directory `/home/paul/gcc/gcc-4.2.0/objdir' make: *** [all] Error 2" (srcdir is the directory where the source code is. ) Thanks, Paul From iant@google.com Fri Dec 21 05:10:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Fri, 21 Dec 2007 05:10:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: Alexandre Oliva writes: > On Dec 20, 2007, Ian Lance Taylor wrote: > > > Right, which will significantly increase debugging size as you add two > > more notes around many lines. > > If that's the price to avoid debug information consumers getting > incorrect values... > > Would you argue for a position such as: > > we can't go on expanding C++ templates for every conceivable type > users instatiate them, this would make applications too large. > let's try to figure out some way to reuse template expansions, even > if some programs break, because it's more important to keep programs > small than to enable them to behave correctly > > ? No, that would be an obviously stupid position to take. I don't understand why you even say such a thing. > Why would code, essential for debug information consumers that are > part of larger systems to work correctly, deserve any less attention > to correctness? Because for most people the use of debug information is to use it in a debugger. And for those people, correct information at line positions suffices. Even the use you mentioned of doing backtraces only requires adding the notes around function calls, not around every line, unless you enable -fnon-call-exceptions. If you want to work on supporting this controlled by an option (-g4?), that is fine with me. Ian From jklowden@freetds.org Fri Dec 21 05:28:00 2007 From: jklowden@freetds.org (James K. Lowden) Date: Fri, 21 Dec 2007 05:28:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: References: <20071219200235.GA21525@oak.schemamania.org> <20071220005030.4971a442.jklowden@freetds.org> Message-ID: <20071221001038.71ef6ec3.jklowden@freetds.org> Ian Lance Taylor wrote: > A typical true positive looked more or less like > > if (a && > b || c) http://www.jetcafe.org/jim/c-style.html It's funny you should mention that. A warning about whitespace indentation that's inconsistent with the expressed logic *would* be helpful (and consistent with the if/else part of -Wparentheses). --jkl From dragonylffly@gmail.com Fri Dec 21 07:54:00 2007 From: dragonylffly@gmail.com (Qing Wei) Date: Fri, 21 Dec 2007 07:54:00 -0000 Subject: Where to find the sources implementing GCC DFA pipeline hazard recognizer Message-ID: <476B4EFA.9030906@gmail.com> Hi, Could someone give some hints of where to find the sources and algorithms of implementing the DFA pipeline hazard recognizer in GCC, which files and functions? Thanks advance. Qing From iant@google.com Fri Dec 21 08:36:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Fri, 21 Dec 2007 08:36:00 -0000 Subject: Where to find the sources implementing GCC DFA pipeline hazard recognizer In-Reply-To: <476B4EFA.9030906@gmail.com> References: <476B4EFA.9030906@gmail.com> Message-ID: Qing Wei writes: > Could someone give some hints of where to find the sources and > algorithms of implementing > the DFA pipeline hazard recognizer in GCC, which files and functions? The files haifa-sched.c, sched-*.[ch] . Ian From Ralf.Wildenhues@gmx.de Fri Dec 21 09:27:00 2007 From: Ralf.Wildenhues@gmx.de (Ralf Wildenhues) Date: Fri, 21 Dec 2007 09:27:00 -0000 Subject: -Wparentheses lumps too much together References: <20071219200235.GA21525@oak.schemamania.org> Message-ID: freetds.org> writes: > > Yes, I know beginners get confused by and/or precedence. But > *every* language that I know of that has operator precedence places > 'and' before 'or'. FWIW, Bourne shell doesn't, && and || have equal precedence there. That's a bit off-topic though, as it's not an argument against your actual proposition, but rather one for `sh -Wall'. ;-) Cheers, Ralf From nightstrike@gmail.com Fri Dec 21 17:16:00 2007 From: nightstrike@gmail.com (NightStrike) Date: Fri, 21 Dec 2007 17:16:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: References: <20071219200235.GA21525@oak.schemamania.org> Message-ID: On 12/20/07, Ralf Wildenhues wrote: > freetds.org> writes: > > > > Yes, I know beginners get confused by and/or precedence. But > > *every* language that I know of that has operator precedence places > > 'and' before 'or'. > > FWIW, Bourne shell doesn't, && and || have equal precedence there. > That's a bit off-topic though, as it's not an argument against your > actual proposition, but rather one for `sh -Wall'. ;-) It's not entirely off-topic. Not all programmers are dedicated to a specific language. It's customary to work on several different languages, and keeping things like operator precedance straight in your head between languages is not always easy. Things like -Wall are a great help in making sure that you don't miss any of those inter-language oddities. As long as there are options to go either way, for instance: o -Wall checks by default, -Wno-parentheses disables o -Wall doesn't check by default, -Wparentheses enables then it's really just a question of what should be enabled by default, not what should be checked for at all. The point is... does it really matter, as long as everyone can go either way? From aoliva@redhat.com Fri Dec 21 17:31:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Fri, 21 Dec 2007 17:31:00 -0000 Subject: sub-optimal stack alignment with __builtin_alloca() Message-ID: WRT http://gbenson.livejournal.com/2007/12/21/ I see where the problem is. GCC is being overzealous because of a default that was local to one file was made global on 2003-10-07, and this changed the behavior of the #if statement in explow.c's allocate_dynamic_stack_space(): #if defined (STACK_DYNAMIC_OFFSET) || defined (STACK_POINTER_OFFSET) #define MUST_ALIGN 1 #else #define MUST_ALIGN (PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT) #endif Unfortunately, STACK_POINTER_OFFSET isn't a preprocessor constant on all ports. We could change the above to: #if defined (STACK_DYNAMIC_OFFSET) #define MUST_ALIGN 1 #else #define MUST_ALIGN (STACK_POINTER_OFFSET || PREFERRED_STACK_BOUNDARY < BIGGEST_ALIGNMENT) #endif but on at least one port (pa), STACK_POINTER_OFFSET depends on the size of the outgoing arguments of a function, which we don't necessarily know yet at the point we expand alloca builtins. For pa, it's never zero, but for other ports it might be, and then this would break. Thoughts, anyone? BTW, function.c still provides a no-longer-necessary default for STACK_POINTER_OFFSET. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From aoliva@redhat.com Fri Dec 21 18:12:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Fri, 21 Dec 2007 18:12:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "20 Dec 2007 20\:37\:31 -0800") References: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: On Dec 21, 2007, Ian Lance Taylor wrote: >> Why would code, essential for debug information consumers that are >> part of larger systems to work correctly, deserve any less attention >> to correctness? > Because for most people the use of debug information is to use it in a > debugger. Emitting incorrect debug information that most people wouldn't use anyway is like breaking only the template instantiations that most people wouldn't use anyway. Would you defend the latter position? > Even the use you mentioned of doing backtraces only requires adding > the notes around function calls, not around every line, unless you > enable -fnon-call-exceptions. Asynchronous signals, anyone? Asynchronous attachment to processes for inspection? Inspection at random points in time? Debugging is changing. Please stop assuming the only use for debug information is for interactive debugging sessions like those provided by GDB. Debug information specifications/standards should be on par with language, ABI and ISA specifications/standards. > If you want to work on supporting this controlled by an option (-g4?), > that is fine with me. So, how would you document -g2? Generate debug information that is thoroughly broken, but that is hopefully good enough for some limited and dated scenarios of debugging? And, more importantly, how would you go about introducing something that provides more meaningful information than the current (non-?)design does, but that discards just the right amount of information so as to keep debug information just barely enough for debugging, but without discarding too much? In other words, how do you draw the line, algorithmically speaking? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From iant@google.com Fri Dec 21 19:32:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Fri, 21 Dec 2007 19:32:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: Alexandre Oliva writes: > On Dec 21, 2007, Ian Lance Taylor wrote: > > >> Why would code, essential for debug information consumers that are > >> part of larger systems to work correctly, deserve any less attention > >> to correctness? > > > Because for most people the use of debug information is to use it in a > > debugger. > > Emitting incorrect debug information that most people wouldn't use > anyway is like breaking only the template instantiations that most > people wouldn't use anyway. > > Would you defend the latter position? Alexandre, I have to say that in my opinion absurd arguments like this do not strengthen your position. I think they make it weaker, because it encourages people like me--the people you have to convince--to write you off as somebody more interested in rhetoric than in actual thought. > > Even the use you mentioned of doing backtraces only requires adding > > the notes around function calls, not around every line, unless you > > enable -fnon-call-exceptions. > > Asynchronous signals, anyone? > > Asynchronous attachment to processes for inspection? > > Inspection at random points in time? What we sacrifice in these cases is the ability to sometimes get a correct view of at most two or three local variables being modified in the exact statement being executed at the time of the signal. When I say "correct view" here I mean that sometimes the tools will see the wrong value for a variable, when the truth is that they should see that the variable's value is unavailable. We do not sacrifice anything about the ability to look at variables declared in functions higher up in the stack frame. Programmers can reasonably select a trade-off between larger debug information size and the ability to correctly inspect local variables when they asynchronously examine a program. Moreover, a tool which reads the debug information can determine that it is looking at instructions in the middle of the statement, and that therefore the known locations of local variables need not be correct. So in fact we don't even lose the ability to get a correct view. What we lose is the ability to in some cases see a value which actually is available, but which the debugging tool can not prove to be available. > > If you want to work on supporting this controlled by an option (-g4?), > > that is fine with me. > > So, how would you document -g2? Generate debug information that is > thoroughly broken, but that is hopefully good enough for some limited > and dated scenarios of debugging? > > And, more importantly, how would you go about introducing something > that provides more meaningful information than the current > (non-?)design does, but that discards just the right amount of > information so as to keep debug information just barely enough for > debugging, but without discarding too much? > > In other words, how do you draw the line, algorithmically speaking? I already told you one perfectly good place to draw the line: make variable location information correct at line notes. That suffices for many practical uses. And I already said that I'm willing to see an option to permit more precise debugging information. It appears to me that you think that there is a binary choice between debugging information that is correct by your definition and debugging information that is incorrect. That is a false dichotomy. There are many gradations of debugging information that are useful. For example, I don't know what your position on -g1 is, but certainly many people find it to be useful and practical, just as many people find -g0 and -g2 to be useful and practical. Presumably some people also find -g3 to be useful, although I don't know any of them myself. Correctness of debugging information is not a binary characteristic. Ian From r-smith@ihug.co.nz Fri Dec 21 20:19:00 2007 From: r-smith@ihug.co.nz (Ross Smith) Date: Fri, 21 Dec 2007 20:19:00 -0000 Subject: -Wparentheses lumps too much together In-Reply-To: <200712201509.20582.paul@codesourcery.com> References: <20071219200235.GA21525@oak.schemamania.org> <20071220005030.4971a442.jklowden@freetds.org> <200712201509.20582.paul@codesourcery.com> Message-ID: <476C1461.2050107@ihug.co.nz> Paul Brook wrote: >James K. Lowden wrote: >> >> 1) most combinations of && and || don't need parentheses because >> >> (a && b) || (c && d) >> >> is by far more common than >> >> a && (b || c) && d >> >> and, moreover, broken code fails at runtime, and > > I dispute these claims. > > The former may be statistically more common, but I'd be surprised if the > difference is that big. I can think of several fairly common situations where > both would be used. > > Any time you've got any sort of nontrivial condition, I always find it better > to include the explicit parentheses. Especially if a, b, c, and d are > relatively complex relational expressions rather than simple variables. I second Paul's points. The precedence of && and || are not widely enough known that warning about it should be off by default (for people sane enough to use -Wall). A couple of data points: First, I've been writing C and C++ for a living for nearly 20 years now, and I didn't know that && had higher precedence than ||. I vaguely recalled that they had different precedence, but I couldn't have told you which came first without looking it up. I'd happily bet that the same is true of the overwhelming majority of developers who aren't compiler hackers. Giving && and || different precedence is one of those things that feels so totally counterintuitive that I have trouble remembering it no matter how many times I look it up. I have a firm coding rule of always parenthesising them when they're used together. (Likewise &, |, and ^, which have similar issues. I can't remember whether -Wparentheses warns about those too.) Second, I just grepped the codebase I'm curently working on (about 60k lines of C++) for anywhere && and || appear on the same line. I got 49 hits, 29 where && was evaluated before ||, 20 the other way around. (All of them, I'm happy to say, properly parenthesised.) So while &&-inside-|| seems to be slightly more common, I'd certainly dispute James's claim that it's "far more common". -- Ross Smith From cschueler@gmx.de Fri Dec 21 20:25:00 2007 From: cschueler@gmx.de (Christian =?utf-8?b?U2Now7xsZXI=?=) Date: Fri, 21 Dec 2007 20:25:00 -0000 Subject: A proposal to align GCC stack References: Message-ID: Ye, Joey intel.com> writes: > Please go forward with this idea! The current implementation of force_align_arg_pointer has never worked for me. I have a DLL which may be called by code out of my control and I already have manual stub functions to align the stack. I would love to rely on compiler facilities for this but if I do, the host program crashes when my DLL is loaded. From aoliva@redhat.com Fri Dec 21 22:46:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Fri, 21 Dec 2007 22:46:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "21 Dec 2007 10\:12\:00 -0800") References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: On Dec 21, 2007, Ian Lance Taylor wrote: > Alexandre Oliva writes: >> On Dec 21, 2007, Ian Lance Taylor wrote: >> >> >> Why would code, essential for debug information consumers that are >> >> part of larger systems to work correctly, deserve any less attention >> >> to correctness? >> >> > Because for most people the use of debug information is to use it in a >> > debugger. >> >> Emitting incorrect debug information that most people wouldn't use >> anyway is like breaking only the template instantiations that most >> people wouldn't use anyway. >> >> Would you defend the latter position? > Alexandre, I have to say that in my opinion absurd arguments like this > do not strengthen your position. I'm sorry that you feel that way, but I don't understand why you and so many others apply different compliance standards to debug information. Why do you regard compiler output that causes systems to fail because they process incorrect debug information as any more acceptable than compiler output that causes system to fail because they process incorrect instructions? Do you just not see how serious the problem is, or just not care about the growing number of tools and people who need the information to be standard-compliant? > What we sacrifice in these cases is the ability to sometimes get a > correct view of at most two or three local variables being modified in > the exact statement being executed at the time of the signal. Aren't you forgetting that complex statements and scheduling can make it much worse than this? In fact, that there can be very many "active statements" at any single point in the code (and this is even more critical on some architectures such as IA64), and that, in these cases, your suggested notion of "line notes" is pretty much meaningless, for they will be present between pretty much every pair of statements anyway? > Programmers can reasonably select a trade-off between larger debug > information size and the ability to correctly inspect local > variables when they asynchronously examine a program. I don't have a problem with permitting people to make this trade-off, as long as the information we generate is still arguably correct (i.e., not necessarily in what I understand as correct), even if it is incomplete. I just don't see where to draw a line that makes sense to me. > Moreover, a tool which reads the debug information can determine that > it is looking at instructions in the middle of the statement, and that > therefore the known locations of local variables need not be correct. > So in fact we don't even lose the ability to get a correct view. What > we lose is the ability to in some cases see a value which actually is > available, but which the debugging tool can not prove to be available. Feel like proposing this "relaxed mode" to the DWARF standardization committee? At least an annotation that tells debug info consumers not to trust fully the information encoded there, because it's only valid at instructions marked with the "is_stmt" flag, or some such. > It appears to me that you think that there is a binary choice between > debugging information that is correct by your definition and debugging > information that is incorrect. That is a false dichotomy. There are > many gradations of debugging information that are useful. For > example, I don't know what your position on -g1 is, but certainly many > people find it to be useful and practical, just as many people find > -g0 and -g2 to be useful and practical. Presumably some people also > find -g3 to be useful, although I don't know any of them myself. > Correctness of debugging information is not a binary characteristic. But this paragraph above is not about correctness, it's about completeness. -g0 is less complete than -g1 is less complete than -g2 is less complete than -g3. They all have their uses, but they can all be compliant with the debug information standards, because what they leave out is optional information. What you're proposing is something else. It's not about leaving out information that is specified as optional in the standard. It's about emitting information, rather than leaving it out, and emitting it in a way that is non-compliant with the standard, which makes it misleading and error-prone to debug information consumers that have no reason to suspect it might be wrong. And all this just because emitting correct and more complete information would make it larger, but we don't even know by how much. What are you trying with to accomplish? Why do you want -g to generate incorrect debug information, and force debug information consumers that have use cases different than yours, and distributors of such debug information, to decide between changing their build procedures to get what the compiler should have long given them, or living with unreliable information? Just so that you, who don't care so much about the correctness of this information yet, can shave off some bytes from your object files? Why shouldn't you use an option such as -gimme-just-what-I-need-no-more or -fsck-up-my-debug-info-I-dont-care-about-standards instead? -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From gccadmin@gcc.gnu.org Sat Dec 22 00:02:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Sat, 22 Dec 2007 00:02:00 -0000 Subject: gcc-4.3-20071221 is now available Message-ID: <20071221224543.14910.qmail@sourceware.org> Snapshot gcc-4.3-20071221 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20071221/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.3 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 131125 You'll find: gcc-4.3-20071221.tar.bz2 Complete GCC (includes all of below) gcc-core-4.3-20071221.tar.bz2 C front end and core compiler gcc-ada-4.3-20071221.tar.bz2 Ada front end and runtime gcc-fortran-4.3-20071221.tar.bz2 Fortran front end and runtime gcc-g++-4.3-20071221.tar.bz2 C++ front end and runtime gcc-java-4.3-20071221.tar.bz2 Java front end and runtime gcc-objc-4.3-20071221.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.3-20071221.tar.bz2 The GCC testsuite Diffs from 4.3-20071214 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.3 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From iant@google.com Sat Dec 22 00:07:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Sat, 22 Dec 2007 00:07:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: Alexandre Oliva writes: > > Alexandre, I have to say that in my opinion absurd arguments like this > > do not strengthen your position. > > I'm sorry that you feel that way, but I don't understand why you and > so many others apply different compliance standards to debug > information. Why do you regard compiler output that causes systems to > fail because they process incorrect debug information as any more > acceptable than compiler output that causes system to fail because > they process incorrect instructions? Because a compiler that generates incorrect instructions is completely useless for all users. A compiler that generates incorrect debug information, or no debug information at all, or debug information which is randomly correct and incorrect, is still quite useful for many users. Evidence: gcc today. I have to say that I find your arguments along these lines to be so absurd as to be nearly incomprehensible. gcc does not exist to adhere to standards. It exists to provide a service to its users. I and so many others apply different compliance standards to debug information because that is appropriate for our user base. > Do you just not see how serious the problem is, or just not care about > the growing number of tools and people who need the information to be > standard-compliant? Do you just not see that your false dichotomies have nothing to do with the real usage of gcc in the real world? Is anybody out there saying that we should absolutely not improve the debug information? No, of course not. All serious people are in favor of improving the debug information. We are just saying that for debug information it is appropriate to weigh different user needs. Those needs include compilation time and size of generated files. This is not true for correctness of generated code. There is no such weighing in that area; the generated code must be correct or the compiler is completely useless. > > What we sacrifice in these cases is the ability to sometimes get a > > correct view of at most two or three local variables being modified in > > the exact statement being executed at the time of the signal. > > Aren't you forgetting that complex statements and scheduling can make > it much worse than this? In fact, that there can be very many "active > statements" at any single point in the code (and this is even more > critical on some architectures such as IA64), and that, in these > cases, your suggested notion of "line notes" is pretty much > meaningless, for they will be present between pretty much every pair > of statements anyway? Fortunately not every single instruction is going to change a user visible variable. But, yes, that is a potential issue. We will have to see what the effect is on debug information size. > > Moreover, a tool which reads the debug information can determine that > > it is looking at instructions in the middle of the statement, and that > > therefore the known locations of local variables need not be correct. > > So in fact we don't even lose the ability to get a correct view. What > > we lose is the ability to in some cases see a value which actually is > > available, but which the debugging tool can not prove to be available. > > Feel like proposing this "relaxed mode" to the DWARF standardization > committee? At least an annotation that tells debug info consumers not > to trust fully the information encoded there, because it's only valid > at instructions marked with the "is_stmt" flag, or some such. No, my personal interest in standardization of debugging information is near-zero. > Why do you want -g to generate incorrect debug information, and force > debug information consumers that have use cases different than yours, > and distributors of such debug information, to decide between changing > their build procedures to get what the compiler should have long given > them, or living with unreliable information? I guess it must be because I'm an extremist who can only cares about one thing, and I have no interest in considering issues that other people might care about. What other possible explanation could there be? > Just so that you, who don't care so much about the correctness of this > information yet, can shave off some bytes from your object files? Why > shouldn't you use an option such as -gimme-just-what-I-need-no-more or > -fsck-up-my-debug-info-I-dont-care-about-standards instead? First, we add the option. Second, we see what the results look like. Third, we decide what the default should be. Like it or not, the large size of debug information is a serious issue for many people. Ian From pinskia@gmail.com Sat Dec 22 00:09:00 2007 From: pinskia@gmail.com (Andrew Pinski) Date: Sat, 22 Dec 2007 00:09:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: Message-ID: On 21 Dec 2007 16:02:38 -0800, Ian Lance Taylor wrote: > Like it or not, the large size of debug information is a serious issue > for many people. Link times are hurt by large size of debugging information. I have many many complaints from some users of the PS3 toolchain that link times are huge and from my investigation, found the size of the debugging info contributed to most (if not all) of the increased link times. Thanks, Andrew Pinski From pinskia@gmail.com Sat Dec 22 03:16:00 2007 From: pinskia@gmail.com (Andrew Pinski) Date: Sat, 22 Dec 2007 03:16:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: Message-ID: On 12/21/07, Andrew Pinski wrote: > On 21 Dec 2007 16:02:38 -0800, Ian Lance Taylor wrote: > > Like it or not, the large size of debug information is a serious issue > > for many people. > > Link times are hurt by large size of debugging information. I have > many many complaints from some users of the PS3 toolchain that link > times are huge and from my investigation, found the size of the > debugging info contributed to most (if not all) of the increased link > times. I forgot to mention the increase in debugging information about prologue and eplogue (made by RTH) between 4.0.2 and 4.1.1 made the link time increase a huge amount. This just an example of where increased debugging information hurts developmental time. Thanks, Andrew Pinski From hp@bitrange.com Sat Dec 22 03:42:00 2007 From: hp@bitrange.com (Hans-Peter Nilsson) Date: Sat, 22 Dec 2007 03:42:00 -0000 Subject: __builtin_expect for indirect function calls In-Reply-To: <20071218000552.GV3656@playstation.sony.com> References: <20071218000552.GV3656@playstation.sony.com> Message-ID: <20071221220630.Y67443@dair.pair.com> On Mon, 17 Dec 2007, trevor_smigiel@playstation.sony.com wrote: > When we can't hint the real target, we want to hint the most common > target. There are potentially clever ways for the compiler to do this > automatically, but I'm most interested in giving the user some way to do > it explicitly. One possiblity is to have something similar to > __builtin_expect, but for functions. For example, I propose: > > __builtin_expect_call (FP, PFP) Is there a hidden benefit? I mean, isn't this really expressable using builtin_expect as-is, at least when it comes to the syntax? Like: > > which returns the value of FP with the same type as FP, and tells the > compiler that PFP is the expected target of FP. Trival examples: > > typedef void (*fptr_t)(void); > > extern void foo(void); > > void > call_fp (fptr_t fp) > { > /* Call the function pointed to by fp, but predict it as if it is > calling foo() */ > __builtin_expect_call (fp, foo)(); __builtin_expect (fp, foo); /* alt __builtin_expect (fp == foo, 1); */ fp (); > } > > void > call_fp_predicted (fptr_t fp, fptr_t predicted) > { > /* same as above but the function we are calling doesn't have to be > known at compile time */ > __builtin_expect_call (fp, predicted)(); __builtin_expect (fp, predicted); fp(); I guess the information just isn't readily available in the preferred form when needed and *that* part could more or less simply be fixed? brgds, H-P From dewar@adacore.com Sat Dec 22 07:38:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Sat, 22 Dec 2007 07:38:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: <476C8799.8030309@adacore.com> Alexandre Oliva wrote: > I'm sorry that you feel that way, but I don't understand why you and > so many others apply different compliance standards to debug > information. Why do you regard compiler output that causes systems to > fail because they process incorrect debug information as any more > acceptable than compiler output that causes system to fail because > they process incorrect instructions? Incorrect debug output does not cause systems to fail in any reasonable development methodology. It is simply a nuisance. After all you can perfectly well develop an application without a debugger at all if you have to, but you have to have correct code being generated or things are MUCH harder. I am all in favor of getting the debug information as accurate as possible, but I agree with others who feel that this excessive rhetoric is damaging the cause of achieving this. If you don't understand why different compliance standards are applied in the two cases, then there is something major you are missing. > Just so that you, who don't care so much about the correctness of this > information yet, can shave off some bytes from your object files? Why > shouldn't you use an option such as -gimme-just-what-I-need-no-more or > -fsck-up-my-debug-info-I-dont-care-about-standards instead? I am beginning to think this is a lost cause if you persist in taking this flippant attitude, and fail to understand the basis of the real concerns about what you propose. From clattner@apple.com Sat Dec 22 11:44:00 2007 From: clattner@apple.com (Chris Lattner) Date: Sat, 22 Dec 2007 11:44:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: Message-ID: <7C283DB3-9716-4B2C-9721-D1F503B91CC4@apple.com> On Dec 21, 2007, at 4:09 PM, Andrew Pinski wrote: > On 12/21/07, Andrew Pinski wrote: >> On 21 Dec 2007 16:02:38 -0800, Ian Lance Taylor >> wrote: >>> Like it or not, the large size of debug information is a serious >>> issue >>> for many people. >> >> Link times are hurt by large size of debugging information. I have >> many many complaints from some users of the PS3 toolchain that link >> times are huge and from my investigation, found the size of the >> debugging info contributed to most (if not all) of the increased link >> times. > > I forgot to mention the increase in debugging information about > prologue and eplogue (made by RTH) between 4.0.2 and 4.1.1 made the > link time increase a huge amount. It's worth noting that not all systems store debug information in executables. On Mac OS 10.5, the linker leaves debug info in the .o files instead of copying it into the executable. As such, size of debug info doesn't significantly affect link-time or executable size (but it can obviously affect time to launch the debugger). I'm sure there are other systems that do similar things. If debug info size and link time is really such a serious problem for so many users, perhaps people developing the gnu toolchain should investigate an extension like this. -Chris From aph@redhat.com Sat Dec 22 13:33:00 2007 From: aph@redhat.com (Andrew Haley) Date: Sat, 22 Dec 2007 13:33:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: <18284.63659.225269.214216@zebedee.pink> Alexandre Oliva writes: > On Dec 21, 2007, Ian Lance Taylor wrote: > > > > Alexandre, I have to say that in my opinion absurd arguments like this > > do not strengthen your position. > > I'm sorry that you feel that way, but I don't understand why you and > so many others apply different compliance standards to debug > information. We know you don't understand, but that isn't likely to change. Would it not surely be better to cease this pointless argument and get on with the job of improving debuginfo? This absolutist position you seem to have adopted isn't helping. If we could talk about "better" and "worse" rather than "correct" and "incorrrect" we'd get much further. Andrew. -- Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK Registered in England and Wales No. 3798903 From dewar@adacore.com Sat Dec 22 17:11:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Sat, 22 Dec 2007 17:11:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <18284.63659.225269.214216@zebedee.pink> References: <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> <18284.63659.225269.214216@zebedee.pink> Message-ID: <476D11FE.6020404@adacore.com> Andrew Haley wrote: > We know you don't understand, but that isn't likely to change. Would > it not surely be better to cease this pointless argument and get on > with the job of improving debuginfo? This absolutist position you > seem to have adopted isn't helping. > > If we could talk about "better" and "worse" rather than "correct" and > "incorrrect" we'd get much further. I very much agree. Everyone is in favor of better debug information if it is not too costly, we won't really see whether it is too costly until we get some real data. But trying to argue for this in terms of standards and conformance is a real red herring, the proper argument for any improvement to debug information is utility. From ismail@pardus.org.tr Sat Dec 22 18:14:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Sat, 22 Dec 2007 18:14:00 -0000 Subject: glibc 2.7 complex functions are possibly miscompiled by gcc 4.3 trunk Message-ID: <200712221911.32786.ismail@pardus.org.tr> Hi all, I am doing glibc 4.3 regression tests using gcc 4.3 trunk nearly every day and I see 3 tests fail : math/test-float math/test-ildoubl math/test-ifloat The erorrs are all similar : Failure: Test: Imaginary part of: cacosh (-0 + 0 i) == 0.0 + pi/2 i Result: is: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 should be: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 difference: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 ulp : 13176795.0000 max.ulp : 0.0000 All of the imaginary part checks fail, with the help of GFortran maintainers we identified 2 testcases which fail with glibc 2.7 when compiled with gcc 4.3 trunk [0] . This problem also causes 32 unexpected failures on gfortran regression tests. So I wonder if you guys can help me debug this, I checked out libc sources but its mostly assembly stuff for math functions. Maybe Jakub has an idea, not sure. Any help/comment appreciated. [0] http://sourceware.org/bugzilla/show_bug.cgi?id=5490 Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From iant@google.com Sat Dec 22 21:27:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Sat, 22 Dec 2007 21:27:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <7C283DB3-9716-4B2C-9721-D1F503B91CC4@apple.com> References: <7C283DB3-9716-4B2C-9721-D1F503B91CC4@apple.com> Message-ID: Chris Lattner writes: > If debug info size and link time is really such a serious problem for > so many users, perhaps people developing the gnu toolchain should > investigate an extension like this. I'm in favor of implementing this. As I'm sure you know, the GNU binutils and gdb already support using a single separate file for debugging information. But these approaches do not solve all problems. The technique works well during development for a program which is normally run on the same system on which it is developed. It doesn't help much when the program must be run on a different system--it's possible to use gdbserver, but awkward. And it doesn't help at all when it is sometimes necessary to debug executables which have been built and distributed widely. Ian From jan.kratochvil@redhat.com Sat Dec 22 22:49:00 2007 From: jan.kratochvil@redhat.com (Jan Kratochvil) Date: Sat, 22 Dec 2007 22:49:00 -0000 Subject: Strange error message from gdb In-Reply-To: <18282.32042.419094.175944@zebedee.pink> References: <18281.21294.238761.442229@zebedee.pink> <20071219172943.GA5939@caradoc.them.org> <18281.23234.151870.816362@zebedee.pink> <20071219185517.GA10986@caradoc.them.org> <18281.27225.249982.220171@zebedee.pink> <18282.13956.454306.676164@zebedee.pink> <18282.32042.419094.175944@zebedee.pink> Message-ID: <20071222212726.GA21386@host0.dyn.jankratochvil.net> On Thu, 20 Dec 2007 15:33:14 +0100, Andrew Haley wrote: > Alexandre Oliva writes: > > > > How about this patch, instead? It will restore debuggability to Java > > while at the same time maintaining the progress of using the > > long-supported-by-GDB DW_TAG_class_type in both C++ and Java. > > > > for gcc/java/ChangeLog > > from Alexandre Oliva > > > > * lang.c (java_classify_record): Don't return > > RECORD_IS_INTERFACE for now. > > > > OK, thanks. FYI GDB HEAD supports it now (just the backward compatible way, no new info extracted from it so far). http://sourceware.org/ml/gdb-cvs/2007-12/msg00123.html Regards, Jan From andi@firstfloor.org Sun Dec 23 00:52:00 2007 From: andi@firstfloor.org (Andi Kleen) Date: Sun, 23 Dec 2007 00:52:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "Sat\, 22 Dec 2007 18\:15\:04 +0000 \(UTC\)") References: <7C283DB3-9716-4B2C-9721-D1F503B91CC4@apple.com.suse.lists.egcs> Message-ID: Ian Lance Taylor writes: > I'm in favor of implementing this. Yes it would be great. > As I'm sure you know, the GNU > binutils Actually binutils only barely supports debuginfo. AFAIK objcopy is the tool tool that knows anything about them. > and gdb already support using a single separate file for > debugging information. That does not solve that problem because all that data still has to be copied. In the current setup even two times (.o -> exe -> objcopy to debuginfo and then another strip which is another partial write). I assume that copying phase is the problem people are complaining about and debuginfo makes it even worse now. > well during development for a program which is normally run on the > same system on which it is developed. It doesn't help much when the > program must be run on a different system--it's possible to use > gdbserver, but awkward. And it doesn't help at all when it is > sometimes necessary to debug executables which have been built and > distributed widely. The Linux distributions have debuginfo rpms that work fine for that. But it does not solve the link time IO problem. -Andi From pkambadu@cs.indiana.edu Sun Dec 23 01:20:00 2007 From: pkambadu@cs.indiana.edu (Prabhanjan Kambadur) Date: Sun, 23 Dec 2007 01:20:00 -0000 Subject: Regarding WITH_CLEANUP_EXPR Message-ID: Dear All, This is Anju from IU, Bloomington. I am trying to inject some code into the program and encountered a strange error enroute. Essentially, I am trying to create an std::vector and then call "resize()" on it. "T" is program dependent. When "T" is a primitive type, everything seems to work fine. However, when "T" is a record such as an std::string, I get an error during gimplification: internal compiler error: in lower_stmt, at gimple-low.c:282. The gimple ouput reads thus: <<< Unknown tree: with_cleanup_expr __comp_dtor (&D.109215) >>> Any suggestions as to why this happens? I am working on a branch off of 4.3. Thanks, Anju From drow@false.org Sun Dec 23 01:32:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Sun, 23 Dec 2007 01:32:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <7C283DB3-9716-4B2C-9721-D1F503B91CC4@apple.com.suse.lists.egcs> Message-ID: <20071223012025.GA15196@caradoc.them.org> On Sat, Dec 22, 2007 at 11:49:23PM +0100, Andi Kleen wrote: > > As I'm sure you know, the GNU > > binutils > > Actually binutils only barely supports debuginfo. AFAIK > objcopy is the tool tool that knows anything about them. I don't know why you say that. ld knows a bit about debugging sections, and how to read .debug_line for errors; objdump knows how to decode debug info, as does readelf; strip knows how to remove it; objcopy how to copy and separate it. > The Linux distributions have debuginfo rpms that work > fine for that. But it does not solve the link time IO problem. FWIW, in the paragraph you were responding to Ian was talking about the Darwin system, not the GNU one. -- Daniel Jacobowitz CodeSourcery From andi@firstfloor.org Sun Dec 23 01:36:00 2007 From: andi@firstfloor.org (Andi Kleen) Date: Sun, 23 Dec 2007 01:36:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <20071223012025.GA15196@caradoc.them.org> References: <7C283DB3-9716-4B2C-9721-D1F503B91CC4@apple.com.suse.lists.egcs> <20071223012025.GA15196@caradoc.them.org> Message-ID: <20071223013344.GA21843@one.firstfloor.org> > I don't know why you say that. ld knows a bit about debugging > sections, and how to read .debug_line for errors; objdump knows how to > decode debug info, as does readelf; strip knows how to remove it; > objcopy how to copy and separate it. Sorry I mean separate debuginfo, as Ian was refering too. I actually had a patch once to hack it into objdump for -S and also into addr2line but it was somewhat ugly and still had some problems and I didn't submit it. -Andi From drow@false.org Sun Dec 23 05:55:00 2007 From: drow@false.org (Daniel Jacobowitz) Date: Sun, 23 Dec 2007 05:55:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: <20071223013344.GA21843@one.firstfloor.org> References: <7C283DB3-9716-4B2C-9721-D1F503B91CC4@apple.com.suse.lists.egcs> <20071223012025.GA15196@caradoc.them.org> <20071223013344.GA21843@one.firstfloor.org> Message-ID: <20071223013603.GA16448@caradoc.them.org> On Sun, Dec 23, 2007 at 02:33:44AM +0100, Andi Kleen wrote: > > I don't know why you say that. ld knows a bit about debugging > > sections, and how to read .debug_line for errors; objdump knows how to > > decode debug info, as does readelf; strip knows how to remove it; > > objcopy how to copy and separate it. > > Sorry I mean separate debuginfo, as Ian was refering too. > > I actually had a patch once to hack it into objdump for -S and > also into addr2line but it was somewhat ugly and still > had some problems and I didn't submit it. Oh, I see. Yes, only BFD and GDB know much about it. -- Daniel Jacobowitz CodeSourcery From fche@redhat.com Sun Dec 23 17:40:00 2007 From: fche@redhat.com (Frank Ch. Eigler) Date: Sun, 23 Dec 2007 17:40:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "21 Dec 2007 16:02:38 -0800") References: <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: Ian Lance Taylor writes: > [...] Because a compiler that generates incorrect instructions is > completely useless for all users. Surely you overstate this: gcc has always included a generous serving of incorrect-code-generation bugs. > A compiler that generates incorrect debug information, or no debug > information at all, or debug information which is randomly correct > and incorrect, is still quite useful for many users. Evidence: gcc > today. Indeed. > [...] Like it or not, the large size of debug information is a > serious issue for many people. It is profoundly ironic that, despite the great bulk of this data, its quality has severe enough blemishes that people can't justify installing/using it. If it were a little larger but significantly more complete/correct, perhaps the cost/benefit judgemment would swing around. Coincidentally, we (several RH engineers) are working on dwarf data compression. - FChE From aran@100acres.us Mon Dec 24 07:53:00 2007 From: aran@100acres.us (Aran Clauson) Date: Mon, 24 Dec 2007 07:53:00 -0000 Subject: Successful Build Message-ID: <200712230940.08781.aran@100acres.us> i386-unknown-netbsdelf4.99.35 Using built-in specs. Target: i686-pc-netbsdelf4.99.35 Configured with: ../gcc-4.2.2/configure --prefix=/usr/local/opt/gcc-4.2.2 --target=i686-pc-netbsdelf4.99.35 --build=i686-pc-netbsdelf4.99.35 --host=i686-pc-netbsdelf4.99.35 --enable-threads --enable-tls --with-cpu=prescott --enable-languages=ada,c,c++,fortran --disable-nls --with-mpfr-lib=/usr/local/lib --with-gmp-lib=/usr/local/lib Thread model: posix gcc version 4.2.2 Enabled Languages: C, C++, Ada, and Fortran Build, host, and target are NetBSD 4.99.35 (Tracking Current). Aran Clauson From joey.ye@intel.com Mon Dec 24 13:55:00 2007 From: joey.ye@intel.com (Ye, Joey) Date: Mon, 24 Dec 2007 13:55:00 -0000 Subject: A proposal to align GCC stack In-Reply-To: A References: A Message-ID: Christian Sch?ler writes: > Please go forward with this idea! > The current implementation of force_align_arg_pointer has never worked for me. This proposal should solve your problem. But to comfirm, I'd like to know the root cause. force_align_arg_pointer should have guaranteed 16 bytes align. Are you using data structure requirement alignment larger than 16? Or maybe you didn't specify force_align_arg_pointer for all of your functions? Thanks - Joey From rbnku1990@sz.com Mon Dec 24 20:21:00 2007 From: rbnku1990@sz.com (Angel) Date: Mon, 24 Dec 2007 20:21:00 -0000 Subject: Greetings! Message-ID: 1064536362.3968222989@sz.com You remember me? I d!o Here my page: http://a-zifg.nm.ru From froydnj@codesourcery.com Mon Dec 24 22:42:00 2007 From: froydnj@codesourcery.com (Nathan Froyd) Date: Mon, 24 Dec 2007 22:42:00 -0000 Subject: [lto] preliminary SPECint benchmark numbers Message-ID: <20071224202148.GU14579@codesourcery.com> In one of my recent messages about a patch to the LTO branch, I mentioned that we could compile and successfully run all of the C SPECint benchmarks except 176.gcc. Chris Lattner asked if I had done any benchmarking now that real programs could be run; I said that I hadn't but would try to do some soon. This is the result of that. I don't have numbers on what compile times look like, but I don't think they're good. 176.gcc takes several minutes to compile (basically -flto *.o, not counting the time to compile individual .o files); the other benchmarks are all a minute or more apiece. Executive summary: LTO is currently *not* a win. In the table below, runtimes are in seconds. I ran the tests on an 8-core 1.6GHz machine with 8 GB RAM. I believe the machine was relatively idle; I ran the tests over a weekend evening. The last merge from mainline to the LTO branch was mainline r130155, so that's about what the -O2 numbers correspond to--I don't think we've changed too much core code on the branch. The % change are just in-my-head estimates, using -O2 as a baseline. -O2 -flto % change 164.gzip 174 176 + 1 175.vpr 139 143 + 3 181.mcf 162 166 + 3 186.crafty 65.2 66.6 + < 1 197.parser 240 261 + 9 253.perlbmk 119 133 + 13 254.gap 84.4 87 + 4 256.bzip2 131 145 + 11 300.twolf 202 193 - 4 (!) 176.gcc doesn't run correctly with LTO yet; 255.vortex didn't run correctly with "mainline", but it did with -flto, which is curious. We don't do C++ yet, so 252.eon is not included. In general, things get worse with LTO, sometimes much worse. I can think of at least three possible reasons off the top of my head: - Alias information. We don't have any type-based alias information in -flto, which hurts. - We don't merge types between compilation units, which could account for poor optimization behavior. - I believe we lose some information in the LTO write/read process; edge probabilities, estimated # instructions in functions, etc. get lost. This hurts inlining decisions, block layout, alignment of jump targets, etc. So there's information we need to write out or recompute. -Nathan From gccadmin@gcc.gnu.org Tue Dec 25 18:58:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Tue, 25 Dec 2007 18:58:00 -0000 Subject: gcc-4.1-20071224 is now available Message-ID: <20071224224216.13338.qmail@sourceware.org> Snapshot gcc-4.1-20071224 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20071224/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.1 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch revision 131161 You'll find: gcc-4.1-20071224.tar.bz2 Complete GCC (includes all of below) gcc-core-4.1-20071224.tar.bz2 C front end and core compiler gcc-ada-4.1-20071224.tar.bz2 Ada front end and runtime gcc-fortran-4.1-20071224.tar.bz2 Fortran front end and runtime gcc-g++-4.1-20071224.tar.bz2 C++ front end and runtime gcc-java-4.1-20071224.tar.bz2 Java front end and runtime gcc-objc-4.1-20071224.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.1-20071224.tar.bz2 The GCC testsuite Diffs from 4.1-20071217 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.1 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From tejgcc@westnet.com.au Tue Dec 25 19:35:00 2007 From: tejgcc@westnet.com.au (Tim Josling) Date: Tue, 25 Dec 2007 19:35:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> Message-ID: <1198609072.25084.14.camel@tim-gcc> On Sat, 2007-12-15 at 20:54 -0200, Alexandre Oliva wrote: > On Dec 3, 2007, kenner@vlsi1.ultra.nyu.edu (Richard Kenner) wrote: > > > In my view, ChangeLog is mostly "write-only" from a developer's > > perspective. It's a document that the GNU project requires us to > produce > > for > > ... a good example of compliance with the GPL: > > 5. Conveying Modified Source Versions. > > a) The work must carry prominent notices stating that you modified > it, and giving a relevant date. > (Minor quibble) As copyright owner of GCC, the FSF is not bound by the conditions of the licence it grants in the same way as licencees are bound. So I don't think this provision in itself would mandate that those who have copyright assignments to the FSF record their changes. I don't hear anyone arguing that people should not record what they changes and when. The question is whether it is sufficient. I just started using git locally, and I keep thinking it would be really great to have something like "git blame" for gcc. The command "git blame" gives you a listing of who changed each line of the file and when, and also gives the commit id. From that all can be revealed. > > FWIW, I've used ChangeLogs to find problems a number of times in my 14 > years of work in GCC, and I find them very useful. When I need more > details, web-searching for the author of the patch and some relevant > keywords in the ChangeLog will often point at the relevant e-mail, so > burdening people with adding a direct URL seems pointless to me. It's > pessimizing the common case for a small optimization in far less > common cases. > This may possibly work when the mailing list entries exist and are accessible. However they are only available AFAIK from 1998. GCC has been going for 2-3 times as long as that. And there is at least one significant gap: February 2004 up to and including this message http://gcc.gnu.org/ml/gcc-patches/2004-02/msg02288.html. In my experience, when documentation is not stored with the source code, it often gets lost. When a person is offline the mailing list htmls are not available. I have an idea to resolve this that I am working on... more in due course if it comes to anything. Tim Josling From dberlin@dberlin.org Wed Dec 26 01:05:00 2007 From: dberlin@dberlin.org (Daniel Berlin) Date: Wed, 26 Dec 2007 01:05:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1198609072.25084.14.camel@tim-gcc> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> <1198609072.25084.14.camel@tim-gcc> Message-ID: <4aca3dc20712251135r37774ec9w9010fe3cfa965668@mail.gmail.com> On Dec 25, 2007 1:57 PM, Tim Josling wrote: > On Sat, 2007-12-15 at 20:54 -0200, Alexandre Oliva wrote: > > On Dec 3, 2007, kenner@vlsi1.ultra.nyu.edu (Richard Kenner) wrote: > > > > > In my view, ChangeLog is mostly "write-only" from a developer's > > > perspective. It's a document that the GNU project requires us to > > produce > > > for > > > > ... a good example of compliance with the GPL: > > > > 5. Conveying Modified Source Versions. > > > > a) The work must carry prominent notices stating that you modified > > it, and giving a relevant date. > > > > (Minor quibble) As copyright owner of GCC, the FSF is not bound by the > conditions of the licence it grants in the same way as licencees are > bound. So I don't think this provision in itself would mandate that > those who have copyright assignments to the FSF record their changes. > > I don't hear anyone arguing that people should not record what they > changes and when. The question is whether it is sufficient. > > I just started using git locally, and I keep thinking it would be really > great to have something like "git blame" for gcc. svn already has blame :) From vmakarov@redhat.com Wed Dec 26 01:16:00 2007 From: vmakarov@redhat.com (Vladimir N. Makarov) Date: Wed, 26 Dec 2007 01:16:00 -0000 Subject: [lto] preliminary SPECint benchmark numbers In-Reply-To: <20071224202148.GU14579@codesourcery.com> References: <20071224202148.GU14579@codesourcery.com> Message-ID: <4771A80D.4010203@redhat.com> Here is mine benchmarking of the current LTO branch on 2.66Ghz Core2 under RHEL 5 in 64- and 32-bits mode. The vortex violates type aliasing rules, therefore it should be compiled with -fno-strict-aliasing. Perlbmk crashed in tree.c::build2_stat in 32-bits mode when LTO used. LTO currently generates wrong code for 176.gcc. I've also checked Specfp2000 benchmarks written in C. In brief, o the code size (text segment) with LTO is much smaller (2.7% and 2.4% for SpecInt and 0.16% and 0.6% for SpecFp correspondingly in 64- and 32-bit mode). That is very promising. o the compilation is 2 times slower with LTO. o The generated code is slower 3.6% and 2.2% for SPECint2000 and SpecFp2000 in 64-bit mode. It is also 6.7% slower for SpecInt2000 in 32-bit mode. But SpecFp2000 in 32-bit mode code generated with LTO is 20% faster! It is because art is almost 2.5 times faster with LTO. The more details can be found below. --------------------------64-bit mode---------------------------- base: -O2 -mtune=generic peak: -O2 -mtune=generic -flto base peak 164.gzip 1363* 1340* 175.vpr 1600* 1571* 176.gcc X X 181.mcf 1658* 1531* 186.crafty 2576* 2569* 197.parser 1269* 1158* 252.eon X X 253.perlbmk 2546* 2373* 254.gap 1987* 1965* 255.vortex 2259* 2208* 256.bzip2 1874* 1721* 300.twolf 2548* 2627* SPECin2000 mean 1910 1841 -3.6% Compilation time of SPECInt2000 (except for eon and gcc): base: 65.02user 6.25system 1:15.41elapsed 94%CPU peak: 130.62user 9.68system 2:45.20elapsed 84%CPU base peak 168.wupwise X X 171.swim X X 172.mgrid X X 173.applu X X 177.mesa 2426* 2314* 178.galgel X X 179.art 6276* 5519* 183.equake 1826* 1808* 187.facerec X X 188.ammp 1770* 1666* 189.lucas X X 191.fma3d X X 200.sixtrack X X 301.apsi X X SPECfp_base2000 2649 2491 -2.2% Compilation time of SPECFp2000 (only mesa, art, equake ammp): 17.32user 1.74system 0:20.42elapsed 93%CPU (0avgtext+0avgdata 0maxresident)k 35.52user 2.88system 0:42.86elapsed 89%CPU (0avgtext+0avgdata 0maxresident)k text segment: ----------------CINT2000----------------- -6.144% 38962 36568 164.gzip -3.500% 147426 142266 175.vpr -4.313% 12613 12069 181.mcf -2.544% 172319 167935 186.crafty -5.566% 108797 102741 197.parser -5.436% 575443 544160 253.perlbmk -5.214% 494375 468599 254.gap -5.617% 556589 525325 255.vortex -3.209% 32532 31488 256.bzip2 1.132% 198639 200887 300.twolf Average = -2.69418% ----------------CFP2000----------------- -5.093% 522117 495526 177.mesa 2.542% 16362 16778 179.art 2.745% 19778 20321 183.equake -2.919% 142532 138372 188.ammp Average = -0.160212% --------------------------32-bit mode---------------------------- base: -m32 -O2 -mtune=generic peak: -m32 -O2 -mtune=generic -flto base peak 164.gzip 1261* 1125* 175.vpr 1603* 1483* 176.gcc X X 181.mcf 3057* 2801* 186.crafty 1764* 1691* 197.parser 1397* 1224* 252.eon X X 253.perlbmk X X 254.gap 1981* 1778* 255.vortex 2013* 1914* 256.bzip2 1666* 1580* 300.twolf 2376* 2484* SPECint2000mean 1839 1716 -6.7% Compilation time of SPECInt2000 (except for eon, gcc, and perlbmk): 49.36user 5.13system 0:58.57elapsed 93%CPU (0avgtext+0avgdata 0maxresident)k 99.32user 7.90system 1:56.63elapsed 91%CPU (0avgtext+0avgdata 0maxresident)k base peak 168.wupwise X X 171.swim X X 172.mgrid X X 173.applu X X 177.mesa 1362* 1325* 178.galgel X X 179.art 2786* 6197* 183.equake 1784* 1772* 187.facerec X X 188.ammp 1144* 1102* 189.lucas X X 191.fma3d X X 200.sixtrack X X 301.apsi X X SPECfp2000 mean 1668 2001 +20% Compilation time of SPECFp2000 (only mesa, art, equake ammp): 17.88user 1.85system 0:21.17elapsed 93%CPU (0avgtext+0avgdata 0maxresident)k 36.76user 2.83system 0:43.81elapsed 90%CPU (0avgtext+0avgdata 0maxresident)k text segment: ----------------CINT2000----------------- -5.936% 35005 32927 164.gzip -5.125% 137683 130627 175.vpr -3.739% 10270 9886 181.mcf -1.379% 195472 192776 186.crafty -5.192% 94770 89850 197.parser -5.436% 575443 544160 253.perlbmk -4.400% 449316 429544 254.gap -2.219% 564982 552446 255.vortex -2.884% 30515 29635 256.bzip2 0.167% 193748 194072 300.twolf Average = -2.40954% ----------------CFP2000----------------- -5.796% 499738 470775 177.mesa 0.458% 13971 14035 179.art 0.303% 17467 17520 183.equake -5.176% 111429 105661 188.ammp Average = -0.600618% Nathan Froyd wrote: >In one of my recent messages about a patch to the LTO branch, I >mentioned that we could compile and successfully run all of the C >SPECint benchmarks except 176.gcc. Chris Lattner asked if I had done >any benchmarking now that real programs could be run; I said that I >hadn't but would try to do some soon. This is the result of that. > >I don't have numbers on what compile times look like, but I don't think >they're good. 176.gcc takes several minutes to compile (basically -flto >*.o, not counting the time to compile individual .o files); the other >benchmarks are all a minute or more apiece. > >Executive summary: LTO is currently *not* a win. > >In the table below, runtimes are in seconds. I ran the tests on an >8-core 1.6GHz machine with 8 GB RAM. I believe the machine was >relatively idle; I ran the tests over a weekend evening. The last merge >from mainline to the LTO branch was mainline r130155, so that's about >what the -O2 numbers correspond to--I don't think we've changed too much >core code on the branch. The % change are just in-my-head estimates, >using -O2 as a baseline. > > -O2 -flto % change >164.gzip 174 176 + 1 >175.vpr 139 143 + 3 >181.mcf 162 166 + 3 >186.crafty 65.2 66.6 + < 1 >197.parser 240 261 + 9 >253.perlbmk 119 133 + 13 >254.gap 84.4 87 + 4 >256.bzip2 131 145 + 11 >300.twolf 202 193 - 4 (!) > >176.gcc doesn't run correctly with LTO yet; 255.vortex didn't run >correctly with "mainline", but it did with -flto, which is curious. We >don't do C++ yet, so 252.eon is not included. > >In general, things get worse with LTO, sometimes much worse. I can >think of at least three possible reasons off the top of my head: > >- Alias information. We don't have any type-based alias information in > -flto, which hurts. > >- We don't merge types between compilation units, which could account > for poor optimization behavior. > >- I believe we lose some information in the LTO write/read process; edge > probabilities, estimated # instructions in functions, etc. get lost. > This hurts inlining decisions, block layout, alignment of jump > targets, etc. So there's information we need to write out or > recompute. > >-Nathan > > From clattner@apple.com Wed Dec 26 02:11:00 2007 From: clattner@apple.com (Chris Lattner) Date: Wed, 26 Dec 2007 02:11:00 -0000 Subject: [lto] preliminary SPECint benchmark numbers In-Reply-To: <4771A80D.4010203@redhat.com> References: <20071224202148.GU14579@codesourcery.com> <4771A80D.4010203@redhat.com> Message-ID: On Dec 25, 2007, at 5:02 PM, Vladimir N. Makarov wrote: > Here is mine benchmarking of the current LTO branch on 2.66Ghz Core2 > under RHEL 5 in 64- and 32-bits mode. The vortex violates type > aliasing rules, therefore it should be compiled with > -fno-strict-aliasing. Perlbmk crashed in tree.c::build2_stat in > 32-bits mode when LTO used. LTO currently generates wrong code for > 176.gcc. I've also checked Specfp2000 benchmarks written in C. > > In brief, > > o the code size (text segment) with LTO is much smaller (2.7% and > 2.4% for SpecInt and 0.16% and 0.6% for SpecFp correspondingly in > 64- > and 32-bit mode). That is very promising. > o the compilation is 2 times slower with LTO. > o The generated code is slower 3.6% and 2.2% for SPECint2000 and > SpecFp2000 in 64-bit mode. It is also 6.7% slower for SpecInt2000 > in > 32-bit mode. But SpecFp2000 in 32-bit mode code generated with LTO > is 20% faster! It is because art is almost 2.5 times faster with > LTO. Wow, nice numbers! Is it possible to compare this to -combine, or does -combine work anymore? In theory, lto and IMA should yield the same codegen, lto should just be usable with normal makefiles. -Chris From kenner@vlsi1.ultra.nyu.edu Wed Dec 26 02:17:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Wed, 26 Dec 2007 02:17:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1198609072.25084.14.camel@tim-gcc> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> <1198609072.25084.14.camel@tim-gcc> Message-ID: <10712260210.AA25957@vlsi1.ultra.nyu.edu> > (Minor quibble) As copyright owner of GCC, the FSF is not bound by the > conditions of the licence it grants in the same way as licencees are > bound. So I don't think this provision in itself would mandate that > those who have copyright assignments to the FSF record their changes. I was hoping nobody would notice that. ;-) > I don't hear anyone arguing that people should not record what they > changes and when. The question is whether it is sufficient. I think we all agree that it isn't, but figuring out where and what to put elsewhere has been tricky. From dewar@adacore.com Wed Dec 26 02:31:00 2007 From: dewar@adacore.com (Robert Dewar) Date: Wed, 26 Dec 2007 02:31:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <10712260210.AA25957@vlsi1.ultra.nyu.edu> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> <1198609072.25084.14.camel@tim-gcc> <10712260210.AA25957@vlsi1.ultra.nyu.edu> Message-ID: <4771B975.9020602@adacore.com> Richard Kenner wrote: >> (Minor quibble) As copyright owner of GCC, the FSF is not bound by the >> conditions of the licence it grants in the same way as licencees are >> bound. So I don't think this provision in itself would mandate that >> those who have copyright assignments to the FSF record their changes. > > I was hoping nobody would notice that. ;-) Actually I think this is wrong, the FSF holds the copyright by virtue of a copyright assignment which contains the guarantee that the software will be distributed under the GPL (at least that's my recollection of the assignment document). From kenner@vlsi1.ultra.nyu.edu Wed Dec 26 04:53:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Wed, 26 Dec 2007 04:53:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <4771B975.9020602@adacore.com> References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> <1198609072.25084.14.camel@tim-gcc> <10712260210.AA25957@vlsi1.ultra.nyu.edu> <4771B975.9020602@adacore.com> Message-ID: <10712260231.AA26683@vlsi1.ultra.nyu.edu> > >> (Minor quibble) As copyright owner of GCC, the FSF is not bound by the > >> conditions of the licence it grants in the same way as licencees are > >> bound. So I don't think this provision in itself would mandate that > >> those who have copyright assignments to the FSF record their changes. > > > > I was hoping nobody would notice that. ;-) > > > Actually I think this is wrong, the FSF holds the copyright by virtue of > a copyright assignment which contains the guarantee that the software > will be distributed under the GPL (at least that's my recollection of the > assignment document). That part's true, but the cited part of the GPL applies only to somebody who makes a *modification* to the work of the copyright holder and redistributes that work. Such a condition doesn't apply to the FSF, who *is* the copyright holder. From aoliva@redhat.com Wed Dec 26 06:00:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Wed, 26 Dec 2007 06:00:00 -0000 Subject: Rant about ChangeLog entries and commit messages In-Reply-To: <1198609072.25084.14.camel@tim-gcc> (Tim Josling's message of "Wed\, 26 Dec 2007 05\:57\:52 +1100") References: <2007-12-02-11-05-39+trackit+sam@rfc1149.net> <200712022136.57819.ebotcazou@libertysurf.fr> <4aca3dc20712021240k19f3eae5j66453276179c401a@mail.gmail.com> <200712022355.23871.ebotcazou@libertysurf.fr> <4aca3dc20712021621n39a036d2u21f471f231dfffe@mail.gmail.com> <10712031329.AA20246@vlsi1.ultra.nyu.edu> <1198609072.25084.14.camel@tim-gcc> Message-ID: On Dec 25, 2007, Tim Josling wrote: > On Sat, 2007-12-15 at 20:54 -0200, Alexandre Oliva wrote: >> ... a good example of compliance with the GPL: >> 5. Conveying Modified Source Versions. >> >> a) The work must carry prominent notices stating that you modified >> it, and giving a relevant date. > (Minor quibble) As copyright owner of GCC, the FSF is not bound by the > conditions of the licence it grants in the same way as licencees are > bound. Of course. That's exactly why I wrote "good example". It wouldn't be nice if the FSF itself didn't set the example for others who modify the code to follow. On top of that, I believe whoever modifies the code and publishes the modification, even if just to contribute it to the FSF, is bound by the terms of the GPL, and terefore the code modification carry the required prominent notices. Of course the FSF, being copyright holder, could choose to throw them all away. > This may possibly work when the mailing list entries exist and are > accessible. True, off-line access to resources that are on-line only or even permanently inaccessible doesn't work. Been there, done that, it's a pain to deal with lack of information. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From dragonylffly@gmail.com Wed Dec 26 07:26:00 2007 From: dragonylffly@gmail.com (Qing Wei) Date: Wed, 26 Dec 2007 07:26:00 -0000 Subject: How to describe a FMAC insn Message-ID: <4771EDE8.9030709@gmail.com> Hi, Could someone give some hints of how to describe a FMAC (float mult and add) insn in machine description, it matches d = b*c+a, which is a four operands float instrution. With a glimp through the array optabs[] in genopinit.c, it seems no OP handler could match FMAC operation? And I found a function gen_add_mult() in loops.c, but it also seems not very helpful. And another my question is, the element of optabs[] are arrays indexed by machine code, for example, add_optab[] indexed by SI, DI, QI, FI machine mode, not by number of operands, it seems it only matches 3 operands add operation,if I want to add a four operands add operation, what should I do? Qing From tprince@computer.org Wed Dec 26 09:08:00 2007 From: tprince@computer.org (Tim Prince) Date: Wed, 26 Dec 2007 09:08:00 -0000 Subject: How to describe a FMAC insn In-Reply-To: <4771EDE8.9030709@gmail.com> References: <4771EDE8.9030709@gmail.com> Message-ID: <477201FF.10802@computer.org> Qing Wei wrote: > Could someone give some hints of how to describe a FMAC (float mult and > add) insn in machine description, it matches d = b*c+a, which is a four > operands float instrution. There are plenty of examples in ia64.md and rs6000.md. From tbm@cyrius.com Wed Dec 26 13:12:00 2007 From: tbm@cyrius.com (Martin Michlmayr) Date: Wed, 26 Dec 2007 13:12:00 -0000 Subject: Status of GCC 4.3 on Alpha (Debian) Message-ID: <20071226090801.GA15225@deprecation.cyrius.com> I recently compiled the Debian archive on Alpha using trunk to identify new issues before GCC 4.3 is released. I actually started a first attempt in the middle of November but had to stop after about 2500 packages because of hardware problems. During my first attempt, I found the following Alpha related compiler bugs: - PR34132: internal consistency failure (invalid rtl sharing found in the insn) Fixed by Jakub Jelinek - PR34171: Segfault in df_chain_remove_problem with -O3 on alpha Fixed by Seongbae Park My second attempt was more successful and I compiled the entire Debian archive (over 7000 packages that need to be compiled). I compiled the archive with optimization set to -O3 and found the following ICEs with trunk from 20071212: - PR34571: Segfault in alpha_expand_mov at -O3 (1 failure) Filed two days ago so no progress yet - PR33410: ICE in iv_analyze_expr, at loop-iv.c (14 failures) This is actually a known regression from 4.1 that has became a problem for us when we moved to GCC 4.2. Unfortunately, nobody seems to be investigating this issue. - PR34467: ICE in lookup_subvars_for_var, at tree-flow-inline.h:1586 (5 failures) Aldy has a patch but I'm not sure what the status is. - PR34585: ICE in remove_useless_stmts_1, at tree-cfg.c:1863 (1 failure) Probably the same issue - PR34465: verify_stmts failed (incorrect sharing of tree nodes) (1 failure) I believe this is fixed by Aldy's patch too If someone can look at PR34571 and particularly at PR33410, Alpha will be in pretty good shape for 4.3. I should also note that a couple of software packages failed to build because of problems with their testsuites. I still have to investigate these failures. The testing was done with 4.3.0 20071212 r130789 from 2007-12-12 to 2007-12-24. Thanks to Alexander Wong of St Hugh's College, Oxford for giving me remote access to an Alpha, and to Florian Lohoff for hosting two DS25 Alpha servers that Richard Higson kindly donated to Debian. -- Martin Michlmayr http://www.cyrius.com/ From kenner@vlsi1.ultra.nyu.edu Wed Dec 26 15:00:00 2007 From: kenner@vlsi1.ultra.nyu.edu (Richard Kenner) Date: Wed, 26 Dec 2007 15:00:00 -0000 Subject: How to describe a FMAC insn In-Reply-To: <4771EDE8.9030709@gmail.com> References: <4771EDE8.9030709@gmail.com> Message-ID: <10712261312.AA27810@vlsi1.ultra.nyu.edu> > Could someone give some hints of how to describe a FMAC (float mult and > add) insn in machine description, it matches d = b*c+a, which is a four > operands float instrution. With a glimp through the array optabs[] in > genopinit.c, it seems no OP handler could match FMAC operation? Correct. It isn't generated directly during RTL generation, but instead created by the optimizer (combine) from add and multiply insns. From dragonylffly@gmail.com Wed Dec 26 17:38:00 2007 From: dragonylffly@gmail.com (Qing Wei) Date: Wed, 26 Dec 2007 17:38:00 -0000 Subject: How to describe a FMAC insn In-Reply-To: <477201FF.10802@computer.org> References: <4771EDE8.9030709@gmail.com> <477201FF.10802@computer.org> Message-ID: <47726C5B.4050701@gmail.com> I tried by referring the ia64.md, unfortunately it does not work. The insn I wrote for FMAC is as follows, (define_insn "maddsi4" [(set (match_operand:SI 0 "register_operand" "=r") (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")) (match_operand:SI 3 "register_operand" "r")))] "" "fma %0, %1, %2, %3") And besides this, I defined other two insns for dedicated add and mult operations as follows, (define_insn "addsi3" [(set (match_operand:SI 0 "register_operand" "=r") (plus:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")) )] "" "add %0, %1, %2") (define_insn "mulsi3" [(set (match_operand:SI 0 "register_operand" "=r") (mult:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")) )] "" "mul %0, %1, %2") It seems trivial. But after I rebuilt GCC for this new target, I found that no optabs entry is initialized for maddsi4 in insn-opinit.c which is generated by genopinit. However, the add_optab and smul_optab do be initialized with Code_for_addsi3/mulsi3. As a result, when I test the following simple program, cc1 produces separate add and mul instructions rather than fma, where the problem is? Thanks. void f(int s1[], int s2[], int s3[], int s4[]) { int j; for (j = 0; j < 16; j++) s4[j] = s1[j]*s2[j]+s3[j]; } Qing >> Could someone give some hints of how to describe a FMAC (float mult and >> add) insn in machine description, it matches d = b*c+a, which is a four >> operands float instrution. >> > There are plenty of examples in ia64.md and rs6000.md. > From rask@sygehus.dk Wed Dec 26 18:54:00 2007 From: rask@sygehus.dk (Rask Ingemann Lambertsen) Date: Wed, 26 Dec 2007 18:54:00 -0000 Subject: How to describe a FMAC insn In-Reply-To: <47726C5B.4050701@gmail.com> References: <4771EDE8.9030709@gmail.com> <477201FF.10802@computer.org> <47726C5B.4050701@gmail.com> Message-ID: <20071226173755.GP17368@sygehus.dk> On Wed, Dec 26, 2007 at 06:59:39AM -0800, Qing Wei wrote: > I tried by referring the ia64.md, unfortunately it does not work. The > insn I wrote for FMAC is as follows, > > (define_insn "maddsi4" > [(set (match_operand:SI 0 "register_operand" "=r") > (plus:SI (mult:SI (match_operand:SI 1 "register_operand" "r") > (match_operand:SI 2 "register_operand" "r")) > (match_operand:SI 3 "register_operand" "r")))] > "" > "fma %0, %1, %2, %3") [...] > It seems trivial. But after I rebuilt GCC for this new target, I found > that no optabs entry is initialized for maddsi4 in insn-opinit.c which > is generated by genopinit. It would be called maddhisi4, maddsidi4 or so for a sign-extending instruction. If your instruction is a plain multiply-add instruction (which is how you've defined it above), then there is no optab for it. > However, the add_optab and smul_optab do be > initialized with Code_for_addsi3/mulsi3. As a result, when I test the > following simple program, cc1 produces separate add and mul instructions > rather than fma, where the problem is? Thanks. Look at the dump file produced by -fdump-rtl-combine-details. -- Rask Ingemann Lambertsen Danish law requires addresses in e-mail to be logged and stored for a year From mark@codesourcery.com Wed Dec 26 19:10:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Wed, 26 Dec 2007 19:10:00 -0000 Subject: __builtin_expect for indirect function calls In-Reply-To: <20071221220630.Y67443@dair.pair.com> References: <20071218000552.GV3656@playstation.sony.com> <20071221220630.Y67443@dair.pair.com> Message-ID: <4772A350.6030907@codesourcery.com> Hans-Peter Nilsson wrote: > On Mon, 17 Dec 2007, trevor_smigiel@playstation.sony.com wrote: >> When we can't hint the real target, we want to hint the most common >> target. There are potentially clever ways for the compiler to do this >> automatically, but I'm most interested in giving the user some way to do >> it explicitly. One possiblity is to have something similar to >> __builtin_expect, but for functions. For example, I propose: >> >> __builtin_expect_call (FP, PFP) > > Is there a hidden benefit? I mean, isn't this really > expressable using builtin_expect as-is, at least when it comes > to the syntax? That was my first thought as well. Before we add __builtin_expect_call, I think there needs to be a justification of why this can't be done with __builtin_expect as-is. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From mark@codesourcery.com Wed Dec 26 19:25:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Wed, 26 Dec 2007 19:25:00 -0000 Subject: Regression count, and how to keep bugs around forever In-Reply-To: References: <571f6b510712181659w64b16ae5ndc32b38de6f5c56c@mail.gmail.com> Message-ID: <4772A710.8070500@codesourcery.com> Joseph S. Myers wrote: > On Wed, 19 Dec 2007, Steven Bosscher wrote: > >> The bigger issue here, is that people seem to be using Bugzilla as a >> kind-of TODO list for things may some day work on, but probably will > > I don't see any problem with that. Me neither. In fact, I think there's a lot of value in a central database of "all known bugs"; it helps users figure out whether something they're running into is something they've run into before. However, I am sympathetic to the idea that we need ways to see what's important to work on now. The problem there is that importance is in the eye of the beholder. The PN system expressions something about regressions that's moderately useful, but nothing else. I suspect that we need more database fields, so that people could run more interesting searches. -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From mark@codesourcery.com Wed Dec 26 20:05:00 2007 From: mark@codesourcery.com (Mark Mitchell) Date: Wed, 26 Dec 2007 20:05:00 -0000 Subject: Problem with ARM_DOUBLEWORD_ALIGN on ARM In-Reply-To: <20071121233222.B5EF748CC02@nile.gnat.com> References: <20071121233222.B5EF748CC02@nile.gnat.com> Message-ID: <4772AA79.1060603@codesourcery.com> Geert Bosch wrote: > Nested functions aren't used that much in C indeed... :) Paul, would you please review this patch? > --- arm.c.orig 2007-11-20 16:27:04.000000000 -0500 > +++ arm.c 2007-11-21 18:15:18.000000000 -0500 > @@ -10448,6 +10448,14 @@ arm_get_frame_offsets (void) > /* Saved registers include the stack frame. */ > offsets->saved_regs = offsets->saved_args + saved; > offsets->soft_frame = offsets->saved_regs + CALLER_INTERWORKING_SLOT_SIZE; > + > + /* Allow for storage of static chain when it needs its own space in the > + frame. */ > + if (IS_NESTED (arm_current_func_type ()) > + && regs_ever_live[3] > + && offsets->saved_args == 0) > + offsets->soft_frame += 4; > + > /* A leaf function does not need any stack alignment if it has nothing > on the stack. */ > if (leaf && frame_size == 0) > @@ -10569,6 +10577,7 @@ arm_expand_prologue (void) > unsigned long live_regs_mask; > unsigned long func_type; > int fp_offset = 0; > + int static_chain_size = 0; > int saved_pretend_args = 0; > int saved_regs = 0; > unsigned HOST_WIDE_INT args_to_push; > @@ -10643,6 +10652,7 @@ arm_expand_prologue (void) > insn = emit_insn (insn); > > fp_offset = 4; > + static_chain_size = 4; > > /* Just tell the dwarf backend that we adjusted SP. */ > dwarf = gen_rtx_SET (VOIDmode, stack_pointer_rtx, > @@ -10836,14 +10846,15 @@ arm_expand_prologue (void) > } > > offsets = arm_get_frame_offsets (); > - if (offsets->outgoing_args != offsets->saved_args + saved_regs) > + if (offsets->outgoing_args != offsets->saved_args + saved_regs > + + static_chain_size) > { > /* This add can produce multiple insns for a large constant, so we > need to get tricky. */ > rtx last = get_last_insn (); > > amount = GEN_INT (offsets->saved_args + saved_regs > - - offsets->outgoing_args); > + + static_chain_size - offsets->outgoing_args); > > insn = emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx, > amount)); > -- Mark Mitchell CodeSourcery mark@codesourcery.com (650) 331-3385 x713 From bosch@gnat.com Wed Dec 26 22:44:00 2007 From: bosch@gnat.com (Geert Bosch) Date: Wed, 26 Dec 2007 22:44:00 -0000 Subject: Problem with ARM_DOUBLEWORD_ALIGN on ARM In-Reply-To: <4772AA79.1060603@codesourcery.com> References: <20071121233222.B5EF748CC02@nile.gnat.com> <4772AA79.1060603@codesourcery.com> Message-ID: <9FDC17D3-43DE-4D5F-898E-1C6CD14EE5B1@gnat.com> On Dec 26, 2007, at 14:24, Mark Mitchell wrote: > Geert Bosch wrote: > >> Nested functions aren't used that much in C indeed... :) > > Paul, would you please review this patch? This patch isn't good. While it addressed the alignment issue, it didn't correctly adjust all necessary offset computations. One of the problems is that this port has grown many magic adjustments and mostly similar recalculations. Olivier Hainque has refactored some of the code and has a proper patch that has had a good amount of testing now. We also have a testcase in C. We'll submit this for review soon. In the meantime, please consider this patch withdrawm. -Geert From gccadmin@gcc.gnu.org Thu Dec 27 23:05:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Thu, 27 Dec 2007 23:05:00 -0000 Subject: gcc-4.2-20071226 is now available Message-ID: <20071226224401.4964.qmail@sourceware.org> Snapshot gcc-4.2-20071226 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20071226/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.2 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch revision 131189 You'll find: gcc-4.2-20071226.tar.bz2 Complete GCC (includes all of below) gcc-core-4.2-20071226.tar.bz2 C front end and core compiler gcc-ada-4.2-20071226.tar.bz2 Ada front end and runtime gcc-fortran-4.2-20071226.tar.bz2 Fortran front end and runtime gcc-g++-4.2-20071226.tar.bz2 C++ front end and runtime gcc-java-4.2-20071226.tar.bz2 Java front end and runtime gcc-objc-4.2-20071226.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.2-20071226.tar.bz2 The GCC testsuite Diffs from 4.2-20071219 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.2 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From dominiq@lps.ens.fr Fri Dec 28 11:22:00 2007 From: dominiq@lps.ens.fr (Dominique Dhumieres) Date: Fri, 28 Dec 2007 11:22:00 -0000 Subject: 10 to 20% speedup with -m64 on Intel Core2Duo Message-ID: <20071227230546.EFFF75BB9E@mailhost.lps.ens.fr> Some time ago I had a look at pr30388 and got the following results: g77 -O2 g95 -O2 gfc -O2 gfc -m64 -O2 MFLOPS: 1063 1061 858 1129 ref. g77 -19% +6% Since the evening is quite calm I decided to check if this speedup with -m64 is generic or not and I got the following timings for the Polyhedron test suite: ================================================================================ Date & Time : 27 Dec 2007 22:24:03 Test Name : pbharness Compile Command : gfc %n.f90 -m64 -O3 -ffast-math -funroll-loops -finline-limit=600 --param min-vect-loop-bound=2 -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 300.0 Target Error % : 0.200 Minimum Repeats : 2 Maximum Repeats : 5 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 4.27 50712 13.10 2 0.0420 aermod 100.72 1200712 30.19 2 0.0066 air 6.68 73204 9.37 2 0.0267 capacita 3.92 64520 56.49 2 0.0628 channel 2.43 42752 2.29 2 0.0437 doduc 14.42 179504 48.66 2 0.0021 fatigue 5.69 76696 11.17 5 0.3700 gas_dyn 6.32 700392 10.24 5 0.7605 induct 12.79 160672 66.27 2 0.0053 linpk 1.53 38400 27.54 2 0.0000 mdbx 3.77 68856 15.16 2 0.0099 nf 11.69 112312 31.63 2 0.0174 protein 10.71 110048 46.78 2 0.0064 rnflow 10.95 163144 37.28 2 0.0268 test_fpu 10.08 150080 12.72 2 0.0314 tfft 1.37 30488 2.79 2 0.1074 Geometric Mean Execution Time = 18.20 seconds ================================================================================ Date & Time : 27 Dec 2007 22:44:36 Test Name : pbharness Compile Command : gfc %n.f90 -O3 -ffast-math -funroll-loops -finline-limit=600 --param min-vect-loop-bound=2 -o %n Benchmarks : ac aermod air capacita channel doduc fatigue gas_dyn induct linpk mdbx nf protein rnflow test_fpu tfft Maximum Times : 300.0 Target Error % : 0.200 Minimum Repeats : 2 Maximum Repeats : 5 Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % --------- ------- ---------- ------- ------- ------ ac 4.48 46532 16.88 2 0.0207 aermod 104.92 1288460 37.09 2 0.0081 air 6.67 80956 11.36 5 0.0849 capacita 3.79 68332 62.40 2 0.0048 channel 2.65 50780 2.51 4 0.1828 doduc 14.27 183264 57.41 2 0.0009 fatigue 6.11 84564 14.02 2 0.0642 gas_dyn 5.93 699872 12.01 5 0.2754 induct 11.83 160132 73.59 2 0.0177 linpk 1.67 46512 27.57 2 0.0145 mdbx 3.84 72672 16.78 2 0.0149 nf 16.73 157220 31.86 2 0.0016 protein 11.62 113868 54.90 2 0.0337 rnflow 11.87 187316 45.56 2 0.0889 test_fpu 11.38 182544 14.56 2 0.0653 tfft 1.44 34420 3.03 5 0.2973 Geometric Mean Execution Time = 20.86 seconds ================================================================================ Polyhedron Benchmark Validator Copyright (C) Polyhedron Software Ltd - 2004 - All rights reserved The results have been obtain on an Intel Core2Duo 2.16Ghz with 2Gb of RAM under Darwin9.1 with gfortran 4.3 at revision 131206. Is this 10 to 20% speedup with -m64 expected? and how generic is it? In the assembly code of the inner loop of the test case in PR30388, the main differences I can see are at the level of the addressing: %eax, %ebp, ... in 32 bit mode and %rn, ... in 64 bit mode. TIA Dominique From taren@earthlink.net Fri Dec 28 14:20:00 2007 From: taren@earthlink.net (Taren) Date: Fri, 28 Dec 2007 14:20:00 -0000 Subject: gcc-4.3-20071221 compilation problems Message-ID: <4774DC4C.2080600@earthlink.net> I've been trying to compile gcc-4.3-20071221 on a sunblade 1000, running Solaris 10 (SunOS faile 5.10 Generic_127111-05 sun4u sparc SUNW,Sun-Blade-1000), and have been running into memory issues. I keep getting the error message that my system is running out of virtual memory (virtual memory exhausted: Not enough space). I have 4GB of swap and 4GB of RAM, so I shouldn't be having this problem. The error I'm getting is shown below: /source/gcc/gcc-4.3-20071221/host-sparc-sun-solaris2.10/prev-gcc/xgcc -B/source/gcc/gcc-4.3-20071221/host-sparc-sun-solaris2.10/prev-gcc/ -B/usr/local/sparc-sun-solaris2.10/bin/ -c -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../.././gcc -I../.././gcc/. -I../.././gcc/../include -I./../intl -I../.././gcc/../libcpp/include -I/usr/local/include -I/usr/local/include -I/usr/local/include/ -I../.././gcc/../libdecnumber -I../.././gcc/../libdecnumber/dpd -I../libdecnumber -I/usr/local/include ../.././gcc/c-decl.c -o c-decl.o virtual memory exhausted: Not enough space make[3]: *** [c-decl.o] Error 1 make[3]: Leaving directory `/source/gcc/gcc-4.3-20071221/host-sparc-sun-solaris2.10/gcc' make[2]: *** [all-stage2-gcc] Error 2 make[2]: Leaving directory `/source/gcc/gcc-4.3-20071221' make[1]: *** [stage2-bubble] Error 2 make[1]: Leaving directory `/source/gcc/gcc-4.3-20071221' make: *** [all] Error 2 The output of top shows (when the compilation isn't in progress): last pid: 17852; load avg: 0.61, 1.25, 1.41; up 2+00:03:27 06:13:41 143 processes: 141 sleeping, 1 zombie, 1 on cpu CPU states: 84.2% idle, 10.4% user, 5.4% kernel, 0.0% iowait, 0.0% swap Memory: 4096M phys mem, 2835M free mem, 4097M total swap, 4097M free swap Richard Paynter taren@earthlink.net From ismail@pardus.org.tr Fri Dec 28 22:47:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Fri, 28 Dec 2007 22:47:00 -0000 Subject: glibc 2.7 complex functions are possibly miscompiled by gcc 4.3 trunk In-Reply-To: <200712221911.32786.ismail@pardus.org.tr> References: <200712221911.32786.ismail@pardus.org.tr> Message-ID: <200712281620.52479.ismail@pardus.org.tr> Saturday 22 December 2007 19:11:32 tarihinde Ismail D?nmez ?unlar? yazm??t?: > Hi all, > > I am doing glibc 4.3 regression tests using gcc 4.3 trunk nearly every day > and I see 3 tests fail : > > math/test-float > math/test-ildoubl > math/test-ifloat > > The erorrs are all similar : > > Failure: Test: Imaginary part of: cacosh (-0 + 0 i) == 0.0 + pi/2 i > Result: > is: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 > should be: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 > difference: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 > ulp : 13176795.0000 > max.ulp : 0.0000 All these failures are gone when glibc is compiled with -O2 instead of -O3 but there are still 4 regressions : math/test-ildoubl Usual math problem : testing long double (inline functions) Failure: Test: expm1 (1) == M_El - 1.0 Result: is: 1.71828182845904523532e+00 0xd.bf0a8b14576953500000p-3 should be: 1.71828182845904523543e+00 0xd.bf0a8b14576953600000p-3 difference: 1.08420217248550443401e-19 0x8.00000000000000000000p-66 ulp : 1.0000 max.ulp : 0.0000 Maximal error of `expm1' is : 1 ulp accepted: 0 ulp libio/tst-fopenloc2 libio/tst-fopenloc These two seems to be a new gcc regression, they crash when compiled with gcc trunk. elf/check-localplt These seems to be less harmful, shows memalign is missing from expected output. Any ideas appreciated. Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From gccadmin@gcc.gnu.org Sat Dec 29 01:02:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Sat, 29 Dec 2007 01:02:00 -0000 Subject: gcc-4.3-20071228 is now available Message-ID: <20071228224731.1261.qmail@sourceware.org> Snapshot gcc-4.3-20071228 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20071228/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.3 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 131213 You'll find: gcc-4.3-20071228.tar.bz2 Complete GCC (includes all of below) gcc-core-4.3-20071228.tar.bz2 C front end and core compiler gcc-ada-4.3-20071228.tar.bz2 Ada front end and runtime gcc-fortran-4.3-20071228.tar.bz2 Fortran front end and runtime gcc-g++-4.3-20071228.tar.bz2 C++ front end and runtime gcc-java-4.3-20071228.tar.bz2 Java front end and runtime gcc-objc-4.3-20071228.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.3-20071228.tar.bz2 The GCC testsuite Diffs from 4.3-20071221 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.3 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way. From ismail@pardus.org.tr Sat Dec 29 06:32:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Sat, 29 Dec 2007 06:32:00 -0000 Subject: glibc 2.7 complex functions are possibly miscompiled by gcc 4.3 trunk In-Reply-To: <200712281620.52479.ismail@pardus.org.tr> References: <200712221911.32786.ismail@pardus.org.tr> <200712281620.52479.ismail@pardus.org.tr> Message-ID: <200712290303.12064.ismail@pardus.org.tr> Friday 28 December 2007 16:20:52 tarihinde Ismail D?nmez ?unlar? yazm??t?: > libio/tst-fopenloc2 > libio/tst-fopenloc > > These two seems to be a new gcc regression, they crash when compiled with > gcc trunk. Ok I identified that commit 130788 [0] broke these testcases , the same commit seems to be the cause for PR34465 . [0] http://gcc.gnu.org/viewcvs?view=rev&revision=130788 Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From tbptbp@gmail.com Sat Dec 29 15:35:00 2007 From: tbptbp@gmail.com (tbp) Date: Sat, 29 Dec 2007 15:35:00 -0000 Subject: censored naked SSE reciprocals, -mrecip Message-ID: <4fc48eb10712282232j7ec01502qdca0c64dd2710532@mail.gmail.com> Merry xmas, i lately had some use for -mrecip but it turned out to come with all sorts of strings attached and apparently no opt-out. Briefly, barring inline asm, i can't get gcc to emit those ops without a NR fixup. # cat src/pr-recip.c #include typedef float v4sf_t __attribute__ ((__vector_size__ (16))); __m128 foo(__m128 a) { return _mm_sqrt_ps(a); } __m128 bar(__m128 a) { return _mm_rsqrt_ps(a); } __m128 baz(__m128 a) { return _mm_rcp_ps(a); } v4sf_t nope1(v4sf_t a) { return __builtin_ia32_sqrtps(a); } v4sf_t nope2(v4sf_t a) { return __builtin_ia32_rsqrtps(a); } v4sf_t allright(v4sf_t a) { return __builtin_ia32_rcpps(a); } int main() { return 0; } # /usr/local/gcc-4.3-20071221/bin/gcc -march=native -ffast-math -mrecip -O2 src/pr-recip.c ... and as can be witnessed in the attached asm dump foo, bar, nope1, nope2 get mangled (at least on x86-64 linux). While i can somehow understand the logic behind the automatic transformation of _mm_sqrt_ps - it can be argued that's what the user has asked for - there's no obvious way to opt out. But then i really don't understand why gcc feels the urge to tinker when i specifically ask for a rsqrt. To add insult to injury -mrecip, unlike fast-math, doesn't set any macro so kludging around is a cat / mouse game. Questions: a) is that really by design? b) what's the official way to dodge fixups when -mrecip is active? c) any chance for -mrecip to set __FAST_MATH_NONE_SHALL_PASS__ or something? -------------- next part -------------- A non-text attachment was scrubbed... Name: dump.asm Type: application/octet-stream Size: 3705 bytes Desc: not available URL: From ubizjak@gmail.com Sat Dec 29 16:11:00 2007 From: ubizjak@gmail.com (Uros Bizjak) Date: Sat, 29 Dec 2007 16:11:00 -0000 Subject: censored naked SSE reciprocals, -mrecip Message-ID: <47766939.2030505@gmail.com> Hello! > i lately had some use for -mrecip but it turned out to come with all > sorts of strings attached and apparently no opt-out. Briefly, barring > inline asm, i can't get gcc to emit those ops without a NR fixup. > > Questions: > a) is that really by design? No. Attached patch fixes these problems by using correct shortcuts when generating intrinsic functions. 2007-12-29 Uros Bizjak * config/i386/sse.md ("*divv4sf3"): Rename to "sse_divv4sf3". ("*sse_rsqrtv4sf2"): Export. ("*sse_sqrtv4sf2"): Ditto. * config/i386/i386.c (enum ix86_builtins) [IX86_BUILTIN_RSQRTPS_NR, IX86_BUILTIN_SQRTPS_NR]: New constants. (struct builtin_description) [IX86_BUILTIN_DIVPS]: Use CODE_FOR_sse_divv4sf3. [IX86_BUILTIN_SQRTPS]: Use CODE_FOR_sse_sqrtv4sf2. [IX86_BUILTIN_SQRTPS_NR]: New. [IX86_BUILTIN_RSQRTPS_NR]: Ditto. (ix86_init_mmx_sse_builtins): Initialize __builtin_ia32_rsqrtps_nr and __builtin_ia32_sqrtps_nr. (ix86_builtin_vectorized_function): Convert BUILT_IN_SQRTF to IX86_BUILTIN_SQRTPS_NR. (ix86_builtin_reciprocal): Convert IX86_BUILTIN_SQRTPS_NR to IX86_BUILTIN_RSQRTPS_NR. Patch was bootstrapped and regression tested with {,-m32} on x86_64-pc-linux-gnu. Patch is committed to SVN. Thanks a lot for your report, Uros. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: r.diff.txt URL: From ismail@pardus.org.tr Sat Dec 29 17:49:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Sat, 29 Dec 2007 17:49:00 -0000 Subject: glibc 2.7 complex functions are possibly miscompiled by gcc 4.3 trunk In-Reply-To: <200712221911.32786.ismail@pardus.org.tr> References: <200712221911.32786.ismail@pardus.org.tr> Message-ID: <200712291811.47936.ismail@pardus.org.tr> Saturday 22 December 2007 19:11:32 tarihinde Ismail D?nmez ?unlar? yazm??t?: > Hi all, > > I am doing glibc 4.3 regression tests using gcc 4.3 trunk nearly every day > and I see 3 tests fail : > > math/test-float > math/test-ildoubl > math/test-ifloat > > The erorrs are all similar : > > Failure: Test: Imaginary part of: cacosh (-0 + 0 i) == 0.0 + pi/2 i > Result: > is: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 > should be: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 > difference: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 > ulp : 13176795.0000 > max.ulp : 0.0000 Replying to myself once again, these failures are due to -fgcse-after-reload flag, -O3 -fno-gcse-after-reload cures this. Any tips on how to debug this? Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From iant@google.com Sat Dec 29 18:12:00 2007 From: iant@google.com (Ian Lance Taylor) Date: Sat, 29 Dec 2007 18:12:00 -0000 Subject: glibc 2.7 complex functions are possibly miscompiled by gcc 4.3 trunk In-Reply-To: <200712291811.47936.ismail@pardus.org.tr> References: <200712221911.32786.ismail@pardus.org.tr> <200712291811.47936.ismail@pardus.org.tr> Message-ID: Ismail D?nmez writes: > Saturday 22 December 2007 19:11:32 tarihinde Ismail D?nmez ?unlar? yazm??t?: > > Hi all, > > > > I am doing glibc 4.3 regression tests using gcc 4.3 trunk nearly every day > > and I see 3 tests fail : > > > > math/test-float > > math/test-ildoubl > > math/test-ifloat > > > > The erorrs are all similar : > > > > Failure: Test: Imaginary part of: cacosh (-0 + 0 i) == 0.0 + pi/2 i > > Result: > > is: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 > > should be: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 > > difference: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 > > ulp : 13176795.0000 > > max.ulp : 0.0000 > > Replying to myself once again, these failures are due to -fgcse-after-reload > flag, -O3 -fno-gcse-after-reload cures this. Any tips on how to debug this? Generic advice to start: compile with -da to get all the RTL dump files. Compare the dump files immediately before and after gcse-after-reload and see what changed. Set breakpoints on validate_change or make_insn_raw as appropriate for the changes. Walk up the stack and see what the code is doing. The file in question is postreload-gcse.c, which is relatively self-contained. Of course it is moderately likely that the bug is not in gcse-after-reload, and is in some other pass generating incorrect information. Still, finding the problem in gcse-after-reload is a good start. Ian From ismail@pardus.org.tr Sat Dec 29 18:23:00 2007 From: ismail@pardus.org.tr (Ismail =?utf-8?q?D=C3=B6nmez?=) Date: Sat, 29 Dec 2007 18:23:00 -0000 Subject: glibc 2.7 complex functions are possibly miscompiled by gcc 4.3 trunk In-Reply-To: References: <200712221911.32786.ismail@pardus.org.tr> <200712291811.47936.ismail@pardus.org.tr> Message-ID: <200712292012.28182.ismail@pardus.org.tr> Saturday 29 December 2007 19:49:13 tarihinde Ian Lance Taylor ?unlar? yazm??t?: > Ismail D?nmez writes: > > Saturday 22 December 2007 19:11:32 tarihinde Ismail D?nmez ?unlar? yazm??t?: > > > Hi all, > > > > > > I am doing glibc 4.3 regression tests using gcc 4.3 trunk nearly every > > > day and I see 3 tests fail : > > > > > > math/test-float > > > math/test-ildoubl > > > math/test-ifloat > > > > > > The erorrs are all similar : > > > > > > Failure: Test: Imaginary part of: cacosh (-0 + 0 i) == 0.0 + pi/2 i > > > Result: > > > is: 0.00000000000000000000e+00 0x0.00000000000000000000p+0 > > > should be: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 > > > difference: 1.57079637050628662109e+00 0x1.921fb600000000000000p+0 > > > ulp : 13176795.0000 > > > max.ulp : 0.0000 > > > > Replying to myself once again, these failures are due to > > -fgcse-after-reload flag, -O3 -fno-gcse-after-reload cures this. Any tips > > on how to debug this? > > Generic advice to start: compile with -da to get all the RTL dump > files. Compare the dump files immediately before and after > gcse-after-reload and see what changed. Set breakpoints on > validate_change or make_insn_raw as appropriate for the changes. Walk > up the stack and see what the code is doing. The file in question is > postreload-gcse.c, which is relatively self-contained. > > Of course it is moderately likely that the bug is not in > gcse-after-reload, and is in some other pass generating incorrect > information. Still, finding the problem in gcse-after-reload is a > good start. Thanks I'll try, this is all new to me so its rather likely to be slow progress. Regards, ismail -- Never learn by your mistakes, if you do you may never dare to try again. From vincent@vinc17.org Sat Dec 29 20:07:00 2007 From: vincent@vinc17.org (Vincent Lefevre) Date: Sat, 29 Dec 2007 20:07:00 -0000 Subject: MPFR 2.3.1 Release Candidate Message-ID: <20071229182306.GC6502@ay.vinc17.org> The release of MPFR 2.3.1 is imminent. Please help to make this release as good as possible by downloading and testing this release candidate: http://www.mpfr.org/mpfr-2.3.1/mpfr-2.3.1-rc1.tar.bz2 http://www.mpfr.org/mpfr-2.3.1/mpfr-2.3.1-rc1.tar.gz http://www.mpfr.org/mpfr-2.3.1/mpfr-2.3.1-rc1.zip The MD5's: 3a029172c380fc28f17db9c727d244e5 mpfr-2.3.1-rc1.tar.bz2 59f3523b93ec6674241110512b932f22 mpfr-2.3.1-rc1.tar.gz ec69f43ad4bf00c3ce28467f0650bcb8 mpfr-2.3.1-rc1.zip Changes from version 2.3.0 to version 2.3.1: - Bug fixes; see . - Improved MPFR manual. Please send success and failure reports to . If no problems are found, MPFR 2.3.1 should be released around 2008-01-12. Happy New Year, -- Vincent Lef??vre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon) From dclarke@blastwave.org Sat Dec 29 20:10:00 2007 From: dclarke@blastwave.org (Dennis Clarke) Date: Sat, 29 Dec 2007 20:10:00 -0000 Subject: MPFR 2.3.1 Release Candidate In-Reply-To: <20071229182306.GC6502@ay.vinc17.org> References: <20071229182306.GC6502@ay.vinc17.org> Message-ID: <32991.72.39.216.186.1198958833.squirrel@mail.blastwave.org> > The release of MPFR 2.3.1 is imminent. Please help to make this > release as good as possible by downloading and testing this > release candidate: > > http://www.mpfr.org/mpfr-2.3.1/mpfr-2.3.1-rc1.tar.bz2 > http://www.mpfr.org/mpfr-2.3.1/mpfr-2.3.1-rc1.tar.gz > http://www.mpfr.org/mpfr-2.3.1/mpfr-2.3.1-rc1.zip > > The MD5's: > 3a029172c380fc28f17db9c727d244e5 mpfr-2.3.1-rc1.tar.bz2 > 59f3523b93ec6674241110512b932f22 mpfr-2.3.1-rc1.tar.gz > ec69f43ad4bf00c3ce28467f0650bcb8 mpfr-2.3.1-rc1.zip > > Changes from version 2.3.0 to version 2.3.1: > - Bug fixes; see . > - Improved MPFR manual. > > Please send success and failure reports to . > > If no problems are found, MPFR 2.3.1 should be released around > 2008-01-12. Do you have a testsuite ? Some battary of tests that can be thrown at the code to determine correct responses to various calculations, error conditions, underflows and rounding errors etc etc ? Dennis Clarke From dave.korn@artimi.com Sat Dec 29 20:14:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Sat, 29 Dec 2007 20:14:00 -0000 Subject: MPFR 2.3.1 Release Candidate In-Reply-To: <32991.72.39.216.186.1198958833.squirrel@mail.blastwave.org> References: <20071229182306.GC6502@ay.vinc17.org> <32991.72.39.216.186.1198958833.squirrel@mail.blastwave.org> Message-ID: <076001c84a56$c9155340$2e08a8c0@CAM.ARTIMI.COM> On 29 December 2007 20:07, Dennis Clarke wrote: > > Do you have a testsuite ? Some battary of tests that can be thrown at the > code to determine correct responses to various calculations, error > conditions, underflows and rounding errors etc etc ? There's a "make check" target in the tarball. I don't know how thorough it is. cheers, DaveK -- Can't think of a witty .sigline today.... From dclarke@blastwave.org Sat Dec 29 22:01:00 2007 From: dclarke@blastwave.org (Dennis Clarke) Date: Sat, 29 Dec 2007 22:01:00 -0000 Subject: MPFR 2.3.1 Release Candidate In-Reply-To: <076001c84a56$c9155340$2e08a8c0@CAM.ARTIMI.COM> References: <20071229182306.GC6502@ay.vinc17.org> <32991.72.39.216.186.1198958833.squirrel@mail.blastwave.org> <076001c84a56$c9155340$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <32998.72.39.216.186.1198959248.squirrel@mail.blastwave.org> > On 29 December 2007 20:07, Dennis Clarke wrote: > >> >> Do you have a testsuite ? Some battary of tests that can be thrown at the >> code to determine correct responses to various calculations, error >> conditions, underflows and rounding errors etc etc ? > > There's a "make check" target in the tarball. I don't know how thorough > it is. That is what scares me. Dennis From linkywhite_7777@yhoo.com Sat Dec 29 22:25:00 2007 From: linkywhite_7777@yhoo.com (Mrs Jenny) Date: Sat, 29 Dec 2007 22:25:00 -0000 Subject: VACANCY AND REQUIREMENT Message-ID: From vincent+gcc@vinc17.org Sat Dec 29 23:03:00 2007 From: vincent+gcc@vinc17.org (Vincent Lefevre) Date: Sat, 29 Dec 2007 23:03:00 -0000 Subject: MPFR 2.3.1 Release Candidate In-Reply-To: <076001c84a56$c9155340$2e08a8c0@CAM.ARTIMI.COM> References: <20071229182306.GC6502@ay.vinc17.org> <32991.72.39.216.186.1198958833.squirrel@mail.blastwave.org> <076001c84a56$c9155340$2e08a8c0@CAM.ARTIMI.COM> Message-ID: <20071229222539.GA4660@ay.vinc17.org> On 2007-12-29 20:09:58 -0000, Dave Korn wrote: > On 29 December 2007 20:07, Dennis Clarke wrote: > > Do you have a testsuite ? Some battary of tests that can be thrown at the > > code to determine correct responses to various calculations, error > > conditions, underflows and rounding errors etc etc ? > > There's a "make check" target in the tarball. I don't know how thorough it > is. The testsuite has been improved, but many things remain to do. Here are the generic improvements since the release of MPFR 2.3.0 (though not everything is in the 2.3 branch, the tests from the trunk can now be run against the 2.3 branch): * In the generic tests (based on random inputs), much fewer cases that yield an exception are generated, i.e. more interesting cases are tested in average. * Generic bad cases for the correct rounding are now tested for functions that have an inverse function implemented (r4817). For the other functions, this should also be possible with a Newton iteration, but this isn't implemented yet. * Some functions were failing when some global flag was set before the call, and unfortunately most tests were done with all flags cleared. Now, in the generic tests, all the global flags are set before a test with a probability 1/2 (part of r5115). * The exponent range is now checked at the end of each test file (r5136). So, if a function doesn't restore the exponent range in some cases, this will probably be detected. We also test worst cases in double precision for some elementary functions. Then there are very useful tests provided by users (in particular Kevin Rauch, who found many bugs in special cases). The concept of bad cases should be extended to the underflow and overflow thresholds, but this isn't done yet. -- Vincent Lef??vre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon) From tbptbp@gmail.com Sun Dec 30 02:06:00 2007 From: tbptbp@gmail.com (tbp) Date: Sun, 30 Dec 2007 02:06:00 -0000 Subject: censored naked SSE reciprocals, -mrecip In-Reply-To: <47766939.2030505@gmail.com> References: <47766939.2030505@gmail.com> Message-ID: <4fc48eb10712291503i602f2dd1s9ae3454107720a4b@mail.gmail.com> On Dec 29, 2007 4:35 PM, Uros Bizjak wrote: > Attached patch fixes these problems by using correct shortcuts when > generating intrinsic functions. > > Patch was bootstrapped and regression tested with {,-m32} on > x86_64-pc-linux-gnu. Patch is committed to SVN. > > Thanks a lot for your report, Now that's blazing fast after-sales service. And i get no less than two undocumented but functional builtins (as opposed to, say __builtin_ia32_movddup, which is documented but dysfunctional) for the same price. As an extremely satisfied customer, i want to nominate you for the 2007 man of the year short list. From dave.korn@artimi.com Sun Dec 30 09:16:00 2007 From: dave.korn@artimi.com (Dave Korn) Date: Sun, 30 Dec 2007 09:16:00 -0000 Subject: censored naked SSE reciprocals, -mrecip In-Reply-To: <4fc48eb10712291503i602f2dd1s9ae3454107720a4b@mail.gmail.com> References: <47766939.2030505@gmail.com> <4fc48eb10712291503i602f2dd1s9ae3454107720a4b@mail.gmail.com> Message-ID: <078001c84a88$a43e1980$2e08a8c0@CAM.ARTIMI.COM> On 29 December 2007 23:04, tbp wrote: > Now that's blazing fast after-sales service. > As an extremely satisfied customer, i want to nominate you for the > 2007 man of the year short list. Hear hear! Uros works very hard and contributes a lot. Thank you, Uros! cheers, DaveK -- Can't think of a witty .sigline today.... From gerald@pfeifer.com Sun Dec 30 16:44:00 2007 From: gerald@pfeifer.com (Gerald Pfeifer) Date: Sun, 30 Dec 2007 16:44:00 -0000 Subject: Status of the DLX backend for GCC? Message-ID: At http://www.gnu.org/software/gcc/extensions.html we have a reference to the DLX port of GCC, which corresponds to the DLX machine described in "Computer Architecture: A Quantitative Approach" by Hennessy and Patterson. Sadly, the link to http://www-mount.ee.umn.edu/~okeefe/mcerg/gcc-dlx.html doesn't work and I failed to find a replacement page. Any pointers? On the binutils side I see that Nikolaos has been listed as a maintainer, so there may be hope? :-) Otherwise I'm afraid we'll have to remove our reference to the DLX port. Gerald From nkavv@physics.auth.gr Sun Dec 30 23:43:00 2007 From: nkavv@physics.auth.gr (nkavv@physics.auth.gr) Date: Sun, 30 Dec 2007 23:43:00 -0000 Subject: Status of the DLX backend for GCC? In-Reply-To: References: Message-ID: <1199033077.4777caf58c090@mail.physics.auth.gr> Hi Gerald and friends (what follows is a copy of a message sent earlier today to Gerald) > At http://www.gnu.org/software/gcc/extensions.html we have a reference > to the DLX port of GCC, which corresponds to the DLX machine described > in "Computer Architecture: A Quantitative Approach" by Hennessy and > Patterson. First of all, i will be happy to help, and if applicable to submit my own DLX backend. Over the previous years, I had downloaded and used both a really archaic gcc-1.09 DLX backend as well as the one you refer too. They are both in a sad state of affairs, but the gcc-2.7.2.1 (AFAICR) was usable. Since i wanted to use a DLX cross-compiler for embedded system development (and to produce objects for the ArchC -- http://www.archc.org -- simulator infrastructure), I coded my own DLX backend for GCC. I developed it around September-October 2006, first for the 3.3.1 release and then updated its state for the 3.4.4. This backend is usable (i used it quite a lot) and has the following features and non-features :) - no proper handling of 64-bit moves - no support for soft-float - a couple of additions to the "standard" DLX ISA, a select (conditional move) instructions for partial predication, inspired by the Machine-SUIF select IR instruction. This one really works well (was adapted from MIPS32 movn, movz). If there is interest, i can submit the backend (where exactly in the cvs tree?) and with the help of the community can fix the 64-bit moves issue, plus add soft-floating. > On the binutils side I see that Nikolaos has been listed as a maintainer, > so there may be hope? :-) Otherwise I'm afraid we'll have to remove our > reference to the DLX port. Actually, I had stepped up as a binutils DLX maintainer for more-less the same reasons. I work on my own soft processors for FPGA-based embedded systems and DLX is a reference for comparisons. I'm having a small sabbatical from late Feb. to May 10 during which i will mostly polish my software projects. I can submit the backend in next days (prior 5-6 January) and fix most issues prior May 10 (joining the army for 9-month military service :) Kind regards Nikolaos Kavvadias PS: My own version of the binutils DLX port supports the "select" instruction. I should submit this as well. From belyshev@depni.sinp.msu.ru Mon Dec 31 03:31:00 2007 From: belyshev@depni.sinp.msu.ru (Serge Belyshev) Date: Mon, 31 Dec 2007 03:31:00 -0000 Subject: Status of the DLX backend for GCC? In-Reply-To: <1199033077.4777caf58c090@mail.physics.auth.gr> (nkavv@physics.auth.gr's message of "Sun\, 30 Dec 2007 18\:44\:37 +0200") References: <1199033077.4777caf58c090@mail.physics.auth.gr> Message-ID: <874pdzlpf0.fsf@depni.sinp.msu.ru> nkavv@physics.auth.gr writes: > Over the previous years, I had downloaded and used both a really archaic > gcc-1.09 DLX backend as well as the one you refer too. They are both in a sad > state of affairs, but the gcc-2.7.2.1 (AFAICR) was usable. > Offtopic: if you still have such an old gcc-1.09 (?) release around, please make it available so it can be uploaded here: ftp://sourceware.org/pub/gcc/old-releases/ From nkavv@physics.auth.gr Mon Dec 31 13:39:00 2007 From: nkavv@physics.auth.gr (nkavv@physics.auth.gr) Date: Mon, 31 Dec 2007 13:39:00 -0000 Subject: Status of the DLX backend for GCC? In-Reply-To: <874pdzlpf0.fsf@depni.sinp.msu.ru> References: <1199033077.4777caf58c090@mail.physics.auth.gr> <874pdzlpf0.fsf@depni.sinp.msu.ru> Message-ID: <1199071850.4778626af3e99@mail.physics.auth.gr> > Offtopic: if you still have such an old gcc-1.09 (?) release around, please > make > it available so it can be uploaded here: > ftp://sourceware.org/pub/gcc/old-releases/ Oops, reality check :) My glibc version for DLX is 1.09. The GCC version seems to be either 1.37.1 or more probably 1.39. Which i think you already have ^_^ Nikolaos Kavvadias PS: Nice repo of old releases! From baembel@gmx.de Mon Dec 31 14:25:00 2007 From: baembel@gmx.de (Boris Boesler) Date: Mon, 31 Dec 2007 14:25:00 -0000 Subject: BITS_PER_UNIT less than 8 In-Reply-To: References: <20071207203726.4690573D41@caffeine.csclub.uwaterloo.ca> Message-ID: <85F72766-66FB-4C83-9C08-DDA30DEBD036@gmx.de> Am 08.12.2007 um 02:49 schrieb Joseph S. Myers: > On Fri, 7 Dec 2007, Ross Ridge wrote: > >> Boris Boesler writes: >>> Ok, so what have I to do to write a back-end where all addresses are >>> given in bits? Memory is addressed in bits, not bytes. So I set: >>> >>> #define BITS_PER_UNIT 1 >>> #define UNITS_PER_WORD 32 >> >> I don't know if it's useful to define the size of a byte to be >> less than >> 8-bits, even if that more accurately reflects the hardware. >> Standard C >> requires that the char type both be at least 8 bits (UCHAR_MAX >= >> 256) >> and the same size as a byte (sizeof(char) == 1). You can't define >> any >> types that are smaller than a char and have sizeof work correctly. I don't want to change sizes. It's addressing! > In theory GCC supports CHAR_TYPE_SIZE > BITS_PER_UNIT, so sizeof > (char) is > still 1 (sizeof counts in units of CHAR_TYPE_SIZE not > BITS_PER_UNIT) but a > char is not the hardware addressing unit. I expect this is even more > broken in practice than BITS_PER_UNIT > 8. Hm, ok. So I patched some source code, one generated file and it seems to work for int(eger) operations. But if I want to add chars GCC runs into an endless loop during conversion (its the functions convert and convert_to_integer). In convert.c ~line 526 the parameters are: inprec:32 outprec:1 mode bitsize:8 I'm wondering about the output precision "1". In tree.def it is documented that a type precision is given in bits. Any idea? Boris From richard.guenther@gmail.com Mon Dec 31 14:45:00 2007 From: richard.guenther@gmail.com (Richard Guenther) Date: Mon, 31 Dec 2007 14:45:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4749DE66.1090602@codesourcery.com> <4756B02D.9010302@google.com> <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> Message-ID: <84fc9c000712310624i7aa01e1btc0489f71d33522cd@mail.gmail.com> On Dec 17, 2007 9:28 PM, Alexandre Oliva wrote: > On Dec 17, 2007, Diego Novillo wrote: > > > On 12/17/07 12:51, Alexandre Oliva wrote: > >> I guess I'm to blame, for having na??vely put the code out without as > >> much as a design and goals document > > > Yes, you are. > > Wow, thanks. At least we agree on something! ;-) > > > You need to provide such a document now. > > Can't I instead provide it when it's ready? > > You know, it wasn't me who asked to have the thing developed in the > open. I didn't push it out just so that people who didn't want to > understand it could beat on it before it was ready to defend itself. > I put it out because there was an offer for contribution. Yeah - that was me... Fact is we had a discussion about debug information earlier this year from which I took the conclusion that most people would appreciate an on-the-side representation to address the most limiting design issue of GCCs tree representation (only one variable per SSA_NAME to track). So I had the impression you worked in that direction and offered help. Now, you seemed to have come to the conclusion that this approach would not help your goal and started on a different route. Now the "mistake" maybe was to before starting this not to revive the former discussion based on your findings and elaborate on your goals. (I realize this is the way development for GCC works most of the time, but this is not what I consider good practice for open source development) Now - I think your goal is valid, and the choice of implementation might even be the best one for it. But we (the GCC community) have not yet decided if the combination of "your goal" and "this best implementation" is what we want. (I haven't decided myself either ;)) So my suggestion for you is to continue with your implementation and produce a white paper about your design (which you ideally would present during the next GCC summit, where we should do a discussion on this topic in some form). We (myself and Matz) will continue to implement what is "our goal" (because we internally committed to it, and to see limitations or problems with the approach) and possibly also will present about its outcome at the summit. Thanks, Richard. From richard.guenther@gmail.com Mon Dec 31 16:55:00 2007 From: richard.guenther@gmail.com (Richard Guenther) Date: Mon, 31 Dec 2007 16:55:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: References: <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> Message-ID: <84fc9c000712310645i28ba29c9s56b9f01b78dbdec@mail.gmail.com> On 18 Dec 2007 08:13:55 -0800, Ian Lance Taylor wrote: > Alexandre Oliva writes: > > > A plan to fix local variable debug information in GCC > > > > by Alexandre Oliva > > > > 2007-12-18 draft > > Thank you for writing this. It makes an enormous difference. Indeed. > > > == Goals > > I note that you don't say anything about the other big problem with > debugging optimized code, which is that the debugger jumps around all > over the place. That is fine, of course. > > > > Once this is established, a possible representation becomes almost > > obvious: statements (in trees) or instructions (in rtl) that assert, > > to the variable tracker, that a user variable or member is represented > > by a given expression: > > > > # DEBUG var expr > > > > By var, we mean a tree expression that denotes a user variable, for > > now. We envision trivially extending it to support components of > > variables in the future. > > While you say that this is almost obvious, it still isn't obvious at > all to me. You consider trees and RTL together, but I don't see why > that is appropriate. > > My biggest concern at the tree level is the significantly increased > memory usage and the introduction of a sort of a weak pointer to > values. Since DEBUG statements shouldn't interfere with > optimizations, we need to explicitly ignore them in things like > has_single_use. But since our data structures need to be coherent, we > can not ignore them when we actually eliminate SSA names. That seems > sort of complicated. > > In SSA form it seems very natural to provide a set of associations > with user variables for each GIMPLE variable. Since the GIMPLE > variables never change, these associations never change. We have to > get them right when we create a new GIMPLE variable and when we > eliminate a GIMPLE variable. While this obviously requires some work, > to me it seems less intrusive than the notion of weak references. This is what we do on the var-mappings-branch. One obvious thing is that SSA form doesn't help you to track the reverse of VAR = cst. But of course it's easy to do another reverse mapping from constants to vars. A similar reverse mapping can be done for SET insns on RTL. > By the way, we shouldn't confuse the source code live range of the > variable with the annotations on the GIMPLE variables. That will get > us into the mapping of source code lines to optimized code. It is of > course true that optimized code will move around unpredictably, and > your proposal doesn't handle that. I don't see it as a flaw that it > will be possible to view user variables outside of their source code > range. I think this is where Alexandes approach _might_ work. (It at least produces loads of funny DEBUG_INSNs ...) I chose to ignore this problem and say we debug the optimized program, not the source as far as life ranges are concerned. > In any case, RTL is different. We can't reasonably associate > annotations with pseudo-registers, because they change during the > function. The obvious choices are to annotate SET statements, or to > annotate insns, or to introduce a DEBUG insn as you suggest. It's not We "annotate" SET insns by adding a bitmap argument to track user variables it sets. That seems to work nicely. > obvious to me why a DEBUG insn is superior to a REG_NOTE attacked to > an insn. The problem with DEBUG insns is of course that the RTL code > is very sensitive to new insns, and also the additional memory usage. > You discuss those, but it's not obvious to me why your proposed > solution is the best one. > > > > Testing for accuracy and completeness of debug information can be best > > accomplished using a debugging environment. > > Of course this is very unsatisfactory without an automated testsuite. I was thinking of pulling in the gdb testsuite harness into gcc... Richard. From aoliva@redhat.com Mon Dec 31 19:39:00 2007 From: aoliva@redhat.com (Alexandre Oliva) Date: Mon, 31 Dec 2007 19:39:00 -0000 Subject: Designs for better debug info in GCC In-Reply-To: (Ian Lance Taylor's message of "20 Dec 2007 13\:37\:26 -0800") References: <4aca3dc20712151903r46c9eceane35edb92d08240ac@mail.gmail.com> <4aca3dc20712161712w1133fb96qd66be0e9a0bb1716@mail.gmail.com> <4766B8E5.60500@google.com> <4766DF5C.1020802@google.com> <47671BF4.5050704@google.com> <1! 198092296.6413.5.camel@janis-laptop> Message-ID: On Dec 20, 2007, Ian Lance Taylor wrote: > Right, which will significantly increase debugging size as you add two > more notes around many lines. FWIW, I've just got powerpc64-linux-gnu to pass bootstrap-debug and bootstrap4-debug/-g0 (i.e., all host and target libraries pass compare-debug when compiled with -g0 and -g2 -fvar-tracking-assignments). I did bootstrap4-debug/-fno-var-tracking-assignments as well, for comparison purposes. Here are the total sizes: 1487400 target libs at -g0 2239140 target libs with -g2 -fno-var-tracking-assignments 2190176 target libs with -g2 -fvar-tracking-assignments So, with the new infrastructure in place, debug info gets smaller. I haven't evaluated its quality yet (e.g., the compiler may be losing track of where variables are too often). Also, the compiler is still missing the improved version of var-tracking to keep track of all copies of user variable values, which is expected to grow debug info. But at least at this point it doesn't seem like the approach is hopeless. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ FSF Latin America Board Member http://www.fsfla.org/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org} From ghazi@caip.rutgers.edu Mon Dec 31 22:44:00 2007 From: ghazi@caip.rutgers.edu (Kaveh R. GHAZI) Date: Mon, 31 Dec 2007 22:44:00 -0000 Subject: MPFR 2.3.1 Release Candidate In-Reply-To: <20071229182306.GC6502@ay.vinc17.org> References: <20071229182306.GC6502@ay.vinc17.org> Message-ID: On Sat, 29 Dec 2007, Vincent Lefevre wrote: > The release of MPFR 2.3.1 is imminent. Please help to make this > release as good as possible by downloading and testing this > release candidate: > [...] > Changes from version 2.3.0 to version 2.3.1: > - Bug fixes; see . > - Improved MPFR manual. Hi Vincent, I read through the bugs in 2.3.0 from the above link. I'm trying to see if I can write a GCC testcase that exposes one of those bugs when GCC is linked with mpfr-2.3.0, but passes when I use 2.3.1-rc1. The bug would need to be exposed using a mantissa size of a C type, like 53 for double, and the default exponent range. And all the global mpfr flags are cleared beforehand, and the input precision is the same as the output precision. These circumstances seem to eliminate many (all?) of the potential failures. I tried several things through gcc+mpfr-2.3.0 like asin(-0.0), but that folds to -0.0 correctly. I tried a call to sqrt(2.0) with -frounding-math. But the inexact flag is apparently set and gcc appropriately does not fold this case, instead replying on the library call to get the rounding correct. I'd rather not test for inefficiencies or infinite loops because then the testcase will take too long to timeout and slow down everyone's testsuite runs. Often the bug says it will fail on "huge" inputs, but doesn't say exactly what they are. Rather than further guessing on my part, would you please suggest something? Thanks, --Kaveh -- Kaveh R. Ghazi ghazi@caip.rutgers.edu From gccadmin@gcc.gnu.org Mon Dec 31 23:40:00 2007 From: gccadmin@gcc.gnu.org (gccadmin@gcc.gnu.org) Date: Mon, 31 Dec 2007 23:40:00 -0000 Subject: gcc-4.1-20071231 is now available Message-ID: <20071231224313.31156.qmail@sourceware.org> Snapshot gcc-4.1-20071231 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.1-20071231/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.1 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_1-branch revision 131239 You'll find: gcc-4.1-20071231.tar.bz2 Complete GCC (includes all of below) gcc-core-4.1-20071231.tar.bz2 C front end and core compiler gcc-ada-4.1-20071231.tar.bz2 Ada front end and runtime gcc-fortran-4.1-20071231.tar.bz2 Fortran front end and runtime gcc-g++-4.1-20071231.tar.bz2 C++ front end and runtime gcc-java-4.1-20071231.tar.bz2 Java front end and runtime gcc-objc-4.1-20071231.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.1-20071231.tar.bz2 The GCC testsuite Diffs from 4.1-20071224 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.1 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.