Sometime after 20191101, my mainline bootstraps on Mac OS X 10.7/Darwin 11 began to fail completely. Initially it seemed the Mac minis I've been using remotely had just been turned off willy-nilly, but even after it had been assured that this wasn't the case, the machines still stopped in the middle of make check without any indication of what had happened. Only after I'd run such a bootstrap in a VirtualBox VM (with Mac OS X 10.7.5) did I see that the machines (obviously like bare metal) crashed with a kernel panic for some asan tests (I've seen alloca_big_alignment.exe, alloca_detect_custom_size. and bitfield-1.exe). Only asan tests seem to be affected (I didn't try any more given the tedious nature of the failure) and probably only 64-bit ones (I do run multilib tests on Darwin if possible). As expected, the ubsan tests still work. Here's the gist of the panics (I do have screen shots if need be): panic(cpu 0 caller 0xffffff8002c4794): Kernel trap at 0xffffff800053ae2, type 14=page fault, registers: [...] Debugger called: <panic> Backtrace (CPU 0),Frame : Return Address [...] mach_kernel : _panic + 0x252 _kernel_trap + 0x6a4 _return_from_trap + 0xcd _fdexec + 0x172 _kco_ma_addsample + 0x162c _kco_ma_addsample + 0x2a80 _posix_spawn + 0xab6 _unix_syscall64 + 0x1fb _hndl_unix_scall64 + 0x13 BSD process name corresponding to current thread: alloca_big_align The obvious immediate fix is to disable libsanitizer on Darwin 11. While in theory one could keep the 32-bit tests if it really turns out that they continue to work and the ubsan ones, it's probably not worth the effort given the age of the OS version and missing provision for enabling ubsan separately.
So you could just disable asan and keep ubsan (set ASAN_SUPPORTED=no in libsanitizer/configure.tgt for a particular darwin OS version, and if it is 32-bit only, also test x$ac_cv_sizeof_void_p = x4 ? Of course, trying to workaround kernel bugs this way is weird, but if it isn't supported anymore or Apple isn't willing to fix their bugs...
> --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > So you could just disable asan and keep ubsan (set ASAN_SUPPORTED=no in > libsanitizer/configure.tgt for a particular darwin OS version, and if it is > 32-bit only, also test x$ac_cv_sizeof_void_p = x4 ? Right now there's only [LT]SAN_SUPPORTED in configure.{ac,tgt}. Sure ASAN_SUPPORTED (and/or UBSAN_SUPPORTED) could be added, but I doubt it's worth the effort. I have a prototype patch that just sets UNSUPPORTED=1 for *86*-apple-darwin11*. > Of course, trying to workaround kernel bugs this way is weird, but if it isn't > supported anymore or Apple isn't willing to fix their bugs... Mac OS X 10.7 is almost 9 years old by now and long past support. I don't feel particularly inclined to reghunt which gcc/sanitizer change caused this, let alone debug the Darwin kernel either.
These systems are EOL so we can't expect any fixes to the systems themselves. The question is "is the latest imported as an version even supposed to support 10.7"? I have a patch to unsupport the sanitiser for <= 10.6 [where it has been unsupported upstream since at least the last release]. That is something that I can apply immediately. If the latest sanitiser code is _supposed_ to work on 10.7 - we should at least take a cursory look at why/where it's failing before punting. I agree that spending much time on making the sanitisers work on EOL machines is not a priority. I don't have access to my 10.7 box right now - but will take a look next week.
(In reply to Iain Sandoe from comment #3) > These systems are EOL so we can't expect any fixes to the systems themselves. > > The question is "is the latest imported as an version even supposed to > support 10.7"? > > I have a patch to unsupport the sanitiser for <= 10.6 [where it has been > unsupported upstream since at least the last release]. That is something > that I can apply immediately. > > If the latest sanitiser code is _supposed_ to work on 10.7 - we should at > least take a cursory look at why/where it's failing before punting. > > I agree that spending much time on making the sanitisers work on EOL > machines is not a priority. I don't have access to my 10.7 box right now - > but will take a look next week. I'm on 10.6 and have been configuring with --disable-libsanitizer for some time now anyways, so it won't be too much of a loss if that becomes the default
> --- Comment #3 from Iain Sandoe <iains at gcc dot gnu.org> --- > These systems are EOL so we can't expect any fixes to the systems themselves. > > The question is "is the latest imported as an version even supposed to support > 10.7"? When I tried to build all of LLVM before the 9.0 release and ran into a couple of issues, I asked the same question. Getting an answer was like pulling teeth, unfortunately, and in the end no one was able or willing to state which macOS version are supposed to be supported by LLVM. Very disappointing IMO, but given this precedent I don't expect anything better now. > I agree that spending much time on making the sanitisers work on EOL machines > is not a priority. I don't have access to my 10.7 box right now - but will > take a look next week. I'm only building mainline on 10.7 because we happen to have a couple of old Mac minis running 10.7 still around that I can use for the purpose. Given the nightmarish slowdowns since 10.13, that is still a decent option. That said, when experimenting with bootstraps inside VirtualBox VMs, I had also tried 10.11 where unlike 10.7 I could run with 4 virtual cpus, getting reasonable build times.
Only affecting EOL systems, moving to P4.
> --- Comment #2 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot > Uni-Bielefeld.DE> --- >> --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- [... >> Of course, trying to workaround kernel bugs this way is weird, but if it isn't >> supported anymore or Apple isn't willing to fix their bugs... > > Mac OS X 10.7 is almost 9 years old by now and long past support. I > don't feel particularly inclined to reghunt which gcc/sanitizer change > caused this, let alone debug the Darwin kernel either. I've since experimented a bit more: 32-bit 10.7 is affected just the same. Afterwards, I've copied both the 32 and 64-bit alloca_big_alignment.exe and the corresponding libasan.6.dylib and libgcc_s.1.dylib to a 10.8 VM where they run just fine, so this is obviously 10.7-only issue. While working on this, I've created VirtualBox VMs for every single macOS release between 10.7 and 10.15, each with the latest updates and last supported Xcode version installed and ready for experiments if needed.
(In reply to ro@CeBiTec.Uni-Bielefeld.DE from comment #7) > > --- Comment #2 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot > > Uni-Bielefeld.DE> --- > >> --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > [... > >> Of course, trying to workaround kernel bugs this way is weird, but if it isn't > >> supported anymore or Apple isn't willing to fix their bugs... > > > > Mac OS X 10.7 is almost 9 years old by now and long past support. I > > don't feel particularly inclined to reghunt which gcc/sanitizer change > > caused this, let alone debug the Darwin kernel either. > > I've since experimented a bit more: 32-bit 10.7 is affected just the > same. Afterwards, I've copied both the 32 and 64-bit > alloca_big_alignment.exe and the corresponding libasan.6.dylib and > libgcc_s.1.dylib to a 10.8 VM where they run just fine, so this is > obviously 10.7-only issue. Yeah, I'm just waiting for the x86_64-darwin13 run to finish with libsanitizer disabled (the fault repeats for me on 64b). It's early low on my priority list to look at this with the current sanitiser output, since that is emitting a different ABI for Darwin than clang does (so the emitted code would be the first thing to fix). > While working on this, I've created VirtualBox VMs for every single > macOS release between 10.7 and 10.15, each with the latest updates and > last supported Xcode version installed and ready for experiments if > needed. VB is more reliable for some versions than others (which might have little to do with VB, of course ;) ). It's pretty hard to get anything < 10.6 to work there, and obv. is no use of ppc. ---- Right now, I'm thinking to disable sanitzer by default for master <= 10.7 and for 9.x for <= 10.6. I'll do that today or tomorrow since I want to make the 9.3 deadline.
one additional point. For earlier OS versions the 'atos' version installed is not sufficient to get sensible output from the sanitizer (characterised by very long timeouts on failed tests). In that case, it is better to install llvm-symbolizer from a recentish LLVM and to set ASAN_SYMBOLIZER_PATH=/path/to/llvm-symbolizer before running tests (FWFW, I tend to do this about 50% of the time even on recent OS versions to ensure that the fails seen are from the sanitiser not atos). atos is closed-source so we can't fix/rebuild it. Unfortunately, the llvm-symbolizer exe is not part of the Xcode distributions, so it has to be built from source. In the case of the x86_64-darwin11 kernel panics, this made no difference to the observed fails.
The master branch has been updated by Iain D Sandoe <iains@gcc.gnu.org>: https://gcc.gnu.org/g:63cc547f6d85819192afa795e9ade14f0800eda9 commit r10-6951-g63cc547f6d85819192afa795e9ade14f0800eda9 Author: Iain Sandoe <iain@sandoe.co.uk> Date: Sun Mar 1 14:40:57 2020 +0000 Darwin, libsanitizer: Adjust minimum supported Darwin version (PR93731). The current imported libsanitizer code produces kernel panics for Darwin 11 (macOS 10.7) and is unsupported for earlier versions already. It is not clear if the current sources are even intended to be supported on Darwin 11, so this patch causes the default to be build without sanitizers for Darwin <= 11. 2020-03-01 Iain Sandoe <iain@sandoe.co.uk> PR sanitizer/93731 * configure.tgt (x86_64-*-darwin*, i?86-*-darwin*): Enable by default only for Darwin versions greater than 12 (macOS 10.8).
I checked current gcc-9 on Darwin11 and the sanitiser builds and tests there without any such issue, so we only need to exclude for Darwin <= 10 for gcc-9. This PR should be fixed now.