[PATCH v2] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p () before IPA.

Jeff Law law@redhat.com
Tue Dec 1 22:34:41 GMT 2020



On 11/23/20 7:36 AM, Ilya Leoshkevich wrote:
> On Fri, 2020-11-20 at 12:14 -0700, Jeff Law wrote:
>> On 6/30/20 12:46 PM, Ilya Leoshkevich wrote:
>>> v1: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547236.html
>>>
>>> This is the implementation of Jakub's suggestion: allow
>>> __builtin_constant_p () after IPA, but fold it into 0.  Smoke test
>>> passed on s390x-redhat-linux, full regtest and bootstrap are
>>> running on
>>> x86_64-redhat-linux.
>>>
>>> ---
>>>
>>> Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c)
>>> build
>>> with GCC 10 fails on s390 with "impossible constraint".
>>>
>>> The problem is that jump threading makes __builtin_constant_p ()
>>> lie
>>> when it splits a path containing a non-constant expression in a way
>>> that on each of the resulting paths this expression is constant.
>>>
>>> Fix by disallowing __builtin_constant_p () on threading paths
>>> before
>>> IPA and fold it into 0 after IPA.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2020-06-30  Ilya Leoshkevich  <iii@linux.ibm.com>
>>>
>>> 	* tree-ssa-threadbackward.c (thread_jumps::m_allow_bcp_p): New
>>> 	member.
>>> 	(thread_jumps::profitable_jump_thread_path): Do not allow
>>> 	__builtin_constant_p () on threading paths unless m_allow_bcp_p
>>> 	is set.
>>> 	(thread_jumps::find_jump_threads_backwards): Set m_allow_bcp_p.
>>> 	(pass_thread_jumps::execute): Allow __builtin_constant_p () on
>>> 	threading paths after IPA.
>>> 	(pass_early_thread_jumps::execute): Do not allow
>>> 	__builtin_constant_p () on threading paths before IPA.
>>> 	* tree-ssa-threadupdate.c (duplicate_thread_path): Fold
>>> 	__builtin_constant_p () on threading paths into 0.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2020-06-30  Ilya Leoshkevich  <iii@linux.ibm.com>
>>>
>>> 	* gcc.target/s390/builtin-constant-p-threading.c: New test.
>> So I'm finally getting back to this.  Thanks for your patience.
>>
>> It's a nasty little problem, and I suspect there's actually some
>> deeper
>> issues here.  While I'd like to claim its a bad use of b_c_p, I don't
>> think I can reasonably make that argument.
>>
>> So what we have is a b_c_p at the start of an if-else chain. 
>> Subsequent
>> tests on the "true" arm of the the b_c_p test may throw us off the
>> constant path (because the constants are out of range).  Once all the
>> tests are passed (it's constant and the constant is in range) the
>> true
>> arm's terminal block has a special asm that requires a constant
>> argument.   In the case where we get to the terminal block on the
>> true
>> arm, the argument to the b_c_p is used as the constant argument to
>> the
>> special asm.
>>
>> At first glace jump threading seems to be doing the right thing. 
>> Except
>> that we end up with two paths to that terminal block with the special
>> asm, one for each of the two constant arguments to the b_c_p call. 
>> Naturally since that same value is used in the asm, we have to
>> introduce
>> a PHI to select between them at the head of the terminal block.   Now
>> the argument in the asm is no longer constant and boom we fail.
>>
>> I briefly pondered if we should only throttle when the argument to
>> the
>> b_c_p is not used elsewhere.  But I think that just hides the problem
>> and with a little work I could probably extend the testcase to still
>> fail in that scenario.
>>
>> I also briefly pondered if we should isolate the terminal block as
>> well
>> (essentially creating one for each unique PHI argument).  We'd likely
>> only need to do that when there's an ASM in the terminal block, but
>> that
>> likely just papers over the problem as well since the ASM could be in
>> a
>> successor of the terminal block.
>>
>> I haven't thought real deeply about it, but I wouldn't be surprised
>> if
>> there's other passes that can trigger similar problems.  Aggressive
>> cross-jumping would be the most obvious, but some of the
>> hosting/sinking
>> of operations past PHIs would seem potentially problematical as well.
>>
>> Jakub suggestion might be the best one in this space.   I don't have
>> anything better right now.  The deeper questions about other passes
>> setting up similar scenarios can probably be punted, I'd expect
>> threading to be far and above the most common way for this to happen
>> and
>> I'd be comfortable faulting in investigation of other cases if/when
>> they
>> happen.
>>
>> So I retract my initial objections.  Let's go with the V2 patch.
>>
>>
>> jeff
> Hi Jeff,
>
> Thanks for having another look!
>
> I did x86_64 builds of SPEC and vmlinux, and it seems that in practice
> v2 does not have any benefit over v1.
>
> What do you think about going with the v1, which is less complex?
No strong opinions.  I think whichever is less invasive in terms of code
quality is probably the way to go.  What we want to avoid is suppressing
threading unnecessarily as that often leads to false positives from
middle-end based warnings.  Suppressing threading can also lead to build
failures in the kernel due to the way they use b_c_p.

jeff



More information about the Gcc-patches mailing list