[PATCH] Cancel bswap opt when intermediate stmts are reused

Wed Nov 19 10:39:00 GMT 2014

On Tue, Nov 18, 2014 at 5:33 PM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
>> From: Richard Biener [mailto:richard.guenther@gmail.com]
>> Sent: Monday, November 17, 2014 12:47 PM
>>
>> Hmm.  I am a little bit concerned about the malloc traffic generated here.
>> So why not use a vec<usedtree>, get rid of the ->next pointer and
>> use a hash_map <gimple, unsigned> to associate the stmt with
>> an index into the vector?
>
> Sure, it even makes things easier. However I don't understand why a vector is
> better than malloc. Is it a matter of fragmentation?

That and with a vector you get log(N) allocations instead of N.

>>
>> At this point I'd rather leave DCE to DCE ...
>
> I thought since all information is there why not do it. It makes it easier to
> read the dump of the pass.

Fair enough ...

>>
>> Ick - do we really have to use gotos here?  Can you refactor this
>> a bit to avoid it?
>
> Yes I can. I needed the same kind of thing for fixing PR63761 (r217409) and
> found a way to avoid it.

Thanks.

>>
>> The whole scheme wouldn't exactly fly with the idea of eventually
>> using a lattice to propagate the symbolic numbers, but well...
>
>>
>> I think the overall idea is sound and if you have time to adjust according
>> to my comments that would be nice.
>
> To be honest I think it should wait for next stage1. After several ping I took
> another look at the testcase with the idea of measuring the size reduction
> the patch could give and was surprised that in all cases I could construct the
> size was actually bigger. Performance might be improved nonetheless but
> I think this needs more careful consideration.
>
> And as you said the approach would need to be changed if bswap was
> rewritten to do a forward analysis. At last, nobody reported a code size or
> performance regression so far due to the changes so that might be a non
> issue. If such a report happens, then it will be a good time to revisit that
> decision.
>
> Do you agree?

Yes.  If you can't come up with a testcase that it improves but it only
pessimizes cases then IMHO this would be a very premature optimization.
I suspect that we'd need to have a more complex cost analysis factoring
in that we _do_ remove a lot of code even if parts need to be kept alive.
So maybe only discard the transform if _all_ stmts need to be kept
live (as when we even remove one of the partial loads then this should
alread offset the single load we add ... for the bswap this may be
slightly different).

>
>>
>> Sorry for the very late review.
>
> That's alright, I know how it is. Thank you for keeping track of it. I actually
> feel sorry I didn't warn about my findings. I thought the patch fell through
> the cracks and didn't want to spam gcc-patches uselessly.

Sure.

Thanks,
Richard.

> Best regards,
>
> Thomas
>
>
>