[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev

thopre01 at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Fri Oct 12 11:08:00 GMT 2018


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968

--- Comment #14 from Thomas Preud'homme <thopre01 at gcc dot gnu.org> ---
(In reply to Eric Botcazou from comment #13)
> > Forgive my naive question as I'm not too familiar with that part of the
> > compiler: why should the get_best_mem_extraction_insn be guarded with
> > reverse? I thought I'd just ad an if (reverse) if it succeeds and call
> > flip_storage_order there, likewise after the call to extract_bit_field_1
> > below if successful.
> 
> No, the numbering of bits depends on the endianness, i.e. you need to know
> the endianness of the source to do a correct extraction.  For example, if
> you extract bit #2 - bit #9 of a structure in big-endian using HImode, then
> you cannot do it in little-endian and just swap the bytes afterwards (as a
> matter of fact, there is nothing to swap since the result is byte-sized). 
> The LE extraction is:
>   HImode load + HImode right_shift (2)
> whereas the BE extraction is:
>   HImode load + HImode right_shift (6)
> 
> The extv machinery cannot handle reverse SSO for the time being so the guard
> is still needed for it in the general case; on the contrary,
> extract_bit_field_1 can already and doesn't need an additional call to
> flip_storage_order.
> 
> Of course, for specific bitfields, typically verifying
> simple_mem_bitfield_p, then you can extract in native order and do
> flip_storage_order on the result.
> 
> In other words, the extv path can be used as you envision, but only for
> specific bitfields modeled on those accepted by simple_mem_bitfield_p, and
> then the call to flip_storage_order will indeed be needed.

Right makes sense. So I tried your suggestion (guard the first if with !reverse
but not the second) and it didn't work. Problem as you suggested is
adjust_bit_field_mem_for_reg which refuses to do an unaligned load (or rather
bit_field_mode_iterator's next_mode method refuses). I think
get_best_mem_extraction_insn does not have this problem because instead it just
queries whether an instruction to do unaligned access exist.

Are you aware of a reason why next_mode does not do the same?


More information about the Gcc-bugs mailing list