Cleanup the logical DImode operations since the current implementation is way
too complicated. Thumb-1, Thumb-2, VFP/Neon and iwMMXt all work differently,
resulting in a bewildering number of expansions, patterns and splits across
several md files. All this complexity is counterproductive and results in
inefficient code.
A much simpler approach is to split these operations early in the expander
so that optimizations and register allocation are applied on the 32-bit halves.
Codegeneration is unchanged on Thumb-1 and Arm/Thumb-2 without Neon or iwMMXt
(which already expand these instructions early). With Neon these changes save
~1000 instructions from the PR77308 testcase, mostly by significantly reducing
register pressure and spilling.
Bootstrap OK on arm-none-linux-gnueabihf --with-cpu=cortex-a57