Post by David SaintyPost by Joerg SonnenbergerPost by Nick HudsonStandards and modern tools, e.g. gcc[1] 4.8, expect malloc to return
memory with alignment that is different to our current
jemalloc.Attached is a suggested diff to fix the problem for most
platforms.
I still don't see the point of the GCC behavior and I don't agree with
the standard interpretation. The referenced DR covers a quite different
problem (pointer casts). It doesn't make sense to justify the larger
alignment if accessing the storage with any such type is UB because it
is an access beyond the end of the allocation. As such, I do strongly
consider this an overeager optimisation.
Surely it makes (the most) sense on any platform that can't write a unit
that small without writing to the surrounding bytes anyway. Isn't that
the case with the (original) Alpha?
I.e. If a two byte write has to be implemented in terms of an aligned 4
byte read-modify-write, then you wouldn't ever want to allocate a pair
of two byte allocations contiguous in the same four bytes anyway - at
least not if the code might be multi-threaded. So you may as well both
align and assume everything aligned on 4 bytes.
That is certainly a justification for forcing 4 byte alignment on alpha.
Indeed x86 also needs 4 byte alignment because the BTC/BTR/BTS instruction
are likely to do a RMW cycle on the 32bit word.
I'm not sure about arm, pre-v4 there were no 16bit accesses. gcc will
do 32bit accesses for some 16bit values on later cpus, but only because
of the limited addressing modes - and they can't affect a 2 byte allocation.
But gcc is assuming the maximal alignment for 'normal' items - which
ends up being 16 bytes for long double on some systems.
David
--
David Laight: ***@l8s.co.uk