Discussion:
GCC built-in to swap octets within words
Mason
2011-08-04 14:35:25 UTC
Permalink
Hello,

I've been looking for the GCC built-in to swap two octets
within a 16-bit word. I looked at the list of GCC built-ins,
and found bswap32 and bswap64, but no bswap16.

http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

int32_t __builtin_bswap32 (int32_t x)
Returns x with the order of the bytes reversed;
for example, 0xaabbccdd becomes 0xddccbbaa.
Byte here always means exactly 8 bits.

int64_t __builtin_bswap64 (int64_t x)
Similar to __builtin_bswap32, except the argument
and return types are 64-bit.

On x86, GCC is smart enough to optimize the operation to
a single instruction.

$ cat swap16.c
#include <stdint.h>
uint16_t swap16(uint16_t n) { return (n >> 8) | (n << 8); }
$ gcc -O3 -fomit-frame-pointer -S swap16.c
_swap16:
movzwl 4(%esp), %eax
rolw $8, %ax
ret

Is my swap16 function recognized as a bswap16 operation
internally, therefore hand-written assembly code is provided?

On my platform (SH-4) the operation is not optimized:

$ sh-superh-elf-gcc -O3 -S swap16.c
_swap16:
extu.w r4,r4
mov r4,r0
shlr8 r4
shll8 r0
or r4,r0
rts
extu.w r0,r0

Even though the operation could be done in a single swap.b
instruction (and possibly an extu.w for ABI compliance)

So, if I understand the situation correctly, the
bswap16 built-in may not be needed because GCC can
recognize the C pattern for bswap16, but no optimized
assembly code has been provided for my platform.

Is that correct?
--
Regards.
Ian Lance Taylor
2011-08-05 16:29:36 UTC
Permalink
Post by Mason
I've been looking for the GCC built-in to swap two octets
within a 16-bit word. I looked at the list of GCC built-ins,
and found bswap32 and bswap64, but no bswap16.
I don't think anybody has implemented bswap16 as a general builtin
function (it does exist for PPC).
Post by Mason
On x86, GCC is smart enough to optimize the operation to
a single instruction.
$ cat swap16.c
#include <stdint.h>
uint16_t swap16(uint16_t n) { return (n >> 8) | (n << 8); }
$ gcc -O3 -fomit-frame-pointer -S swap16.c
movzwl 4(%esp), %eax
rolw $8, %ax
ret
Is my swap16 function recognized as a bswap16 operation
internally, therefore hand-written assembly code is provided?
That is one way to view what is happening. I would describe this as a
target specific compiler optimization.
Post by Mason
$ sh-superh-elf-gcc -O3 -S swap16.c
extu.w r4,r4
mov r4,r0
shlr8 r4
shll8 r0
or r4,r0
rts
extu.w r0,r0
Even though the operation could be done in a single swap.b
instruction (and possibly an extu.w for ABI compliance)
So, if I understand the situation correctly, the
bswap16 built-in may not be needed because GCC can
recognize the C pattern for bswap16, but no optimized
assembly code has been provided for my platform.
That sounds about right, although personally I would say that gcc ought
to provide __builtin_bswap16 just for consistency.

Ian

Loading...