casting "extended vectors"

Discussion:

Vincenzo Innocente

2014-07-05 15:33:34 UTC

at the end of
https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html
I read
"It is possible to cast from one vector type to another, provided they are of the same size"

when I try, it looks to me that it is actually more a type-punning rather than a C-style cast.
Is it possible to cast “vectors” as in C one casts intrinsic types?

given
typedef float __attribute__( ( vector_size( 16 ) ) ) float32x4_t;
typedef int __attribute__( ( vector_size( 16 ) ) ) int32x4_t;

float32x4_t vf{4.2,-3.2, 1.2, 7.2};
int32x4_t vi = int32x4_t(vf);
is not what I would like to be i.e.
int32x4_t vi{int(vf[0]),int(vf[1]),int(vf[2]),int(vf[3])};

best,
v.

Marc Glisse

2014-07-06 13:45:16 UTC

Permalink

Post by Vincenzo Innocente
at the end of
https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html
I read
"It is possible to cast from one vector type to another, provided they are of the same size"
when I try, it looks to me that it is actually more a type-punning rather than a C-style cast.

Indeed, I am not happy about that, but it seems hard to change that now.
If you could send a patch to gcc-patches improving the documentation, it
would be welcome.

Post by Vincenzo Innocente
Is it possible to cast “vectors” as in C one casts intrinsic types?

Not directly, no. You can cast element by element and hope gcc will
vectorize it, but I don't even think that is likely.

Post by Vincenzo Innocente
From a language point of view, clang recently introduced

__builtin_convertvector. It takes a type as second parameter, that may be
an issue if we try to implement the same in gcc.

Post by Vincenzo Innocente
From an implementation point of view, we seem to have most (all?) pieces

(float_expr, vec_unpack_hi_expr, etc) to represent those in the
middle-end.

--
Marc Glisse

Vincenzo Innocente

2014-07-06 14:32:10 UTC

Permalink

Post by Marc Glisse

Indeed, I am not happy about that, but it seems hard to change that now.

Understand.

Post by Marc Glisse
If you could send a patch to gcc-patches improving the documentation, it would be welcome.

I will try to think of something (maybe use "reinterpret" insted of cast, as in c++ reinterpret_cast.)
The point is that there is no (more?) any equivalent for C/C++ type
Of course adding __builtin_convertvector and its doc will ease to clarify the issue

Post by Marc Glisse

Post by Vincenzo Innocente
Is it possible to cast “vectors” as in C one casts intrinsic types?

Not directly, no. You can cast element by element and hope gcc will vectorize it, but I don't even think that is likely.

VF convert(itype i) { VF f; for (int j=0;j<N;++j) f[j]=i[j]; return f;}
itype convert(VF f) { itype i; for (int j=0;j<N;++j) i[j]=f[j]; return i;}
vectorize for N=8 and N=16, does not vectorize for N=4 (independently of the target architecture)

Post by Marc Glisse
From a language point of view, clang recently introduced __builtin_convertvector. It takes a type as second parameter, that may be an issue if we try to implement the same in gcc.

strange syntax indeed. of course in c++ would be trivial… "__builtin_convertvector<T>”
It that would be a very welcome addition whatever syntax you choose/manage-to-implement

If you can also try to add something for “movemask” would be also great!

Vincenzo

Post by Marc Glisse
From an implementation point of view, we seem to have most (all?) pieces (float_expr, vec_unpack_hi_expr, etc) to represent those in the middle-end.
--
Marc Glisse

Marc Glisse

2014-07-06 14:57:46 UTC

Permalink

Post by Vincenzo Innocente

Post by Marc Glisse

Post by Vincenzo Innocente
Is it possible to cast “vectors” as in C one casts intrinsic types?

Not directly, no. You can cast element by element and hope gcc will vectorize it, but I don't even think that is likely.

Even for N=8 or 16, if you inline convert, it won't vectorize anymore. The
vectorizer needs to see a non-local object to start working.

Post by Vincenzo Innocente
strange syntax indeed. of course in c++ would be trivial…
"__builtin_convertvector<T>”
It that would be a very welcome addition whatever syntax you
choose/manage-to-implement
If you can also try to add something for “movemask” would be also great!

Please file enhancement PRs (after making sure they don't already exist)
with as much information as you can. I am unlikely to implement that any
time soon, maybe somebody else will be...

For the vector extension, we don't want things that are too specific to a
processor. Maybe an operation that takes an integer vector guaranteed to
have only 0 and -1 as elements and compacting it to a bitfield would make
sense (and the reverse operation), this would be relevant for sparc-vis
and for avx512. Movemask seems a bit too weird for direct support.

--
Marc Glisse

Vincenzo Innocente

2014-07-07 09:01:21 UTC

Permalink

Post by Vincenzo Innocente
f you can also try to add something for “movemask” would be also great!

Please file enhancement PRs (after making sure they don't already exist) with as much information as you can. I am unlikely to implement that any time soon, maybe somebody else will be...
For the vector extension, we don't want things that are too specific to a processor. Maybe an operation that takes an integer vector guaranteed to have only 0 and -1 as elements and compacting it to a bitfield would make sense (and the reverse operation), this would be relevant for sparc-vis and for avx512. Movemask seems a bit too weird for direct support.

I submitted PR56829 April last year …
maybe it needs a bit of rewording to attract possible contributors
v.

Marc Glisse

2014-07-07 09:15:02 UTC

Permalink

Post by Vincenzo Innocente

Post by Vincenzo Innocente
f you can also try to add something for “movemask” would be also great!

Please file enhancement PRs (after making sure they don't already exist) with as much information as you can. I am unlikely to implement that any time soon, maybe somebody else will be...
For the vector extension, we don't want things that are too specific to a processor. Maybe an operation that takes an integer vector guaranteed to have only 0 and -1 as elements and compacting it to a bitfield would make sense (and the reverse operation), this would be relevant for sparc-vis and for avx512. Movemask seems a bit too weird for direct support.

I submitted PR56829 April last year …
maybe it needs a bit of rewording to attract possible contributors

It sure could do with some more information, like what "movemask" is,
links to the documentation of cuda ballot and x86 movemask, etc (if you
can find relevant instructions for altivec, neon, vis or some others, it
would help convince that this isn't an x86-only feature). The reference to
any/all/popcount may not be enough of a motivation, because all 3 are
reductions. I would find clz/ctz more convincing.

Not that any of that will make free time magically appear for some
contributor :-(

--
Marc Glisse