Discussion:
help with fusing multiple dependent ops in gcc combine pass
Cherry Vanc
2014-08-15 01:21:23 UTC
Permalink
I received very helpful comments previously
(https://gcc.gnu.org/ml/gcc-help/2014-08/msg00010.html). And I could
successfully fuse dependent ops like following :

...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...

using a define_insn pattern to a new op "testnew36".

Now, How can I fuse the following stream of ops :

...
op1
...
op2 (consumes result of op1)
...
op3 (consumes result of op2)
...
op4 (consumes result of op2)
...

to the following :

...
testnew36
...
testnew40

The pertinent pattern seen in .combine file is a parallel expression :

(parallel [
(set (reg:DI 256 [ *_15 ])
(op3:DI (op2:DI (op1:DI (reg:DI 202 [ D.1563 ])
(const_int 4 [0x4]))
(reg:DI 242 [ inbuf ])) ))
(set (reg:DI 205 [ D.1566 ])
(op2:DI (op1:DI (reg:DI 202 [ D.1563 ])
(const_int 4 [0x4]))
(reg:DI 242 [ inbuf ])))
])

Is the following the correct way to do combine the four ops :
1. define a new define_insn "*matchtestnewparallel" matching the above
parallel expression which substitutes the first set expression above
(op1+op2+op3 combination) with testnew36 and leaves the second set
expression (op1+op2) as is
2. define a new define_insn "*testnew40" pattern that matches op1 +
op2 + op4 combination.
(I already have a define_insn "*testnew36" pattern that matches
op1+op2+op3 combo.

I have done what I have just described above, but I am not quite
seeing what is desirable. The order in which I defined them in the md
file is - "*matchtestnewparallel", "*testnew40", "*testnew36". Either
I am not doing it right or this is just not the right way to do it.
Can you give me some hints please ?

Thanks
Jeff Law
2014-08-15 05:15:58 UTC
Permalink
Post by Cherry Vanc
I received very helpful comments previously
(https://gcc.gnu.org/ml/gcc-help/2014-08/msg00010.html). And I could
...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...
using a define_insn pattern to a new op "testnew36".
...
op1
...
op2 (consumes result of op1)
...
op3 (consumes result of op2)
...
op4 (consumes result of op2)
...
...
testnew36
...
testnew40
(parallel [
(set (reg:DI 256 [ *_15 ])
(op3:DI (op2:DI (op1:DI (reg:DI 202 [ D.1563 ])
(const_int 4 [0x4]))
(reg:DI 242 [ inbuf ])) ))
(set (reg:DI 205 [ D.1566 ])
(op2:DI (op1:DI (reg:DI 202 [ D.1563 ])
(const_int 4 [0x4]))
(reg:DI 242 [ inbuf ])))
])
If you see a PARALLEL, then it means that one of the output operands in
the original series of insns is used later. Thus that side effect
must be preserved. In the example above, you'll find uses of regs 256
and 205.

PARALLELs are typically far less useful because targets typically don't
have many instructions that produce multiple outputs. Typically when a
PARALLEL is generated, you're going to be outputting multiple
instructions for the PARALLEL. In that case you're better off using a
define_insn_and_split. You can find many examples in the various MD
files distributed with GCC.

If all the intermediate destinations die when they are consumed, then
the combiner will not need to preserve the side effects and thus won't
generate a PARALLEL and you would implement that as a simple define_insn
in the machine description. Again, you can find many examples of
patterns for the combiner in the various MD files included in GCC.


Jeff

Loading...