Discussion:
fuse multiple ops into one new op
Cherry Vanc
2014-08-01 23:18:49 UTC
Permalink
I need to fuse multiple instructions into a single one.
...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...

I defined a peephole2 pattern in my GCC backend .md file. If these
three instructions are contiguous, then I do get my test "testnew"
instruction. If these instructions are far apart, I dont.

(define_peephole2
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
(set (match_dup 0)
(op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
(set (match_dup 0)
(sign_extend:DI (op3:SI (match_dup 0))))]
"TARGET_MYCORE"
[(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
(match_dup 0) (match_dup 1)) (match_dup 0)))))]
"")

(define_insn "*testnew"
[(set (match_operand:DI 0 "register_operand" "=d")
(sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
(match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])

How can I fuse multiple instructions that are far apart into a new
single opcode that MYCORE has ?
Marc Glisse
2014-08-02 07:02:36 UTC
Permalink
Post by Cherry Vanc
I need to fuse multiple instructions into a single one.
...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...
I defined a peephole2 pattern in my GCC backend .md file. If these
three instructions are contiguous, then I do get my test "testnew"
instruction. If these instructions are far apart, I dont.
(define_peephole2
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
(set (match_dup 0)
(op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
(set (match_dup 0)
(sign_extend:DI (op3:SI (match_dup 0))))]
"TARGET_MYCORE"
[(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
(match_dup 0) (match_dup 1)) (match_dup 0)))))]
"")
(define_insn "*testnew"
[(set (match_operand:DI 0 "register_operand" "=d")
(sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
(match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])
How can I fuse multiple instructions that are far apart into a new
single opcode that MYCORE has ?
Hello,

I probably haven't looked closely enough, but could you explain why
the 'combine' pass isn't already doing what you want?
--
Marc Glisse
Oleg Endo
2014-08-02 09:08:37 UTC
Permalink
Post by Marc Glisse
Post by Cherry Vanc
I need to fuse multiple instructions into a single one.
...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...
I defined a peephole2 pattern in my GCC backend .md file. If these
three instructions are contiguous, then I do get my test "testnew"
instruction. If these instructions are far apart, I dont.
(define_peephole2
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
(set (match_dup 0)
(op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
(set (match_dup 0)
(sign_extend:DI (op3:SI (match_dup 0))))]
"TARGET_MYCORE"
[(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
(match_dup 0) (match_dup 1)) (match_dup 0)))))]
"")
(define_insn "*testnew"
[(set (match_operand:DI 0 "register_operand" "=d")
(sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
(match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])
How can I fuse multiple instructions that are far apart into a new
single opcode that MYCORE has ?
Hello,
I probably haven't looked closely enough, but could you explain why the 'combine' pass isn't already doing what you want?
Yes, this kind of stuff is usually done using the combine pass.
However, it will not try out all permutations of instructions, but
rather follow some rules. Thus the patterns in the .md have to
match combine's expectations. To see which patterns it tries out,
look at the rtl pass dump. From there it should rather easy to
write down the expected pattern. Notice also that sometimes
patterns will not be picked if the rtx costs are off. Again,
see combine's log for when this happens.

Cheers,
Oleg
Jeff Law
2014-08-02 09:51:59 UTC
Permalink
Post by Cherry Vanc
I need to fuse multiple instructions into a single one.
...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...
I defined a peephole2 pattern in my GCC backend .md file. If these
three instructions are contiguous, then I do get my test "testnew"
instruction. If these instructions are far apart, I dont.
(define_peephole2
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
(set (match_dup 0)
(op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
(set (match_dup 0)
(sign_extend:DI (op3:SI (match_dup 0))))]
"TARGET_MYCORE"
[(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
(match_dup 0) (match_dup 1)) (match_dup 0)))))]
"")
(define_insn "*testnew"
[(set (match_operand:DI 0 "register_operand" "=d")
(sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
(match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])
How can I fuse multiple instructions that are far apart into a new
single opcode that MYCORE has ?
I suspect the problem is "r1" is set/used multiple times. That will
inhibit instruction combination. If at all possible you really want
that code to look like:


r4 = (r1) op1 (const) /* r1 dies */
r5 = r4 op (r2) /*r2 and r2 die */
r3 = op3 (r5) /* r5 dies */


Then the combiner will attempt to combine those instructions in the
obvious ways. For the combiner you want to use a define_insn pattern.

define_peephole2 is primarily used in cases where there is no obvious
dataflow between the patterns.


Jeff
Cherry Vanc
2014-08-05 23:26:05 UTC
Permalink
Thanks. I am now using a define_insn based on your inputs :

(define_insn "testnew36"
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_operand:DI 1 "register_operand" "")
(match_operand:SI 2 "immediate_operand" "") ))
(set (match_operand:DI 3 "register_operand" "")
(op2:DI (match_operand:DI 4 "register_operand" "") (match_dup 0)))
(set (match_operand:DI 5 "register_operand" "")
(sign_extend:DI (op3:SI (match_dup 3))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])

Why doesnt -fdump-rtl-all-all / -fdump-rtl-all generate those .life
and .combine files so that I can take a look at the combine pass is
doing ? dump-rtl-combine doesnt spit anything either. MYCORE is a mips
adaptation using GCC 4.9.0.
Post by Cherry Vanc
I need to fuse multiple instructions into a single one.
...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...
I defined a peephole2 pattern in my GCC backend .md file. If these
three instructions are contiguous, then I do get my test "testnew"
instruction. If these instructions are far apart, I dont.
(define_peephole2
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
(set (match_dup 0)
(op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
(set (match_dup 0)
(sign_extend:DI (op3:SI (match_dup 0))))]
"TARGET_MYCORE"
[(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
(match_dup 0) (match_dup 1)) (match_dup 0)))))]
"")
(define_insn "*testnew"
[(set (match_operand:DI 0 "register_operand" "=d")
(sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
(match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])
How can I fuse multiple instructions that are far apart into a new
single opcode that MYCORE has ?
I suspect the problem is "r1" is set/used multiple times. That will inhibit
instruction combination. If at all possible you really want that code to
r4 = (r1) op1 (const) /* r1 dies */
r5 = r4 op (r2) /*r2 and r2 die */
r3 = op3 (r5) /* r5 dies */
Then the combiner will attempt to combine those instructions in the obvious
ways. For the combiner you want to use a define_insn pattern.
define_peephole2 is primarily used in cases where there is no obvious
dataflow between the patterns.
Jeff
Cherry Vanc
2014-08-05 23:27:36 UTC
Permalink
I forgot to mention that this define_insn pattern doesnt work for me.
Post by Cherry Vanc
(define_insn "testnew36"
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_operand:DI 1 "register_operand" "")
(match_operand:SI 2 "immediate_operand" "") ))
(set (match_operand:DI 3 "register_operand" "")
(op2:DI (match_operand:DI 4 "register_operand" "") (match_dup 0)))
(set (match_operand:DI 5 "register_operand" "")
(sign_extend:DI (op3:SI (match_dup 3))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])
Why doesnt -fdump-rtl-all-all / -fdump-rtl-all generate those .life
and .combine files so that I can take a look at the combine pass is
doing ? dump-rtl-combine doesnt spit anything either. MYCORE is a mips
adaptation using GCC 4.9.0.
Post by Cherry Vanc
I need to fuse multiple instructions into a single one.
...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...
I defined a peephole2 pattern in my GCC backend .md file. If these
three instructions are contiguous, then I do get my test "testnew"
instruction. If these instructions are far apart, I dont.
(define_peephole2
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
(set (match_dup 0)
(op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
(set (match_dup 0)
(sign_extend:DI (op3:SI (match_dup 0))))]
"TARGET_MYCORE"
[(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
(match_dup 0) (match_dup 1)) (match_dup 0)))))]
"")
(define_insn "*testnew"
[(set (match_operand:DI 0 "register_operand" "=d")
(sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
(match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])
How can I fuse multiple instructions that are far apart into a new
single opcode that MYCORE has ?
I suspect the problem is "r1" is set/used multiple times. That will inhibit
instruction combination. If at all possible you really want that code to
r4 = (r1) op1 (const) /* r1 dies */
r5 = r4 op (r2) /*r2 and r2 die */
r3 = op3 (r5) /* r5 dies */
Then the combiner will attempt to combine those instructions in the obvious
ways. For the combiner you want to use a define_insn pattern.
define_peephole2 is primarily used in cases where there is no obvious
dataflow between the patterns.
Jeff
Marc Glisse
2014-08-06 05:51:14 UTC
Permalink
Post by Cherry Vanc
(define_insn "testnew36"
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_operand:DI 1 "register_operand" "")
(match_operand:SI 2 "immediate_operand" "") ))
(set (match_operand:DI 3 "register_operand" "")
(op2:DI (match_operand:DI 4 "register_operand" "") (match_dup 0)))
(set (match_operand:DI 5 "register_operand" "")
(sign_extend:DI (op3:SI (match_dup 3))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])
Er, no, that's not what was recommended. Your *testnew in the previous
email was much better.
Post by Cherry Vanc
Why doesnt -fdump-rtl-all-all / -fdump-rtl-all generate those .life
and .combine files so that I can take a look at the combine pass is
doing ? dump-rtl-combine doesnt spit anything either. MYCORE is a mips
adaptation using GCC 4.9.0.
Are you sure compiling file.c with options -O -da (or any of the options
you tried) doesn't create file.c.201r.combine (number can vary)? You'll
need to debug that first then.
--
Marc Glisse
Cherry Vanc
2014-08-07 18:34:55 UTC
Permalink
Thanks all for your comments. Posting my comments for posterity.

I defined a define_insn pattern as follows and it worked well for me :

(define_insn "testnew36"
[(set (match_operand:DI 0 "register_operand" "=d")
(op3:DI (op2:DI (op1:DI (match_operand:DI 1
"register_operand" "")
(match_operand:SI 2
"immediate_operand" ""))
(match_operand:DI 3
"register_operand" ""))))]
"TARGET_MYCORE"
"testnew36"
[(set_attr "mode" "DI")
(set_attr "length" "4")])

I am working on a clean way to update the rtx_costs this still. But
once the insn costs are in place (I somewhat put a dirty hack for
now), GCC's combine pass does fuse these ops for me. There are a few
cases where the combine pass falters with suboptimal patterns.

The case is when (I think) GCC thinks that the result of op1 +op2
combination is required for a latter insn :

(parallel [
(set (reg:DI 256 [ *_15 ])
(op3:DI (op2:DI (op1:DI (reg:DI 202 [ D.1563 ])
(const_int 4 [0x4]))
(reg/v/f:DI 242 [ inbuf ])) [0 *_15+0 S4 A32]))
(set (reg/f:DI 205 [ D.1566 ])
(op2:DI (op1:DI (reg:DI 202 [ D.1563 ])
(const_int 4 [0x4]))
(reg/v/f:DI 242 [ inbuf ])))
])

The second set insn in the above parallel expression can be combined
with another define_insn pattern that can fuse op1, op2 and "op4" to a
new insn "testnew40". Is there a way to accomplish this ? When does
the combine pass create these parallel expressions ?

To be clear, I have two define_insn patterns at the moment :

1. testnew36 (fuses op1, op2, and op3)
2. testnew40 (fuses op1, op2, and op4)

So a stream of insns like below :

...
op1
...
op2 (consumes result of op1)
...
op3 (consumes result of op2)
...
op4 (consumes result of op2)
...

gets translated to :

...
testnew36
...
testnew40
Post by Cherry Vanc
(define_insn "testnew36"
[(set (match_operand:DI 0 "register_operand" "")
(op1:DI (match_operand:DI 1 "register_operand" "")
(match_operand:SI 2 "immediate_operand" "") ))
(set (match_operand:DI 3 "register_operand" "")
(op2:DI (match_operand:DI 4 "register_operand" "") (match_dup 0)))
(set (match_operand:DI 5 "register_operand" "")
(sign_extend:DI (op3:SI (match_dup 3))))]
"TARGET_MYCORE"
"testnew 36"
[(set_attr "mode" "DI")])
Er, no, that's not what was recommended. Your *testnew in the previous email
was much better.
Post by Cherry Vanc
Why doesnt -fdump-rtl-all-all / -fdump-rtl-all generate those .life
and .combine files so that I can take a look at the combine pass is
doing ? dump-rtl-combine doesnt spit anything either. MYCORE is a mips
adaptation using GCC 4.9.0.
Are you sure compiling file.c with options -O -da (or any of the options you
tried) doesn't create file.c.201r.combine (number can vary)? You'll need to
debug that first then.
--
Marc Glisse
Loading...