Stack frame question on x86 code generation

Discussion:

Gang-Ryung Uh

2005-04-23 18:09:44 UTC

Could anyone help me understand what is the gcc
strategy to prepare the stack frame?
For the following function,

void function(int a, int b, int c)
{
char buffer1[5];
char buffer2[10];
int *ret;

ret = &buffer1[0]+28;
printf("0x%x=return address, *ret);
}

I compiled with gcc -O0 -S option and the compiler
produces the code that I cannot quite follow the
stack frame layout strategy in gcc.

function:
pushl %ebp
movl %esp, %ebp
subl $56, %esp // question 1
leal -24(%ebp), %eax
addl $28, %eax
movl %eax, -44(%ebp)
subl $8, %esp // question 2
movl -44(%ebp), %eax
pushl (%eax)
pushl $.LC0
call printf

Here are my questions:
question1: Why the stack frame size is 56?
observation: (1) compiler add 16 bytes
padding before allocating storage
for array buffer1 (2) buffer1 need
5 bytes. However, due to alignment
issue, they seem to add 3 extra bytes.
Thus, -24(%ebp) should point to buffer1[0].

Then, why they adding 16 bytes padding?

question2: Why gcc makes the stack frame bigger before

the function call printf?

subl $8, %esp

Does it related to printf? If it does, then
could you explain why?

Thanks in advance.
Best regards,

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Niko Matsakis

2005-04-24 05:04:07 UTC

Permalink

Post by Gang-Ryung Uh
I compiled with gcc -O0 -S option and the compiler
produces the code that I cannot quite follow the
stack frame layout strategy in gcc.

First of all, I think you'll find in more complicated functions that
the stack layout for -O3 is dramatically different. This is partly
because many variables do not need to end up on the stack, and the
effect of other optimizations.

Hopefully this is just for educational purposes, and you're not
planning on writing code that relies on the layout of the stack frame!
I can guarantee you that this will change. If you just want to find
the return address, there are easier ways...

Post by Gang-Ryung Uh
question1: Why the stack frame size is 56?
observation: (1) compiler add 16 bytes
padding before allocating storage
for array buffer1 (2) buffer1 need
5 bytes. However, due to alignment
issue, they seem to add 3 extra bytes.
Thus, -24(%ebp) should point to buffer1[0].
Then, why they adding 16 bytes padding?

I can't say for sure, but remember that the stack frame has more than
just variables on it. There are also saved registers, such as the
caller's frame pointer, and a number of other things. Also, many
machines work best if they 32 bit alignment for memory loads.

Post by Gang-Ryung Uh
question2: Why gcc makes the stack frame bigger before
the function call printf?
subl $8, %esp
Does it related to printf? If it does, then
could you explain why?

This is to make room for the parameters to printf(); there are two, and
each is a pointer, and hence four bytes, so the total space required is
eight bytes.

Niko

Arturas Moskvinas

2005-04-24 09:10:22 UTC

Permalink

Post by Niko Matsakis
This is to make room for the parameters to printf(); there are two, and
each is a pointer, and hence four bytes, so the total space required is
eight bytes.

I think you are not very correct in this part, if you write simple
function which calls function without parameter, you'll see that gcc
is also substract by 8 (12 if you use -fomit-frame-pointer). Command
pushl substracts ESP pointer by 4 itself. So i think gcc is trying to
align variable (and return address) in stack.

Arturas Moskvinas

Niko Matsakis

2005-04-24 09:12:53 UTC

Permalink

Yes, you're right of course.

Niko

Post by Arturas Moskvinas

Post by Niko Matsakis
This is to make room for the parameters to printf(); there are two, and
each is a pointer, and hence four bytes, so the total space required is
eight bytes.

Gang-Ryung Uh

2005-04-24 15:03:07 UTC

Permalink

First of all, thanks for your kind response.
I have a couple of follow up questions.
My questions are below.

Best regards,

Post by Niko Matsakis

Post by Gang-Ryung Uh
I compiled with gcc -O0 -S option and the compiler
produces the code that I cannot quite follow the
stack frame layout strategy in gcc.

Yes, I have a very clear understanding that (1) the
optimizer
can map the lifetime of variables to registers so
that, the
stack frame size can differ. (2) how the activation
record is constructed by caller and callee.

It appears that gcc seems to generate code to make
%esp aligned
in 16bytes boundary. If this correct, then what is the
benefit in
x86. Why 2 words boundary? Targeting for I64? Are
there any
x86 instructions to exploit such alighment to reduce
the
function call overhead or context switch?

However, my first question still remains since I
cannot
reasoning about the 16 bytes padding for the array
buffer1.

Thanks again for your help.

Post by Niko Matsakis
Hopefully this is just for educational purposes, and
you're not
planning on writing code that relies on the layout
of the stack frame!
I can guarantee you that this will change. If you
just want to find
the return address, there are easier ways...

buffer1[0].

Post by Gang-Ryung Uh
Then, why they adding 16 bytes padding?

I can't say for sure, but remember that the stack
frame has more than
just variables on it. There are also saved
registers, such as the
caller's frame pointer, and a number of other
things. Also, many
machines work best if they 32 bit alignment for
memory loads.

Post by Gang-Ryung Uh
question2: Why gcc makes the stack frame bigger

before

Post by Gang-Ryung Uh
the function call printf?
subl $8, %esp
Does it related to printf? If it does,

then

Post by Gang-Ryung Uh
could you explain why?

This is to make room for the parameters to printf();
there are two, and
each is a pointer, and hence four bytes, so the
total space required is
eight bytes.
Niko

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Arturas Moskvinas

2005-04-24 16:13:01 UTC

Permalink

Post by Gang-Ryung Uh
However, my first question still remains since I
cannot
reasoning about the 16 bytes padding for the array
buffer1.

I think the reason is that you chose to use int *res; remove this
variable, and you'll see that now gcc is trying different align
strategy.

It might be like that:
1. align char[5] to 8.
2. align char[10] to 16
3 align int *res to 4.
Some misalignment (the biggest member is size 16):
1. align char[5] to 16
2. align char[10] to 16
3. align int *res to 16
now we have 48. Let's align it to 64 (2^6)
1. add padding 16 bytes.
1. align char[5] to 16.
2. align char[10] to 16.
3. align int *res to 16.

I think we lost 4 bytes for return adress, and additionally 4 bytes
for putting EBP onto stack.
And we have 56bytes.

Arturas Moskvinas
P.S.: From intel optimization guide:
ftp://download.intel.com/design/Pentium4/manuals/24896611.pdf
"Employ data structure layout optimization to ensure efficient use of
64-byte cache line size."
AMD is not talking much about the alignment, they only say it to be
multitiply to double word, quadword.
X86 Processors allow misaligned memory access, but it cost at least to
memory read cycles to read it!

Ian Lance Taylor

2005-04-24 18:04:19 UTC

Permalink

Post by Gang-Ryung Uh
It appears that gcc seems to generate code to make
%esp aligned
in 16bytes boundary. If this correct, then what is the
benefit in
x86. Why 2 words boundary? Targeting for I64? Are
there any
x86 instructions to exploit such alighment to reduce
the
function call overhead or context switch?

See the documentation for the -mpreferred-stack-boundary option.

Ian

Gang-Ryung Uh

2005-04-24 15:03:12 UTC

Permalink

First of all, thanks for your kind response.
I have a couple of follow up questions.
My questions are below.

Best regards,

Post by Niko Matsakis

Post by Gang-Ryung Uh
I compiled with gcc -O0 -S option and the compiler
produces the code that I cannot quite follow the
stack frame layout strategy in gcc.

buffer1[0].

Post by Gang-Ryung Uh
Then, why they adding 16 bytes padding?

I can't say for sure, but remember that the stack
frame has more than
just variables on it. There are also saved
registers, such as the
caller's frame pointer, and a number of other
things. Also, many
machines work best if they 32 bit alignment for
memory loads.

Post by Gang-Ryung Uh
question2: Why gcc makes the stack frame bigger

before

Post by Gang-Ryung Uh
the function call printf?
subl $8, %esp
Does it related to printf? If it does,

then

Post by Gang-Ryung Uh
could you explain why?

This is to make room for the parameters to printf();
there are two, and
each is a pointer, and hence four bytes, so the
total space required is
eight bytes.
Niko

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Gang-Ryung Uh

2005-04-24 15:06:27 UTC

Permalink

Thank you!
Your explanation helps a lot.

Best regards,

Post by Niko Matsakis

Post by Niko Matsakis
This is to make room for the parameters to

printf(); there are two, and

Post by Niko Matsakis
each is a pointer, and hence four bytes, so the

total space required is

Post by Niko Matsakis
eight bytes.

I think you are not very correct in this part, if
you write simple
function which calls function without parameter,
you'll see that gcc
is also substract by 8 (12 if you use
-fomit-frame-pointer). Command
pushl substracts ESP pointer by 4 itself. So i think
gcc is trying to
align variable (and return address) in stack.
Arturas Moskvinas

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Gang-Ryung Uh

2005-04-24 20:59:42 UTC

Permalink

Thanks for your help! Your response is truely helpful.
I will follow the link that you suggested.

Please allow me to ask one more question. How about
incoming parameters? (the running example that I used
has
three int type arguments - in other words, caller
(main) will
pushl 3 times to pass the arguments in the stack).
Aren't
incoming parameters considered as the part of
activation
record (stack frame)?

Thanks again,

Post by Arturas Moskvinas

Post by Gang-Ryung Uh
However, my first question still remains since I
cannot
reasoning about the 16 bytes padding for the array
buffer1.

I think the reason is that you chose to use int
*res; remove this
variable, and you'll see that now gcc is trying
different align
strategy.
1. align char[5] to 8.
2. align char[10] to 16
3 align int *res to 4.
1. align char[5] to 16
2. align char[10] to 16
3. align int *res to 16
now we have 48. Let's align it to 64 (2^6)
1. add padding 16 bytes.
1. align char[5] to 16.
2. align char[10] to 16.
3. align int *res to 16.
I think we lost 4 bytes for return adress, and
additionally 4 bytes
for putting EBP onto stack.
And we have 56bytes.
Arturas Moskvinas

ftp://download.intel.com/design/Pentium4/manuals/24896611.pdf

Post by Arturas Moskvinas
"Employ data structure layout optimization to ensure
efficient use of
64-byte cache line size."
AMD is not talking much about the alignment, they
only say it to be
multitiply to double word, quadword.
X86 Processors allow misaligned memory access, but
it cost at least to
memory read cycles to read it!

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Ian Lance Taylor

2005-04-25 01:55:05 UTC

Permalink

Post by Gang-Ryung Uh
Please allow me to ask one more question. How about
incoming parameters? (the running example that I used
has
three int type arguments - in other words, caller
(main) will
pushl 3 times to pass the arguments in the stack).
Aren't
incoming parameters considered as the part of
activation
record (stack frame)?

gcc will try to align the stack to the preferred stack boundary
(default 16) at function entry. For this purpose the incoming
parameters are part of the caller's stack frame.

Ian

James E Wilson

2005-04-26 00:26:11 UTC

Permalink

Post by Gang-Ryung Uh
Could anyone help me understand what is the gcc
strategy to prepare the stack frame?

You didn't mention the gcc version, or the gcc target. Different gcc
versions and targets will give different answers. Even different x86
targets work differently.

Post by Gang-Ryung Uh
printf("0x%x=return address, *ret);

You are missing a quote here.

Post by Gang-Ryung Uh
question1: Why the stack frame size is 56?

A bug. It is 40 in current gcc development sources, or rather, I should
say that it is 40 that gets subtracted from the stack pointer. The
actual frame size also includes stuff that is being pushed.

This is probably the same issue as discussed in the thead here
http://gcc.gnu.org/ml/gcc/2005-04/msg01191.html

Post by Gang-Ryung Uh
Then, why they adding 16 bytes padding?

Probably the same bug. I get "leal -9(%ebp), %eax" which makes sense
for a 5 byte array, with 4 bytes of data allocated ahead of it.

Post by Gang-Ryung Uh
question2: Why gcc makes the stack frame bigger before
the function call printf?

This is probably to maintain 16-byte stack alignment when we reach
printf. We maintain 16-byte stack alignment so that MMX/SSE
instructions will work.

--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

Gang-Ryung Uh

2005-04-26 01:22:48 UTC

Permalink

Thanks for your response. Please find the subsequent
question.

Best regards,

Post by James E Wilson

Post by Gang-Ryung Uh
Could anyone help me understand what is the gcc
strategy to prepare the stack frame?

You didn't mention the gcc version, or the gcc
target. Different gcc
versions and targets will give different answers.
Even different x86
targets work differently.

% gcc -v
Reading specs from
/usr/lib/gcc-lib/i386-redhat-linux/3.3.2/specs
Configured with: ../configure --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info
--enable-shared --enable-threads=posix
--disable-checking --with-system-zlib
--enable-__cxa_atexit --host=i386-redhat-linux
Thread model: posix
gcc version 3.3.2 20031022 (Red Hat Linux 3.3.2-1)

Post by James E Wilson

Post by Gang-Ryung Uh
printf("0x%x=return address, *ret);

You are missing a quote here.

You are absolutely right.

Post by James E Wilson

Post by Gang-Ryung Uh
question1: Why the stack frame size is 56?

A bug. It is 40 in current gcc development sources,
or rather, I should
say that it is 40 that gets subtracted from the
stack pointer. The
actual frame size also includes stuff that is being
pushed.

Well, that answers.

Post by James E Wilson
This is probably the same issue as discussed in the
thead here
http://gcc.gnu.org/ml/gcc/2005-04/msg01191.html

Post by Gang-Ryung Uh
Then, why they adding 16 bytes padding?

Probably the same bug. I get "leal -9(%ebp),
%eax" which makes sense
for a 5 byte array, with 4 bytes of data allocated
ahead of it.

That answers, too!

Post by James E Wilson

Post by Gang-Ryung Uh
question2: Why gcc makes the stack frame bigger

before

Post by Gang-Ryung Uh
the function call printf?

This is probably to maintain 16-byte stack alignment
when we reach
printf. We maintain 16-byte stack alignment so that
MMX/SSE
instructions will work.

I am not quite following this. It sounds like the
stack frame
of the current development gcc version is not aligned
with 16 bytes with -40. How you can make it 16 bytes
aligned with -8?

Post by James E Wilson
--
Jim Wilson, GNU Tools Support,
http://www.SpecifixInc.com

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

James E Wilson

2005-04-26 01:45:36 UTC

Permalink

Post by Gang-Ryung Uh
of the current development gcc version is not aligned
with 16 bytes with -40. How you can make it 16 bytes
aligned with -8?

push/push/subl -40 gives -48 which is 16-byte aligned. Or perhaps that
is call/push/subl -40 which gives the -48. I'm not sure how x86 call
insns work.

subl -8/push/push gives -16 which is 16 byte aligned.

You can step through code in gdb with display/x $sp if you want to
follow this, and see what the real stack pointer value is.

--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

Gang-Ryung Uh

2005-04-26 02:30:40 UTC

Permalink

You are right! Thanks!
Now, it makes sense.

Best regards,

Post by Gang-Ryung Uh

Post by Gang-Ryung Uh
of the current development gcc version is not

aligned

Post by Gang-Ryung Uh
with 16 bytes with -40. How you can make it 16

bytes

Post by Gang-Ryung Uh
aligned with -8?

push/push/subl -40 gives -48 which is 16-byte
aligned. Or perhaps that
is call/push/subl -40 which gives the -48. I'm not
sure how x86 call
insns work.
subl -8/push/push gives -16 which is 16 byte
aligned.
You can step through code in gdb with display/x $sp
if you want to
follow this, and see what the real stack pointer
value is.
--
Jim Wilson, GNU Tools Support,
http://www.SpecifixInc.com

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com