Array & Pointer Are they equivalent?
Consider the following two pieces of code:
int *p;
...
c = p[1];
int p[10];
...
c = p[1];
Are they equivalent? If not, which is faster? The answer is here:
Disassembled for c = p[1]:
mov eax,dword ptr[p]
mov ecx,dword ptr[eax+4]
mov dword ptr[c],ecx
Disassembled for c = p[1]:
mov eax,dword ptr[ebp-28h]
mov dword ptr[c],eax
FAQ: How to allocate a matric dynamically?
Method 1:
int* m = (int*)malloc(sizeof(int) * n * m);
// Use m[i * m + j] to access m[i][j]
Method 2:
int* buf = (int*)malloc(sizeof(int) * n * m);
int** m = (int **)malloc(sizeof(int*) * n);
int i;
for(i = 0; i < n; ++i)
m[i] = buf + i * m;
Method 3:
int **m = (int**)malloc(sizeof(int*) * n);
int i;
for (i = 0; i < n; ++i)
m[i] = (int*)malloc(sizeof(int) * m);
Pointer to Function And Event Handling
Let’s take the qsort() routine as an example:
void qsort(void *, size_t, size_t,
int (*)(const void *, const void *));
The qsort() routine doesn’t know how to compare two elements, so it requires a function to tell it which is smaller. Commonly, function pointers works as a server, for a reaction you need the user to define. Such mechanism is called callback.
Let’s look into another example which shows the classic POSIX signal usage:
#include <stdio.h>
#include <signal.h>
void handle_zero(int p)
{
puts("blabla");
signal(SIGFPE, handle_zero);
}
int main()
{
int i = 0;
int j = 1;
signal(SIGFPE, handle_zero);
j /= i;
raise(SIGFPE);
}
|
Dinner: Memory, Stack and Calling convention
When an arithmetic exception is raised,
handle_zero will be called.
handle_zero will be called.
raise a SIGFPE exception
raise a SIGFPE exception manually
Stack and Heap
int main()
{
int *p = malloc(sizeof(int));
int i;
...
}
|
BSS: Better Save Space
int m[1000000];
int n[1000000] = {1};
int main()
{
int m1[1000000];
int n1[1000000] = {1};
return 0;
}
|
Each int[1000000] takes 4M space. How large should the .exe file be? The answer is, 4M.
Both supposed to be in file, but as BSS implied, uninitialized variable is omitted.
Both in stack, allocated when main is called.
Alignment
Consider the following structure:
struct A
{
char c1;
int c2;
char c3[5];
};
sizeof(struct A) = 1 + 4 + 5 = 10?
In my project, it’s 16, in order to accelerate the addressing access. In 32 bits environment, CPU accesses address of a multiple of 4 is much faster. The compiler adds padding bytes in the structure, so that the size of the structure would be a multiple of 4.
Calling Convention
Some questions about function call:
l What happened before a function is called?
n Pass arguments with stack or register.
n Push the address of the next function.
n Jump to the function.
l What do a called function do?
n Get the arguments (if any).
n Do the work.
n Clean up the stack (optional).
n Pop the saved address and jump to it.
l What happened after a function is called?
n Clean up the stack (optional)
And what’s calling conventions?
l Argument-passing order
l Stack-maintenance responsibility
l Name-decoration convention
l …
Example:
The cdecl calling convention
|
|
Argument-passing order:
|
Right to left
|
Stack maintenance responsibility:
|
Calling function pops the arguments from the stac
|
Name-decoration convention:
|
Underscope character (_) is prefixed to names
|
Apply the calling conventions to the folloing code:
int main()
{
...
f(1, 2);
...
}
|
_main
|
push 2 into stack
|
push 1 into stack
|
||
push the address of the next instruction into stack
|
||
goto _f
|
||
_f
|
get parameters and do the work
|
|
pop the return address and return
|
||
_main
|
pop the 1 and 2 from the stack
|
Other Conventions:
stdcall: Win32 API
|
|
Stack maintenance responsibility:
|
Called function pops the arguments from the stac
|
fastcall: FAST call
|
|
Argument-passing order:
|
The first two DWORD or smaller arguments are passed in ECX and EDX registers; all other arguments are passed right to left.
|
Stack maintenance responsibility:
|
Calling function pops the arguments from the stac
|
Variational Arguments
int sum(int num, ...)
{
??? how to get ... argument?
}
int main()
{
int a = 1, b = 2, c = 3;
int n = sum(3, a, b, c); // n = 6 expected.
}
Cdcel And Stdcall Pass Arguments Right to Left
When sum is called, the stack must be like:
a = *(&num + 1);
b = *(&num + 2);
c = *(&num + 3);
So that we can get the sum like this:
int sum(int num, ...)
{
int i, retval = 0;
for(i = 1; i <= num; ++i)
retval += (&num)[i];
return retval;
}
Question: should sum() be stdcall?
If sum() is stdcall, it must pop the stack, but how can it know that how many arguments are pushed by main? So, we get:
int _cdecl sum(int num, …);
Question in 88, stage 1:
Print 1 to 1000 without loop code (for, while, goto, jmp)
int N = 1;
void main(int argc, char* argv[])
{
printf("%d\n", N);
*(&argc - 1) -= N++ == 1000 ? 0 : 5;
}
&argv
|
“”
|
&argc
|
0
|
&argc - 1
|
Ret addr
|
...
|
...
|
Length of the call instruction is 5 bytes, so all we need is minus the address by 5.
00412529 call 4114Ech
0041252E ...
longjmp
void f()
{
goto label1;
}
int main()
{
label1:
f();
}
error C2094:
Undefined label "label1"
#include <setjmp.h>
jmp_buf b;
void f()
{
longjmp(b, 1);
}
int main()
{
setjmp(b);
f();
}
This makes f() fun repeatedly.
Write My longjmp and setjmp
The setjmp is used to save the current environment, while the longjmp is used to restore the saved environment. So that, we need a data structure to hold the “environment”.
struct myjmp_buf
{
char stack[1024]; /* save the stack */
int esp; /* save the esp register */
int ebp; /* save the ebp register */
};
And this is our own version of longjmp:
int mysetjmp(struct myjmp_buf* b)
{
static int a1, a2;
_asm
{ /* save the registers */
mov a1, esp;
mov a2, ebp;
}
b->esp = a1;
b->ebp = a2;
memcpy(b->stack, (int*)b->esp, /* save the stack */
b->ebp - b->esp + 2 * sizeof(int));
return 0;
}
Why (int*)ebp+1 is the boundary of stack?
When a function is been called, two instructions will be executed:
push ebp
mov ebp, esp
Firstly, save EBP, so that EBP can be used to save the initial value of ESP which always points to the top of the stack. After this, the stack grows, which means ESP changes. And now, the valid range of the stack is from EBP + 4 to ESP. So, saving bytes from EBP to EBP + 4 saves the returning address.
The following code is our mylongjmp():
int mylongjmp(struct myjmp_buf* b, int n)
{
static int a1, a2, n2;
n2 = n;
a1 = b->esp;
a2 = b->ebp;
memcpy((int*)b->esp, b->stack, /* load stack */
b->ebp - b->esp + 2 * sizeof(int));
_asm /* load registers */
{
mov esp, a1;
mov ebp, a2;
}
return n2;
}
Exception Handling in C
#include <signal.h>
struct myjmp_buf buf;
void handle_zero(int p)
{
puts("blabla");
mylongjmp(&buf, 1); /* jump and let mysetjmp return 1 */
}
int intdivide(int a, int b)
{
if (!b)
raise(SIGFPE); /* call handle_zero */
return a / b;
}
int main()
{
int a, b;
if (mysetjmp(&buf))
{
puts("Divided By Zero, reinput!");
}
signal(SIGFPE, handle_zero);
scanf("%d %d", &a, &b);
printf("%d", intdivide(a, b));
}
Dessert: Macro and Precompiling
#define
We can use #define and macros to simulate the C++ template in C:
#define DeclMax(ElemType) \
ElemType Max_##ElemType(ElemType a, ElemType b) \
{ \
return a > b ? a : b; \
}
DeclMax(int);
int main()
{
printf("%d", Max_int(1, 2));
system("pause");
}
#pragma
#pragma pack(1) /* this will cancel the alignment */
struct A
{
char c1;
int c2;
char c3[5];
};
int main()
{
printf("%d", sizeof(struct A)); /* output 10 */
system("pause");
}
#pragma warning(disable: 4013) /* disable warning 4013 */
int main()
{
printf("Hello World!");
system("pause");
}