Linking to C/C++ in Protected Mode
Programs written for x86 processors running in Protected mode can sometimes have bottlenecks
that must be optimized for runtime efficiency. If they are embedded systems, they may have
stringent memory size limitations. With such goals in mind, we will show how to write external
procedures in assembly language that can be called from C and C++ programs running in Protected mode. Such programs consist of at least two modules: The first, written in assembly
language, contains the external procedure; the second module contains the C/C++ code that
starts and ends the program. There are a few specific requirements and features of C/C++ that
affect the way you write assembly code.
Arguments(实参) Arguments are passed by C/C++ programs from right to left, as they appear in
the argument list. After the procedure returns, the calling program is responsible for cleaning up
the stack. This can be done by either adding a value to the stack pointer equal to the size of the
arguments or popping an adequate number of values from the stack.
External Identifiers In the assembly language source, specify the C calling convention in
the .MODEL directive and create a prototype for each procedure called from an external C/C++
program:
.586
.model flat,C
AsmFindArray PROTO,
srchVal:DWORD, arrayPtr:PTR DWORD, count:DWORD
Declaring the Function In a C program, use the extern qualifier when declaring an external
assembly language procedure. For example, this is how to declare AsmFindArray:
extern bool AsmFindArray( long n, long array[], long count );
If the procedure will be called from a C++ program, add a “C” qualifier to prevent C++ name
decoration:
extern "C" bool AsmFindArray( long n, long array[], long count );
Name decoration is a standard C++ compiler technique that involves modifying a function name
with extra characters that indicate the exact type of each function parameter. It is required in any
language that supports function overloading (two functions having the same name, with different
parameter lists). From the assembly language programmer’s point of view, the problem with
name decoration is that the C++ compiler tells the linker to look for the decorated name rather
than the original one when producing the executable file.
13.3.1 Using Assembly Language to Optimize C++ Code
One of the ways you can use assembly language to optimize programs written in other languages
is to look for speed bottlenecks. Loops are good candidates for optimization because any
extra statements in a loop may be repeated enough times to have a noticeable effect on your program’s
performance.
Most C/C++ compilers have a command-line option that automatically generates an assembly
language listing of the C/C++ program. In Microsoft Visual C++, for example, the listing file can
contain any combination of C++ source code, assembly code, and machine code, shown by the
options in Table 13-2. Perhaps the most useful is /FAs, which shows how C++ statements are
translated into assembly language.
Table 13-2 Visual C++ Command-Line Options for ASM Code Generation.
Command Line |
Contents of Listing File |
/FA |
Assembly-only listing |
/FAc |
Assembly with machine code |
/FAs |
Assembly with source code |
/FAcs |
Assembly, machine code, and source |
FindArray Example
Let’s create a program that shows how a sample C++ compiler generates code for a function
named FindArray. Later, we will write an assembly language version of the function, attempting
to write more efficient code than the C++ compiler. The following FindArray function (in C++)
searches for a single value in an array of long integers:
bool FindArray( long searchVal, long array[], long count ) { for(int i = 0; i < count; i++) { if( array[i] == searchVal ) return true; } return false; }
FindArray Code Generated by Visual C++
Let’s look at the assembly language source code generated by Visual C++ for the FindArray
function, alongside the function’s C++ source code. This procedure was compiled to a Release
target with no code optimization in effect:
; Listing generated by Microsoft (R) Optimizing Compiler Version 15.00.30729.01 TITLE c:UsersStudDocumentsVisual Studio 2008ProjectsFindArrayFindArrayFindArray.cpp .686P .XMM include listing.inc .model flat INCLUDELIB MSVCRTD INCLUDELIB OLDNAMES PUBLIC ?FindArray@@YA_NJQAJJ@Z ; FindArray EXTRN __RTC_Shutdown:PROC EXTRN __RTC_InitBase:PROC ; COMDAT rtc$TMZ ; File c:usersstuddocumentsvisual studio 2008projectsfindarrayfindarrayfindarray.cpp rtc$TMZ SEGMENT __RTC_Shutdown.rtc$TMZ DD FLAT:__RTC_Shutdown rtc$TMZ ENDS ; COMDAT rtc$IMZ rtc$IMZ SEGMENT __RTC_InitBase.rtc$IMZ DD FLAT:__RTC_InitBase ; Function compile flags: /Odtp /RTCsu /ZI rtc$IMZ ENDS ; COMDAT ?FindArray@@YA_NJQAJJ@Z _TEXT SEGMENT _i$5245 = -8 ; size = 4 _searchVal$ = 8 ; size = 4 _array$ = 12 ; size = 4 _count$ = 16 ; size = 4 ?FindArray@@YA_NJQAJJ@Z PROC ; FindArray, COMDAT ; Line 7 push ebp mov ebp, esp sub esp, 204 ; 000000ccH push ebx push esi push edi lea edi, DWORD PTR [ebp-204] mov ecx, 51 ; 00000033H mov eax, -858993460 ; ccccccccH rep stosd ; Line 8 mov DWORD PTR _i$5245[ebp], 0 jmp SHORT $LN4@FindArray $LN3@FindArray: mov eax, DWORD PTR _i$5245[ebp] add eax, 1 mov DWORD PTR _i$5245[ebp], eax $LN4@FindArray: mov eax, DWORD PTR _i$5245[ebp] cmp eax, DWORD PTR _count$[ebp] jge SHORT $LN2@FindArray ; Line 10 mov eax, DWORD PTR _i$5245[ebp] mov ecx, DWORD PTR _array$[ebp] mov edx, DWORD PTR [ecx+eax*4] cmp edx, DWORD PTR _searchVal$[ebp] jne SHORT $LN1@FindArray ; Line 11 mov al, 1 jmp SHORT $LN5@FindArray $LN1@FindArray: ; Line 12 jmp SHORT $LN3@FindArray $LN2@FindArray: ; Line 13 xor al, al $LN5@FindArray: ; Line 14 pop edi pop esi pop ebx mov esp, ebp pop ebp ret 0 ?FindArray@@YA_NJQAJJ@Z ENDP ; FindArray _TEXT ENDS PUBLIC _wmain ; Function compile flags: /Odtp /RTCsu /ZI ; COMDAT _wmain _TEXT SEGMENT _argc$ = 8 ; size = 4 _argv$ = 12 ; size = 4 _wmain PROC ; COMDAT ; Line 16 push ebp mov ebp, esp sub esp, 192 ; 000000c0H push ebx push esi push edi lea edi, DWORD PTR [ebp-192] mov ecx, 48 ; 00000030H mov eax, -858993460 ; ccccccccH rep stosd ; Line 18 xor eax, eax ; Line 19 pop edi pop esi pop ebx mov esp, ebp pop ebp ret 0 _wmain ENDP _TEXT ENDS END
Specify the /FA switch for the cl compiler. Depending on the value of the switch either only assembly code or high-level code and assembly code is integrated. The filename gets .asm file extension. Here are the supported values:
- /FA Assembly code; .asm
- /FAc Machine and assembly code; .cod
- /FAs Source and assembly code; .asm
- /FAcs Machine, source, and assembly code; .cod
Click "Project"-> Double Click "Configuration Properties(配置属性)"-> Choose "C/C++"-> Click "Output Files"->
Three 32-bit arguments were pushed on the stack in the following order: count, array, and
searchVal. Of these three, array is the only one passed by reference because in C/C++, an array
name is an implicit pointer to the array’s first element. The procedure saves EBP on the stack and
creates space for the local variable i by pushing an extra doubleword on the stack (Figure 13–1).
Inside the procedure, the compiler reserves local stack space for the variable i by pushing ECX
(line 9). The same storage is released at the end when EBP is copied back into ESP (line 14).
There are 14 instructions between the labels $L284 and $L285, which constitute the main body
of the loop. We can easily write an assembly language procedure that is more efficient than the
code shown here.