In-depth understanding of C language

xiaoxiao2021-03-06  22

C language generated code is high than other advanced languages ​​than other advanced languages. Now let's take a look at what code generated by the C language is what is like. When you read this article, you will be more step more.

This article explains C language through a actual case program.

Research case one

Tools: TurboC C V2.0, Debug, Masm V5.0, NASM

Example C procedure:

/ * esample1.c * /

CHAR CH;

INT E_MAIN () {

E_PUTCHAR (CH);

}

Target content: method and details of C language call functions

The C compiler we use is 16-bit TurboC C v2.0. It generates 16-bit code, relatively simple, easy to study. At the same time, we also need to use DEBUG under DOS to make an anti-assembly. Due to us The program in many cases is not a complete C program, so TLINK under Turboc does not generate the target program for us, so I will use Link.exe in MASM, and exe2bin.com can also convert the exe file for us. Cheng Bin file.

This program doesn't have a main function. We use e_main instead of the main function. This way we can avoid the C language to make a series of processing for the main function. Similarly, we also use e_putchar () to replace our usual use Putchar (). Here The meaning of "e" is "eXample".

Without the main function, our C program has no entrance, so before starting to compile this C code, I have to write a few simple assembly code, which is used as the entrance of our program.

The entry of the C procedure start.asm

[BITS 16]

[Global Start]

[exTern _e_main]

Start:

Call_e_main

According to C language habits, the total nature of C must be automatically added to a "_" underline. So, the E_MAIN function we are in c, if you want to call in the assembly, turn it into a _e_main function. This paragraph The assembly code has only one sentence: call _e_main, is called our E_MAIN function in C

This code I will compile with NASM. Generate start.obj

NASMW -F Obj -o Start.obj Start.ASM

Below we use TurboC C to compile this C code:

Tcc -mt -oexample1.obj -c eXample1.c

Link Start.obj Example1.obj, Example1.exe ,,,

EX2BIN EXAMPLE1.EXE

In this way, we have obtained the machine code file (Example1.bin) compiled by this C code.

Below we use Debug's tools to disassemble the example1.bin.

Debug

-n example1.bin

-l 0

-u 0

XXXX: 0000 CALL 0003

XXXX: 0003 MOV AX, 000B

XXXX: 0006 PUSH AX

XXXX: 0007 CALL 0020

XXXX: 000A POP CX

The code here is the result of the generated code of our entire C program.

The first first sentence called the code generated by the Start.asm compiled with NASM.

Our main goal is to study the code of the blue C language. The first code generated by Start.asm is too simple, that is, call the E_MAIN function. Our E_MAIN function is the blue code section.

From the C source program, we do it in e_main is a thing: call E_PUTCHAR (CH); where CH is passing the parameters of E_PUTCHAR.

MOV AX, 000B

000B is the address of our overall variable CH. C language will pass all global variables in another memory area. C code first to AX, then pass

Push AX

Press the value of the AX, that is, the address of the CH is pressed into the stack. And then

Call 0020

And 0020 is the address of the E_PUTCHAR code. Through this hop statement, the computer jumps to the code part of the E_PUTCHAR, I don't give the code code of E_PUTCHAR here, because our case is just how to pass the parameters in the C language. The function is, regardless of the parameters in e_putchar. In one case, we will study the function how to take the parameters. Here I have to explain the Call instruction, because the next research function will be confused in the section of the next research function. The Call XXXX instruction is simple or

Push IP

JMP XXXX

It first presses the current execution address IP into the stack, then jump to the address to be CALL to go. Call and RET instructions are supported. RET directive is equivalent to

POP IP

That is, the execution address IP before calling CALL.

Because of this, once you use the CALL instruction, your stack pointer SP will be automatically reduced.

POP CX

It is an essential operation after each function call is completed. It doesn't work here. It may be the only role is to correspond to the PUSH AX before Call 0020. Such a stack pointer SP can return to the original.

Ok, simple first case studies are over. Although this 4 jump instruction, we can see how the C language passes the parameter method. Summary is

By "MOV AX, Parameter Address" to transmit the address of the parameter to AX, then "Push AX" press the address of the parameter into the stack. The last "Call function address" turns the function to be called. Finally, "POP CX ", Restore the stack pointer SP.

Research case two

Tools: TurboC C v2.0, Debug, Masm V5.0, NASM, TASM

Example C procedure:

/ * esample1.c * /

CHAR CH;

EXTERN VOID E_PUTCHAR (CHAR C);

INT E_MAIN () {

CH = 0x44;

E_PUTCHAR (CH);

}

Example assembler:

EIO.ASM

_Text segment byte public 'code'

DGroup group _text

Assume CS: _Text, DS: DGroup, SS: DGROUP

Public _k_putchar

_K_putchar proc near

Push BP

MOV BP, SP

Mov Ah, 0eh

MOV BX, 7h

Mov Al, Byte PTR [BP 4]

INT 10h

POP BP

RET

_K_putchar ENDP

Target content: Method for functioning function using parameters in C language

In this section we will use TASM to write a standard C function with compilation. The content of this section may have seen in many compilation books. Talking about the connection method of the C language and assembly language. Maybe you will Weird, we already have MASM, NASM two compilers, why also use Tasm another assembly compiler. I don't know if MASM can cooperate with our TurboC C, but TASM is definitely available with TurboC C. Coordinated. After all, they are the products of Borland, and the assembly code generated in TurboC C is fully fixed according to the syntax in TASM. This is enough to see "intimate" between TurboC C and TASM.

In this case, we mainly do not study C code. That is to study the C function written with compiled.

Push BP

MOV BP, SP

Mov Ah, 0eh

MOV BX, 7h

Mov Al, Byte PTR [BP 4]

INT 10h

POP BP

RET

Where Byte PTR [BP 4] is the parameter value we pass to E_PUTCHAR ().

In the previous case, we have been known that the C language is to press the address of the address into the stack to pass the function. So in the standard C function, it is read through the value in the stack.

The standard C function is both two lines.

Push BP

MOV BP, SP

First save the value of the BP, then pass the current stack pointer to the bp, our access to the parameter to which the function is passed through BP. The first parameter value is placed in the address of the BP 4, the second parameter value is Placed in BP 6, ..., this is in this corresponding address. BP is the value of the IP before the call call. Because the call is executed, the system will automatically press the current IP into the stack. About this front A case has been given in a case. Don't see this C function written in assembly language, it is a complete C function.

Ok, let's compile it out.

Tasm eio.asm eio.obj

Tcc -mt -oexample1.obj -c eXample1.c

Link start.obj eio.obj example1.obj, example1.exe ,,,

EX2BIN EXAMPLE1.EXE

Ok, it's time to summarize.

The method of function access parameters in the C language is to pass the "Push BP" to save the BP, "MOV BP, SP" to pass the current stack pointer to BP. The address of the first parameter is in BP 4, the second parameter The address can pass the first parameter value to the AX register in BP 6, ... such as "MOV AX, WORD PTR [BP 4]". Need to pay attention to the order of C / C delivery parameters is The opposite. C language is to press the address of the parameter from right to left, so the more behind the parameters, the more the addresses in the stack.

转载请注明原文地址:https://www.9cbs.com/read-65543.html

New Post(0)