-
Definitions, assumptions and references of segments
-
Definition of segment
-
format
-
segment_name segment [align] [combine] ['class'] statements segment_name ends ;keyword segment Indicates the beginning of the segment definition, keyword ends Indicates the end of the segment definition(end of segment),segment_name Represents the segment name. Any segment must have a segment name,statements Statement representing assembly language ;align Indicates alignment, which is one of the following five keywords:byte word dword para page ;These keywords are used to specify that the defined segment is bounded by bytes, words, doublewords, sections and pages. Sections are 16 bytes and pages are 256 bytes. If the alignment mode is omitted during segment definition, it is section by default. Generally, it is not necessary to specify the alignment mode ;combine Indicates the merge type. It refers to one of the following five keywords ;public stack common memory at ;public It is generally used for the definition of code segment or data segment. If the segment name is the same and the merge type is public The segments of are merged into one segment when connected, stack Used for stack segment definition. If the segment name is the same and the merge type is stack When the program is loaded into memory and ready to run, ss Automatically initialized to the segment address of the stack segment, sp Automatically initialized to the offset address of the byte at the end of the segment+1,Generally, there is no need to write. If a stack segment is defined in the program, the merging type of the segment should be specified as stack ;'class'Represents a class alias. Its content is fixed and can be taken at will. Its function is to enable the connector to put this segment together with other segments with the same class alias when connecting. Generally, it does not need to be specified ;In general, it's good to define segments like this segment_name segment statements segment_name ends ;Stack segments are generally defined this way stk segment stack db 100h dup(0);Definition 100 h Bytes, and the value of each byte is 0 stk ends
-
-
-
Assumptions of paragraph
-
The assembly instruction assert can be used to establish the corresponding relationship between segment registers and segments
-
format
-
assume segreg:segment_name ;segreg Represents the segment register(cs,ds,es,ss)One of them
-
-
Note that the corresponding relationship between segment register and segment established by assume only helps the assembler and compiler to replace the segment address with the segment register, but does not assign a value to the segment register, that is, it cannot guarantee that the segment register is equal to the corresponding segment address
-
When the program starts running, except that the two registers cs and ss will be automatically assigned by the operating system, ds and es are not equal to the segment address specified by assume
-
-
Reference to segment
-
Reference segment address with segment name
-
Segment names can be used instead of segment addresses
-
Reference segment address with seg operator
-
format
-
seg variablename;Variable name,Representation variable variablename Segment address of the segment seg labelname;Label name,Address of the segment where the label is located
-
-
-
-
-
-
End of procedure
-
End of source program
-
End with assembly instruction end
-
Format:
-
end labelname;end Indicates that the source program ends here, labelname Is the name of a label in the code segment, which instructs the program to run from this label end Post ellipsis labelname,The program starts from the first instruction in the code segment
-
-
-
-
Segment prefix
-
PSP(program segment prefix) is a memory with a length of 100h bytes
-
When DOS runs any executable program, it first allocates a PSP for the program in memory, then DOS reads the contents of the program file and loads it into the memory after the PSP. Finally, DOS sets DS and ES as PSP segment address, SS and SP as stack segment address and last byte offset address of stack segment + 1, and CS as code segment address of the program, IP is set as the offset address of the label specified by end in the source program, and CS:IP in the program starts running
-
deviation Length (byte) content 0000h 2 int 20h(0cdh,20h) 0016h 2 PSP segment address of parent program 002ch 2 Environment block address 0080h 1 Command line parameter length 0081h Indefinite (up to 7Fh bytes) Command line parameters
-
-
Termination of proceedings
-
The end in the assembly source program only represents the end of the source program. It is only an assembly instruction. After the program is compiled, end will disappear and will not be converted into any code
-
If you really want the program to terminate, you usually use the 4ch function call of DOS
-
mov ah,4ch mov al,return_code int 21h ;among al The return code in is used to pass the return information of this program to the parent program (the caller of the currently running program)
-
In addition to the 4c function of DOS, the int 20h interrupt call and the 00h function of DOS can also terminate the program. However, these two calls require the value of the code segment register CS to be equal to the PSP segment address. If CS is not equal to the PSP segment address when calling these two functions, it will cause a crash
-
There is another method of termination
-
code segments assume cs:code push ds;At the beginning of the program ds Pushed onto the stack, actually saved PSP Segment address mov ax,0 push ax .... retf;Pop 0 from stack,ds,This time will IP Assign value to 0,cs Assign as ds that is PSP Segment address, equivalent to executing jmp psp:0000,There is one stored there int 20h Instruction, and the program terminates execution code ends
-
-
-
-
Assembly language statement
-
Format of Assembly statement
-
name mnemonic operand ;comment ;name Called name item, it mainly refers to variable name and label name, and can also be segment name, process name, etc. name item is not necessary in assembly language, and most statements do not need it ;mnemonic For mnemonics, it mainly refers to 8086 instruction (e.g mov,add,jmp),It can also be assembly instruction(assume,end,segment)And pseudo instructions(db,dw) ;operand Is an operand and a mnemonic parameter ;comment For comments, always start with a semicolon
-
-
Constants and constant expressions
-
The constants supported by assembly include integer constants, character constants and string constants
-
Integer constant
-
8-bit, 16 bit and 32-bit integer constants can be positive numbers, negative numbers and non signed numbers, which can be expressed in decimal, binary, octal and hexadecimal
-
10,-10;Decimal, no suffix 1011B;Binary, B Is suffix 177Q;octal number system, Q Is suffix 3Fh,0FFh;hexadecimal, h Is suffix
-
-
Character constant
- A single character enclosed in single or double quotation marks, 'a', 'a', which is numerically equal to the ASCII code of the character
-
String constant
- A string of characters enclosed in single or double quotation marks, 'ABC', 'ABC'
- Note that the string constant in the assembly contains characters in quotation marks, and there is no \ 0 to end the string
-
Constant expression
-
Assembly supported operators
-
operator format meaning + +Expression just - -Expression negative + Expression + expression plus - Expression - expression reduce * Expression * expression ride / Expression / expression except MOD Expression MOD expression Seeking remainder SHR Expression 1 SHR expression 2 Shift right expression 2 bits SHL Expression 1 SHL expression 2 Shift left NOT NOT expression Reverse AND Expression AND expression And OR Expression OR expression or XOR Similar XOR SEG SEG variable name or label name Segment address OFFSET ditto Offset address
-
-
Symbolic Constant
-
Symbolic Constant refers to the constant expressed in symbolic form, which is defined by EQU, =
-
symbol equ expression symbol = expression;symbol Is the symbol name, expression Is an expression ;=The operand of can only be a constant or constant expression of numeric type or character type. The same symbol is allowed to be used=Redefine ;equ In addition to the familiar or constant expression of numeric type or character type, the operand of can also be a string or even an assembly statement, but the same symbol cannot be redefined char = 'A' exitfun equ <mov ah,4ch> dosint equ <int 21h> code segment assume cs:code main: mov ah,2 mov dl,char;amount to mov dl,'A' dosint ;int 21h char = 'B';Redefine mov ah,2 mov dl,char; mov dl,'B' dosint exit fun ;mov ah,4ch dosint code ends end main
-
-
-
Variables and labels
-
Variable name and label name
- Variable names cannot start with numbers, $and? It cannot be used as a variable name alone. The maximum number of characters contained in a variable name is 31. By default, the case of the variable name is not distinguished. It cannot be defined repeatedly, can not be a keyword, and can not contain spaces
-
Definition of variables
-
Variable name db|dw|dd Initial value ;db(define byte) ;dw (define word) ;dd(define double word) ;ex: x db 3Fh y db 1,2,3 ;Equivalent to defining y[3] z db 'ABC',0Dh,0Ah,'$';z[6] abc dw 1234h,5678h;abc(2) ;dup Command, repeat the same initial value abc db 100 dup(0) ;abc(100),Each is initialized to 0 ;dup The value after parentheses can have multiple values x db 3 dup(1,2) ;Define an array of byte types x,Total 3*2 Elements, respectively(1,2,1,2,1,2) ;It can also be nested y db 2 dup('A',3 dup('B'),'C') ;amount to'ABBBCABBBC'
-
-
Definition of label
-
The label is used as the target of jump (jmp class) or call
-
Label name:;The simplest way to define,Equivalent to the following near Label name label near|far|byte|word|dword ;near Only the product offset address is included, far Include segment address and offset address ;If it's followed by byte,word,dword What is actually defined is a variable abc label byte db 7Fh ;The first sentence defines abc Is a byte type variable, but no memory space is allocated. In this case, abc The address of is equal to the address of the following variable amount to abc db 7Fh ;This is used to define byte or word variables at the same time www label word abc db 12h,34h ;www That's 3412 h bbb label byte xyz dw 5678h ;bbb It's 78 h
-
-
Variable reference
- In the code segment, VAR or [var] can be used as the operand of 8086 instruction to represent the value of variable
-
Variable cast
-
word ptr [var] byte ptr dword ptr near ptr far ptr
-
-
Position counter
-
When compiling the source program, the assembler compiler will use a variable called the position counter to record the offset address in the current segment. When a segment definition starts, the value of the position counter will be assigned to 0. Then, when compiling to a statement, the value of the position counter will automatically add the number of bytes occupied in the source program, You can use $to get the value of the current position counter. Using the value of the position counter, you can get the length of a string
-
data segment poem db "abcsdefsdq" len db $-offset poem;That's what you get before this instruction, that is poem Length of data ends code segment assume cs :code main: mov ah,4ch int 21h code ends end main
-
-
-
- Can see len Sure enough, the length of characters is 10 - Sets or modifies the value of the position counter - org - ```assembly data segment org 1000h;set up $1000 h,such abc The offset address of is 1000 h Not 0000 h abc db 12h,34h org $+100h xyz dw 5678h data ends code segment assume cs :code main: mov ah,4ch int 21h code ends end main ```
-
You can see that the data content that should have followed ss:0 is missing and filled with 0
It can be seen that 12 and 34 appear at 1000h, indicating that $is actually a bit like a pointer to the stack. It points to the offset address of the next content. If we modify it, it will not be placed continuously according to the established. We can also estimate that the offset address of xyz should be 1000 + 100 + 2 = 1102
So if you don't need it, please don't change the address of $. At the same time, you can also see that the org instruction doesn't occupy memory space. It should play a role in compiling and preprocessing
- Reference to label
- If lab or offset is the label name of the lab, it can indicate the offset of the lab
The content is missing and filled with 0
- If lab or offset is the label name of the lab, it can indicate the offset of the lab
[external chain picture transferring... (img-SCj8ntcD-1619056079924)]
It can be seen that 12 and 34 appear at 1000h, indicating that $is actually a bit like a pointer to the stack. It points to the offset address of the next content. If we modify it, it will not be placed continuously according to the established. We can also estimate that the offset address of xyz should be 1000 + 100 + 2 = 1102
[external chain picture transferring... (img-5xj3bhd-1619056079925)]
So if you don't need it, please don't change the address of $. At the same time, you can also see that the org instruction doesn't occupy memory space. It should play a role in compiling and preprocessing
- Reference to label
- If lab is a label name, lab or offset lab can represent the offset address of the label