Скачать презентацию CS 499 Intel Architecture Intel Architecture References Скачать презентацию CS 499 Intel Architecture Intel Architecture References

8be0c715ec9a52ef3847962559bdbcee.ppt

  • Количество слайдов: 127

CS 499 Intel Architecture CS 499 Intel Architecture

Intel Architecture References IA 32 Intel® Architecture Software Developer’s Manual, • Volume 1: Basic Intel Architecture References IA 32 Intel® Architecture Software Developer’s Manual, • Volume 1: Basic Architecture • Volume 2: Instruction Set Reference www. intel. com/design/pentiumii/manuals/

Number Systems Decimal-to-Hexadecimal: 420. 62510 = 42010 +. 62510 Division 420 ÷ 16 26 Number Systems Decimal-to-Hexadecimal: 420. 62510 = 42010 +. 62510 Division 420 ÷ 16 26 ÷ 16 1 ÷ 16 Multiplication. 625 x 16 420. 62510 = 1 A 4. A 16 413510 = 102716 62510 = 271. A 16 Quotient 26 1 0 Product 10. 00 Remainder 4 LSB 10 (or A) 1 MSB Carry out 10 (or A)

Number Systems Binary-Coded Hexadecimal (BCH): 2 AC = 0010 1100 1000 0011 1101. 1110 Number Systems Binary-Coded Hexadecimal (BCH): 2 AC = 0010 1100 1000 0011 1101. 1110 = 83 D. E

Complements Data are stored in complement form to represent negative numbers One’s complements of Complements Data are stored in complement form to represent negative numbers One’s complements of 01001100 1111 0100 1011 0011 Two’s complements 1011 0011 +0000 0001 1011 0100

The 80 x 86 MICROPROCESSOR The 80 x 86 MICROPROCESSOR

Some buzz words CISC – Complex Instruction Set Computers • Refers to number and Some buzz words CISC – Complex Instruction Set Computers • Refers to number and complexity of instructions • Improvements was: Multiply and Divide • The number of instruction increased from • 45 on 4004 to: • 246 on 8085 • 20, 000 on 8086 and 8088 RISC – Reduced Instruction Set Computer • Executes one instruction per clock Newer RISC Superscaler Technology • Execute more than one instruction per clock . . . ha?

Inside The 8088/8086 Concepts important to the internal operation of 8088/8086 • Pipelining • Inside The 8088/8086 Concepts important to the internal operation of 8088/8086 • Pipelining • Registers

Inside The 8088/8086…pipelining • Pipelining – Two ways to make CPU process information faster: Inside The 8088/8086…pipelining • Pipelining – Two ways to make CPU process information faster: • • Increase the working frequency – technology dependent Change the internal architecture of the CPU – Pipelining is to allow CPU to fetch and execute at the same time

Inside The 8088/8086…pipelining Intel implemented the concept of pipelining by splitting the internal structure Inside The 8088/8086…pipelining Intel implemented the concept of pipelining by splitting the internal structure of the 8088/8086 into two sections that works simultaneously: • Execution Unit (EU) – executes instructions previously fetched • Bus Interface Unit (BIU) – accesses memory and peripherals

Inside The 8088/8086 AH BH CH DH AL BL CL DL Inside The 8088/8086 AH BH CH DH AL BL CL DL

 • Registers Overview – General purpose registers (8) • Operands for logical and • Registers Overview – General purpose registers (8) • Operands for logical and arithmetic operations • Operands for address calculations • Memory pointers – Segment registers (6) – EFLAGS register – The instruction pointer register • The stack

Inside The 8088/8086…registers • Registers – To store information temporarily Category General Pointer Index Inside The 8088/8086…registers • Registers – To store information temporarily Category General Pointer Index Segment Bits 16 8 16 16 16 Instruction Flag 16 16 AX 16 bit register AH AL 8 bit reg. Register Names AX, BX, CX, DX AH, AL, BH, BL, CH, CL, DH, DL SP (stack pointer), BP (base pointer) SI (source index), DI (destination index) CS (code segment), DS (data segment) SS (stack segment), ES (extra segment) IP (instruction pointer) FR (flag register)

Anatomy of a Register Extended Register Word Register Bits 16 -31 Bits 8 -15 Anatomy of a Register Extended Register Word Register Bits 16 -31 Bits 8 -15 Bits 0 -7 High Byte Low Byte Register

General Registers 32 bit Registers 16 bit Registers 8 bit Registers EAX EBP AX General Registers 32 bit Registers 16 bit Registers 8 bit Registers EAX EBP AX BP AH AL EBX ESI BX SI BH BL ECX EDI CX DI CH CL EDX ESP DX SP DH DL Bits 16 -31 Bits 8 -15 Bits 0 -7

General Registers I • EAX – ‘Accumulator’ • accumulator for operands and results data General Registers I • EAX – ‘Accumulator’ • accumulator for operands and results data • usually used to store the return value of a procedure • EBX – ‘Base Register’ • pointer to data in the DS segment • ECX – ‘Counter’ • counter for string and loop operations • EDX – ‘Data Register’ • I/O pointer

General Registers II • ESI – ‘Source Index’ • source pointer for string operations General Registers II • ESI – ‘Source Index’ • source pointer for string operations • typically a pointer to data in the segment pointed to by the DS register • EDI – ‘Destination Index’ • destination pointer for string operations • typically a pointer to data/destination in the segment pointed to by the ES register

General Registers III • EBP – ‘Base Pointer’ • pointer to data on the General Registers III • EBP – ‘Base Pointer’ • pointer to data on the stack • points to the current stack frame of a procedure • ESP – ‘Stack Pointer’ • pointer to the top address of the stack • holds the stack pointer and as a general rule should not be used for any other purpose

Segment Registers • CS – ‘Code Segment’ – contains the segment selector for the Segment Registers • CS – ‘Code Segment’ – contains the segment selector for the code segment where the instructions being executed are stored • DS(ES, FS, GS) – ‘Data Segment’ – contains the segment selectors for the data segment where data is stored • SS – ‘Stack Segment’ – contains the segment selector for the stack segment, where the procedure stack is stored

The EFLAGS Register I • Carry Flag – CF (bit 0) – Set if The EFLAGS Register I • Carry Flag – CF (bit 0) – Set if an arithmetic operation generates a carry or a borrow out of the most significant bit of the result; cleared otherwise. • Parity Flag – PF (bit 2) – Set if the least significant byte of the result contains an even number of 1 bits; cleared otherwise. • Adjust Flag – AF (bit 4) – Set if an arithmetic operation generates a carry or a borrow out of bit 3 of the result; cleared otherwise.

The EFLAGS Register II • Zero Flag – ZF (bit 6) – Set if The EFLAGS Register II • Zero Flag – ZF (bit 6) – Set if the result is zero; cleared otherwise • Sign Flag – SF (bit 7) – Set equal to the most significant bit of the result, which is the sign bit of a signed integer • Overflow Flag – OF (bit 11) – Set if the integer result is too large a positive number or too small a negative number (excluding the sign bit) to fit in the destination operand; cleared otherwise

Instruction Pointer • EIP – ‘Instruction Pointer’ – Contains the offset within the code Instruction Pointer • EIP – ‘Instruction Pointer’ – Contains the offset within the code segment of the next instruction to be executed – Cannot be accessed directly by software

The Stack The stack starts in high memory and grows toward low memory ESP The Stack The stack starts in high memory and grows toward low memory ESP EBP Current stack frame Caller’s stack frame stack growth

Intel Assembly Intel Assembly

Intel Assembly Goal: to gain a knowledge of Intel 32 bit assembly instructions References: Intel Assembly Goal: to gain a knowledge of Intel 32 bit assembly instructions References: • M. Pietrek, “Under the Hood: Just Enough Assembly Language to Get By” • MSJ Article, February 1998 www. microsoft. com/msj • Part II”, MSJ Article, June 1998 www. microsoft. com/msj • IA 32 Intel® Architecture Software Developer’s Manual, • Volume 1: Basic Architecture www. intel. com/design/Pentium 4/documentation. htm • Volume 2 A: Instruction Set Reference A M www. intel. com/design/pentium 4/documentation. htm • Volume 2 B: Instruction Set Reference N Z www. intel. com/design/pentium 4/documentation. htm

Assembly Programming • Machine Language • binary • hexadecimal • machine code or object Assembly Programming • Machine Language • binary • hexadecimal • machine code or object code • Assembly Language • mnemonics • assembler • High Level Language • Pascal, Basic, C • compiler

Assembly Language Programming Assembly Language Programming

What Does It Mean to Disassemble Code? Source Code Preprocessing & Compiling Assembly Code What Does It Mean to Disassemble Code? Source Code Preprocessing & Compiling Assembly Code Assembly Executable Code Object Code Linking DLLs

What Does It Mean to Disassemble Code? Source Code Preprocessing & Compiling LY MB What Does It Mean to Disassemble Code? Source Code Preprocessing & Compiling LY MB E SS ISA D Executable Code Assembly Object Code Linking DLLs

Why is Disassembly Useful in Malware Analysis? • It is not always desirable to Why is Disassembly Useful in Malware Analysis? • It is not always desirable to execute malware: disassembly provides a static analysis. • Disassembly enables an analyst to investigate all parts of the code, something that is not always possible in dynamic analysis. • Using a disassembler and a debugger in combination creates synergy.

32 bit Instructions • Instructions are represented in memory by a series of “opcode 32 bit Instructions • Instructions are represented in memory by a series of “opcode bytes. ” • A variance in instruction size means that disassembly is position specific. • Most instructions take zero, one, or two arguments: instruction destination, source For example: add eax, ebx is equivalent to the expression eax = eax + ebx

Rule #3: If a value less than FFH is moved into a 16 bit Rule #3: If a value less than FFH is moved into a 16 bit register, the rest of the bits are assumed to be all zeros. MOV BX, 5 BX =0005 BH = 00, BL = 05

Program Segments • A segment is an area of memory that includes up to Program Segments • A segment is an area of memory that includes up to 64 K bytes • Begins on an address evenly divisible by 16 • 8085 could address a max. of 64 K bytes of physical memory it has only 16 pins for the address lines (216 = 64 K) • 8088/86 stayed compatible with 8085 Range of 1 MB of memory, it has 20 address pins (220 = 1 MB) Can handle 64 KB of code, 64 KB of data, 64 KB of stack • A typical Assembly language program consist of three segments: • Code segments • Data segment • Stack segment

Program Segments…a sample Program Segments…a sample

Program Segments Program Segments

Program Segments Code segment The 8086 fetches the instructions (opcodes and operands) from the Program Segments Code segment The 8086 fetches the instructions (opcodes and operands) from the code segments. The 8086 address types: – Physical address – Offset address – Logical address • Physical address – – 20 bit address that is actually put on the address pins of 8086 Decoded by the memory interfacing circuitry A range of 00000 H to FFFFFH It is the actual physical location in RAM or ROM within 1 MB mem. range • Offset address – A location within a 64 KB segment range – A range of 0000 H to FFFFH • Logical address – consist of a segment value and an offset address

Program Segments…example Define the addresses for the 8086 when it fetches the instructions (opcodes Program Segments…example Define the addresses for the 8086 when it fetches the instructions (opcodes and operands) from the code segments. • Logical address: – Consist of a CS (code segment) and an IP (instruction pointer) format is CS: IP • Offset address – IP contains the offset address • Physical address – generated by shifting the CS left one hex digit and then adding it to the IP – the resulting 20 bit address is called the physical address

give me some numbers…ok Program Segments…example Suppose we have: CS IP 2500 95 F give me some numbers…ok Program Segments…example Suppose we have: CS IP 2500 95 F 3 • Logical address: – Consist of a CS (code segment) and an IP (instruction pointer) format is CS: IP 2500: 95 F 3 H • Offset address – IP contains the offset address which is 95 F 3 H • Physical address – generated by shifting the CS left one hex digit and then adding it to the IP 25000 + 95 F 3 = 2 E 5 F 3 H

Program Segments Data segment refers to an area of memory set aside for data Program Segments Data segment refers to an area of memory set aside for data • Format DS: BX or DI or SI • example: DS: 0200 = 25 DS: 0201 = 12 DS: 0202 = 15 DS: 0203 = 1 F DS: 0204 = 2 B

Program Segments Data segment Example: Add 5 bytes of data: 25 H, 12 H, Program Segments Data segment Example: Add 5 bytes of data: 25 H, 12 H, 15 H, 1 FH, 2 BH Not using data segment MOV ADD ADD ADD AL, 00 H AL, 25 H AL, 12 H AL, 15 H AL, 1 FH AL, 2 BH ; clear AL ; add 25 H to AL

Program Segments Data segment Example: Add 5 bytes of data: 25 H, 12 H, Program Segments Data segment Example: Add 5 bytes of data: 25 H, 12 H, 15 H, 1 FH, 2 BH using data segment with a constant offset Data location in memory: DS: 0200 = 25 DS: 0201 = 12 DS: 0202 = 15 DS: 0203 = 1 F DS: 0204 = 2 B Program: MOV ADD ADD ADD AL, 0 AL, [0200] AL, [0201] AL, [0202] AL, [0203] AL, [0204]

Program Segments Data segment Example: Add 5 bytes of data: 25 H, 12 H, Program Segments Data segment Example: Add 5 bytes of data: 25 H, 12 H, 15 H, 1 FH, 2 BH using data segment with an offset register Program: MOV ADD INC ADD AL, 0 BX, 0200 H AL, [BX] BX AL, [BX] ; same as “ADD BX, 1”

Endian conversion • Little endian conversion: In the case of 16 bit data, the Endian conversion • Little endian conversion: In the case of 16 bit data, the low byte goes to the low memory location and the high byte goes to the high memory address. (Intel, Digital VAX) • Big endian conversion: The high byte goes to low address. (Motorola) Example: Suppose DS: 6826 = 48, DS: 6827 = 22, Show the contents of register BX in the instruction MOV BX, [6826] Little endian conversion: BL = 48 H, and BH = 22 H

Program Segments Stack segment Stack A section of RAM memory used by the CPU Program Segments Stack segment Stack A section of RAM memory used by the CPU to store information temporarily. • Registers: SS (Stack Segment) and SP (stack Pointer) • Operations: PUSH and POP – PUSH – the storing of a CPU register in the stack – POP – loading the contents of the stack back into the CPU • Logical and offset address format: SS: SP

Flag Register • Flag Register (status register) – 16 bit register – Conditional flags: Flag Register • Flag Register (status register) – 16 bit register – Conditional flags: CF, PF, AF, ZF, SF, OF – Control flags: TF, IF, DF ZF

Flag Register and ADD instruction • Flag Register that may be affected – Conditional Flag Register and ADD instruction • Flag Register that may be affected – Conditional flags: CF, PF, AF, ZF, SF, OF

Flow Control I • JMP location Transfers program control to a different point in Flow Control I • JMP location Transfers program control to a different point in the instruction stream without recording return information. jmp eax jmp 0 x 00934 EE 4

Flow Control II • CMP value, value / Jcc location The compare instruction compares Flow Control II • CMP value, value / Jcc location The compare instruction compares two values, setting or clearing a variety of flags (e. g. , ZF, SF, OF). Various conditional jump instructions use flags to branch accordingly. cmp eax, 4 je 40320020 cmp [ebp+10 h], eax jne 40 DC 0020

Flow Control III • TEST value, value / Jcc location The test instruction does Flow Control III • TEST value, value / Jcc location The test instruction does a logical AND of the two values. This sets the SF, ZF, and PF flags. Various conditional jump instructions use these flags to branch. test jnz eax, eax 40 DA 0020 test jz edx, 0056 FCE 2 56 DC 0 F 20

Looping using zero flag • The zero flag is set (ZF=1), when the counter Looping using zero flag • The zero flag is set (ZF=1), when the counter becomes zero (CX=0) • Example: add 5 bytes of data ADD_LP: MOV MOV ADD INC DEC JNZ CX, 05 BX, 0200 H AL, 00 AL, [BX] BX CX ADD_LP ; CX holds the loop count ; BX holds the offset data address ; initialize AL ; add the next byte to AL ; increment the data pointer ; decrement the loop counter ; jump to next iteration if counter ; not zero

Addressing Modes – Accessing operands (data) in various ways Addressing Modes – Accessing operands (data) in various ways

; move contents of DS: 2400 H into DL ; move contents of DS: 2400 H into DL

; move contents of DS: SI into CL ; move contents of AH into ; move contents of DS: SI into CL ; move contents of AH into DS: DI ; moves contents of AX into memory ; locations DS: SI and DS: SI +1

; move DS: BX+10 & DS: BX+10+1 ; into CX. PA= DS(sl) +BX+10 ; ; move DS: BX+10 & DS: BX+10+1 ; into CX. PA= DS(sl) +BX+10 ; PA = SS (sl) + BP + 5

; PA = DS (sl) + SI + 5 ; PA = DS (sl) ; PA = DS (sl) + SI + 5 ; PA = DS (sl) + DI + 20

; PA=DS(sl)+BX+DI +8 ; PA=SS(sl)+BP+SI +29 ; PA=DS(sl)+BX+DI +8 ; PA=SS(sl)+BP+SI +29

Assembly Language Programming Assembly Language Programming

Assembly Programming • Assembly Language instruction consist of four fields [label: ] mnemonic [operands] Assembly Programming • Assembly Language instruction consist of four fields [label: ] mnemonic [operands] [; comment] • Labels • See rules • mnemonic, operands • MOV AX, 6764 • comment • ; this is a sample program

Model Definition MODEL directive –selects the size of the memory model • MODEL MEDIUM Model Definition MODEL directive –selects the size of the memory model • MODEL MEDIUM • Data must fit into 64 KB • Code can exceed 64 KB • MODEL COMPACT • Data can exceed 64 KB • Code cannot exceed 64 KB • MODEL LARGE • Data can exceed 64 KB (but no single set of data should exceed 64 KB) • Code can exceed 64 KB • MODEL HUGE • Data can exceed 64 KB (data items i. e. arrays can exceed 64 KB) • Code can exceed 64 KB • MODEL TINY • Data must fit into 64 KB • Code must fit into 64 KB • Used with COM files

Segments Segment definition: The 80 x 86 CPU has four segment registers: CS, DS, Segments Segment definition: The 80 x 86 CPU has four segment registers: CS, DS, SS, ES Segments of a program: . STACK ; marks the beginning of the stack segment example: . STACK 64 . DATA ; reserves 64 B of memory for the stack ; marks the beginning of the data segment example: . DATA 1 DB 52 H ; DB directive allocates memory in byte size chunks

Segments. CODE ; marks the beginning of the code segment starts with PROC (procedures) Segments. CODE ; marks the beginning of the code segment starts with PROC (procedures) directive the PROC directive may have the option FAR or NEAR ends by ENDP directives

Assemble, Link, and Run Program STEP INPUT PROGRAM OUTPUT 1. Edit the program keyboard Assemble, Link, and Run Program STEP INPUT PROGRAM OUTPUT 1. Edit the program keyboard editor myfile. asm 2. Assemble the program myfile. asm MASM or TASM myfile. obj myfile. lst myfile. crf 3. Link the program myfile. map myfile. obj LINK or TLINK myfile. exe

Assemble, Link, Run Files. asm – source file. obj – machine language file. lst Assemble, Link, Run Files. asm – source file. obj – machine language file. lst – list file it lists all the Opcodes, Offset addresses, and errors that MASM detected. crf – cross reference file an alphabetical list of all symbols and labels used in the program as well as the program line numbers in which they are referenced. map – map file to see the location and number of bytes used when there are many segments for code or data

PAGE and TITLE directives PAGE [lines], [columns] • To tell the printer how the PAGE and TITLE directives PAGE [lines], [columns] • To tell the printer how the list should be printed • Default mode is 66 lines per page with 80 characters per line • The range for number of lines is 10 to 255 and for columns is 60 to 132 TITLE • Print the title of the program • The text after the TITLE pseudo instruction cannot be more than 60 ASCII characters

Control Transfer Instructions • NEAR – When control transferred to a memory location within Control Transfer Instructions • NEAR – When control transferred to a memory location within the current code segment • FAR – When control is transferred outside the current code segment • CS: IS – This register always points to the address of the next instruction to be executed. • In a NEAR jump, IP is updated, CS remains the same • In a FAR jump, both CS and IP are updated

Control Transfer Instructions • Conditional Jumps – See Table 2 1 • Short Jump Control Transfer Instructions • Conditional Jumps – See Table 2 1 • Short Jump – All conditional jumps are short jump – The address of the target must be within – 128 to +127 bytes of the IP – The conditional jump is a two byte instruction: • One byte is the opcode of the J condition • The 2 nd byte is between 00 and FF 256 possible addresses: forward jump to +127 backward jump to – 128

Control Transfer Instructions • forward jump to +127: • calculation of the target address: Control Transfer Instructions • forward jump to +127: • calculation of the target address: • by adding the IP of the following instruction to the operand (see page 65) • backward jump to – 128 • the 2 nd byte is the 2’s complement of the displacement value • Calculation of the target address: • the 2 nd byte is added to the IP of the instruction after the jump (see Program 2 1, and page 65)

Control Transfer Instructions • Unconditional Jumps – “JMP label” When control is transferred unconditionally Control Transfer Instructions • Unconditional Jumps – “JMP label” When control is transferred unconditionally to the target location label § SHORT JUMP – “JMP SHORT label” § NEAR JUMP – “JMP label” § FAR JUMP – “JMP FAR PTR label” • CALL statements – A control transfer instruction used to call a procedure • In NEAR call IP is saved on the stack (see figure 2 5, page 67) • In FAR call both CS and IP are saved on the stack • RET – the last instruction of the called subroutine

Control Transfer Instructions • Assembly Language Subroutine § one main program and many subroutines Control Transfer Instructions • Assembly Language Subroutine § one main program and many subroutines § main program – is the entry point from DOS and is FAR § subroutines – called within the main program • can be FAR or NEAR § if after PROC nothing is mentioned, it defaults to NEAR

Data Types and Data Definition • 80 x 86 data types § 8 bit Data Types and Data Definition • 80 x 86 data types § 8 bit or 16 bit § Positive or negative § example 1: number 510(1012) will be 0000 01010 § example 2: number 51410(10 0000 00102) will be 0000 0010

Data Types and Data Definition • Assembler data directives § ORG (origin) – to Data Types and Data Definition • Assembler data directives § ORG (origin) – to indicate the beginning of the offset address § example: ORG 0010 H § DB (define byte) – allocation of memory in byte sized chunks § example: DATA 1 DATA 2 DATA 3 DATA 4 DATA 5 DATA 6 DATA 7 DB DB ? ‘Hello’ “O’ Hi” 25 ; decimal 10001001 B ; binary 12 H ; hex ‘ 2591’ ; ASCII numbers ; set aside a byte ; ASCII characters

Data Types and Data Definition • Assembler data directives § DUP (duplicate) – to Data Types and Data Definition • Assembler data directives § DUP (duplicate) – to duplicate a given number of characters § example: DATA 1 DB 0 FFH, 0 FFH Can be replaced with: DATA 2 DB 4 DUP(0 FFH) ; fill 4 bytes with FF DATA 3 DB 30 DUP(? ) ; set aside 30 bytes DATA 4 DB 5 DUP (2 DUP (99)) ; fill 10 bytes with 99 ; fill 4 bytes with FF

Data Types and Data Definition • Assembler data directives § DW (define word) – Data Types and Data Definition • Assembler data directives § DW (define word) – allocate memory 2 bytes (one word) at a time § example: DATA 1 DATA 2 DATA 3 DATA 4 DATA 5 DW DW 342 ; decimal DW 0101001 B ; binary DW 123 FH ; hex DW 9, 6, 0 CH, 0111 B, ’Hi’ ; Data numbers 8 DUP (? ) ; set aside 8 words § EQU (equate) – define a constant without occupying a memory location § example: COUNT EQU 25 ; COUNT can be used in many places in the program

Data Types and Data Definition • Assembler data directives § DD (define doubleword) – Data Types and Data Definition • Assembler data directives § DD (define doubleword) – allocate memory 4 bytes (2 words) at a time § example: DATA 1 DATA 2 DATA 3 DATA 4 DD DD 1023 ; decimal 0101001001110110 B ; binary 7 A 3 D 43 F 1 H ; hex 54 H, 65432 H, 65533 ; Data numbers § DQ (define quadwordequate) – allocate memory 8 bytes (4 words) at a time § example: DATA 1 DATA 2 DQ DATA 3 DQ DQ ‘Hi’ ? 6723 F 9 H ; hex ; ASCII characters ; nothing

Data Types and Data Definition • Assembler data directives § DT (define ten bytes) Data Types and Data Definition • Assembler data directives § DT (define ten bytes) – allocates packed BCD numbers (used in multibyte addition of BCD numbers) § example: DATA 1 DATA 2 DATA 3 DT DT DT 123456789123 ? 76543 d ; BCD ; nothing ; assembler will convert decimal number to hex and store it

Full Segment Definition § Simple segment definition – refers to newer definition § Microsoft Full Segment Definition § Simple segment definition – refers to newer definition § Microsoft MASM 5. 0 or higher § Borland’s TASM ver. 1 § Full segment definition – refers to older definition § SEGMENT directive – indicate to the assembler the beginning of a segment § END directive – indicate to the assembler the beginning of a segment Example: label SEGMENT ; statements label ENDS [options]

EXE vs. COM • COM files • Smaller in size (max of 64 KB) EXE vs. COM • COM files • Smaller in size (max of 64 KB) • Does not have header block • EXE files • Unlimited size • Do have header block (512 bytes of memory, contains information such as size, address location in memory, stack address)

Converting from EXE to COM Procedure for conversion to COM from EXE 1. Change Converting from EXE to COM Procedure for conversion to COM from EXE 1. Change the source file to the COM format 2. Assemble 3. Link 4. Use utility program EXE 2 BIN that comes with DOS C: >EXE 2 BIN program 1, program 1. com

Arithmetic and Logic Instructions and Programs Arithmetic and Logic Instructions and Programs

Use to mask certain bits, test for zero Use to clear the contents of Use to mask certain bits, test for zero Use to clear the contents of a register also to see if two register have the same value

Interrupt Programming with C • Using C “high level assembly” • C programmers do Interrupt Programming with C • Using C “high level assembly” • C programmers do need to have detailed knowledge of 80 x 86 assembly • C programmers write programs using: – DOS function calls INT 21 H – BIOS interrupts • Compilers provide functions: – int 86 – intdos (calling any of the PC’s interrupts) (only for INT 21 H DOS function calls) 110

Interrupt Programming with C • Programming BIOS interrupts with C/C++ – Set registers to Interrupt Programming with C • Programming BIOS interrupts with C/C++ – Set registers to desired values – Call int 86 • Upon return from int 86, the 80 x 86 registers can be accesses – To access the 80 x 86 registers use union of the REGS structure already defined by C compiler • union REGS regin, regout; • Registers for access are 16 bit (x) or 8 bit (h) format 111

Interrupt Programming with C Example: /* C language Assembly language */ union REGS region, Interrupt Programming with C Example: /* C language Assembly language */ union REGS region, regout; regin. h. ah=0 x 25; regin. x. dx=0 x 4567; regin. x. si=0 x 1290; int 86(interrupt#, ®in, ®out); /* mov ah, 25 h ; AH=25 H */ /* mov dx, 4567 h ; DH=4567 H */ /* mov si, 1290 h ; SI=1290 H */ /* int # */ 112

Interrupt Programming with C • Programming INT 21 H DOS function calls with C/C++ Interrupt Programming with C • Programming INT 21 H DOS function calls with C/C++ – intdos used for DOS function calls • intdos(®in, ®out); /* to be used for INT 21 H only */ 113

Signed Numbers, Strings, Tables Signed Numbers, Strings, Tables

y y

y y

Using MASM • Developed by Microsoft • Used to translate 8086 assembly language into Using MASM • Developed by Microsoft • Used to translate 8086 assembly language into machine language • 3 steps: – – Prepare. ASM file using a text editor Create. OBJ file using MASM Create. EXE file using LINK Once you have the. EXE file, debug can be used to test and run the program

Disassembly Using IDA Pro Disassembly Using IDA Pro

IDA Pro Disassembler • Interactive disassembler commercially developed by Datarescue • Supports over 30 IDA Pro Disassembler • Interactive disassembler commercially developed by Datarescue • Supports over 30 families of processors (Intel x 86, SPARC) • Supports many file formats (PE, ELF) • Provides powerful SDK

Using IDA Pro • • Loading a file General settings Views Navigating through the Using IDA Pro • • Loading a file General settings Views Navigating through the code Adding analysis content Searches (binary, text) Patching & scripting Exiting and saving

Tools – IDA Pro Demonstration Tools – IDA Pro Demonstration